?

Log in

No account? Create an account
Semiformalishmaybe

Statistical Software Components

A few months ago I mentioned my big library of useful generic C/Perl functions (libpgunn). There are plenty of other general-purpose libraries out there, one of them being glib (which as I understand started as part of libGTK back when GTK was just a toolkit to implement the GIMP so they wern't writing right on top of Xlib). Solid, widely-used functions for generally-useful things advance the art of programming. Sometimes these features are slightly esoteric; BLAS and LAPACK are commonly used APIs to do linear algebra, widely-enough used that Intel and a number of other hardware/software vendors have made custom versions of them to show off their hardware or platform.

One of the things I'd really like to see would be an API to do exploratory statistics(1); let's say you have an interactive "realm", operators you can apply in it, and a certain state in that realm. Let's further imagine you have a function to classify results into rough categories. It'd be nice to have a set of functions that would let you learn that realm while exploring it, automatically building competing models for it and honing in on better ones as you keep exploring the realm. Yes, this sounds kind of airy; we'd initially need to pare it down to reasonable subproblems of this, but it'd be broadly useful and I think after implementing enough simplified versions of this and seeing how people use them, we'd know how to progress upwards in capacity. It's a big problem, but it's tractable.

  • Note 1: I am using the term exploratory statistics more loosely than most people do

Comments