so I've made some progress in understanding what I what to do with dataframes, that can be seen in the RHO package on my github page (and can be D/L-d through there). There is a example.lisp file and a unittests.lisp file there that should give an idea of what functionality I am looking for. I'm still exploring this a bit more, and have not finalized my thoughts. But the typing works, and then there will need to be a summary-generic written for each typed column, so that general summarization makes sense.
Construction of dataframes from vectors and from combinations of data frames and vectors is what I am thinking about right now, since I'm currently very interested in so-call meta analytic datasets (combinations and compositions of self-coherent datasets into larger entities in an approach which records the potential for incoherencies across the datasets).
I am not sure that the end result of RHO will be used for Common Lisp Statistics, so I've not added it to quicklisp (and please do not request such yet) as it still might make more sense to move the functionality, once I understand it a bit better, to one of the other packages that we've discussed for holding dataframes.
BUT, to clarify, I want to insist that every row of a dataset is independent (or conditionally independent) from the other rows, and that data structures (elements in a column) must have a means for summarizing down to a (possibly ordered by so-called relevance) vector of summary statistics, which could be a single quantity (self, in the case of numeric data) or multiple quantities (number of nodes, number of vectors, connectivity index) in the case of a graph, or Cmax/Tmax/etc in the case of a kinetic profile).
From that, different numerical matrices and criteria functions (for optimization) would be created depending on the data analysis desired, la-de-dah, etc.
But basically, I'm just providing a middle-of-progress update.
I've asked a question on comp.lang.list which I'd like to ask here, which is the following (no need to answer in both places :)
I'm making progress on the dataframes package (thanks Marco A for the seed code that is slowly evolving!). It's going well, except that one optional feature would be to be able to find all variables in a particular package which currently hold data (ie point to a place) of a particular type.
(yes, I know I've got the technical details and names wrong -- educational corrections welcome, thanks...)
;; I.e. I'd like to be able to
(defparameter my-s (make-strand ....))
(defparameter my-df (make-data-frame ....))
;;
(find-all-data-variables) ; => (my-s my-df)
;; What I thought I could do is something like:
(defun find-all-data-variables (&key (pkg *package*))
(let ((lst ()))
(do-symbols (s package)
(if (typep s 'STRAND)
(push s lst))
(if (typep s 'DATA-FRAME)
(push (data-frame-column-names s) lst)))
lst))
;; but I think I am getting confused between variables and places (this is not ;; the first time, and it will not be the last time, I think...).
Is what I am doing possible, or am I just making an error in concept somewhere?
best,
-tony