Hi Mike,Thank you for a quick feedback.I think that:a) it is a good idea. However, matrix/array operations should only produce matrices/arrays. Otherwise we will have complex cases, like what should be the result of matrix multiplication of 2 datasets?
b) columns should have unique name. The question is how to treat conflicts - automatically rename duplicate column or raise an exception?
c) it is must have feature.d)I don't see advantages of it if we treat dataset as matrix and provide comprehensive enough API.
On Saturday, July 5, 2014 7:37:06 PM UTC+2, Mike Anderson wrote:Hi Alexandr,I think it would be good to define exactly what we mean by a dataset as an abstraction? The definition would help us to be more precise in terms of what should / shouldn't be in the APIFor example, we might define a dataset as something that has the following properties:a) It can be treated as a 2D array / matrixb) It has (uniquely?) named columnsc) It supports heterogeneous data types as columnsd) It is seqable? (producing a sequence of rows - core.matrix can't guarantee that in general, but we could make it a requirement for datasets...)That may or may not be the right definition....
On Saturday, 5 July 2014 18:07:09 UTC+1, Aleksandr Sorokoumov wrote:Hi all,As a part of Incanter and core.matrix integration process, there is an idea to evolve existing in core.matrix dataset type and use it in Incanter.In order to do that, Incanter dataset functions should be implemented in core.matrix.Can you please look at my user API proposal and share your opinion on it? As it should be used instead of existing Incanter dataset API, most of the functions are copied from it.Thanks.
--
You received this message because you are subscribed to the Google Groups "Numerical Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Numerical Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Regarding the column names - do we continue to support Incanter's approach & allow keywords and strings? Or limit only to one type?HiDispatch dataset function - good idea, it's easy to extend to new types later...
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-clojure+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.
A though occurs to me that we may want two different dataset implementations:
a) Dataset that stores a column vector for each column
b) Dataset that wraps an arbitrary array / matrix and just adds column names
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.
--
With best wishes, Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)
Skype: alex.ott
--
You received this message because you are subscribed to the Google Groups "Numerical Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numerical-cloj...@googlegroups.com.