I thought I'd start with a new thread since the subject is different. I've been thinking about how best to implement factors and missing values. Missing values I think is relatively easy. There was an
earlier discussion on this (under a thread named 'Julia'), and I've got it mostly done, using nil as the symbol for missing values. Room for improvement, but enough for a start.
For factors, I looked at the
factor implementation in Rho, and noticed that a sequence is used to enumerate the possible levels. My thought was to use symbols for factors, and its value to indicate ordering or type; simple and easy to implement. That won't enumerate all possible values though, however I can't see any use cases for knowing possible values except for error checking. If reading data from an external source, you probably aren't going to know all possibilities unless the data source happens to provide it. Rho also has a bit vector for 'used levels'; any idea what use case might be behind that?