Interesting ideas.
I've been using the PICK delimiters within the Anji text editor, and that means I have illegal unicode characters. My original workaround was to allow the user to switch between unicode, and Latin-1 (ISO8859). This whole scheme allowed the user to either have full visability of UTF-8 or delimiters, but not both at the same time. Works well for the storing of the data in the database, but not so good when it travels around to browsers etc.
As I rewrite that bit, I'll store meta-data for the database to indicate what delimiters are used natively, and convert to your scheme for display/editing.
Time to come up with a name for this scheme. Perhaps EDE for Exodus Delimited Encoding?
Taking this further, I often hit problems with data that already contains delimiters that I want to store within an already existing record structure. I have to convert all of the delimiters so they suit the containing record. This is normal, and I'm sure you've hit it before. It's getting more frequent now that I'm processing data from XML sources that often has 5 (or more) levels of nesting.
I've been contemplating an alternative aproach, where there are just three delimiters. One to indicate the start of a lower-level, one to separate data items, and a third to indicate a return to a higher level. This should allow an infinite number of levels, and remove the need for delimiter conversion as I described above. I've not taken this any further yet, but would be interested in hearing your thoughts on it.
--
Ashley Chapman