Of course, there is the cfhttp csv import thing, but that requires a
public URL, and requires knowing about it (someone looking "is there a
csv function" isn't likely to see it).
Why doesn't CFML have a function that accepts either a filename or a
string (like XmlParse) and other suitable arguments
(headers,delimiters,qualifiers) and can be used to provide robush and
reliable importing, so people can stop trying to roll their own using
the List* functions.
Like this:
CsvParse, returns query
Input, required string (filename, url or text)
Delimiter, string default ','
TextQualifier, string default '"'
FirstRowAsHeaders, boolean default true
Columns, optional string (list of colnames)
So yeah, those seem obvious choices, but if anyone might have better
ideas, feel free to give feedback.
After any discussion, I'll go raise feature requests for all three
engines with whatever turns out to be most popular.
(It's such a trivial and yet useful function, I'd hope they'll all
just accept the idea and add it to their next releases.)
So yeah, anyone think I'm crazy here, or have any ideas about this?
So yeah, anyone think I'm crazy here, or have any ideas about this?
--
CFML Conventional Wisdom
http://groups.google.com/group/cfml-conventional-wisdom?hl=en?hl=en
Heh, well that's good, and also having a toCsv makes sense.
I can't help but think "readcsv" is the wrong name though - so far as
consistency with rest of CFML goes.
Having delimiter/qualifier last probably makes more sense than the
order I had above though.
Hmm, no qualifier argument for toCsv?
Todd Rafferty wrote:
> With Railo, you can make this a built in function by creating a UDF and
> shoving it into one of the magic directories and you're done.
Yep, and that's great, but why I'm posting here (instead of responding
to the recent railo post about this) is that this needs to be the same
for all engines, and part of the documentation for all engines.
I'd also be proposing for it to be Core CFML if there was a hint of a
way how to do that...?
Write up the function and submit it as a patch in jira.explain openbd already has it built in.
On Jun 28, 2010 11:23 AM, "Peter Boughton" <boug...@gmail.com> wrote:
Matthew Woodward wrote:
> Clearly you're not crazy. ;-)
>
Heh, well that's good, and also having a toCsv makes sense.
I can't help but think "readcsv" is the wrong name though - so far as
consistency with rest of CFML goes.
Having delimiter/qualifier last probably makes more sense than the
order I had above though.
Hmm, no qualifier argument for toCsv?
Todd Rafferty wrote:
> With Railo, you can make this a built in function by creating a UDF and
> ...
Yep, and that's great, but why I'm posting here (instead of responding
to the recent railo post about this) is that this needs to be the same
for all engines, and part of the documentation for all engines.
I'd also be proposing for it to be Core CFML if there was a hint of a
way how to do that...?
MD
Mark Drew
Railo Technologies UK
Professional Open Source
skype: mark_railo
email: ma...@getrailo.com
gtalk: ma...@getrailo.com
tel: +44 7971 85 22 96
web: http://www.getrailo.com
I can't help but think "readcsv" is the wrong name though - so far as
consistency with rest of CFML goes.
Hmm, no qualifier argument for toCsv?
I'd also be proposing for it to be Core CFML if there was a hint of a
way how to do that...?
--
We bubbled up these functions because we had them for the SpreadSheet
and cfhttp functionality. It seemed a shame not to set them free.
I would normally agree, except that the *basic* delimited-text format
is well enough defined that a single solution will work for 90% of all
cases and that currently, people build custom solutions by looping
with lists and keep running into the same problems over and over:
empty cells, qualified cells (removing the qualifiers) and embedded
commas within the qualified cells (resulting in two cells when it
should have been one).
It's true that different methods for quoting fields, escaping those
quotes, etc. might exist. But the vast majority of the time you're
dealing with comma-separated, double-quote-qualified files. Being
able to change just those characters allows for basic tab-delimited
files as well. More complex then that, and you're back to custom
code. However, an entry-level programmer could use CsvRead() or
equivalent. I wouldn't expect them to write a *correct* version
themselves, though. I've had to explain to many a developer to
pre-format their lists by replacing ",," with something like ",NULL,"
and then converting back later... and that was ignoring qualified
fields (since they didn't exist in that project... yet).
It's perfectly valid to say they should use cflib or somesuch... but
the CSV functions on cflib are *also* bad, or at least they were the
last time I looked. This one seems okay
http://cflib.org/udf/CSVtoArray but it needs an argument for the field
delimiter (easy enough). Actually, I'm sure this wasn't there lasts
time I looked, because it even seems to handle embedded commas. The
CSVToQuery function doesn't look like it handles qualified cells.
Really, this is more of a "library" function than a core language
function. Perhaps some sort of distinction between core/logic
functions, "Library" functions and "UI" functions would be useful?
Probably not.
I agree this could be seen as a "library function" rather than "core
logic", but I (and hopefully others) are using "core" based on the
CFML Advisory Committee terms - Core/Extended/Vendor-specific.
Using these three categories, I'd say it fits into the "Core"
definition rather than the other two.
I want it added to "Core" because that means all (compliant) engines
are required to implement it (whereas "Extended" just says, "you
should implement it, like this", and Vendor-specific means it can be
implemented differently by any/all engines).
It would be a shame for something so basic as CSV to not be
implemented uniformly across the engines.
Oops. Yes, sorry. I didn't mean to confuse the terminology. That's
the definition of "core" that should be used by this list.
So, what I'll do (this evening) is create a proposed set of functions
and test cases and submit them to this list for review/comments/etc.
If the OBD team go "yay, we love it", then all is good. Otherwise we
can perhaps discuss/vote on the differences with an aim of coming up
with something everyone (on this list, at least) is happy with, and
then proceed as you said.
Sound good?
Otherwise we
can perhaps discuss/vote on the differences with an aim of coming up
with something everyone (on this list, at least) is happy with, and
then proceed as you said.
Sound good?
--
Well, the order is the first thing - we've got FileRead, ImageCrop, XmlNew, etc.
(Even though I would prefer <verb><object>, consistency with existing
functions is important.)
And I'd pick "parse" over "read" because its primary function is
parsing data - same as XmlParse - whether a file is read before data
parsed is less relevant here.
(I put this in a different category to
FileRead/ImageRead/SpreadsheetRead/etc type of tags, where their
primary function is reading the files).
> The CSV stuff is brand new, so feel free to create tickets for things you'd
> like to see changed:
> http://code.google.com/p/openbluedragon/issues/list
Yep, I'll certainly do that, (once we've got a consensus).
I'll do likewise on the Railo Jira, and wherever that ACF bugtracker is.
> This is one of the reasons this list was created, because there are no
> public submission and discussion outlets for the CFML Advisory Committee. I
> believe there are representatives from all the engines on this list though.
Good, I hoped that was the case. :)
Though it would be nice to have an official channel, (or at least
acknowledged/unofficial) - any chance of getting this group
specifically mentioned on the OpenCFML wiki?
Yes we can do that.
consider that OFFICIAL! ;)
That's fine and dandy (I had to enable javascript to get the full
effect, but those are beautiful docs!), I'm more than happy to let
other people name stuff and whatnot. That's the hard part! =)
I was more wondering about, like, maybe using the same underlying java
libraries or something along those lines.
BTW we'll probably want to add escape chars and newline stuff, neh?
At least I exposed those in mine, and they were useful.
HSQLDB has some freakishly nice ways of dealing with CSV files, just
to toss that out there as well.
:Den
--
The incarnation is true, not of Christ exclusively, but of Man
universally, and God everlastingly.
James Martineau
To be clear, this is an "in general" type of question, versus a CSV
specific one.
Should the engines each write their own implementations of
functionality X, or /lean/ that way at least?
There are pros and cons to each approach... compatibility being at the
heart of it.
:Den
--
The pinafore of the child will be more than a match for the frock of
the bishop and the surplice of the priest.
James Martineau
Anyway, you're basically asking "should the engines re-invent the
wheel", and the obvious answer is probably not!
One of the key benefits to Open Source is not wasting effort re-doing
things that have already been done.
Unless there is a specific reason/benefit in re-writing a particular
functionality, the default case for adding functionality should be to
find a suitably licensed existing project and integrate it in a clean
CFML way.
That way, we have more time to spend on progressing other things!
While that sounds logical enough, I think CFML is one of the absolute
worst as far as Not Invented Here syndrome goes. =)
We're slowly bucking that trend, is the good news. At least I think we are.
The only argument, specifically for leveraging java libs vs writing
our own java, is things like classloader issues. Theoretically Java 7
will have some magic for dependency management, but I'm not holding my
breath. :)
So we can avoid library "issues" by rolling our own(s), but then the
incompatibility is shifted into the actual engines, which, IMO, is
probably potentially harder to manage.
Either way, the various engine folks should probably try to
collaborate on included java libraries (also IMO) to some extent.
Eh. *shrug*
Thanks for the reply, Peter! No need for apologies, I feel this stuff
is "evolution", and will happen one way or another. :)
:Denny
--
Democracy is the road to socialism.
Karl Marx