Andreas
How are you getting the data out of FLEx into Excel? It only takes a
couple of seconds to delete extra columns in Excel, so I'm wondering
what other work/process you are going through to 'sanitize' your data...
Craig.
I'm inclined to agree with Mark about exporting as CSV. I think that there would be many views of the data that can be displayed in
FLEx which users would like to be able to deal with in a spreadsheet or table. Not many users will be able to deal with XML
themselves, so they are dependant on others providing tools.
An "Export" option that exports the current view in real CSV would be useful. It would seem to be something that would be simple to
implement, though no doubt there are problems like how do deal with multiline fields, and repeating the entry data for each sense.
This could be useful for gathering statistics, or any kind of analysis that Flex doesn't (yet) do. I'm sure that more uses would be
found for the feature if it were there.
Another idea would be that if we export to CSV it would perhaps be good to be able to import from CSV too.....
However, I like using spreadsheets so this is a fairly biased view. It would be good to know whether there are many linguists that
would use the feature before adding it.
FWIW.
David.
Heidi Rosendall
Wycliffe Nigeria
Robert
----- Original Message -----
From: "David Baines JarMail" <david_...@sil.org>
To: <flex...@googlegroups.com>
Sent: Thursday, June 26, 2008 4:00 PM
Subject: [FLEx] Re: Browse view: allow us to select the data in oen column
>
I'm not sure that I've followed your question...
Exporting as CSV shouldn't make any assumptions, it should just export the data in the columns of the current view. When the CSV
file is opened in a spreadsheet it should look similar to the data in FLEx.
All of the data to be exported is already in a table. The aim of the export would be to facilitate the transfer of that view into a
spreadsheet.
So for Lexicon Edit, it would only be the data in the Entries pane that is exported. For Lexicon - Browse, Bulk Edit Entries, Bulk
Edit Senses, and Bulk Edit Reversal Entries
the data is already in a tabular format and it shouldn't be difficult to export it.
I imaginge that the option wouldn't be available for the Dictionary since it isn't already in a tabular form.
It would be good to provide the option for all the tabular data in FLEx, including most of the tools in the Grammar area, and the
concordance tools. Wherever there is data already in a table.
In Bulk Edit Senses List Choice, on the development version, one can use cut and paste to get the data from the table into a
spreadsheet. In this case, entry information, such as Headword, is repeated for each sense. Unfortunately this doesn't work in
Bulk Edit Entries, or Lexicon Edit. The cells are not respected and data ends up in the wrong column.
Does this answer your question?
David B.
I assume that CSV as a spreadsheet so you would only have each field once.
If this is so, my concern would be that in a dictionary, not a simple
wordlist, you can have the same field many times in one dictionary entry. I
can't see how this would work with a spread sheet. But perhaps I don't
understand CSV and need educating.
Robert
PS did you get my zip sent via YouSendIt ?
Robert
----- Original Message -----
From: "David Baines" <david_...@sil.org>
To: <flex...@googlegroups.com>
Sent: Friday, June 27, 2008 11:17 AM
But there are a few tricky points:
1. How do we delimit fields? Strict CSV uses commas...but data could well
contain commas, confusing things. At least some CSV programs allow quotes
to be put around data containing commas...but data could contain quotes,
too. Then you start wondering how different possible clients escape quotes.
On the whole, I would be inclined to actually delimit with tab characters
between fields. FieldWorks data can't have tab characters embedded, so this
is unambiguous, and some (most?) CSV programs, certainly Excel, can also
handle tab-delimited. Of course we could make it configurable, but the more
complex you make the task, the less likely it gets done.
2. What do we do about multiple paragraphs in a cell? Some of our views
show multiple values on separate lines within a cell (e.g., sense glosses
if looking at entries, or multiple semantic domains). We don't want
newlines in a CSV format except one at the end of each record. Would it
make sense to use a slash or a vertical bar to separate 'lines' within a
cell? Any better ideas?
Being able to import is much harder. Some issues that immediately come to
mind:
- Will imported entries replace existing ones? All the existing ones, or
just ones with the same headword? Does 'same headword' mean lexeme form,
citation form, or what? Does it include homograph number? Or will all the
imported entries be additional, and it will be up to the user to merge
unwanted homographs? Or should we merge automtatically (perhaps only if the
imported entry is identical)?
- How do we tell how the columns in an imported file correspond to fields
in the FW database? How do we even know whether the CSV represents one line
per entry or one per sense?
- How do we match up sense data in an Entry import (e.g., if one column has
a list of glosses and another a list of semantic domains, which domains
belong to which senses)? For that matter, how do we even know how multiple
glosses corresponding to multiple senses are delimited?
The basic problem is that the row-oriented data represents a selection and
simplification of what is really in FLEx. To import it we have to
regenerate the internal complexity. Being able to import anything we can
export might be too hard a goal for any time soon.
That said, there are probably some simple cases we could fairly easily
handle, such as assuming a single entry/sense per line and importing to an
empty database. And that much would be helpful for some situations.
John Thomson