Tim Piganelli asked:
> Has anyone heard of a tool that will extract the text out of a DAT file?
> I know you can get Multi-Page Text files exported out of Concordance
> once the DAT file is loaded. I am trying to bypass loading the data into
> Concordance and try to get the Text field "parsed" out and put into a
> text file.
>
> Any ideas?
Tim,
I routinely use simple AWK and Perl scripts to do this. Perl and other modern scripting languages (AWK, Python, Ruby, PowerShell) excel at simple text and file processing tasks like this one: extracting the plain text of whole documents embedded within a character-separated value (CSV) file (e.g., a Concordance DAT file) and putting the text in individual document files on the file system with rational names derived from data in the same CSV file (e.g., beginning Bates numbers).
There's a commercial software application called TextPipe Pro you can use to accomplish this same task. However, TextPipe Pro is a GUI application and, so, doesn't support standard I/O streams (stdin/stdout/stderr, redirection and pipelines). It has a steep learning curve. To do useful things with it, you must use filters and these filters are written in VBScript and JScript, scripting langages that do *not* excel at text processing tasks compared to Perl and other languages. In my opinion, if you're going to spend the time it takes to learn a text processing tool, you'd be better off in the long run to learn AWK or one of its descendants (Perl, Python, Ruby) instead of TextPipe Pro.
AWK is easy to learn compared to other options. You can quickly begin wielding it to do very useful things. For example, a trivial version of an AWK script to extract the text of whole documents from a Concordance DAT file would probably be less than twenty lines of code and would be very understandable, even to a novice programmer.
AWK, Perl, Python and Ruby are all free (free-as-in-freedom and free-as-in-free-beer) and available for all modern operating systems (Microsoft Windows, Mac OS X, Unix/Linux). TextPipe Pro costs money, is proprietary, and is only available for Windows.
I hope this suggestion helps you.
--
Jim Monty
Tempe, AZ