exporting index of text-field

xeoman

unread,

Mar 17, 2002, 2:53:21 PM3/17/02

to

hello,
i know, it might sound an odd thing to do, but i want to have a
complete wordlist of all words used in a database (or just one
text-field). at first glance this appears an easy thing to fetch from
a database: just export the index of this field and you're done. well,
but how to?
my current solution (several, combined import/export-scripts) is so
slooooow and awkward i was wondering if some one could think of a
savvy way to export the index that filemaker uses internally.
thanks for any posts,
xeoman.

John Weinshel

unread,

Mar 17, 2002, 4:01:13 PM3/17/02

to

Value Lists use the index, and the ValueListItems() function will return the
list of the index of a field. Try to avoid actually creating a field that
stores the value list, as that can cause the index, and thus the file size,
to bloat (and can take several hours to build the field when you exit define
fields). Rather, see if your setup will accomodate setting a text field to
ValueListItems(Status(CurrentFile), Your_VL), and then export that field.

Note also that any such field will be limited by Filemaker 64K text limit,
as well as the 20/3 index limit (up to 20 characters per 'word', for a total
of 60 characters, including the spaces between the 'words'). Also, the
method of storage (English or another language, or ASCII) affects what will
be stored. So, if you are after indexing larger text fields, you would need
to loop through them in a rather gargantuan loop, appending each new word to
an 'index' field and then comparing that word to field in question,
--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance

"xeoman" <goo...@xeoman.com> wrote in message
news:38417df2.0203...@posting.google.com...

Tim Booth

unread,

Mar 17, 2002, 4:43:52 PM3/17/02

to

Not sure about what can be done to get this, but note
that the rough rule appears to still be that the index
only covers the first 20 characters of the first 60 words.

Hmmm - actually, maybe that's changed. I just added an
impossible word to the end of a long post in the FAQ,
and searched on it, and the record was found...

What is the indexing capability of FM 5.x then??

(Sorry, this doesn't answer the actual question ;-)

Webko

John Weinshel

unread,

Mar 17, 2002, 6:12:30 PM3/17/02

to

Search and Index operate differently-- remember, you can search on related
and un-indexed fields.

The index (and thus fields used as keys) stores up to 60 characters, but
only in 'words' of up to 20 characters. It would thus store, for example, 20
characters, space, 20 characters, space, 18 characters. Or 18, 20, 20.

The Exact() function appears to handle text outside those limits.

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance

"Tim Booth" <T.B...@isu.usyd.edu.au> wrote in message
news:3C950E18...@isu.usyd.edu.au...

Redwing

unread,

Mar 18, 2002, 12:04:32 AM3/18/02

to

Geez guys..., you were sooooooo close...

Put the target text field all by its lonesome on a new layout. Copy all records
(1.5GB RAM helps). Paste into text editor/word processor. Replace all tabs,
spaces, soft returns, etc., with hard returns. Save. Import into new FM file
"Words.fp5", where each word ("Word") will get its own private record. De-dupe
in some usual fashion. Now the hard part is done.

Then create a global in Words.fp5, and relate it ("GetWords") to target file's
unique record ID field. Next, create a calc ("NewWord") based on this
relationship that parses the target file's target field one word per record:

MiddleWords(GetWords::TargetField, Status(CurrentRecordNumber), 1)

Relate NewWord to Word, and create a calc field based on this relationship to
flag empties, nulls, not valids (or whatever...), that show NewWord doesn't
have a matching Word. Pretty easy to find those flags, and turn the flagged
NewWord's into Word's for the next new record, or edited record, in the target
file. With a constant relationship between the two files, it's easy to set the
unique ID from the target file into Word.fp5's global, and process any newly
found previously unused Word's. Word.fp5 can then rather gracefully maintain
the target file's target field index in a readily managable fashion.

Just a little food to put under your thinking caps.

Sincerely,

Roger E. Grodin
REDWING FINANCIAL GROUP
"Where OUR business is the ultimate database for YOUR business"
red...@aol.com
==========================================================