Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

strtok()'ing from hell

3 views
Skip to first unread message

Gary M. Greenberg

unread,
Jan 13, 1997, 3:00:00 AM1/13/97
to

Hi y'all. Happy New Year.

<background details at end>
Read this if you feel like puzzling out a string parsing snippet
and helping me determine a better method for the next time I do
something like this. Otherwise, skip this post.

Given a textfile of input having the form:

...
Obj, GetClassName, , String,, ""
Obj, Is, "(aClass)", Boolean,, ""
Application, DelayedRun, "(aString, aObj, aNumber)", Nil,Obj, ""
Application, FindDoc, "(aString)", Doc,Obj, ""
...

I used strtok() to parse each line and build HTML tables
where the first field is used as the basis for an
8.3-style output filename. The code I wrote is, in part:

/*
** Note: All headers, error checking, and
** much else elided from posting
*/
FILE *rfp;
...
const char ctok[]=",";
char line[MAXLINE]; /* MAXLINE == 1024 */
char *pstr, *pstr2, *ch;
...
/* [This is inside a while(fgets(line,sizeof(line),rfp)!=NULL) loop */
...
if((ch=strchr(line,'\"'))!=NULL && *(ch+1)=='(')
{
pstr=strtok(line,ctok);
pstr=strtok(NULL,"\0"); /*** START OF KLUDGE!!! ***/
if( pstr[1] == '\"')
{
pstr[1]=' ';
pstr2=strtok(pstr,"\"");
/* fprintf() ... */
}
while((pstr=strtok(NULL,ctok))!=NULL)
{
/* manipulate rest of line */
}
}
else
{
pstr=strtok(line,ctok);
while((pstr=strtok(NULL,ctok))!=NULL)
{
/* reuse manipulation */
}
}
}

/* end code snippet */

I accomplished what I set out to do, but the method I used troubles me;
instinct tells me that I've missed something in the logic for the
parsing. I guess part of it is using strtok() and then changing the
string used in strtok() based on whether or not the current input string
has a `"' followed by a `(' although that's the only way to distinguish
from the source input file whether or not there were any paramters provided
for the Messages. Your thought on the matter appreciated.

Reply by email AND/OR post as _you_ prefer (will summarize if warranted).
I will read here (as always) for knowledge.


irrrelevant Background:
Recently, I kludged something together to use as a utility for
some programming I do in a proprietary language I use called
Avenue. I took a text file and converted it to an HTML indexed
set of Classes, Messages, Parameters, and such. It worked out fine;
Other Avenue programmers see:
"http://users.southeast.net/~garyg/class.htm"


Ciao4now,

gary /* the Sorcerer's Apprentice */
1996 - 97 UF GATORS -*- NATIONAL CHAMPIONS
-=- visit The C Programmers' Reference -=-
http://users.southeast.net/~garyg/main_page.html

Darin Johnson

unread,
Jan 15, 1997, 3:00:00 AM1/15/97
to

>Obj, GetClassName, , String,, ""
>Obj, Is, "(aClass)", Boolean,, ""

As an important note, on the strtok topic, multiople delimiters in a
row count as only a single delimiter. Thus, multiple commas in a row
won't parse out as having an empty field.

> pstr=strtok(NULL,"\0"); /*** START OF KLUDGE!!! ***/

That's very odd. I guess it might work, but could be nonportable (ie,
that's an empty string as your delimiter).


>I guess part of it is using strtok() and then changing the
>string used in strtok() based on whether or not the current input string
>has a `"' followed by a `(' although that's the only way to distinguish
>from the source input file whether or not there were any paramters provided
>for the Messages. Your thought on the matter appreciated.

Changing the delimiter is allowed in strtok. Parsing is almost always
a kludge. If you ever get the choice, you should always try to
dictate your own file formats, it makes life so much easier :-)

--
Darin Johnson
da...@connectnet.com

0 new messages