Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What is parsing?

0 views
Skip to first unread message

Al Christensen

unread,
Apr 11, 1999, 3:00:00 AM4/11/99
to
I am finishing my second semester of cobol. And as a programmer
wantabe, I am curious. I have read through many of the discussions
here, and have found references to "parsing". It is not referred to at
all in my text. What is it, how does it work, what does it do?
Thanks in advance for your reply.

Al Christensen


William J. Lightner

unread,
Apr 11, 1999, 3:00:00 AM4/11/99
to
Almost. Compilers definitely do parsing. But not only compilers do parsing.

Parsing is the extraction of data elements from an input stream, by
recognizing delimiters between the fields. The delimiters are arbitrary, as
applicable to the specific problem and data stream. A common example would
be a Cobol program extracting the fields from a comma (or tab) delimited text
file.

Ken Foskey wrote:

> Parsing is what a compiler does. It takes a text form and turns
> it into something closer to what a mchine would need to
> understand.
>
> For example:
>
> MOVE abc to XYZ
>
> get converted to
>
> token-1, token-2, token-3, token-2
>
> Where token-1 refers to the 'MOVE' token-2 to a WS-variable, etc.
> A program can simply step through an array of these tokens to
> determine what it needs. The parser will be generic and used by
> multiple programs for example to compile you need a parser, to
> reformat you need a parser, to QA cobol (check goto's only goto
> exit in THIS paragraph, etc) yoy need a parser. If you write a
> common peice of software it quickly becomes solid by multiple
> reuse.
>
> This will be discussed if you do a compiler subject. This is a
> fairly weak explanation but it gives the gist of it.
>
> Ken
> Open source means many eyes makes light work.


Arnold Trembley

unread,
Apr 11, 1999, 3:00:00 AM4/11/99
to
William J. Lightner wrote:
>
> Almost. Compilers definitely do parsing. But not only compilers do parsing.
>
> Parsing is the extraction of data elements from an input stream, by
> recognizing delimiters between the fields. The delimiters are arbitrary, as
> applicable to the specific problem and data stream. A common example would
> be a Cobol program extracting the fields from a comma (or tab) delimited text
> file.

This sounds incorrect to me. I've read some books on compiler writing,
but I've never worked with Unix tools lex and yacc, nor have I ever
built a compiler. I think parsing comes after lexical analysis or
scanning.

As I understand it, if you use lex and yacc to build a compiler (or some
other kind of translater program), that lex is used to generate the
lexical analyzer or scanner program. Yacc would be used to generate the
parser program. The parser would call the lexer to get the next token
from the input stream. The lexer collects characters until it
identifies a valid token.

The parser is supposed to analyze a tree structure of tokens (words) to
determine if a statement is syntactically correct. The parser needs to
know the grammar of the language being translated. The lexer only needs
to know how to extract a token from the input character stream.

--
Arnold Trembley
http://home.att.net/~arnold.trembley/
"Y2K? Because Centuries Happen!"

Ken Foskey

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to

Rick Smith

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to

Al Christensen wrote in message <3711208A...@tdconnect.com>...

>I am finishing my second semester of cobol. And as a programmer
>wantabe, I am curious. I have read through many of the discussions
>here, and have found references to "parsing". It is not referred to at
>all in my text. What is it, how does it work, what does it do?
>Thanks in advance for your reply.
>
The Free On-Line Dictionary of Computing is located at:
< http://wombat.doc.ic.ac.uk/foldoc/index.html >

The dictionay entries are cross-referenced allowing exploration
of related terms.
------------------
Rick Smith

Donald Tees

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to
Seems that the word "parse" has changed its meaning since I went to school.

Parse: To break a sentence down into its component parts of speech, with an
explanation of the form, function, and syntactical relationship of each
part. (Heritage)

The meaning in Computer Science is the same as the meaning in English.

Arnold Trembley wrote in message <37117585...@worldnet.att.net>...

Thane Hubbell

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to
Parse: To break down giving the form and function fo each part.

I have written several "Parsers". All it means REALLY is to break
down a sentence into individual words.


Howard Brazee

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to
Al Christensen wrote:
>
> I am finishing my second semester of cobol. And as a programmer
> wantabe, I am curious. I have read through many of the discussions
> here, and have found references to "parsing". It is not referred to at
> all in my text. What is it, how does it work, what does it do?
> Thanks in advance for your reply.
>
> Al Christensen


Parse = analyze (a sentence) grammatically, telling its parts of speech
and their use in the sentence.

Different languages have different parsing rules - for instance, in
English and Cobol, a double negative is a positive, in Spanish it is a
negative and it is required.

Donald Tees

unread,
Apr 12, 1999, 3:00:00 AM4/12/99
to
No. It also includes determining what each word represents grammatically.
For example, even parsing a command line requires determining which element
is file name, which are switches, etc. Simply unstringing them is *not*
parsing, only part.

Thane Hubbell wrote in message <3711e99d...@news1.ibm.net>...

Ken Foskey

unread,
Apr 13, 1999, 3:00:00 AM4/13/99
to
Thane Hubbell wrote:
>
> Parse: To break down giving the form and function fo each part.
>
> I have written several "Parsers". All it means REALLY is to break
> down a sentence into individual words.

I agree with Donald, what you describe is tokenising. A parser
goes one step further though not much.

Probably getting into a fine line of distinction here though.

Ken

Thane Hubbell

unread,
Apr 13, 1999, 3:00:00 AM4/13/99
to
On Mon, 12 Apr 1999 13:58:50 -0400, "Donald Tees"
<don...@willmack.com> wrote:

Websters had (sentence) in parens and I took editorial license and
removed it <G>.

>No. It also includes determining what each word represents grammatically.
>For example, even parsing a command line requires determining which element
>is file name, which are switches, etc. Simply unstringing them is *not*
>parsing, only part.
>
>Thane Hubbell wrote in message <3711e99d...@news1.ibm.net>...

William J. Lightner

unread,
Apr 13, 1999, 3:00:00 AM4/13/99
to
Parsing is breaking a known input structure into its component &
semantic parts. So tokenizing a tab-delimited file into tokens with
understood/implied/structural meaning IS parsing. Tokenizing a language
string into its components for input into a compiler isn't, technically,
parsing, because the application of the semantic content is delayed
until the compiler deals with the tokenized result, a pass or two later.

I didn't really mean to start an argument here. But the main difference
between parsing and tokenizing is that the former generally includes the
latter, but not necessarily vice-versa. Tokenizing is mechanical,
according to a set of structural rules. Parsing requires semantics.
However, the former set of rules can be sufficiently complex to blur the
line from that direction, and the latter set of semantics can be
sufficiently simplistic to blur the line from the other direction. You
can tokenize an input, and still need to parse the results, but not
vice-versa.

Enough on this.

Later,

Ken Foskey wrote:

> Thane Hubbell wrote:
> >
> > Parse: To break down giving the form and function fo each part.
> >
> > I have written several "Parsers". All it means REALLY is to break
> > down a sentence into individual words.
>

0 new messages