Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Spelling Error analysis
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Simon  
View profile  
 More options Mar 19 2003, 6:27 pm
Newsgroups: comp.ai, comp.ai.nat-lang
From: spj.wal...@virgin.net (Simon)
Date: 20 Mar 2003 10:22:47 +1100
Local: Wed, Mar 19 2003 6:22 pm
Subject: Spelling Error analysis
I am researching the analysis of spelling errors with a view to
offering advice on remedial action that can be taken. I have decided
to attempt to build a shell that analyses the data, on which rules can
be offered or patterns can be shown if there is no rule available for
the pattern. I want to be able to group 'like errors' then find a rule
to match the most prevalent errors or just offer the error pattern if
a rule is not avalable i.e. I have not coded it.
I would like to find a pattern once then tot up the occurrences of the
pattern, I can do this for infliction and derived errors but am having
problems when it comes to errors in a root word, without a rule I can
say what the difference between the dictionary form and the error form
but some rules require information about surrounding graphemes and
here I am stumped.

[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <comp...@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Raffill  
View profile  
 More options Mar 19 2003, 9:57 pm
Newsgroups: comp.ai, comp.ai.nat-lang
From: r...@cse.ucsc.edu (Thomas Raffill)
Date: 20 Mar 2003 13:52:46 +1100
Local: Wed, Mar 19 2003 9:52 pm
Subject: Re: Spelling Error analysis

>I would like to find a pattern once then tot up the occurrences of the
>pattern, I can do this for infliction and derived errors but am having
>problems when it comes to errors in a root word, without a rule I can
>say what the difference between the dictionary form and the error form
>but some rules require information about surrounding graphemes and
>here I am stumped.

I have worked on this kind of issue before. Here are some rough
overall categories of errors. If anyone knows more categories,
please follow up and post them.

1. Typographical: missing letter, extra letter, transposition of
letters, or substitution of a letter for another (for example, the
adjacent letter on a typewriter). There is a lot of published
literature on these kinds of "string edits." Given a pair of strings,
the minimum number of edits needed to transform from one to the other
is called the "edit distance." Most of the automatic spelling correction
programs today can handle misspelled words that are at an edit distance
of 1 from a dictionary word and will return a list of all dictionary words
within 1 string edit.

2. Phonetic: substitution of a particular spelling for another
phonetically related spelling. An example of this would be to
misspell the word "rough" as "ruf." This kind of error is applicable
to languages like English where spelling is not very tightly phonetic.
To handle this kind of error, you can create tables of correspondences
between letter sets and phonetic units. You can transform everything
into phonetic units, then use the edit distance techniques on them.
There is a small body of literature related to this sort of thing.

3. Context-sensitive: substitution of an inappropriate dictionary word
correctly spelled for the appropriate word. For example, substitution
of "passed" for "past." Groups of words that are often inappropriately
substituted for each other in this way are called "confusion sets" in
the literature. This is a more difficult task and you can use all of
the techniques of natural language processing to tackle it. Most
spelling correction programs today are not capable of handling
this kind of error.

Hope this helps.

Thomas Raffill

[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <comp...@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kyongho Min  
View profile  
 More options Mar 20 2003, 8:57 pm
Newsgroups: comp.ai, comp.ai.nat-lang
From: kyongho....@aut.ac.nz (Kyongho Min)
Date: 21 Mar 2003 12:53:05 +1100
Local: Thurs, Mar 20 2003 8:53 pm
Subject: Re: Spelling Error analysis

spj.wal...@virgin.net (Simon) wrote in message <news:b5au47$h6p$1@mulga.cs.mu.OZ.AU>...
> I am researching the analysis of spelling errors with a view to
> offering advice on remedial action that can be taken. I have decided
> to attempt to build a shell that analyses the data, on which rules can
> be offered or patterns can be shown if there is no rule available for
> the pattern. I want to be able to group 'like errors' then find a rule
> to match the most prevalent errors or just offer the error pattern if
> a rule is not avalable i.e. I have not coded it.
> I would like to find a pattern once then tot up the occurrences of the
> pattern, I can do this for infliction and derived errors but am having
> problems when it comes to errors in a root word, without a rule I can
> say what the difference between the dictionary form and the error form
> but some rules require information about surrounding graphemes and
> here I am stumped.

I have studied the task and implemened a system covered three levels: lexical,
syntactic, and semantic.
If you visit the following URL, there are some seplleing-error source data,
bibliography, and my papers.
I hope it wil be helpful for you.

URL: www.cse.unsw.edu.au/~min

Regards,

Kyongho MIN

[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <comp...@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Apokrif  
View profile  
 More options Mar 22 2003, 4:27 am
Newsgroups: comp.ai, comp.ai.nat-lang
From: Apokrif <apokr...@yahoo.com>
Date: 22 Mar 2003 20:26:01 +1100
Local: Sat, Mar 22 2003 4:26 am
Subject: Re: Spelling Error analysis
Thomas Raffill :

> I have worked on this kind of issue before. Here are some rough
> overall categories of errors. If anyone knows more categories,
> please follow up and post them.
> 1. Typographical
> 2. Phonetic

I have written a Pascal program for these two categories. It's in French,
but one can easily adapt it (e.g. by changing the tables used for phonetic
equivalences).

[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <comp...@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
km  
View profile  
 More options Mar 27 2003, 12:51 am
Newsgroups: comp.ai.nat-lang
From: "km" <m...@me.me>
Date: Thu, 27 Mar 2003 05:51:58 GMT
Local: Thurs, Mar 27 2003 12:51 am
Subject: Re: Spelling Error analysis
"Kyongho Min" <kyongho....@aut.ac.nz> wrote in message

news:b5dra1$ob1$1@mulga.cs.mu.OZ.AU...

"Damerau's Misspelt Words(...)"

regional differences should be factored in as well, lest you take
misspelt for misspelled...

- Paul


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »