Java review

16 views
Skip to first unread message

Jean-Philippe Barette-LaPierre

unread,
Jan 6, 2010, 9:07:58 PM1/6/10
to mo...@googlegroups.com
Court, I peaked into the code. I found 2 errors.

First, I'm quite sure that the characteristic vector must be 2n + 1,
not 2n + 2. If you replace all 2n + 2 occurences
in the code with 2n + 1, then the crash goes away.

Second, the RecognizeMapping index update by this line in "recognize" function:

stack.push(new RecognizeMapping(mapping.working_string + c,
dictionary_next, levenshtein_next, mapping.index + 1));

the "mapping.index + 1" is just wrong. As you can see on table 5.2 of
Schulz and Mihov, the next index is determined by the
next levenshtein state.

Courtney Falk

unread,
Jan 9, 2010, 10:35:25 PM1/9/10
to mo...@googlegroups.com
On Wed, Jan 6, 2010 at 9:07 PM, Jean-Philippe Barette-LaPierre
<j...@rrette.com> wrote:
> Court, I peaked into the code. I found 2 errors.
>
> First, I'm quite sure that the characteristic vector must be 2n + 1,
> not 2n + 2.  If you replace all 2n + 2 occurences
> in the code with 2n + 1, then the crash goes away.

I changed the characteristic vector from 2n + 2 to 2n + 1. The crash
I get happens when this.chi = ChiType.MS. Then I get this stacktrace:

Exception in thread "main" java.lang.NegativeArraySizeException
at com.infiauto.datastr.auto.LevenshteinAutomaton.functionR(LevenshteinAutomaton.java:444)
at com.infiauto.datastr.auto.LevenshteinAutomaton.deltaE(LevenshteinAutomaton.java:453)
at com.infiauto.datastr.auto.LevenshteinAutomaton.delta(LevenshteinAutomaton.java:488)
at com.infiauto.datastr.auto.LevenshteinAutomaton.<init>(LevenshteinAutomaton.java:669)
at com.infiauto.datastr.auto.LevenshteinAutomaton.main(LevenshteinAutomaton.java:789)

functionR has often been the source of my bugs.


> Second, the RecognizeMapping index update by this line in "recognize" function:
>
> stack.push(new RecognizeMapping(mapping.working_string + c,
> dictionary_next, levenshtein_next, mapping.index + 1));
>
> the "mapping.index + 1" is just wrong. As you can see on table 5.2 of
> Schulz and Mihov, the next index is determined by the
> next levenshtein state.

You're right. I made a big mistake there. I'm going to have to look
at my design in order to fix that.

I'll go ahead and include the latest working copy of the only file I've changed.

Thanks again.


Court

LevenshteinAutomaton.java

Courtney Falk

unread,
Jan 18, 2010, 10:33:30 PM1/18/10
to mo...@googlegroups.com
Jean-Philippe,

There are two problems I'm struggling to understand right now:

1) The LA as I've built it right now is constructed of States, which are
just sets of Positions. When I build a characteristic vector and apply
the delta function to the current State I get a State back, but
shouldn't it be just a single Position? I mean, since an LA is a DFA I
shouldn't be getting multiple Positions for any given input, correct?

2) I'm looking at your Python code to try and help my understanding.
I'm looking at the ErrorTolerantRecognizer.recognize function in fsc.py.
Your code says:
if mPrime[0] != []:
So what kind of data structure (mPrime) is being returned by the delta
function in your code? A set of sets?


Courtney Falk

Jean-Philippe Barrette-LaPierre

unread,
Jan 22, 2010, 11:11:29 AM1/22/10
to Moman
Just a quick reply to say that I didn't had the time to answer to your
question,
but I didn't forgot you. I'll answer saturday.

Courtney Falk

unread,
Jan 22, 2010, 11:58:47 AM1/22/10
to mo...@googlegroups.com
Not a problem. I'm studying your Python code (version 0.2.1) at the moment.
Reply all
Reply to author
Forward
0 new messages