foreach a $list_a { foreach b $list_b { for {set i 0} {$i < [string length $b]} {incr i} { do_sth_with_a_and_b } }
}
What i need to do is check every item in list_a against every item in list_b. Problem is i run in "too many nested calls in Tcl_EvalObj (infinite loop?)" because list_b can be a few thousand entries, list_a is only a few entries. Recompiling tcl with an increased maxNestingDepth is not an option.
t...@huno.net (thomas) writes: > I've got the following code (simplified):
> foreach a $list_a { > foreach b $list_b { > for {set i 0} {$i < [string length $b]} {incr i} { > do_sth_with_a_and_b > } > } > }
> What i need to do is check every item in list_a against every item in > list_b. Problem is i run in "too many nested calls in Tcl_EvalObj > (infinite loop?)" because list_b can be a few thousand entries, list_a > is only a few entries. > Recompiling tcl with an increased maxNestingDepth is not an option.
> Is there any other way to do this?
of course there is, some might even work:
1: use the elements of list_b as the indexes of an array and for each item in list_a see if the index exists, man n info and look at exists
2: you could use the lindex command to search list_b for item_a
and both lookups can be done in 1 loop and an if, ok you need to loop over list_b to load it into the array but that is setup.
I have no idea why the inner for loop is there thou so I might have missed something.
> t...@huno.net (thomas) writes: > > What i need to do is check every item in list_a against every item > > in list_b. > You might look at the 'set' code in Tcllib.
I was referring to the URL that follows. Googling on 'set' with Tcl isn't going to be very productive:-)
thomas wrote: > I've got the following code (simplified):
> foreach a $list_a { > foreach b $list_b { > for {set i 0} {$i < [string length $b]} {incr i} { > do_sth_with_a_and_b > } > } > }
> What i need to do is check every item in list_a against every item in > list_b. Problem is i run in "too many nested calls in Tcl_EvalObj > (infinite loop?)" because list_b can be a few thousand entries, list_a > is only a few entries.
Have you tried it without the "for {set i 0} ..." line? That line appears to do more than you want. It looks like, in each iteration of the "foreach b" loop within the "foreach a" loop, it would run "do_sth_with_a_and_b" multiple times, not only once. The number of times would be determined by the number of *characters* (plus spaces, etc.) in the "b" list item; for example, if the "b" list item was the word "characters," it would run "do_sth_with_a_and_b" 10 times before proceeding. A plain old "foreach a, foreach b, do_sth_with_a_and_b" nested loop shouldn't be infinite if the "do_sth..." part doesn't do anything recursive.
thomas <t...@huno.net> wrote: > I've got the following code (simplified): > foreach a $list_a { > foreach b $list_b { > for {set i 0} {$i < [string length $b]} {incr i} { > do_sth_with_a_and_b > } > } > } > Problem is i run in "too many nested calls in Tcl_EvalObj > (infinite loop?)"
This error message has got *nothing* to do with the number of iterations. the number of nested calls is less than 5 regardless of the lists' lengths.
I guess your problem lies in the "do_sth_with_a_and_b", which seems to recursively call the code that contains the loops.
Thanks for the answers, i'll try to explain what the code actually does. It is a fuzzy compare of one (or multiple) strings against a list of many strings. Strings in list_a are usually shorther than strings in list_b. Lets take the following example:
list_a = peter list_b = foobarpeter, fooketerbar, foobarfoo
So "peter" is compared against each string in list_b. In the first iteration peter and foobarpeter face each other. Here is where the for loop and do_sth_with_a_and_b come into play. The word peter gets shifted through foobarpeter to see if there is a match (or a fuzzy match, the fuzzy proc returns a value between 0 and 1, and with >0.8 i give it a match). So basically like this:
peter fooba
peter oobar
peter obarp
and so on. (it actually checks peter against 6 characters, but that doesn't matter here). do_sth_with_a_and_b just calls the fuzzy proc that returns a value, if >0.8 the string from list_b gets added to a list and i break from the current loop.
Maybe that's a stupid way to do this, but it's the only way i came up with. About the error: I could never reproduce it on my computer, seems both the debian package and activestate tcl already come with a reasonable maxNestingDepth value. The computer where it happens is running the default redhat tcl package i think, but i got no access to it for recompiling or anything. I'd also prefer to fix the code instead of running around recompiling :)
PS. I can post the code if my words make no sense, but it's a few lines and probably not very understandable on first sight.
On Wed, 4 Aug 2004, thomas wrote: > It appears i just confused everyone with my reply :)
No, we are just silently waiting for your "a few" lines of code, and wondering how you have implemented this fuzzy matching, do you seek for contiguous pattern, or does your routine allow interference in the middle of data.
And being very jealous if you are truly calculating convolutions on strings, something I have only had wet dreams of :)
"Kaitzschu" <kaitzs...@kaitzschu.cjb.net.nospam.plz.invalid> wrote: > No, we are just silently waiting for your "a few" lines of code, and > wondering how you have implemented this fuzzy matching, do you seek for > contiguous pattern, or does your routine allow interference in the middle > of data.
It's very simple, don't know if it gets any simpler actually. It's only detecting swapped characters and accidentaly inserted characters (only one though). So it just looks one character behind and one ahead. I'm gonna post the code shortly, i'm gonna make a comment or two.
set a_list [split [string tolower $a] {}] set tmpresults []
foreach b $list_B {
# no need for fuzzy compare, there is a match if {[regexp -nocase $a $b]} { lappend tmpresults $b continue}\
set b_list [split [string tolower $b] {}]
#shift a through b, and compare a against a substring of b of the length a+1 #(because fuzzy looks one character ahead). do a simple fuzzy compare. for {set i 0} {$i < [expr {[llength $b_list] - [llength $a_list]}]} {incr i} { if {[fuzzzy $a_list [lrange $b_list $i [expr {[llength $a_list] + $i}]] 1 -1 0] > 0.75} { lappend tmpresults $b break}} } #if there are multiple strings in $list_A then all strings from $list_B that didn't show a match in the #previous run don't need to be checked again. set list_B $tmpresults }
set all_matches $list_B
proc fuzzzy {a b next previous result} { #next = 1, previous = -1
foreach i $a j $b { if {[string equal $i $j]} { set result [expr {$result + 1.00}]}\ elseif {[string equal $i [lindex $b $previous]]} { set result [expr {$result + 0.585}]}\ elseif {[string equal $i [lindex $b $next]]} { set result [expr {$result + 0.585}]} incr previous; incr next }
Could you be so kind and provide a sample set of list_A and list_B that would demonstrate usage for this? I am wondering whether or not that funny regexp is really supposed to match *anything* from list_B that includes something from list_A.
The Magick doesn't happen when only swap has happened (strings are the same length)?
And, finally, does the logic fail in the end, when "if there are multiple strings in $list_A then all strings from $list_B that didn't show a match in the previous run don't need to be checked again.", shouldn't you scrap those that _did_ match and keep those that didn't, or should list_A be some sort of iterative list that focuses bit-by-bit?
But very fancy snippet, if I only were awake enough to understand what it does... Well, maybe in the morning, and with sample lists :P
Could've sworn i wrote an answer about 2 days ago. Checked google groups but no message there. So here it is again.
You are right, i'm not really comparing strings here, what i'm actually doing is finding strings in strings. An example:
list_A = peter, newspaper list_B = "some peter guy who works at thenewspaper", "some peter who doesnt", "another petar who works at some nuwspapreveryday"
So first i filter all strings out that got peter in them and then i take the matches and check if any of them also matches newspaper. The regexp grabs all the matches withouth spelling/typing errors, the rest is for the fuzzy proc.
The fuzzy proc is not the problem. But what is triggering the infinite loop?