i'm a researcher in bioinformatics, and trying to make an application in tk in order to see alignment of protein sequences. Typically, i have to display in a text widget 100 to 1000 strings (same length) of 1000-5000 characters. Each string (sequence) is made of 20 different characters (amino acids), plus the "." sign. Every of the 20 different characters have to be tagged individually, so there are something like 50 000 tags at least in the widget.
When I try to initialize the widget, following previous remarks found in this newsgroup, i append every character + (tag or not) in a variable that I insert then in the widget text. This takes quite a lot of time !!! I just try an other solution, which is to tag only the visible characters. This gives not a "smooth" scroll, and if you scroll too quickly, the display is in a hurry ....
Does someone have a hint ?
Many thanks in advance !!
(I can send my script test if needed)
------------------------------- luc Moulinier, Laboratoire de Bioinformatique et Genomique Integratives IGBMC Illkirch, 67404 France
You could do it somehow like Excel: display the whole picture compactly in a canvas; display the selected line in an entry widget for editing; update the whole picture on <Return> in the entry.
> i'm a researcher in bioinformatics, and trying to make an application > in tk in order to see alignment of protein sequences. > Typically, i have to display in a text widget 100 to 1000 strings (same > length) of 1000-5000 characters. Each string (sequence) is made of 20 > different characters (amino acids), plus the "." sign. Every of the 20 > different characters have to be tagged individually, so there are > something like 50 000 tags at least in the widget.
each character, or each string needs to be tagged differently? And why? How are you using the tags?
You might try tktable...it might or might not be better, but something to try...
mou...@igbmc.u-strasbg.fr wrote: > i'm a researcher in bioinformatics, and trying to make an application > in tk in order to see alignment of protein sequences. > Typically, i have to display in a text widget 100 to 1000 strings (same > length) of 1000-5000 characters. Each string (sequence) is made of 20 > different characters (amino acids), plus the "." sign. Every of the 20 > different characters have to be tagged individually, so there are > something like 50 000 tags at least in the widget.
That's an awful lot of tags, and "lots of tags" is the case that the text widget isn't heavily optimized for.
> When I try to initialize the widget, following previous remarks found > in this newsgroup, i append every character + (tag or not) in a > variable that I insert then in the widget text. This takes quite a lot > of time !!! > I just try an other solution, which is to tag only the visible > characters. This gives not a "smooth" scroll, and if you scroll too > quickly, the display is in a hurry ....
Err, why is it necessary to tag every character differently? Wouldn't it be easier to only tag things according to what they look like and process mouse/keyboard activity in the widget directly? That'd certainly be faster in this case...
There are only 20 letters. All "A" should be foreground white and background pink, "D" fg white, bg green, "A" fg black , bg orange, etc ....I defined 20 tags, and i applyed them for each letter The code looks like that :
set Lc [split [$wt get 1.0 end] ""] $w delete 1.0 end
foreach c $Lc { if {$c == "\n" || $c == "."} { set t {} } else { set t "Tag$c" } lappend Ldata $c $t
}
eval $w insert end $Ldata
Is there an other way ? I don't understand "tag things like what they look like" ....?
And , can someone tell me where to download the ctext widget ? The tklib is not with the tcllib anymore.... I found an tklib archive on the net dated of feb 2003 ....
> There are only 20 letters. All "A" should be foreground white and > background pink, "D" fg white, bg green, "A" fg black , bg orange, etc > ....I defined 20 tags, and i applyed them for each letter > The code looks like that :
> set Lc [split [$wt get 1.0 end] ""] > $w delete 1.0 end
> foreach c $Lc { > if {$c == "\n" || $c == "."} { > set t {} > } else { > set t "Tag$c" > } > lappend Ldata $c $t > }
> eval $w insert end $Ldata
> Is there an other way ?
There is no real other way, but you could avoid using eval here and try not to build a quite huge list first before you display the tagged data, speeding things up a bit. First insert your data into the text widget without using the tags and then loop through the data tagging them afterwards line per line (since your data is organized in lines):
set lineNumber 1 for {set i 0} {$i < [string length $myText]} {incr i} { set c [string index $myText $i] $w tag add Tag$c $lineNumber.$i
}
This is quicker, since inserting text into the text widget is not the problem. Try your code above without the eval (meaning: without tags), it will be much quicker. But lastly, 50,000 characters are a lot and the tagging is the slow part. I don't think ctext will help much. It's just a wrapper around the text widget using a new layer to implement syntax highlightning. So the fundamental problem is still there. If you want to give ctext a try, you need to get it from the cvs on sourceforge (these are the unix commands needed):
cvs -d:pserver:anonym...@cvs.sourceforge.net:/cvsroot/tcllib login cvs -z3 -d:pserver:anonym...@cvs.sourceforge.net:/cvsroot/tcllib co -P tklib
I never understood why there is no current tklib in the "File releases" section.
> I don't understand "tag things like what they look like" ....?
I think, this is what you do. You tag the characters using a different tag for each letter (not for each character). But this only makes 20 different tags, not 50,000 tags, as the subject line makes us think.
Hmm, perhaps you could insert tiny coloured letter bitmaps instead of tagged letters. Would that speed up things??
That does not solve his problem with actually editing the contents, but it may be worthwhile to combine this with an overlaid widget to edit a single line ....
Why not "virtualize" the tagging? You could just insert the aprotein sequences as plains letters wihout tagging information first. This is fast enough.
Then, start by only tagging the portion of the text widget that is visible to the user (or perhaps one more page). This should be reasonably fast. When the user scrolls through the text, a binding on Button1-Release will then trigger the tagging of the new visible part.
The hard here would be to find the "visible part" of the text. The upcoming Tk8.5 makes that much easier. And of course, if you start scrolling all over to right end of the widget, would probably make the GUI very busy ...
> Why not "virtualize" the tagging? You could just insert the aprotein > sequences as plains letters wihout tagging information first. This is > fast enough.
> Then, start by only tagging the portion of the text widget that is > visible to the user (or perhaps one more page). This should be > reasonably fast. When the user scrolls through the text, a binding on > Button1-Release will then trigger the tagging of the new visible part.
Personally, I'd create a loop that does the tagging in the background a screenful of lines (or columns) at a time. Most likely the loop would always be ahead of any scrolling the user might do, but if not, the tagging code could do sanity checks to make sure to tag the currently visible region as soon as possible.
Roughly, like this (though with better error checking and such):
proc tag_a_chunk {index} { # set magicnum with whatever can be done in a few tens of # milliseconds so that the UI stays responsive. for {set i 0} {$i < $magicnum} {incr i} { <code to add tag to "index+$i chars" } # if index is before the visible part of the screen, # tag the visible portion of the screen, too if {[.textwidget compare $index < @0,0]} { <code to add tags to text between @0,0 and @width,height> } if {[.text widget compare "$index+1c" < end]} { after 10 [list tag_a_chunk [.textwidget index "$index + 1 c"] } }
> The hard here would be to find the "visible part" of the text. The > upcoming Tk8.5 makes that much easier. And of course, if you start > scrolling all over to right end of the widget, would probably make the > GUI very busy ...
Finding the visible part is easy; it's just index @0,0 and index @$width,$height.