I am trying to parse a verilog (Hardware description language) file
using TCL,
and I have some problems with an implementation of a non-greedy regexp:
Here's the string I am looking in:
set a {module a
...
inst1 abc
...
endmodule
module b
...
inst2 xyz
...
endmodule
}
I would like to find the name of the module containing an instance of
type
"inst2", called "xyz". For that I use the following non-greedy regexp
(in
this example I am trying to get module b as an answer):
regexp "module.*?inst2 xyz" $a match
What I get in $match, is the following:
module a
...
inst1 abc
...
endmodule
module b
...
inst2 xyz
Even though I used a non greedy quantifier, I got a greedy match:
instead of
returning "module b" as the start of the match, I get "module a". I had
sucess implementing non-greedy regexps before. This one simply doesn't
work.
I wonder what am I doing wrong.
arnon
Why not:
foreach line [split $a "\n"] {
if { [lindex $line 0] == "module" } {
set module_name [lindex $line 1]
}
if { [string trim $line] == "inst2 xyz" } {
break
}
}
puts "Module: $module_name"
It may not be as compact as a regexp, but IMHO it is clearer
Regards,
Arjen
arnon
% regexp -inline {a.*?b} acabad
acab
"ab" alone would have been a match too, but "acab" is encountered
first...
arnon...@gmail.com wrote:
> OK. Now that I know whats wrong, Is there a way to bypass it using
> regexp ?
>
yeah, add .* at the beginning to eat up the extra stuff before the section you care about
(this assumes there is only one secrtion you care about - if you want to match multiple ones
then you need to adjust your RE and use the inline and all options to loop thru the matches
Bruce
arnon
> thanks bruce, but I don't think I follow you . Can you please be more
> specific, and give the exact regexp ?
Hey that's good Bruce!
For the exact regexp, instead of
regexp "module.*?inst2 xyz" $a match
use
regexp ".*(module.*?inst2 xyz)" $a whole match
--
Donald Arseneau as...@triumf.ca
OK, sorry to be terse,
if there is only one match in the whole thing then
regexp ".*module\s(\S+).*?inst2\sxyz" $input ignoreThis moduleName
will give you the name
and if there are multiple matches, then I would use
set names {}
foreach {module name marker} [regexp -all -inline "module\s(\S+).*?(inst2 xyz)?endmodule"] {
if {$marker ne ""} {
lappend names $name
}
}
to get the list of names.
Bruce