Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Simpler regexp maybe?

2 views
Skip to first unread message

Bruno

unread,
Nov 1, 2007, 12:09:46 PM11/1/07
to
Hi experts!

I'm trying to parse the following line:
HSD| avg| 15 - 16 | 586784

in order to extract only the numerical values : 15, 16 and 586784

The following regexp works:

regexp {(^HSD\|)(\s*)(avg\|)(\s*)(\d*)(\s*-\s*)(\d*)(\s*\|\s*)(\d*)}
$line match sub1 sub2 sub3 sub4 sub5 sub6 sub7 sub8 sub9

puts "$sub5 $sub7 $sub9"
15 16 586784

I was wondering though if there would exist a simpler expression to
extract those numbers? My regexp doesn't look too optimized ;)

TIA.

--Bruno

Donald G Porter

unread,
Nov 1, 2007, 12:22:39 PM11/1/07
to
Bruno wrote:
> Hi experts!
>
> I'm trying to parse the following line:
> HSD| avg| 15 - 16 | 586784
>
> in order to extract only the numerical values : 15, 16 and 586784

Have you considered [scan] for this?

% set line {HSD| avg| 15 - 16 | 586784}


HSD| avg| 15 - 16 | 586784

% scan $line {%[^|]|%[^|]|%d - %d |%d}
HSD { avg} 15 16 586784

--
| Don Porter Mathematical and Computational Sciences Division |
| donald...@nist.gov Information Technology Laboratory |
| http://math.nist.gov/~DPorter/ NIST |
|______________________________________________________________________|

Bruno

unread,
Nov 1, 2007, 12:59:41 PM11/1/07
to
Donald G Porter a écrit :

> Bruno wrote:
>> Hi experts!
>>
>> I'm trying to parse the following line:
>> HSD| avg| 15 - 16 | 586784
>>
>> in order to extract only the numerical values : 15, 16 and 586784
>
> Have you considered [scan] for this?
>
> % set line {HSD| avg| 15 - 16 | 586784}
> HSD| avg| 15 - 16 | 586784
> % scan $line {%[^|]|%[^|]|%d - %d |%d}
> HSD { avg} 15 16 586784
>

I didn't know scan!

scan $line "%s %s %d %s %d %s %d" string1 string2 val1 string3 val2
string4 val3

Works great ;) In fact simpler than regexp in such a case.

Thanks Donald.

Bruce Hartweg

unread,
Nov 1, 2007, 2:18:54 PM11/1/07
to
simplest way to grab all numbers is

set numList [regexp -all -inline {\d+}]

Bruce

Glenn Jackman

unread,
Nov 2, 2007, 11:08:16 AM11/2/07
to
At 2007-11-01 02:18PM, "Bruce Hartweg" wrote:
> Bruno wrote:
> > Hi experts!
> >
> > I'm trying to parse the following line:
> > HSD| avg| 15 - 16 | 586784
> >
> > in order to extract only the numerical values : 15, 16 and 586784
[...]

> simplest way to grab all numbers is
>
> set numList [regexp -all -inline {\d+}]

I'd add word boundaries in case the words contain digits
set numList [regexp -all -inline {\m\d+\M}]

--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry

0 new messages