Use of regular expressions on non-strings

1 view
Skip to first unread message

David Whipp

unread,
Aug 1, 2002, 10:30:51 PM8/1/02
to perl6-l...@perl.org
I'm wondering if Perl6's new regex can be applied to non-string things. I
seem to recall A5 mentioning something about strings tied to array
implementations; but I'm wanting something a little more powerful.

A bit of context: I use Perl for verification of big complex ASICs. We run a
simulation and get a waveform database (stores values of each signal over
time). We can then perform queries on that database to check verious
properties.

Example: I want to find any occasions where the bus is requested twice in a
row, without an intervening grant signal. Its a fairly simple query to
implement; but it is, in fact , a regular expresssions. So I would like to
write (pseudo-code):

my @violations = ($waveform_database =~ m:any/ <bus_request> <not
bus_grant>* <bus_request and not bus_grant> /);

I can think of a number of ways of implementing this (one is to first create
a string by iterating the database and mapping each signal to a bit in the
[ascii?] value of a character, and then defining character classes -- though
I'd not want to lose the timestamps); but will Perl6 make my life any easier
when I write the module? I'm not trying to implement a full CTL* formal
property checker; just something that lets me run queries on an existing
simulation trace (which is an object). Ultimately, it would be nice to
define an entire protocol as a grammer -- the signal trace either conforms,
or doesn't (in which case I get the violations).


Dave.

--
Dave Whipp, Senior Verification Engineer,
Fast-Chip inc., 950 Kifer Rd, Sunnyvale, CA. 94086
tel: 408 523 8071; http://www.fast-chip.com
Opinions my own; statements of fact may be in error.

Markus Laire

unread,
Aug 3, 2002, 4:35:44 PM8/3/02
to David Whipp, perl6-l...@perl.org
On 1 Aug 2002 at 19:30, David Whipp wrote:

> I'm wondering if Perl6's new regex can be applied to non-string things. I
> seem to recall A5 mentioning something about strings tied to array
> implementations; but I'm wanting something a little more powerful.

Yes, it can be applied to anything which can be tied to string. I
didn't understand your message completely so I'll just copy-paste
relevant parts from Synopsis 5, which IMO is easier to understand
than A5.

from http://dev.perl.org/perl6/synopsis/5.html

Matching against non-strings

* Anything that can be tied to a string can be matched against a
regex. This feature is particularly useful with input streams:
my @array := <$fh>; # lazy when aliased
my $array is from(\@array); # tie scalar # and later...
$array =~ m/pattern/; # match from stream

Backtracking control

* A <cut> assertion always matches successfully, and has the side
effect of deleting the parts of the string already matched.

* Attempting to backtrack past a <cut> causes the complete match to
fail (like backtracking past a <commit>. This is because there's now
no preceding text to backtrack into.

* This is useful for throwing away successfully processed input when
matching from an input stream or an iterator of arbitrary length.

--
Markus Laire 'malaire' <markus...@nic.fi>

David Whipp

unread,
Aug 5, 2002, 4:30:08 PM8/5/02
to perl6-l...@perl.org, Markus Laire
Markus Laire 'malaire' <markus...@nic.fi> replied:

> Yes, it can be applied to anything which can be tied to string. I
> didn't understand your message completely ...

OK, sorry if I wasn't clear. My example was probably too specific,
so I'll try the abstract approach.

A regex can be applied to anything that can be tied to a string.
So the questions become: what is a string?

I will assume that a string can be defined as an ordered,
random-access, collection of characters. Each character has
a definite position in the string: its index.

Let us assume that the data I want to match against meets
this criteria. I have an ordered collection of indexed data.

The second part of defining a string is the Character. I know
that they are no longer limited to ascii values -- we fully
support unicode. But presumably the definition of a character
is somewhat less liberal than "Character == any object"

If I assume that a character is defined as "any object that
implements interface X", then the question becomes whether I
can coerse my database record to implement that interface.

I can view my database object as a set of predicates. I can
concatinate these predicates to form a binary word. This binary
word would probably be easy to coerse into a character but for
two things: firstly, the numer of bits in the word may number
tens of thousands; second, I'd really like to define the
predicates themselves as rules within a regular expression.

The other issue with this is that my fundamental unit of
matching is a defined subset of predicates. This concept
is equivalent to a character class. If I can tie my database
to be a string, can I also tie sets of predicates to
character-class objects that I use (or construct!) as part
of a regular expression?

It feels like it should be possible to do everything that I
need. But I need greater understanding of the interfaces
involved.


Dave.

Reply all
Reply to author
Forward
0 new messages