A bit of context: I use Perl for verification of big complex ASICs. We run a
simulation and get a waveform database (stores values of each signal over
time). We can then perform queries on that database to check verious
Example: I want to find any occasions where the bus is requested twice in a
row, without an intervening grant signal. Its a fairly simple query to
implement; but it is, in fact , a regular expresssions. So I would like to
my @violations = ($waveform_database =~ m:any/ <bus_request> <not
bus_grant>* <bus_request and not bus_grant> /);
I can think of a number of ways of implementing this (one is to first create
a string by iterating the database and mapping each signal to a bit in the
[ascii?] value of a character, and then defining character classes -- though
I'd not want to lose the timestamps); but will Perl6 make my life any easier
when I write the module? I'm not trying to implement a full CTL* formal
property checker; just something that lets me run queries on an existing
simulation trace (which is an object). Ultimately, it would be nice to
define an entire protocol as a grammer -- the signal trace either conforms,
or doesn't (in which case I get the violations).
> I'm wondering if Perl6's new regex can be applied to non-string things. I
> seem to recall A5 mentioning something about strings tied to array
> implementations; but I'm wanting something a little more powerful.
Yes, it can be applied to anything which can be tied to string. I
didn't understand your message completely so I'll just copy-paste
relevant parts from Synopsis 5, which IMO is easier to understand
Matching against non-strings
* Anything that can be tied to a string can be matched against a
regex. This feature is particularly useful with input streams:
my @array := <$fh>; # lazy when aliased
my $array is from(\@array); # tie scalar # and later...
$array =~ m/pattern/; # match from stream
* A <cut> assertion always matches successfully, and has the side
effect of deleting the parts of the string already matched.
* Attempting to backtrack past a <cut> causes the complete match to
fail (like backtracking past a <commit>. This is because there's now
no preceding text to backtrack into.
* This is useful for throwing away successfully processed input when
matching from an input stream or an iterator of arbitrary length.
Markus Laire 'malaire' <markus...@nic.fi>
OK, sorry if I wasn't clear. My example was probably too specific,
so I'll try the abstract approach.
A regex can be applied to anything that can be tied to a string.
So the questions become: what is a string?
I will assume that a string can be defined as an ordered,
random-access, collection of characters. Each character has
a definite position in the string: its index.
Let us assume that the data I want to match against meets
this criteria. I have an ordered collection of indexed data.
The second part of defining a string is the Character. I know
that they are no longer limited to ascii values -- we fully
support unicode. But presumably the definition of a character
is somewhat less liberal than "Character == any object"
If I assume that a character is defined as "any object that
implements interface X", then the question becomes whether I
can coerse my database record to implement that interface.
I can view my database object as a set of predicates. I can
concatinate these predicates to form a binary word. This binary
word would probably be easy to coerse into a character but for
two things: firstly, the numer of bits in the word may number
tens of thousands; second, I'd really like to define the
predicates themselves as rules within a regular expression.
The other issue with this is that my fundamental unit of
matching is a defined subset of predicates. This concept
is equivalent to a character class. If I can tie my database
to be a string, can I also tie sets of predicates to
character-class objects that I use (or construct!) as part
of a regular expression?
It feels like it should be possible to do everything that I
need. But I need greater understanding of the interfaces