In an attempt to extract interconnection information from VHDL files I
started using regular expressions, but the nature of VHDL make those
regular expressions large and quite difficult to maintain and extend
as I fill in new features. Has anybody a better idea than to use
regexps for parsing for such a difficult language?
--
Svenn
>In an attempt to extract interconnection information from VHDL files I
>started using regular expressions, but the nature of VHDL make those
>regular expressions large and quite difficult to maintain and extend
>as I fill in new features.
Yes, it's a loser. Stripping away all the comments makes
things a tad easier, but it's still not appropriate RE-fodder.
The received wisdom here on c.l.t is clearly opposed to the
use of REs for serious parsing applications.
> Has anybody a better idea than to use
>regexps for parsing for such a difficult language?
Have you looked at GHDL? You might be able to steal the VHDL
parser from that. I don't know whether it is able to write out
the syntax tree in a form that could easily be sucked into a
Tcl script, though.
The practical reality is that you will do far better to load
your VHDL model into a simulator, and use its foreign-language
interface (FLI, VHPI or what-have-you) to traverse the connectivity
of instantiated entities. If you consider all the complications
that can be introduced by generics, generates and configurations,
it is surely unreasonable to extract connectivity by looking
at the source code unless you're prepared to do the full process
of compilation and elaboration, just as a simulator does.
It's worth pinging comp.lang.vhdl for this too; there are a few
people there who've played with such things, I think.
There are some realistic parser packages for Tcl, but
I'll hand that over to the people who actually understand
them instead of misleading you with unreliable folklore.
Good luck.
--
Jonathan Bromley, Consultant
DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services
Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan...@MYCOMPANY.com
http://www.MYCOMPANY.com
The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
It *is* rather disappointing that Tcl, despite its massive use with
VHDL, hasn't inspired a similar native parser ... Perhaps there'll be
more to say in a follow-up.
Could this be of help?
xml tools for eda file format conversion:
http://xmleda.wiki.sourceforge.net/
uwe
It is actually amazing that a type strict language like VHDL can be so
little strict when it comes to what kind of syntax it allows. If
beginning and ending entities and architectures would be done in one,
and only one, allowed way, things would have been a little bit easier,
but that was before I looked at handling generates for multiple
instantiations of the same entity. At least I know that this is a can
of worms ....
>
> > Has anybody a better idea than to use
> >regexps for parsing for such a difficult language?
>
> Have you looked at GHDL? You might be able to steal the VHDL
> parser from that. I don't know whether it is able to write out
> the syntax tree in a form that could easily be sucked into a
> Tcl script, though.
I am using GHDL for my non-work projects. I thought that GHDL was
written in ADA so I never dared to look at the source code. Maybe I
should give it a try. Every good simulator should have an embedded Tcl
interpreter :-)
>
> The practical reality is that you will do far better to load
> your VHDL model into a simulator, and use its foreign-language
> interface (FLI, VHPI or what-have-you) to traverse the connectivity
> of instantiated entities. If you consider all the complications
> that can be introduced by generics, generates and configurations,
> it is surely unreasonable to extract connectivity by looking
> at the source code unless you're prepared to do the full process
> of compilation and elaboration, just as a simulator does.
Yes, I am not a programmer and all this talk about parsing and lexing
and syntax trees frightens me as most seem to be happy if they can
validate that a piece of code is in valid language syntax. I want to
extract real information from whatever a parser fill my memory with.
Looking at lex and yacc some years back made me go to scripting
instead of c for my helper utilities.
perl seems to have a VHDL package, but _that_ package is of course not
a part of the standard distribution of perl so I'll probably have to
go to CPAN go get that.
--
Svenn
While that is certainly true of REs alone, don't underestimate the
joint power of [regsub] and the Tcl parser itself. In countless
occasions I've found myself morphing random domain-specific languages
into Tcl and just calling [eval]. And that was before XML ;-)
Now I have absolutely no idea of how well this applies to VHDL...
-Alex
>don't underestimate the
>joint power of [regsub] and the Tcl parser itself. In countless
>occasions I've found myself morphing random domain-specific languages
>into Tcl and just calling [eval]. And that was before XML ;-)
>
>Now I have absolutely no idea of how well this applies to VHDL...
That sounds suspiciously like a dropped gauntlet...
Given that VHDL has a rather strict nested-block structure,
and that the OP is probably (I'm guessing here) happy to
run his Tcl script only on VHDL code that is already known
to be syntactically legal, it may apply quite well to the
purely syntactic side of things.
Unfortunately, as I mentioned earlier, the connectivity
and instance hierarchy of a VHDL model (roughly, executable)
cannot be determined, even locally, until the whole mess
is elaborated. Analyzing its syntax is only the beginning.
It's worth a look, though; thanks for the prod :-)
Good guessing. While entities are fairly simple to extract,
instantiations of those entities is a nightmare as they can be
iteratively instantiated with generate loops. If VHDL wouldn't be so
strict on entity declarations, I guess they would also be a nightmare.
>
> Unfortunately, as I mentioned earlier, the connectivity
> and instance hierarchy of a VHDL model (roughly, executable)
> cannot be determined, even locally, until the whole mess
> is elaborated. Analyzing its syntax is only the beginning.
I have started looking at how GHDL does things, and it _is_ written in
ADA which looks very much like VHDL. My hopes of easy extracting how
things are done faded very quickly. Next thing would be to look at how
the perl guys solve the problem. Maybe I have to move to Perl/Tk.
--
Svenn
There's one thing I don't quite grasp here: are you guys saying that
the syntactic level is pretty simple to convert to any internal
representation, but that the real complexity lies in the semantics
(topology) ? In that case, the scripting language used should be
completely neutral, the only real problem being algorithmic... Or are
you saying something else ?
-Alex
>There's one thing I don't quite grasp here: are you guys saying that
>the [VHDL] syntactic level is pretty simple to convert to any internal
>representation, but that the real complexity lies in the semantics
>(topology) ? In that case, the scripting language used should be
>completely neutral, the only real problem being algorithmic... Or are
>you saying something else ?
First off, you need to recognise that I (and I suspect the same
applies to the OP) am not a compiler wonk. I have a reasonable
idea of how such things work, and I'm not stupid about writing
software, but even writing just a lexer for a serious programming
language would probably defeat me unless I had LOTS of spare time.
However, my point was this: given a language like C, once you have
the syntax tree you know everything about the static structure
of your program and you're ready to start execution. And I'm
aware that Tcl can do a rather nice job on this, because it's
often rather straightforward to create the syntax tree in a form
that happens to be an executable Tcl script - I've done it myself
on several occasions, most recently to write a nice assembly
language for a bizarre custom CPU we used here.
But hardware description languages like VHDL pose the additional
problem that (from a software point of view) their execution
proceeds in two distinct phases, elaboration and execution.
Elaboration builds the static structure of a set of concurrent
processes and the signals that are used to communicate between
them. Execution simulates the behaviour of the modelled system
by having those concurrent processes respond to changes on the
signals as simulated time elapses. It sounds as though Svenn,
like many other hardware folk, is interested in the static
structure that is created by elaboration; given a suitably
constrained model, that static structure can be in one-to-one
correspondence with the digital hardware that will ultimately
be implemented and that was simulated by the model.
The tools that VHDL provides for describing this static structure
are fairly limited, precisely because they are restricted to
the description of a static structure; but they have enough
flexibility that, in the general case, there is no alternative
to actually running the elaboration phase in order to find out
what you've got. This often comes as a surprise to folk who
can easily look at a trivial piece of VHDL and (correctly)
identify the static structure it represents merely by looking
at its syntax.
And so we come back to my first point, which is: we are not
compiler writers, we're hardware grunts who happen to have
a slightly better-than-average clue about software. And we
have what is ostensibly a simple hardware-related problem,
and we find we need to write truly non-trivial software to
solve it. Since we know what's going on, we are not satisfied
with the quick hack that works only in a few simple cases; but
since we're not compiler writers, we find the task of doing
it properly to be rather daunting. Such is the fate of anyone
who has the temerity to cross a boundary between disciplines :-)
And yes, the precise scripting language is not really an issue.
But now another cultural problem rears its head: Tcl is ubiquitous
in the electronics design business, being used as the command
front-end to pretty much every serious design tool. So Svenn
and I are likely to know Tcl reasonably well, and perhaps have
eschewed Perl or other script languages because it's too much
like hard work to learn them, and Tcl can do anything we need.
So we are, not for the first time, searching for the right
blade on the Tcl Swiss-army knife. Both the syntax and
elaboration-time semantics of VHDL are non-trivial. Easy for
a serious compiler writer, I'm sure, but hard for me and -
it seems - hard for Svenn too.
Thanks
Thanks for the detailed explanation.
My only problem with this is that Sven said he was about to switch to
Perl...
In other circumstances I'd just have wished him good luck, but you got
me hooked ;-)
It definitely sounds like an interesting challenge for Tcl because:
(1) The syntactic level seems to be within reach (especially on a
constrained subset of cases)
(2) Once we have the stuff in memory, *if* someone can provide the
relevant algorithms, Tcl's nice data structures could really shine (I
know this from experience with other applications of graph theory).
And if Perl does it, no reason Tcl should have perf issues.
Possible next step: Sven can you provide an example VHDL file, and a
description in plain English of what the output should look like ?
-Alex
>It definitely sounds like an interesting challenge for Tcl because:
>
>(1) The syntactic level seems to be within reach (especially on a
>constrained subset of cases)
OK, I have to believe you about that :-) But the constraints are
probably not as tight as you might wish for comfort.
>(2) Once we have the stuff in memory, *if* someone can provide the
>relevant algorithms, Tcl's nice data structures could really shine (I
>know this from experience with other applications of graph theory).
No question. VHDL elaboration proceeds strictly top-down, the
only small complication being that at each node you may need to
merge two structures: the design unit and its configuration.
Given the syntax tree in memory, I reckon that wouldn't be too
hard for anyone who knows VHDL in detail. The library mechanism,
which is somewhat akin to Ada's, will make life more complicated
but probably not much more difficult.
>And if Perl does it, no reason Tcl should have perf issues.
I don't think performance was ever likely to be a limiting factor.
Million-line VHDL programs are about as common as rocking-horse shit.
>Possible next step: Sven can you provide an example VHDL file, and a
>description in plain English of what the output should look like ?
That may be a rather tall order unless Svenn has more time
on his hands than I do. VHDL is definitely more complex than,
say, C or Pascal; its syntax is loosely derived from Ada's,
but there are many differences. I can show you some very
simple example...
library ieee; -- declares "ieee" as a library name,
-- some external mapping mechanism maps it to
-- a physical library structure. Libraries
-- contain compiled versions of design units.
use ieee.std_logic_1164.all; -- Make this package visible
-- in the (lexically) next design unit.
entity E is -- Starting a new DESIGN UNIT. An entity
-- describes the external interface of a
-- structural block; an architecture describes
-- its implementation.
generic
( G1 : integer := 5 -- generic parameter with default value
; G2 : boolean -- generic with no default
);
port -- Entities can have port lists. Ports are SIGNALS.
( P1 : in bit -- Input port, of built-in bit type
; P2 : out bit_vector(3 downto 0) -- array port
; P3 : inout std_logic_vector -- unconstrained array port
-- of a type taken from the
-- std_logic_1164 package
);
end;
architecture A of E is -- Architecture is a distinct design unit;
-- an entity can have multiple architectures of which
-- just one is chosen to populate each instance of the
-- entity in a parent architecture
signal SB: boolean; -- signals carry information
signal SI: integer range 0 to 7; -- from one concurrent process
signal ST: bit := '0'; -- to another.
-- Now the fun starts. P3 was unconstrained, so each instance
-- has a port P3 constrained by whatever signal got connected
-- to the port on that instance. To discover those constraints
-- we can use attributes:
constant Upper: integer := P3'HIGH; -- upper bound of subscript
constant Width: integer := P3'LENGTH;
signal P3A: std_logic_vector(P3'RANGE); -- clones the subscript
begin
process (P1) -- Process sensitive to (wakes up on) changes on P1
variable S: boolean; -- A static variable of this process
begin
P2 <= "0000"; -- A vector of bits
S := (P1 = '1'); -- '0', '1' are the enum literals for 'bit'
if S then
P2(0) <= '1'; -- Signal assignment has
else -- delayed-update semantics
P2(1) <= '1';
end if;
end process;
P3A <= P3; -- Shorthand for a process:
-- process (P3) begin
-- P3A <= P3;
-- end process;
RepeatedInstances: for i in 1 to G1 generate
-- instances, processes can go here,
-- multiple instances are created during elaboration
-- by the generate loop
end generate;
end;
Ummmm, not trivial. Nicely block structured, for sure, but not
trivial. And I haven't started describing packages, subprograms,
instances, configurations, type declarations and all the other
zoo of nice language features. Designers use some part of all
these features in real hardware designs, so missing some stuff out
is not an option.
Not an afternoon's work, even for Alexandre, I fear.
Sure :-)
However, I'd be disappointed to sleep on the idea that this problem
has only, and will only ever have, a Perl solution...
At the same time I have the feeling that you possess the keys to the
hardest parts of the problem, and that a very small piece of
additional help on the parsing side would do the job... Am I wrong ?
From another standpoint: what about the wanted output ?
From yet another standpoint: what does the Perl tool's output look
like ?
-Alex
uwe
posted this via google a couple of days ago, seen it?
> Possible next step: Sven can you provide an example VHDL file, and a
> description in plain English of what the output should look like ?
I'll try to put a test case together which includes some of the
features that are used in the project I am working on now. Problem is
that I have to anonymize it a bit to avoid giving out trade
secrets :-) I'll have to do this on my spare time so it may take a few
days. I could maybe just smash some modules from opencores.org
together in a toplevel and add some modifications to the indentation
of the files to cover possible "personal tastes of how code should
look".
--
Svenn
Yes, I saw that, and I haven't really had time to investigate. It
should be on my list.
--
Svenn
I know nothing about VHDL
BUT
might there be a 'text book' example that would ILLUSTRATE the problem?
Jonathan Bromley provided a short example in a previous post, but a
short example cannot really illustrate all the problems that can occur
when parsing VHDL. Text book examples tend to be "clean" VHDL, that
is, they are both correct and look "good". When looking at code
written by other designers, personal feel and taste differs. I usually
hijack files and send them through the VHDL mode in xemacs before I
start modification. That is ok during development, but in production
that kind of behaviour will result in loads of whitespace changes and
line formatting in the diff files. A simple example though:
From http://tams-www.informatik.uni-hamburg.de/vhdl/tools/grammar/vhdl93-bnf.txt
entity_declaration ::=
entity identifier is
entity_header
entity_declarative_part
[ begin
entity_statement_part ]
end [ entity ] [ entity_simple_name ] ;
The simplest problem is just how many possible ways an entity can be
ended on:
entity mymodule is -- mymodule is the name of the block
-- definitions
end entity; --or
end mymodule; -- or
end entity mymodule;
This is without taking any care of what is *inside* the entity
definition.
I was able to handle this and some of the definitions with a regexp,
but that was about it before I realized that with rexexp alone I would
not get the job done.
Also have a look at the hyperlinked BNF of VHDL-93 at
http://tams-www.informatik.uni-hamburg.de/vhdl/tools/grammar/vhdl93-bnf.html
and enjoy jumping around among possible and optional arguments.
--
Svenn
Jonathan Bromley provided a short example in a previous post, but a
--
Svenn