I've been programming with Haskell for a few years and love it. One of my favorite applications of Haskell is using for domain specific languages. However, after designing a handful of DSLs, I continue to hit what appears to be a fundamental hurdle -- or at least I have yet to find an adequate solution.
My DSLs invariably define a datatype to capture expressions; something like this:
data Expression = Add Expression Expression | Sub Expression Expression | Variable String | Constant Int deriving Eq
Using the datatype Expression, it is easy to mass a collections of functions to help assemble complex expressions, which leads to very concise programs in the DSL.
The problem comes when I want to generate efficient code from an Expression (ie. to C or some other target language). The method I use invovles converting the tree of subexpressions into an acyclic graphic to eliminate common subexpressions. The nodes are then topologically ordered and assigned an instruction, or statement for each node. For example:
let a = Add (Constant 10) (Variable "i1") b = Sub (Variable "i2") (Constant 2) c = Add a b
would compile to a C program that may look like this:
a = 10 + i1; b = i2 - 2; c = a + b;
The process of converting an expression tree to a graph uses either Eq or Ord (either derived or a custom instance) to search and build a set of unique nodes to be ordered for execution. In this case "a", then "b", then "c". The problem is expressions often have shared, equivalent subnodes, which dramatically grows the size of the tree. For example:
let d = Add c c e = Add d d -- "e" now as 16 leaf nodes.
As these trees grow in size, the equality comparison in graph construction quickly becomes the bottleneck for DSL compilation. What's worse, the phase transition from tractable to intractable is very sharp. In one of my DSL programs, I made a seemingly small change, and compilation time went from milliseconds to not-in-a-million-years.
Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this problem in OCaml because each "let" expression was mutable, and I could use the physical equality operator to perform fast comparisons. Unfortunately, I have grown to love Haskell's type system and its lack of side effects, and could never go back.
Is there anything that can be done to dramatically speed up comparisons, or is there a better approach I can take to extract common subexpressions? I should point out I have an opportunity to get Haskell on a real industrial application. But if I can't solve this problem, I may have to resort to far less eloquent languages. :-(
> I've been programming with Haskell for a few years and love it. One > of my favorite applications of Haskell is using for domain specific > languages. However, after designing a handful of DSLs, I continue to > hit what appears to be a fundamental hurdle -- or at least I have yet > to find an adequate solution.
> My DSLs invariably define a datatype to capture expressions; something > like this:
> data Expression > = Add Expression Expression > | Sub Expression Expression > | Variable String > | Constant Int > deriving Eq
> Using the datatype Expression, it is easy to mass a collections of > functions to help assemble complex expressions, which leads to very > concise programs in the DSL.
> The problem comes when I want to generate efficient code from an > Expression (ie. to C or some other target language). The method I use > invovles converting the tree of subexpressions into an acyclic graphic > to eliminate common subexpressions. The nodes are then topologically > ordered and assigned an instruction, or statement for each node. For > example:
> let a = Add (Constant 10) (Variable "i1") > b = Sub (Variable "i2") (Constant 2) > c = Add a b
> would compile to a C program that may look like this:
> a = 10 + i1; > b = i2 - 2; > c = a + b;
> The process of converting an expression tree to a graph uses either Eq > or Ord (either derived or a custom instance) to search and build a set > of unique nodes to be ordered for execution. In this case "a", then > "b", then "c". The problem is expressions often have shared, > equivalent subnodes, which dramatically grows the size of the tree. > For example:
> let d = Add c c > e = Add d d -- "e" now as 16 leaf nodes.
> As these trees grow in size, the equality comparison in graph > construction quickly becomes the bottleneck for DSL compilation. > What's worse, the phase transition from tractable to intractable is > very sharp. In one of my DSL programs, I made a seemingly small > change, and compilation time went from milliseconds to > not-in-a-million-years.
> Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this > problem in OCaml because each "let" expression was mutable, and I > could use the physical equality operator to perform fast comparisons. > Unfortunately, I have grown to love Haskell's type system and its lack > of side effects, and could never go back.
> Is there anything that can be done to dramatically speed up > comparisons, or is there a better approach I can take to extract > common subexpressions? I should point out I have an opportunity to > get Haskell on a real industrial application. But if I can't solve > this problem, I may have to resort to far less eloquent languages. > :-(
> I've been programming with Haskell for a few years and love it. One > of my favorite applications of Haskell is using for domain specific > languages. However, after designing a handful of DSLs, I continue to > hit what appears to be a fundamental hurdle -- or at least I have yet > to find an adequate solution.
> My DSLs invariably define a datatype to capture expressions; something > like this:
> data Expression > = Add Expression Expression > | Sub Expression Expression > | Variable String > | Constant Int > deriving Eq
> Using the datatype Expression, it is easy to mass a collections of > functions to help assemble complex expressions, which leads to very > concise programs in the DSL.
> The problem comes when I want to generate efficient code from an > Expression (ie. to C or some other target language). The method I use > invovles converting the tree of subexpressions into an acyclic graphic > to eliminate common subexpressions. The nodes are then topologically > ordered and assigned an instruction, or statement for each node. For > example:
> let a = Add (Constant 10) (Variable "i1") > b = Sub (Variable "i2") (Constant 2) > c = Add a b
> would compile to a C program that may look like this:
> a = 10 + i1; > b = i2 - 2; > c = a + b;
> The process of converting an expression tree to a graph uses either Eq > or Ord (either derived or a custom instance) to search and build a set > of unique nodes to be ordered for execution. In this case "a", then > "b", then "c". The problem is expressions often have shared, > equivalent subnodes, which dramatically grows the size of the tree. > For example:
> let d = Add c c > e = Add d d -- "e" now as 16 leaf nodes.
> As these trees grow in size, the equality comparison in graph > construction quickly becomes the bottleneck for DSL compilation. > What's worse, the phase transition from tractable to intractable is > very sharp. In one of my DSL programs, I made a seemingly small > change, and compilation time went from milliseconds to > not-in-a-million-years.
> Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this > problem in OCaml because each "let" expression was mutable, and I > could use the physical equality operator to perform fast comparisons. > Unfortunately, I have grown to love Haskell's type system and its lack > of side effects, and could never go back.
> Is there anything that can be done to dramatically speed up > comparisons, or is there a better approach I can take to extract > common subexpressions? I should point out I have an opportunity to > get Haskell on a real industrial application. But if I can't solve > this problem, I may have to resort to far less eloquent languages. > :-(
> The process of converting an expression tree to a graph uses either Eq > or Ord (either derived or a custom instance) to search and build a set > of unique nodes to be ordered for execution.
in similar situation, i've added hash field to each node, initialized by smart constructor:
data Expression = Add Hash Expression Expression | ... type Hash=Int
add x y = Add (x*y+1234567) x y ..
-- Best regards, Bulat mailto:Bulat.Zigans...@gmail.com
There is some concern, and rightly so, that observable sharing is dangerous, and that your Haskell program will explode if you use it, and perhaps even that anyone who uses it is "dirty" and should be sent to matron's for a good scrubbing! However, when used "safely", it is not a hack, nor even dirty, but a nice, simple solution to an otherwise nasty problem. Below I define what I mean by "safely".
First consider the circumstances under which obserevable sharing is useful. Typically we have some Haskell function "f" that produces a symbolic representation of an expression. For whatever reason, we'd like to *generate a program* that computes the value of this expression, rather that computing it in Haskell. For example, in Lava, we want to generate a VHDL program so that the expression can be computed on, say, an FPGA. In Tom's case, he wants to generate a C program to compute the expression. All perfectly reasonable, and in my opinion, a very powerfull way to use Haskell.
Now recall that referential transparency lets you replace equals with equals without changing the *value produced* by a program. Note that it says nothing about preserving *runtime behaviour*. Sharing, for example, may be lost. So if you do equational reasoning on function "f" (above), and loose some sharing, then you can only expect that the same sharing will also be also lost in the generated program. As long as the generated program computes the same result as it did before, referential transparency will be, overall, preserved; it would only be lost intermediately. This is what I mean by "safe".
Now, there remains the concern that Haskell's semantics does not enforce sharing. A Haskell compiler is free to change the sharing a program at a whim, unknowingly to the programmer who may be relying on it in for an efficient program. However, to my knowledge, it is an unwritten rule of Haskell compilers that sharing *is* preserved, and that they do perform *graph* reduction. Clean, a similar language to Haskell, indeed has a semantics based on graphs. So I don't believe Haskell being non-strict (not necessarily lazy) is a good reason for not using observable sharing. Though I do feel better when I compile without -O. :-)
Finally, when I say "observable sharing", I don't necessarily mean it as defined by Koen Claessen and David Sands. I simply mean the use of unsafePerformIO to detect sharing, whether or not this is done by an "eq" predicate on Refs. (I say this because I think there are simpler ways to detect sharing, though these will probably not have the nice semantic properties of observable sharing.)
On 2/8/08, Matthew Naylor <mfn-haskell-c...@cs.york.ac.uk> wrote:
> it in for an efficient program. However, to my knowledge, it is an > unwritten rule of Haskell compilers that sharing *is* preserved, and > that they do perform *graph* reduction. Clean, a similar language to
I'm not sure that programmers ought to be relying on this rule. Sure, all Haskell compilers I know of preserve sharing and do graph reduction. But conventional wisdom is not the same thing as an unwritten rule. Someday, someone might come along and write a Haskell compiler that isn't based on graph reduction and doesn't preserve sharing at the implementation level (while still preserving the informal semantics of Haskell). A programmer who had written code that failed to compile correctly under this hypothetical compiler would be a very naughty Haskell programmer indeed.
> Haskell, indeed has a semantics based on graphs. So I don't believe
Haskell doesn't have a semantics, graph-based or not... or at least not a formal one, and if not a formal one, I don't know what you mean :-)
Cheers, Tim
-- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt "There are no sexist decisions to be made. There are antisexist decisions to be made. And they require tremendous energy and self-scrutiny, as well as moral stamina..." -- Samuel R. Delany _______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
On 2/8/08, Emil Axelsson <e...@cs.chalmers.se> wrote:
> I know of a few of ways to express sharing in a pure language:
> 1) "Observable sharing", which, in general, is unsafe. > 2) Using Template Haskell > 3) Matthew Naylor has done some work on "expressible sharing", which has > 4) Use a monad (but I'm sure this is what you're trying to avoid).
Or...
5) Forget embedding the DSL, and write a direct compiler.
In addition to the sharing problem, another shortcoming of Haskell DSLs is they can not fully exploit the benefits of algebraic datatypes. Specifically, pattern matching ADTs can only be used to control the compile-time configuration of the target, it can't be used to describe the target's behavior -- at least for DSLs that generate code that executes outside of Haskell's runtime.
Writing a real compiler would solve both of these problems. Is there any Haskell implementation that has a clean cut-point, from which I can start from a fully type-checked, type-annotated intermediate representation?
And thanks for the link to John's paper describing Hydra's use of Template Haskell. I will definiately consider TH.
> On 2/8/08, Emil Axelsson <e...@cs.chalmers.se> wrote: > > I know of a few of ways to express sharing in a pure language:
> > 1) "Observable sharing", which, in general, is unsafe. > > 2) Using Template Haskell > > 3) Matthew Naylor has done some work on "expressible sharing", which has > > 4) Use a monad (but I'm sure this is what you're trying to avoid).
> Or...
> 5) Forget embedding the DSL, and write a direct compiler.
> In addition to the sharing problem, another shortcoming of Haskell > DSLs is they can not fully exploit the benefits of algebraic > datatypes. Specifically, pattern matching ADTs can only be used to > control the compile-time configuration of the target, it can't be used > to describe the target's behavior -- at least for DSLs that generate > code that executes outside of Haskell's runtime.
> Writing a real compiler would solve both of these problems. Is there > any Haskell implementation that has a clean cut-point, from which I > can start from a fully type-checked, type-annotated intermediate > representation?
Taking the output of GHC's intermediate phase, after optimising leaves you with type checked, optimised, 'Core' -- basically lambda calculus with extras.
It's a good start if you then want to hand-compile that down.
> Finally, when I say "observable sharing", I don't necessarily mean it > as defined by Koen Claessen and David Sands. I simply mean the use of > unsafePerformIO to detect sharing, whether or not this is done by an > "eq" predicate on Refs. (I say this because I think there are simpler > ways to detect sharing, though these will probably not have the nice > semantic properties of observable sharing.)
ghc actually provides a primop for this:
reallyUnsafePtrEquality# :: a -> a -> Int#
Use at your own risk.
Note that you can only check for equality uing that primop. To detect cycles in data structures efficiently, a total order would be better, but in the presence of copying garbage collection that's asking too much.
> Now, there remains the concern that Haskell's semantics does not > enforce sharing. A Haskell compiler is free to change the sharing a > program at a whim, unknowingly to the programmer who may be relying on > it in for an efficient program. However, to my knowledge, it is an > unwritten rule of Haskell compilers that sharing *is* preserved, and > that they do perform *graph* reduction.
That is not true anymore for the threaded runtime of ghc. If two threads demand the same thunk, one of them will usually block, but there is a small window where both threads can start evaluting the expression. To prevent this, you'd have to take a lock or otherwise synchronize the threads upon entering each thunk, which is prohibitively expensive.
> In addition to the sharing problem, another shortcoming of Haskell > DSLs is they can not fully exploit the benefits of algebraic > datatypes. Specifically, pattern matching ADTs can only be used to > control the compile-time configuration of the target, it can't be used > to describe the target's behavior -- at least for DSLs that generate > code that executes outside of Haskell's runtime.
you can embed algebraic data types and pattern matching in Haskell with your own semantics, and retain type inference. It goes something like this:
(nil, (|>)) = datatype (cons0 [] \/ cons2 (:))
map f xs = match xs rules where rules (x, xs) = [ nil --> nil , x |> xs --> f x |> map f xs ]
here, map :: (Term a -> Term b) -> Term [a] -> Term [b].
The main issue is that you have to quantify the free variables in patterns, hence the "rules" function. I don't know if this helps you.
> Writing a real compiler would solve both of these problems. Is there > any Haskell implementation that has a clean cut-point, from which I > can start from a fully type-checked, type-annotated intermediate > representation?
The Yhc.Core library is very simple to use and fairly mature (Neil's been tweeking it for about 3 years now), but it doesn't meet your type-annotated requirement. (Untyped core is still pretty useful, though.)
If you go the real compiler route, would it not make sense to take the DSL as the source language rather than Haskell? Or are the DSL and Haskell quite similar? Or perhaps you are thinking of a two language system, where some code is evaluated at compile time by Haskell, and some is compiled to the target language? If so, maybe you just want domain specific syntax inside a Haskell program, in which case the paper "Why it's nice to be quoted: quasiquoting for haskell" might be relevant (it's actually supported in GHC I think).
Anyway, all very thought provoking!
Matt.
P.S.
Tom Hawkins wrote: > Emil Axelsson wrote: > > I know of a few of ways to express sharing in a pure language:
> > 1) "Observable sharing", which, in general, is unsafe. > > 2) Using Template Haskell > > 3) Matthew Naylor has done some work on "expressible sharing", which has > > 4) Use a monad (but I'm sure this is what you're trying to avoid).
> Or...
> 5) Forget embedding the DSL, and write a direct compiler.
Taking options 2 or 5 just to solve the sharing problem sounds to me like a lot of hard work for little reward. But don't worry, I won't repeat my observable sharing speech. :-) _______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
On Fri, 8 Feb 2008, Tom Hawkins wrote: > 5) Forget embedding the DSL, and write a direct compiler.
> In addition to the sharing problem, another shortcoming of Haskell > DSLs is they can not fully exploit the benefits of algebraic > datatypes. Specifically, pattern matching ADTs can only be used to > control the compile-time configuration of the target, it can't be used > to describe the target's behavior -- at least for DSLs that generate > code that executes outside of Haskell's runtime.
Also in a pure Haskell library you will try to avoid direct access to constructors, because the internal data structures might change. Better are functions that access the internal data of a type, like 'maybe' and 'either' for 'Maybe' and 'Either', respectively. _______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
On Feb 9, 2008 12:28 AM, Tom Hawkins <tomahawk...@gmail.com> wrote:
> 5) Forget embedding the DSL, and write a direct compiler.
> In addition to the sharing problem, another shortcoming of Haskell > DSLs is they can not fully exploit the benefits of algebraic > datatypes. Specifically, pattern matching ADTs can only be used to > control the compile-time configuration of the target, it can't be used > to describe the target's behavior -- at least for DSLs that generate > code that executes outside of Haskell's runtime.
Only partly true. Probably you are not aware of them (I myself learned about its existence a few days ago) but pattern quasiquoting (available in GHC's HEAD) can be used for that.
> Writing a real compiler would solve both of these problems. Is there > any Haskell implementation that has a clean cut-point, from which I > can start from a fully type-checked, type-annotated intermediate > representation?
If you have to write a compiler why not define a language which fits better with the semantics of the embedded language instead of using plain Haskell?
The approach you propose has the disadvantages of both the embedded and the standalone languages. On one hand you have to stick with the syntax of the host language which may not fit with your exact semantical requirements and, on the other hand, you cannot take advantage of all the existing machinery around the host language (you have to code your own compiler).
Furthermore, the first citizen status of functions make it impossible (or really difficult at least) to compile EDSL descriptions avoiding runtime and simply applying a static analysis approach (using Core or plain Haskell as input).
> And thanks for the link to John's paper describing Hydra's use of > Template Haskell. I will definiately consider TH.
Well, TH would be one of those static analysis approaches. Actually, O'Donell's implementation (which uses an outdated version of Template Haskell) only works with a small Haskell subset. So using TH you'd probably be changing the host language anyhow.
Furthermore, the TH approach consists in adding node labels by preprocessing the EDSL description, making sharing observable. That makes the original EDSL description inpure. The only difference is that side effects are added by preprocessing instead of using runtime unsafe functions.
----
Some pointers covering the topic:
[1] and [2] summarize what are the alternatives to observe sharing in Haskell whereas [3] compares the embedded approach vs standalone approach and advocates the last one.
On Fri, 8 Feb 2008, Matthew Naylor wrote: > Now recall that referential transparency lets you replace equals with > equals without changing the *value produced* by a program. Note that > it says nothing about preserving *runtime behaviour*. Sharing, for > example, may be lost. So if you do equational reasoning on function > "f" (above), and loose some sharing, then you can only expect that the > same sharing will also be also lost in the generated program. As long > as the generated program computes the same result as it did before, > referential transparency will be, overall, preserved; it would only be > lost intermediately. This is what I mean by "safe".
I think there are degrees of observability. If a Haskell library immediately talks to a C library and shares resources generated by the library, then this sharing can be hardly observed and the method is somehow safe. If you generate a C program with Haskell and write it to a disk it can be easily observed and people might rely on a particular resulting C program. If the C program is piped to a C compiler which is immediately run, then sharing can be hardly observed. Even within Haskell sharing is somehow observable, the Haskell program could observe the free memory of the machine and thus it can see a difference between sharing and duplicated objects. _______________________________________________ Haskell-Cafe mailing list Haskell-C...@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
On Feb 9, 2008 1:07 PM, Matthew Naylor <mfn-haskell-c...@cs.york.ac.uk> wrote:
> If you go the real compiler route, would it not make sense to take the > DSL as the source language rather than Haskell? Or are the DSL and > Haskell quite similar?
The two are nearly identical. In fact the only significant difference between the languages is the semantics of top level monad; it wouldn't be IO, but something else. With the syntax the same, it could leverage much of Haskell's standard library.
> Or perhaps you are thinking of a two language > system, where some code is evaluated at compile time by Haskell, and > some is compiled to the target language?
Not necessarily in the same compilation flow, but I can think of several scenarios where it would be advantageous for code written in this other language to be pulled into a conventional Haskell program.
> Taking options 2 or 5 just to solve the sharing problem sounds to me > like a lot of hard work for little reward. But don't worry, I won't > repeat my observable sharing speech. :-)
So is the general strategy with observable sharing to use unsafePerformIO with Data.Unique to label expressions at construction? Ahh...clever! I did not think of this. Of course, now that you have me reading up on Yhc.Core, option #5 is looking considerably more fun.
> So is the general strategy with observable sharing to use > unsafePerformIO with Data.Unique to label expressions at > construction?
something like that, yes. Basically, you just need:
{-# NOINLINE ref #-} ref x = unsafePerformIO (newIORef x)
and you can write expressions like
ref False == ref False
and
let x = ref False in x == x
However, while referential equality is enough for sharing detection, I *suspect* it's simpler to use the fact that refs are IORefs and you can read and write them (in the IO monad). So a very simple Lava might look like
module Lava (Bit,Netlist,low,high,nand2,netlist) where
import Data.IORef import System.IO.Unsafe
{-# NOINLINE ref #-} ref x = unsafePerformIO (newIORef x)
type Ref = IORef (Maybe Int)
data Bit = Gate String Ref [Bit]
type Netlist = [(String, Int, [Int])] -- gate, output, inputs
netlist :: Bit -> IO Netlist netlist x = do i <- newIORef (0 :: Int) ; f i x where f i (Gate str r xs) = do val <- readIORef r num <- readIORef i case val of Nothing -> do writeIORef r (Just num) writeIORef i (num+1) rest <- mapM (f i) xs let is = map ((\(g,o,is) -> o) . head) rest return ((str,num,is):concat rest) Just j -> return [("indirection",j,[])] -- explicit sharing!
Indirections can be filtered out at the end, they don't actually give the netlist any information.
> Of course, now that you have me reading up on Yhc.Core, option #5 is > looking considerably more fun.
Yeah, I think Yhc.Core is pretty nifty too. Thank Neil!
Henning Thielemann <lemm...@henning-thielemann.de> wrote in article <Pine.GSO.4.56.0802080908310.12...@haydn.informatik.uni-halle.de> in gmane.comp.lang.haskell.cafe:
> It seems to become a FAQ. I think all DSLs suffer from the same problems: > sharing and recursion. I've used wrappers for CSound, SuperCollider, > MetaPost, they all have these problems.
What do you mean by the "recursion" problem?
Sometimes (or perhaps even often), sharing in an EDSL can be expressed in two ways. First, to reuse a -value- in the embedded language, you could introduce a "let" construct in the embedded language.
let_ expr body = body expr
Second, to reuse an -expression- in the embedded language, if your interpreter is compositional (here by "interpreter" I include a compiler, and by "compositional" I mean a fold), then you can represent an embedded expression simply as its interpretation.
add x y = x + y let expr = add x y in add expr expr
I've been working on another code-generating graphics compiler, generating GPU code. As always, I run into the problem of efficient common subexpression elimination. In Pan, Vertigo & Pajama, I used lazy memoization, using stable pointers and weak references, to avoid the worst-case-exponential behavior you mention below. I'm now using a bottom-up CSE method that's slower and more complicated than I'm going for.
On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins <tomahawk...@gmail.com> wrote: > I've been programming with Haskell for a few years and love it. One > of my favorite applications of Haskell is using for domain specific > languages. However, after designing a handful of DSLs, I continue to > hit what appears to be a fundamental hurdle -- or at least I have yet > to find an adequate solution.
> My DSLs invariably define a datatype to capture expressions; something > like this:
> data Expression > = Add Expression Expression > | Sub Expression Expression > | Variable String > | Constant Int > deriving Eq
> Using the datatype Expression, it is easy to mass a collections of > functions to help assemble complex expressions, which leads to very > concise programs in the DSL.
> The problem comes when I want to generate efficient code from an > Expression (ie. to C or some other target language). The method I use > invovles converting the tree of subexpressions into an acyclic graphic > to eliminate common subexpressions. The nodes are then topologically > ordered and assigned an instruction, or statement for each node. For > example:
> let a = Add (Constant 10) (Variable "i1") > b = Sub (Variable "i2") (Constant 2) > c = Add a b
> would compile to a C program that may look like this:
> a = 10 + i1; > b = i2 - 2; > c = a + b;
> The process of converting an expression tree to a graph uses either Eq > or Ord (either derived or a custom instance) to search and build a set > of unique nodes to be ordered for execution. In this case "a", then > "b", then "c". The problem is expressions often have shared, > equivalent subnodes, which dramatically grows the size of the tree. > For example:
> let d = Add c c > e = Add d d -- "e" now as 16 leaf nodes.
> As these trees grow in size, the equality comparison in graph > construction quickly becomes the bottleneck for DSL compilation. > What's worse, the phase transition from tractable to intractable is > very sharp. In one of my DSL programs, I made a seemingly small > change, and compilation time went from milliseconds to > not-in-a-million-years.
> Prior to Haskell, I wrote a few DSLs in OCaml. I didn't have this > problem in OCaml because each "let" expression was mutable, and I > could use the physical equality operator to perform fast comparisons. > Unfortunately, I have grown to love Haskell's type system and its lack > of side effects, and could never go back.
> Is there anything that can be done to dramatically speed up > comparisons, or is there a better approach I can take to extract > common subexpressions? I should point out I have an opportunity to > get Haskell on a real industrial application. But if I can't solve > this problem, I may have to resort to far less eloquent languages. > :-(
> I've been working on another code-generating graphics compiler, generating > GPU code. As always, I run into the problem of efficient common > subexpression elimination. In Pan, Vertigo & Pajama, I used lazy > memoization, using stable pointers and weak references, to avoid the > worst-case-exponential behavior you mention below. I'm now using a > bottom-up CSE method that's slower and more complicated than I'm going for.
> What's your latest wisdom about CSE in DSELs?
> Thanks, - Conal
One common trick that Tom didn't seem to mention in the 2008-02-07T23:33 post is hash cons'ing.
Given a perfect hash function, traverse the term bottom-up storing each (hash,subterm) pair in a memo table and replacing the subterm by its hash. Once that's done, equality checks are trivial, and the memotable can be converted to SSA rather easily.
This works best if you amortize the memoization by doing it with smart constructors, so that you don't need to worry about the exponential duplication of work for expressions with DAGy structure sharing in the Haskell. Since it's stateful, that means the smart constructors may need to be in an appropriate monad/applicative for passing the memo table around (some hash functions may not need to store the table explicitly).
Maybe this is the too-slow too-complex solution you're using already?
On Tue, May 26, 2009 at 6:49 PM, Conal Elliott <co...@conal.net> wrote: > Hi Tom,
> I've been working on another code-generating graphics compiler, generating > GPU code. As always, I run into the problem of efficient common > subexpression elimination. In Pan, Vertigo & Pajama, I used lazy > memoization, using stable pointers and weak references, to avoid the > worst-case-exponential behavior you mention below. I'm now using a > bottom-up CSE method that's slower and more complicated than I'm going for.
> What's your latest wisdom about CSE in DSELs?
I wasn't able to find a solution that offered both performance and elegance, so I changed the fundamental operation of the DSL (in this case, atom). When atom was still a hardware description language, the compiler would combine several user defined expressions together resulting in very wide and deep expression trees, resulting in the same problem you are observing. But when I switch the target of atom from HDL to C, the compiler no longer needed to perform the same expression expansion. And since the user defined expressions are generally shallow -- at least in the case of my applications -- atom is able to get away with exhaustive equality comparison (deriving Eq).
Let-expression in the EDSL indeed solves the sharing problem, but only partially. Recursion appears when you have a leaf node pointing back to the root node or another branch and forming a cyclic graph in the data structure. It is often desirable to recover cyclic sharing when showing/reading/interpretating EDSL programs.
One possible solution is to further introduce a fixed point data constructor, a Rec or even LetRec to explicitly capture cycles. But then you still incur much overheads interpreting them, and syntax wise it just gets more and more complicated to the point that turning the EDSL into a DSL (or even a preprocessor with your own lexer and parser) becomes more attractive.
Another alternative is to express the EDSL as Monad/MonadFix, or Arrows/ArrowLoop. There are still interpretive overheads, but at the very least they could help with the syntax.
The tagless paper is really nice, but I doubt it offers solutions to the (cyclic) sharing problem.
On 2/13/08, Chung-chieh Shan <ccs...@post.harvard.edu> wrote:
> Henning Thielemann <lemm...@henning-thielemann.de> wrote in article > <Pine.GSO.4.56.0802080908310.12...@haydn.informatik.uni-halle.de> in > gmane.comp.lang.haskell.cafe: >> It seems to become a FAQ. I think all DSLs suffer from the same problems: >> sharing and recursion. I've used wrappers for CSound, SuperCollider, >> MetaPost, they all have these problems.
> What do you mean by the "recursion" problem?
> Sometimes (or perhaps even often), sharing in an EDSL can be expressed > in two ways. First, to reuse a -value- in the embedded language, you > could introduce a "let" construct in the embedded language.
> let_ expr body = body expr
> Second, to reuse an -expression- in the embedded language, if your > interpreter is compositional (here by "interpreter" I include a > compiler, and by "compositional" I mean a fold), then you can represent > an embedded expression simply as its interpretation.
> add x y = x + y > let expr = add x y in add expr expr
BTW, I doubt the (cyclic) sharing problem relates that much to purity, because in an impure language (or the unsafe observable sharing), you still have to remember whether something has been traversed or not and in the worst case accumulates everything that's been traversed so far before releasing all of them in the end. If you remember the history by mutating some states, e.g., using a dirty tag, you lose the ability to do simultaneous traversals.
Adding a simple indirect reference only to the places where sharing is needed (and thus making it explicit) could alleviate this problem. But this solution exists in both pure and impure languages.
> I've been working on another code-generating graphics compiler, > generating GPU code. As always, I run into the problem of efficient > common subexpression elimination. In Pan, Vertigo & Pajama, I used > lazy memoization, using stable pointers and weak references, to > avoid the worst-case-exponential behavior you mention below. I'm > now using a bottom-up CSE method that's slower and more complicated > than I'm going for.
What do you mean with `exponential behavior'? Exponential related to what?
For my FRP EDSL to JavaScript (toy) compiler[1] I've been implementing CSE as well. I traverses the expression tree recursively and creates an small intermediate language containing id's (pointers) to expressions instead of real sub-expressions.
Maybe (probably) I am very naive, but I think this trick takes time linear to the amount of sub-expressions in my script. When using a trie instead of a binary tree for the comparisons there should be no more character (or atomic expression) comparisons that the amount of characters in the script.
So the problem seems not to be CSE algorithm, but the fact that EDSL itself tends to blow up because it is hosted in Haskell. Like Tom's example:
> let d = Add c c > e = Add d d -- "e" now as 16 leaf nodes.
But again, I might be missing some important point here.
> What's your latest wisdom about CSE in DSELs?
> Thanks, - Conal
> On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins <tomahawk...@gmail.com> > wrote: >> ...
Sebastiaan Visser wrote: > On May 27, 2009, at 1:49 AM, Conal Elliott wrote: >> Hi Tom,
>> I've been working on another code-generating graphics compiler, >> generating GPU code. As always, I run into the problem of efficient >> common subexpression elimination. In Pan, Vertigo & Pajama, I used >> lazy memoization, using stable pointers and weak references, to avoid >> the worst-case-exponential behavior you mention below. I'm now using >> a bottom-up CSE method that's slower and more complicated than I'm >> going for.
> What do you mean with `exponential behavior'? Exponential related to > what?
> For my FRP EDSL to JavaScript (toy) compiler[1] I've been > implementing CSE as well. I traverses the expression tree recursively > and creates an small intermediate language containing id's (pointers) > to expressions instead of real sub-expressions.
> Maybe (probably) I am very naive, but I think this trick takes time > linear to the amount of sub-expressions in my script. When using a > trie instead of a binary tree for the comparisons there should be no > more character (or atomic expression) comparisons that the amount of > characters in the script.
> So the problem seems not to be CSE algorithm, but the fact that EDSL > itself tends to blow up because it is hosted in Haskell. Like Tom's > example:
> > let d = Add c c > > e = Add d d -- "e" now as 16 leaf nodes.
> But again, I might be missing some important point here.
That's exactly right. But it's pretty inconvenient to have your expression tree to blow up exponentially in relation to the code the user actually wrote! You can indeed construct an intermediate language that collapses this blowup, but the pass to create it must take exponential time if written completely purely, since it has to visit everything at least once.
In my experience [1], observable sharing using GHC's stable names is a pretty effective solution to this problem.
=========================================================================== ==== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =========================================================================== ====