[Haskell-cafe] how to organize a parallel parser/compiler

12 views

Skip to first unread message

Maurizio Vitale

unread,

May 28, 2015, 12:50:58 PM5/28/15

to Haskell Cafe

hello haskell-cafe,

suppose you have a compiler pipeline roughly similar to (from write you a Haskell

modl :: FilePath -> L.Text -> CompilerM ()
modl fname
= parseP fname
>=> dataP
>=> groupP
>=> renameP
>=> desugarP
>=> inferP
>=> evalP

and suppose you have a language where you can compile multiple files and compilation must be made _as if_ files were concatenated together in the order given on the command line (there might be side effects that affect parsing of subsequent files); still in the average case there're no dependencies and files could be compiled in parallel (or some of the dependencies can be handled at the AST level as one can prove they don't affect parsing but only aspects of the AST that can be patched).

What are good strategies for dealing with this and rerun some of the parseP functions (and in the simplest solution the complete pipeline after it).

The strategy I have in mind now doesn't mix well with the above pipeline, so I'd like to see if there're alternative solutions.

Basically, what I have in mind is:

- each parse function gets a file, something to watch on for the result of the previous parse (let's say an MVar, or some variation of speculation

, not sure yet) and an input environment (same for everybody). It produces an AST + what he's sensitive to (e.g. what would have affected the parsing) and what he generates that the next guy must be sensitive to.

- before producing the 'what it generates part', it must be sure to have completed a valid parse, so he'll wait on the input MVar to know that the previous files have been parsed properly and that whatever side effects it caised wouldn't affect parsing. This wait will be done on a thread that doesn't count towards the parallelism limit as it is presumably cheap and we don't really serialize the parsing.

[a similar question related to the above pipeline would be how do I fit in any kind of global transformation pass]

Is there any library or other ideas on how to combine multiple pipelines where we want to run them with maximum parallelism but the outcome of (even a partial step of) one pipeline can invalidate and require to rerun others?

Thanks for any idea,

Maurizio

Reply all

Reply to author

Forward

0 new messages