unixFilter chaining

84 views
Skip to first unread message

Ethan Schoonover

unread,
Mar 30, 2011, 1:08:54 AM3/30/11
to hak...@googlegroups.com
I realize that the reason I'm having trouble with the following is due to my noob status with Haskell. Regardless, I've tried many (reasonable) permutations to get this working and now turn to the Hakyll meta-brain for assistance.

I'm pulling in a standalone github repo as as subdir of my main website repo (via git subtree, but that's not important). The beauty of this is that I can use the github friendly README.md as my index.html and avoid duplicating effort. Of course I have some images in the README.md (at https://github.com/altercation/solarized ) and I could just leave the paths pointing to github images even when hosting off my own site, but this seems like a bad idea. Deep linking isn't nice, etc.

So I want to pull the project repo down to live as a subdir of my webroot (done, works fine) and process it into a full webpage (tested, fine). Ideally, then, I will process the URLs in the README to switch from absolute https://github... URLs to either local or absolute (to my webserver) paths. This is trivial with sed, so I build a unixFilter compiler:

    unGitHub :: Compiler String String
    unGitHub = unixFilter "sed" ["s+https://github.com/[^/]*/\\([^:/]*\\)/raw/master+/\1+g"]


This simply changes urls in the README.md like this:


to this:

[![haskell light](/solarized/img/solarized-screen-haskell-light-th.png)](/solarized/img/solarized-screen-haskell-light.png)


This unixFilter works fine if nothing else is happening. However I want to chain it together into the standard pageCompiler >>> applyTemplate compiler chain. This is where I'm stuck. Rather than stumble through even more failed implementations, I seek wisdom, oh Hakyll elders.

For reference, here's where I'm using (breaking) it:

    -- README index files are special - no subdir, and they
    -- get some special processing to strip github paths

    forM_   [ "pages-listed/*/README.md"
            , "pages-unlisted/*/README.md"] $ \page -> do
    route   page $ setRoot `composeRoutes` setExtension "html"
    compile page $ unGitHub >>> pageCompiler, etc...

-- (yes, I know that unGitHub needs to getResourceString >>> above... I'm not even sure this is where a unixFilter compiler should be, though)

Help :)

-Ethan



Jasper Van der Jeugt

unread,
Mar 30, 2011, 2:44:13 AM3/30/11
to hak...@googlegroups.com
Hello Ethan,

`pageCompiler` is actually a very simple compiler, and it is supposed
to be a "sane default". It is defined more or less like this:

pageCompiler = readPageCompiler >>> addDefaultFields >>>
arr applySelf >>> pageRenderPandoc

This simply:

- reads the page;
- adds some default fields (e.g. $url$, $path$);
- fills in $url$, $path$, etc. in it's own body;
- renders using Pandoc.

If you want to have custom transformations in between, you will have
to specify your custom `pageCompiler`. Basically, you want to replace
the `readPageCompiler` part, which has the type `Compiler Resource
(Page String)` and is defined as

readPageCompiler = getResourceString >>> arr readPage

So your definition should probably be something like:

myPageCompiler :: Compile Resource (Page String)
myPageCompiler = getResourceString >>> unGitHub >>>
arr readPage >>> addDefaultFields >>> arr applySelf >>>
pageRenderPandoc

Hope this helps,
Cheers,
Jasper

Ethan Schoonover

unread,
Mar 30, 2011, 3:59:09 PM3/30/11
to hak...@googlegroups.com
Jasper, that was a huge help and got me through this issue beautifully.

Hakyll is just brilliant, lovely work. Thanks.

Best regards,
Ethan
Reply all
Reply to author
Forward
0 new messages