This is a bit beyond my skill level too, but here's how I would consider attempting it. The first bit is based on code I use to transform LaTeX/TikZ in math environments to SVG files, which was inspired by
this post. It's not really all the complicated, but it isn't very clean, it isn't very fast, and it has some limitations.
I'll try to implement magicalExtraction. First we want a data type to hold our reverse citations.
```
data ReverseCitation = ReverseCitation
{ source :: Identifier -- ^ Identifier of the page hosting the link
, destination :: String -- ^ Destination of the link
} deriving (Show)
```
I don't know how to use the Provider type in Hakyll, which is required as an argument for resourceBody. Instead I just parse the raw files sitting on disk. Once we have the pandoc AST then we can easily search it for a list of the links on the page. Should look roughly like this.
```
-- | This function needs to happen outside of the Compiler monad
getReverseCitationsR :: Pattern -> Rules [ReverseCitation]
getReverseCitationsR p = do
identifiers <- getMatches p
preprocess $ concatMapM getReverseCitationsIO identifiers
getReverseCitationsIO :: Identifier -> IO [ReverseCitation]
getReverseCitationsIO identifier = do
fileContents <- TI.readFile $ toFilePath identifier
ast <- runIOorExplode $ readMarkdown pandocReaderOptions fileContents -- Copy readPandocWith for other types or actual error handling
return $ query getReverseCitationsAST identifier ast
getReverseCitationsAST :: Identifier -> Inline -> [ReverseCitation]
getReverseCitationsAST identifier (Link _ _ (url, _))
| isInternal url = [ReverseCitation identifier url]
| otherwise = []
getReverseCitationsAST _ _ = []
```
Here's an iffy way to determine whether a link is internal. It might actually be useful to collect external links, but let's toss them for now.
```
isInternal :: String -> Bool
isInternal url
| "." `isPrefixOf` url = True
| "#" `isPrefixOf` url = False -- Throw these out for now, they're basically self-referencing.
| "//" `isPrefixOf` url = False -- Protocol relative links.
| "/" `isPrefixOf` url = True
| otherwise = False
```
Now we can get a big list of reverse citations with the magicalExtraction function.
```
main :: IO ()