Lens.Converter has gone local

Skip to first unread message

Michael Aufreiter

Sep 7, 2013, 7:21:40 PM9/7/13
to elife...@googlegroups.com
Great news regarding the Lens article conversion process!

What started as an experiment turned out to be a really flexible approach in order to allow Lens to display any NLM XML file. In about 1 1/2 week we managed to port most functionality of Refract into a client-side implementation that uses the browser-native XMLParser. The code for it is available on Github: https://github.com/elifesciences/lens-converter

This has a number of implications for the whole project:

- Lens now understands NLM directly, no need for the conversion pipeline we had before. Essentially we no longer need to host a converter, seed it with a list of XML files and host the resulting JSON files somewhere. All that is done now on the fly on the client-side. Just provide a url to an NLM XML file to Lens.
- The conversion is a matter of milliseconds. I don't think there's a need for precompiled / hosted JSON files anymore.
- We only need to host Lens, and the XML files. No precompiled JSON's anymore. It will save us a lot of time maintaining infrastructure. 
- For development this means we can debug more easily using the browsers's dev tools (we haven't had that luxory with our Node-based converter)
- Drag and drop any NLM file into Lens: This just works out of the box using the new browser-native converter. No server roundtrip needed
- Oliver, Ivan, Rebecca and I have have implemented a generic version of the converter (based on the NLM spec) which can be tweaked by publisher specific configurations (e.g. for resolving figure urls etc.) See: https://github.com/elifesciences/lens-converter/blob/0.1.x/src/configurations/elife.js or https://github.com/elifesciences/lens-converter/blob/0.1.x/src/configurations/landes.js
- Whenever we make a change to the converter, we can just redeploy Lens without touching anything. The improved Lens will just work on the same NLM files.

While I was working on the converter last week I had a bunch of realisations that I'd like to share:

- I had hopes that we could have a mapping back from our JSON format to NLM in order to implement basic editing in the future (e.g. for paragraphs etc.). However I think this is not a good idea. NLM has many presentation-specific bits in it, and the output of our converter can't be mapped back to NLM without loosing information. I think this is fine.. we should not try to do too many things at once. 
- I'm not sure anymore if it's a good idea to integrate Lens into the peer review process. I think this should be part of a separate project which lives outside the scope of Lens as a viewer for NLM.
- Stats tracking: This also adds some overhead. We'll prepare the data to support click tracking by storing the original element ids in our document. However I don't think this should be part of 0.2.x. as we have so many other priorities. Discussed this with Ivan already.

Conclusion: Imo we should focus on the core idea and turn Lens into the best viewer for NLM content. The default implementation will work for any publisher, while it will be easy to provide custom configurations. Imo we should hold off on turning Lens into an editing/peer-reviewing tool. Now that I dug deeper into the NLM spec I've got strong opinions that this should be done in a separate long-term project, which does not rely on NLM as an input format anymore. 

Anyways, here's a sample document that is on the fly converted in your browser: 

More updates soon.

-- Michael
Reply all
Reply to author
0 new messages