82b919e: revised xml/html importers

121 views
Skip to first unread message

Edward K. Ream

unread,
Apr 16, 2017, 7:33:58 AM4/16/17
to leo-editor
The new code is simpler than the old, and should be more robust in the face of dubious syntax. All unit tests pass, and Rob's recent files import correctly. Still, more real-world testing is required.  Please report any problems immediately.

Edward

P.S. The code that handles "<", "</", ">" and "/>" has changed as follows:

1. The stack of open tags is now "stateless".  That is, it contains only tags.
2. The stack now contains all open tags, including void tags and tags not in @data import_html_tags.

The new code completely ignores ">" tags except when the top of the stack is a void tag. This is an important simplification.  It should allow more robust error checking. We shall see...

EKR

Joe Orr

unread,
Apr 22, 2017, 7:41:40 PM4/22/17
to leo-editor
Probably a dumb question, but how do I convert this file to Leo format:

I tried the import entry on the file menu, that didn't seem to work.

I can write an XSL stylesheet to do the conversion if necessary, but it that is the case, is there a guide to the Leo file format? (I can figure it out my looking at it, just would like to know if there are things I shouldn't miss).

Joe

Terry Brown

unread,
Apr 22, 2017, 11:26:01 PM4/22/17
to leo-e...@googlegroups.com
On Sat, 22 Apr 2017 16:41:40 -0700 (PDT)
Joe Orr <joe...@gmail.com> wrote:

> Probably a dumb question, but how do I convert this file to Leo
> format: https://www.ibiblio.org/xml/examples/shakespeare/lear.xml
>
> I tried the import entry on the file menu, that didn't seem to work.

What do you expect it to look like when it's imported?

You could use an `@auto lear.xml`, hmm, apparently that's no different
to `@edit lear.xml`, just a single node import of the XML.

Perhaps try the xml_edit plugin. I just loaded it with that, took a
little time on my relatively fast desktop, so be patient. I suspect
that gives you what you're looking for demo wise.

Cheers -Terry

> I can write an XSL stylesheet to do the conversion if necessary, but
> it that is the case, is there a guide to the Leo file format? (I can
> figure it out my looking at it, just would like to know if there are
> things I shouldn't miss).
>
> Joe
>
> On Sunday, April 16, 2017 at 7:33:58 AM UTC-4, Edward K. Ream wrote:
> >
> > The new code is simpler than the old, and should be more robust in
> > the face of dubious syntax. All unit tests pass, and Rob's recent
> > files import correctly. Still, more real-world testing is
> > required. Please report any problems immediately.
> >
> > Edward
> >
> > P.S. The code that handles "<", "</", ">" and "/>" has changed as
> > follows:
> >
> > 1. The stack of open tags is now "stateless". That is, it contains
> > only tags.
> > 2. The stack now contains *all* open tags, including void tags and

Edward K. Ream

unread,
Apr 23, 2017, 7:25:12 AM4/23/17
to leo-editor
On Sat, Apr 22, 2017 at 10:25 PM, Terry Brown <terry...@gmail.com> wrote:
On Sat, 22 Apr 2017 16:41:40 -0700 (PDT)
Joe Orr <joe...@gmail.com> wrote:

> Probably a dumb question, but how do I convert this file to Leo
> format: https://www.ibiblio.org/xml/examples/shakespeare/lear.xml
>
> I tried the import entry on the file menu, that didn't seem to work.

What do you expect it to look like when it's imported?

You could use an `@auto lear.xml`, hmm, apparently that's no different
to `@edit  lear.xml`, just a single node import of the XML.

​You have to tell the html and xml importers which elements are to form new nodes.  See @data import_xml_tags in leoSettings.leo.

Edward

Joe Orr

unread,
Apr 23, 2017, 7:31:47 AM4/23/17
to leo-editor
Any tips on how to get that plugin to work? Does not show on the plugins menu pulldown. Leo 5.5, Mac OS.

Also, when I click on 'plugins menu' item, get raw md text in view rendered pane. View rendered pane does only displays raw output, both Mac and Windows.

Joe

Joe Orr

unread,
Apr 23, 2017, 7:39:51 AM4/23/17
to leo-editor
Oh realized I needed to add it in myLeoSettings.py.

Did so, and got this message:

loadOnePlugin: error importing plugin: leo.plugins.xml_edit
Traceback (most recent call last):
  File "/Users/josephorr/Development/leo-editor/leo/core/leoPlugins.py", line 502, in loadOnePlugin
    __import__(moduleName)
  File "/Users/josephorr/Development/leo-editor/leo/plugins/xml_edit.py", line 103, in <module>
    from lxml import etree
ImportError: No module named lxml
loadOnePlugin: can not load enabled plugin: leo.plugins.xml_edit

Joe

Terry Brown

unread,
Apr 23, 2017, 8:33:55 AM4/23/17
to leo-e...@googlegroups.com
Ah, I guess you need to install lxml for that to work. Not something that can be bundled with Leo.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Joe Orr

unread,
Apr 23, 2017, 9:38:52 AM4/23/17
to leo-editor
Thanks for quick reply. New to python world here, tried sudo port install py27-lxml 
but same result. Any ideas on what is missing?

I'll try on Linux later if I can't get Mac to work.

Joe

lewis

unread,
Apr 23, 2017, 11:38:33 AM4/23/17
to leo-editor
I enabled the xml_edit plugin and installed lxml via pip; It installed v3.7.3
pip install lxml

Created a new leo file, then plugins> xml_edit> xml2leo (browse to file) and it successfully imports the Lear.xml file which I had downloaded.

I'm running Python 3.6.1, PyQt version 5.8.0 on Windows 10.

Regards
Lewis

On Sunday, April 23, 2017 at 11:38:52 PM UTC+10, Joe Orr wrote:
.... Any ideas on what is missing?

Terry Brown

unread,
Apr 23, 2017, 10:34:49 PM4/23/17
to leo-e...@googlegroups.com
I think the xml_edit plugin's going to give you the most appealing
results, seeing it hides the XML tags. Well, not the tag names, but
all the < /> stuff.

Cheers -Terry

Edward K. Ream

unread,
Apr 27, 2017, 6:04:14 PM4/27/17
to leo-editor
On Sun, Apr 23, 2017 at 8:38 AM, Joe Orr <joe...@gmail.com> wrote:
Thanks for quick reply. New to python world here, tried sudo port install py27-lxml 
but same result. Any ideas on what is missing?

​Just checking on this thread.  Joe, have you been able to install lxml?

Edward

Joe Orr

unread,
Jun 3, 2017, 10:29:45 AM6/3/17
to leo-editor
I gave up on that approach, decided it would be better to make specific transformers to get exactly the output I want.

Here is a project with a transformer for the Moby Shakespeare xml. I'm planning to make a bunch more, for each xml format just need to write the xsl.

The result in Leo Viewer:

Question:
If you look an example output file:
You'll see that I didn't use the type of ID specified in the Leo filespec. I just used the XSL unique id. I was going to write a function to change those to the leo format (unique datetime) but this seems to work, so was going to skip it. Is this a problem?

Joe

Edward K. Ream

unread,
Jun 16, 2017, 9:39:49 AM6/16/17
to leo-editor
On Sat, Jun 3, 2017 at 9:29 AM, Joe Orr <joe...@gmail.com> wrote:
I gave up on that approach, decided it would be better to make specific transformers to get exactly the output I want.

Here is a project with a transformer for the Moby Shakespeare xml. I'm planning to make a bunch more, for each xml format just need to write the xsl.

The result in Leo Viewer:

Question:
If you look an example output file:
You'll see that I didn't use the type of ID specified in the Leo filespec. I just used the XSL unique id. I was going to write a function to change those to the leo format (unique datetime) but this seems to work, so was going to skip it. Is this a problem?

​Sorry for the delay in answering.  It might be a problem--are clones linked/shown correctly?

Edward

Joe Orr

unread,
Aug 2, 2017, 8:58:22 AM8/2/17
to leo-editor
Sorry for slow reply, joined a startup so my spare time has gone way down. 

Doesn't seem to be causing any problems but if I notice any I will add the code to duplicate the timestamp IDs.

Going to post soon with some updates to the HTML5 project.

Joe

Edward K. Ream

unread,
Aug 3, 2017, 2:10:51 PM8/3/17
to leo-editor
On Wed, Aug 2, 2017 at 7:58 AM, Joe Orr <joe...@gmail.com> wrote:
Sorry for slow reply, joined a startup so my spare time has gone way down. 

​Congratulations, I think ;-)​
 

Doesn't seem to be causing any problems but if I notice any I will add the code to duplicate the timestamp IDs.

​Thanks for the status report.

Edward
Reply all
Reply to author
Forward
0 new messages