xml (html) parser ignores <section> tags

Largo84

unread,

May 19, 2016, 1:57:13 PM5/19/16

to leo-editor

When Leo attempts to parse an html page using either @auto or @clean (refresh from disc), it does a pretty good job of creating nodes from <div> and other tags. However, it doesn't create nodes for <section> tags. For large files, it's a real pain to go through the code and find where they begin and end, then to manually insert nodes. Should I post an enhancement request for this on GitHub or is there something I'm missing?

Also, is there a setting to prevent the parser from 'overdoing it'? For example, have the parser to *not* create nodes for <p>, <li> or other user selectable tags.

Rob..........

PS Try it for yourself with the attached file.

example.html

Edward K. Ream

unread,

May 20, 2016, 8:02:40 AM5/20/16

to leo-editor

>

On Thu, May 19, 2016 at 12:57 PM, Largo84 <Lar...@gmail.com> wrote:

>

Should I post an enhancement request for this on GitHub or is there something I'm missing?

Thanks for asking first. The @data import_xml_tags setting is probably what you want. Let me know if it doesn't work for you.

EKR

Largo84

unread,

May 20, 2016, 10:30:59 AM5/20/16

to leo-editor

I added the @data import_xml_tags to my @settings node (local file, didn't change myLeoSettings) with the following list:

# lowercase xml tags, one per line.

html

body

head

div

table

section

ul

ol

dl

form

tbody

<section> is still *not* picked up.
Others *are* that I don't want (e.g. <i>, <li>, <a> and many others).

I tried with and without a similar @data import_html_tags with the same results (not really sure I know why to use one over the other).

Rob.............

Largo84

unread,

May 23, 2016, 9:21:10 PM5/23/16

to leo-editor

That doesn't work (see other post for details). Any other suggestions?

Rob.....

On Friday, May 20, 2016 at 8:02:40 AM UTC-4, Edward K. Ream wrote:

Edward K. Ream

unread,

Sep 23, 2016, 4:38:52 PM9/23/16

to leo-editor

On Thursday, May 19, 2016 at 12:57:13 PM UTC-5, Largo84 wrote:

When Leo attempts to parse an html page using either @auto or @clean (refresh from disc), it does a pretty good job of creating nodes from <div> and other tags. However, it doesn't create nodes for <section> tags.

Sorry to take so long to respond properly to this.

It looks to me that your opening section tags end with `>` instead of `/>`. For example,

`<section id="Instructions" class="main-content-section">`

not:

`<section id="Instructions" class="main-content-section"/>

Because of this html error, Leo doesn't look for the matching `</section>` tags.

In other words, the fault is in example.html, not in Leo's html importer.

Edward

Reply all

Reply to author

Forward