xml (html) parser ignores <section> tags

66 views
Skip to first unread message

Largo84

unread,
May 19, 2016, 1:57:13 PM5/19/16
to leo-editor
When Leo attempts to parse an html page using either @auto or @clean (refresh from disc), it does a pretty good job of creating nodes from <div> and other tags. However, it doesn't create nodes for <section> tags. For large files, it's a real pain to go through the code and find where they begin and end, then to manually insert nodes. Should I post an enhancement request for this on GitHub or is there something I'm missing?

Also, is there a setting to prevent the parser from 'overdoing it'? For example, have the parser to *not* create nodes for <p>, <li> or other user selectable tags.

Rob..........

PS Try it for yourself with the attached file.
example.html

Edward K. Ream

unread,
May 20, 2016, 8:02:40 AM5/20/16
to leo-editor
​> ​
On Thu, May 19, 2016 at 12:57 PM, Largo84 <Lar...@gmail.com> wrote:

​> ​
Should I post an enhancement request for this on GitHub or is there something I'm missing?

​Thanks for asking first.  The @data import_xml_tags setting is probably what you want.  Let me know if it doesn't work for you.

EKR

Largo84

unread,
May 20, 2016, 10:30:59 AM5/20/16
to leo-editor
I added the @data import_xml_tags to my @settings node (local file, didn't change myLeoSettings) with the following list:

# lowercase xml tags, one per line.

html
body
head
div
table
section
ul
ol
dl
form
tbody

  1. <section> is still *not* picked up.
  2. Others *are* that I don't want (e.g. <i>, <li>, <a> and many others).
I tried with and without a similar @data import_html_tags with the same results (not really sure I know why to use one over the other).

Rob.............

Largo84

unread,
May 23, 2016, 9:21:10 PM5/23/16
to leo-editor
That doesn't work (see other post for details). Any other suggestions?

Rob.....


On Friday, May 20, 2016 at 8:02:40 AM UTC-4, Edward K. Ream wrote:

Edward K. Ream

unread,
Sep 23, 2016, 4:38:52 PM9/23/16
to leo-editor
On Thursday, May 19, 2016 at 12:57:13 PM UTC-5, Largo84 wrote:
When Leo attempts to parse an html page using either @auto or @clean (refresh from disc), it does a pretty good job of creating nodes from <div> and other tags. However, it doesn't create nodes for <section> tags.

Sorry to take so long to respond properly to this.

It looks to me that your opening section tags end with `>` instead of `/>`.  For example,

`<section id="Instructions" class="main-content-section">`

not:

`<section id="Instructions" class="main-content-section"/>

Because of this html error, Leo doesn't look for the matching `</section>` tags.

In other words, the fault is in example.html, not in Leo's html importer.

Edward
Reply all
Reply to author
Forward
0 new messages