HTML mapping

20 views
Skip to first unread message

rsperberg

unread,
May 31, 2006, 6:42:35 PM5/31/06
to FBReader
I'd like to put down somewhere what the formatting options in FBReader
are for HTML text.

I'm going to list here what I have observed, and request others to
correct me where I'm wrong and add anything I've omitted. I tested only
the main elements (from my perspective).

The HTML elements are formatted according to the mapped element from
the FictionBook2 element list, which populates FBReader's styles list.

Roger

p tag
Maps to Regular paragraph in the styles list. Doesn't have to be the
same as <base> of course.

i and em tags
Both map to Emphasis in the styles list. (And not to Italic at all.)

b and strong tags
Both map to Emphasis in the styles list. (And not to Bold at all.)

h1 - h6 tags
All six map to Section Title, and all six automatically have a
page-break before. It would be nice if h1 and h2 could map to Title and
h5 and h6 to Subtitle to provide some differentiation.

code tag
Maps to Code.

ol and ul (and li)
An li element that is a child of either ol or ul shows up with a
bullet. This doesn't appear to be mapped to any element in the styles
list (and of course there is no mechanism for tagging (or generating)
lists in the FictionBook2 format).

pre
Nothing. Was hoping this would map to Preformatted text.

hr (horizontal rule)
Nothing.

blockquote
Nothing. Since there is no way to indicate special indenting for any
style, this would not be reader-controllable. Still, I was hoping it
might map to Stanza, Verse or Cite.

rsperberg

unread,
May 31, 2006, 6:46:22 PM5/31/06
to FBReader
I wrote above:

>
> The HTML elements are formatted according to the mapped element from
> the FictionBook2 element list, which populates FBReader's styles list.
>

Of course, it would be really great if there were some way to notify
FBReader that the text in question used the HTML tags and the styles
list actually showed HTML elements and not FB2 elements.

Roger

rsperberg

unread,
May 31, 2006, 6:50:47 PM5/31/06
to FBReader
rsperberg wrote:
>
> i and em tags
> Both map to Emphasis in the styles list. (And not to Italic at all.)
>
> b and strong tags
> Both map to Emphasis in the styles list. (And not to Bold at all.)
>

Of course the b and strong tags map to Strong and not to Emphasis.
Apologies for the typo.

Roger

rsperberg

unread,
Jun 1, 2006, 12:43:39 PM6/1/06
to FBReader
I haven't run this test with an oeb file, but since the oeb package
file (opf) would merely point to that same test html file, I'm going to
assume for the moment that the same results will occur.

Note too that FBReader doesn't yet handle tables, so I didn't include
table tags in the quick-and-dirty test that I ran.

geometer

unread,
Jun 2, 2006, 7:29:05 AM6/2/06
to FBReader
rsperberg wrote:

> Of course, it would be really great if there were some way to notify
> FBReader that the text in question used the HTML tags and the styles
> list actually showed HTML elements and not FB2 elements.

Hi Roger,

I think, the best solution is to add new styles like 'H1'..'H6' in
style list. The only problem of this solution is, may be, too large
styles list. I'll try to add additional styles in 0.7.4b. (Styles
'Bold' and 'Italic' are already presented in 0.7.4a, but used for <b>
and <i> tags in OEB only.)

-- Nikolay Pultsin

rsperberg

unread,
Jun 2, 2006, 9:43:41 AM6/2/06
to FBReader
Well, this depends on whether you want to keep a single list of all
tags for all formats, or have separate lists for separate formats.

HTML doesn't have subtitle, stanza, verse, epigraph and so on, and FB2
doesn't have h1-h6, pre, ol, ul and li, and so on.

For the short term, having a combined list (or mapping some elements in
one format to elements in another, the way h1-h6 are now) may work OK,
but it seems to me that it will break down as additional types of
formats are added.

Of course, adding XML vocabularies won't require technical knowledge of
the file format itself, as for instance adding Mobipocket does. And it
is mostly the capability to handle other vocabularies, like DocBook or
TEI, that I personally am interested in.

Reply all
Reply to author
Forward
0 new messages