[First project] Tom Brown's School Days

190 views
Skip to first unread message

Robert Dixon

unread,
Dec 14, 2022, 1:26:56 AM12/14/22
to Standard Ebooks
Hello,

I'm interested in producing Tom Brown's School Days. This book is slightly over 100k words, but it does appear in the list of recommended first projects at https://standardebooks.org/contribute/wanted-ebooks.

Text: https://gutenberg.org/ebooks/32224 (Note: I believe this text is better and more complete than the one at https://gutenberg.org/ebooks/1480)

Robby

Alex Cabal

unread,
Dec 14, 2022, 2:08:56 AM12/14/22
to standar...@googlegroups.com
Great, that one would be a good start.

You can cut all of the illustrations. Since there is a preface, you'll
have to add a half title page. I see there are epigraphs for each
chapter, see the manual for how to style those.

It will be easiest to remove the page numbers with a regex before you
split the chapters.

The transcription you found looks like it has corrections in underline
with explanations... that will be something you'll have to remove in the
text. Not sure how many there are but that will probably have to be done
by hand.

Make sure to read the Standard Ebooks Manual of Style before starting,
as you won't know what to fix if you haven't read the standards. In
particular, please closely review the semantics, high level patterns,
and typography sections:

https://standardebooks.org/manual

https://standardebooks.org/manual/latest/4-semantics

https://standardebooks.org/manual/latest/7-high-level-structural-patterns

https://standardebooks.org/manual/latest/8-typography

The step by step guide will take you from start to finish:

https://standardebooks.org/contribute/producing-an-ebook-step-by-step

Please email often if you have any questions at all. Our standards are
well-established so there is probably already a standard for formatting
whatever problem you've encountered.

When you're ready, email back with a link to your Github repository so
that I can mark you as having started.

Have fun! :)
> --
> You received this message because you are subscribed to the Google
> Groups "Standard Ebooks" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to standardebook...@googlegroups.com
> <mailto:standardebook...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/standardebooks/a58494ba-ce9b-4e28-a8c5-46e4b793abbfn%40googlegroups.com <https://groups.google.com/d/msgid/standardebooks/a58494ba-ce9b-4e28-a8c5-46e4b793abbfn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Robert Dixon

unread,
Dec 14, 2022, 4:15:35 AM12/14/22
to standar...@googlegroups.com

Robert Dixon

unread,
Dec 14, 2022, 5:16:08 AM12/14/22
to standar...@googlegroups.com
I've done the initial "Split files and clean" step. Before I start thinking about typography, I'd appreciate it if someone could do a quick sanity check to make sure everything looks reasonable so far.

B Keith

unread,
Dec 14, 2022, 9:22:27 AM12/14/22
to Standard Ebooks
You are going to have to split it based on chapters not parts. 
Ad the headers will need to be adjusted i.e. <section id="chapter-3" epub:type="chapter> is actually <section  id=“part-1”… etc.  Take a look at these relevant sections of the SEMOS


And of course dump the illustrations.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "Standard Ebooks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to standardebook...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/standardebooks/55cb0efb-f625-40ac-abef-a36e2dddde0d%40app.fastmail.com.

Alex Cabal

unread,
Dec 14, 2022, 12:21:15 PM12/14/22
to standar...@googlegroups.com
Emma, can you manage this with Vince reviewing?

Emma Sweeney

unread,
Dec 14, 2022, 3:33:25 PM12/14/22
to Standard Ebooks
Sure!

Emma

Robert Dixon

unread,
Dec 16, 2022, 6:56:29 PM12/16/22
to standar...@googlegroups.com
Here's a typography question for which I can't find an answer in the manual of style. In some cases, semicolons appear inside closing quotation marks when they're not semantically part of the quotation, like this:

> there is no crying “hold;” the shepherd is an old hand and up to all the dodges

I know commas and punctuations generally go inside quotation marks, but should these semicolons be moved outside?

Robert Dixon

unread,
Dec 16, 2022, 6:57:24 PM12/16/22
to standar...@googlegroups.com
Sorry, "commas and punctuations" should read "commas and periods".

Emma Sweeney

unread,
Dec 16, 2022, 7:45:00 PM12/16/22
to Standard Ebooks
You can move semicolons outside of the quotation mark, but don't touch the commas and periods. Commas and periods stay inside quotes for typographical aesthetics (SEMOS 8.7.2).

Emma

Robert Dixon

unread,
Dec 18, 2022, 1:34:12 AM12/18/22
to standar...@googlegroups.com
Thanks all for the guidance so far. Another question: there's a full-page epigraph at the beginning of Part II, but not Part I. It's not true "front matter" since it's not at the beginning of the book. Should it still be put in its own file, and if so should the file be called epigraph.xhtml or something else?

Robby

Emma Sweeney

unread,
Dec 18, 2022, 1:58:28 AM12/18/22
to Standard Ebooks
The epigraph can go in part-2.xhtml. See SEMOS 7.4.5 on how to format epigraphs in section headers.

Emma

Robert Dixon

unread,
Mar 7, 2023, 2:14:14 AM3/7/23
to standar...@googlegroups.com
Still working on this. Looking at modernize-spelling, my inclination is not to collapse the many occurrences of "School-house" into "Schoolhouse," since it doesn't refer to a small school building (usually a small town elementary school) but to one of several "houses" into which the larger institution is divided. Does that make sense? Or perhaps it should be changed to "School House"?

Emma Sweeney

unread,
Mar 7, 2023, 6:14:56 PM3/7/23
to Standard Ebooks
You can keep the dashes for "School-house".

Emma

Robert Dixon

unread,
Mar 19, 2023, 11:19:46 PM3/19/23
to standar...@googlegroups.com
I got the following error when running build-toc and have no idea what any of it means. What's going on?

Traceback (most recent call last):
File "/home/robby/.local/bin/se", line 8, in <module>
sys.exit(main())
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/main.py", line 81, in main
sys.exit(getattr(module, command_function)(args.plain_output))
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/commands/build_toc.py", line 39, in build_toc
toc = se_epub.generate_toc()
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/se_epub.py", line 1279, in generate_toc
toc_xhtml = generate_toc(self)
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/se_epub_generate_toc.py", line 768, in generate_toc
landmarks, toc_list = process_all_content(self, self.spine_file_paths)
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/se_epub_generate_toc.py", line 735, in process_all_content
process_headings(dom, textf.name, toc_list, single_file, single_file_without_headers)
File "/home/robby/.local/pipx/venvs/standardebooks/lib/python3.6/site-packages/se/se_epub_generate_toc.py", line 423, in process_headings
special_item.level = get_level(content_item[0], toc_list)
IndexError: list index out of range

David at Standard Ebooks

unread,
Mar 20, 2023, 12:30:27 AM3/20/23
to standar...@googlegroups.com
I think the problem is in your Preface, which should have <h2 epub:type="title"> in the heading.
--
You received this message because you are subscribed to the Google Groups "Standard Ebooks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to standardebook...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/standardebooks/b293eccf-89ea-4da5-9ae5-9e0f77488d68%40app.fastmail.com.

Emma Sweeney

unread,
Mar 20, 2023, 12:33:33 AM3/20/23
to Standard Ebooks
Check all your file titles. Make sure they are formatted correctly and have the right semantics (SEMOS 7.2). Running se lint . can help you find these issues.

Emma

David at Standard Ebooks

unread,
Mar 20, 2023, 12:34:48 AM3/20/23
to standar...@googlegroups.com
No, actually the problem is deeper than that. It's the way you've structured headings, which need a hgroup around them. Check out the manual at https://standardebooks.org/manual/1.7.0/single-page#7.2

Vince

unread,
Mar 20, 2023, 12:36:46 AM3/20/23
to Standard Ebooks
For reasons I’m not sure of (although it is reasonable), it’s choking on the fact there is no <p> in the dedication. So replacing the <div> with a <p> will solve the traceback.
But in fact the dedication html needs to be completely redone.

As noted, the <div> should be a <p>.
We don’t use <big> (use <b> instead and then target it with CSS to the appropriate size).
There should be no all-caps in the text. (If it’s needed, it should be handled with CSS.)
There should be no spaces in the “B Y T H E…”; again, this should be handled with CSS (letter-spacing).

Once you get past this, the preface also has a problem (which won’t cause a traceback, but will give a nice error message) in that the header doesn’t have an epub:type.

Alex Cabal

unread,
Mar 20, 2023, 12:38:27 AM3/20/23
to standar...@googlegroups.com
We should have an error message with a hint for whatever case is going
on here, instead of a crash and exception traceback. David could you
take a look?
>> <mailto:standardebook...@googlegroups.com>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/standardebooks/4c65546b-357f-4088-9a16-7eb1fe097617%40Spark <https://groups.google.com/d/msgid/standardebooks/4c65546b-357f-4088-9a16-7eb1fe097617%40Spark?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Standard Ebooks" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to standardebook...@googlegroups.com
> <mailto:standardebook...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/standardebooks/a9eba2cf-23e9-4210-95f3-936266ac2fe1%40Spark <https://groups.google.com/d/msgid/standardebooks/a9eba2cf-23e9-4210-95f3-936266ac2fe1%40Spark?utm_medium=email&utm_source=footer>.

Vince Rice

unread,
Mar 20, 2023, 12:47:26 AM3/20/23
to Standard Ebooks
David, as I noted a few minutes ago, the traceback is caused by the absence of a <p> in the dedication.
The missing epub:type in preface just generates an error that says exactly what the problem is (Couldn’t find title in: preface.xhtml.)
The absence of hgroups is incorrect from the standpoint of our structure, but it doesn’t cause an error in build-toc (the generated ToC is just wrong, with two entries per chapter instead of one). Don’t know if it should?

Robert Dixon

unread,
Mar 20, 2023, 1:52:30 AM3/20/23
to standar...@googlegroups.com
So there were three separate issues here:
- As David noted, the preface was missing an epub:type attribute in the heading.
- As David also noted, <hgroup> tags were missing. This was due to me misreading the MoS: for some reason I was under the impression that each file should have either <hgroup> or <header>, but not both.
- As Vince said, replacing <div> with <p> in the dedication allows build-toc to run successfully, but I haven't yet redone the markup there.

Thanks for the help!

David at Standard Ebooks

unread,
Mar 20, 2023, 3:24:37 AM3/20/23
to standar...@googlegroups.com
OK, I'll have a look at the code. We certainly want to handle any exceptions thrown, not just crash.
--
You received this message because you are subscribed to the Google Groups "Standard Ebooks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to standardebook...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/standardebooks/b5c82a22-e2ee-45cb-9a58-48f1d2507de8%40app.fastmail.com.

Vince

unread,
Apr 23, 2023, 7:58:22 PM4/23/23
to Standard Ebooks
Just curious, David, did you find out what was causing the crash?

David at Standard Ebooks

unread,
Apr 23, 2023, 8:27:55 PM4/23/23
to Standard Ebooks
Still on my “to do” list, I’m afraid, but thanks for the reminder!
--
You received this message because you are subscribed to the Google Groups "Standard Ebooks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to standardebook...@googlegroups.com.

David Grigg

unread,
May 21, 2023, 7:42:48 PM5/21/23
to Standard Ebooks
Alex just merged my pull request to fix this crash.

Vince

unread,
May 21, 2023, 8:01:21 PM5/21/23
to standar...@googlegroups.com
Excellent, thanks, David!
Reply all
Reply to author
Forward
0 new messages