Hello and here is a tool I created to split big .docx files like a book with chapters, to use with Library

13 views
Skip to first unread message

Midori Koçak

unread,
Jan 31, 2021, 1:45:39 PM1/31/21
to NYT Library Community

Not a feature request but I would like to get your opinions.

http://github.com/midorikocak/docsplitter

Problem Description

For a project, I created for a client, the content and sections were saved in large .docx files with small chapters. As I know, Library requires the google doc files to be separated. To split many large .docx files at least 36 sections would be a tedious task. I looked for a tool to split .docx files according to headings but I could not find one. Using master documents and generate chapter documents one by one was also confusing. So I decided to use python-docx library to iterate and generate zipped .docx files with chapter numbering.

Feature

Suppose you have book with long chapters.docx file.

$ docsplitter -f book\ with\ long\ chapters.docx

will generate book with long chapters.zip in the folder you run the docsplitter. When you extract the zipfile splitted files should appear like 1 - book with long chapters - Chapter Name.docx

If you don't want the filename as prefix you can use -nn, --noname option.

$ docsplitter -f book\ with\ long\ chapters.docx -nn

The generated files will not include the original filename and will have only headings. 1 - Chapter Name.docx

If you want to split your word documents using a different heading, you can use  -l, --level option.

Additional Information

It would be great if you give feedback on this tool, what could be improved for the compatibility.

Thank you very much.

Midori

Link to issue: https://github.com/nytimes/library/issues/251

Reply all
Reply to author
Forward
0 new messages