boxedit, a web-based box file editor

700 views
Skip to first unread message

Dan Vanderkam

unread,
Dec 18, 2014, 10:48:28 AM12/18/14
to tesser...@googlegroups.com
I had to edit a few Tesseract box files to generate training data recently and didn't find any of the existing tools to my liking. I wanted something that ran on Mac OS X and showed letters inside their boxes.

So I built a web-based tool which I'm calling boxedit.

Demo with preloaded data: http://www.danvk.org/boxedit/demo.html
Source code & instructions: https://github.com/danvk/boxedit/

A few things to like about it:
- It's entirely browser-based, so it runs on any platform and requires no installation.
- You can use the browser's zoom in/out features.
- It shows OCR'd letters on top of the source image, so the accuracy is easy to gauge.
- It can split boxes N ways.
- You can edit the raw box data or use the GUI, either works & they stay in sync.
- It's easy to get going: drag & drop an image and its box file to get started.

A few things to dislike:
- The UI could use some work: the overlaying of transcribed letters could be much clearer.
- Saving your changes back to disk is tedious (my best solution is to copy/paste back into the box file).
- Missing a few important features (e.g. n-way merge and moving/resizing boxes visually)

If people find this useful, I'm happy to polish it a bit more. Feel free to file issues on GitHub.

  - Dan

Helmut Wollmersdorfer

unread,
Dec 19, 2014, 4:28:28 AM12/19/14
to tesser...@googlegroups.com
Nice done. But how to install the missing components?

TIA 

Dan Vanderkam

unread,
Dec 22, 2014, 10:25:27 AM12/22/14
to tesser...@googlegroups.com
What do you mean by "missing components"?

Helmut Wollmersdorfer

unread,
Dec 22, 2014, 10:39:08 AM12/22/14
to tesser...@googlegroups.com


Am Montag, 22. Dezember 2014 16:25:27 UTC+1 schrieb Dan Vanderkam:
What do you mean by "missing components"?

The 3rd party JavaScript code. It would be enough to describe it in an installation HOWTO on github. 

Dan Vanderkam

unread,
Dec 22, 2014, 10:47:56 AM12/22/14
to tesser...@googlegroups.com
Fair enough! I posted some instructions here:

Basically, you need to run "npm install".

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/T5m7ICIcYk4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/441c46b0-efc3-47d3-b75d-6b3e7d25c8dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages