You should consider also using the PAGE format. You can use this tool for conversion: http://www.primaresearch.org/tools/TesseractOCRToPAGE
Thanks everyone for helpful pointers! These all appear to be different ways of describing the position of the identified words on the page? This definitely seems like it would help me produce structured data because I can classify the words as belonging to certain attributes of a json object for each page based on their vertical and horizontal positions.