Hi, I'm not exactly sure what the problem is with your case. Below is
based on my personal experience only.
The BOX file you expected is just the bounding box automatically
marked by tesseract, which simply to save up our labour work in
manually obtaining the bounding box. This bounding box info can be
inaccurate. Hence, we need to check them one by one and correct any
errors if there are. Otherwise, the following training procedure won't
go correctly.
Since you are training Chinese characters, it is not a surprise if the
initial bounding box info obtained by tesseract contains some errors,
because I suppose currently tesseract is not doing well enough to
recognise Chinese characters.
Maybe, you will need to manually mark and get the bounding box info
for each of characters in your sample image?