I do not use tesstrain.sh for training, but I assume it does the right thing, so if there is a little overlap it is likely not to be a problem. Reading many messages on this mailing list I've never seen this as an issue.
I use ocrd-train and it generates boxes for the whole line, not for individual characters and it works perfectly, at least for latin characters. Also the way an LSTM ocr works makes me think the boxes might get joined together in a single line, but I'm just speculating here.
Bye
Lorenzo