Pdf in Open Song App Format while recognizing chords

29 views
Skip to first unread message

gerdjohannes

unread,
Sep 2, 2025, 4:15:57 AMSep 2
to OpenSongApp
Hello! Someone in our band is putting a lot of effort into creating Leeds Sheet Music PDFs (see attachment for an example). I know how to import a PDF into Open Song App. But how can I import a PDF into Open Song App so that the chords are recognized as chords, for example? Pure text recognition doesn't work with the system; it only recognizes about half of the chords.

50ways to leave your lover-1-1.pdf

gerdjohannes

unread,
Sep 2, 2025, 4:25:21 AMSep 2
to OpenSongApp
Here is an example of an attempt to integrate the file from Open Song App using text recognition.
ops beispiel weg.jpg

Gareth Evans

unread,
Sep 16, 2025, 5:32:34 PM (4 days ago) Sep 16
to gerdjohannes, OpenSongApp
Hi,

OCR extraction doesn't care about lyrics or chords, or spaces.  It simply returns blocks of text.  The app then has to try to put the blocks back together.  Currently the app simply adds these blocks of text back together with a single space between them and as a result the chords often end up bunched up together.  The automatic detection of chord lines is then done by the app, not the ocr.  It does this by looking for lines that meet one of the following criteria:
  1. Contains lots of short (1-3 character) blocks and not much else (the blocks should average at 2.4 characters or less)
  2. After removing all of the white space (normal spaces), the entire line content should be 25% or less of the entire line length including spaces
A pass of either of these identifies the line as likely to be a chord line (the app will add . at the beginning of the line)

Here's some examples from your song showing why the lines have been determined to be lyrics or not.

G Bb
The average character length is (1 + 2)/2 = 3/2 = 1.5 - passes check 1
The percentage of the line used (text / all text including spaces) is = (3/4) * 100 = 75% - fails check 2
Passing one of these suggests that it is likely to be a chord line

Em D6 Cmaj7 H7
The average character length is (2 + 2 + 5 + 2)/4 = 11/4 = 2.75 - fails check 1
The percentage of the line used (text / all text including spaces) is = (11/14) * 100 = 79% - fails check 2
Failing both of these suggest that it is not a chord line

It will never be perfect though as the next line shows

La la la lah
The average character length is (2 + 2 + 2 + 3)/4 = 9/4 = 2.25 - passes check 1
The percentage of the line used (text / all text including spaces) is = (9/12) * 100 = 75% - fails check 2
This would wrongly be identified as a chord line.

I've had a go at improving the OCR feature by manually trying to guess how many spaces might fit between the text blocks and add these back in.  It isn't perfect when the font used isn't a monospaced font, but it should help.  I will also increase step 1 to an average of 2.75 or less and increase the percentage used to be a max of 30% to help with some other lines.  I've got quite a lot of other code changes in progress, so I can't release this until I fix these too.

Watch out for the next release (it will be the one that includes light/dark modes for the entire app, not just the song display).

Using your PDF with the new code would give this output (not perfect, but definitely better)
image.png

Best wishes,

Gareth

--
When responding to a post on the forum using your email, please make sure to click on the 'Reply all' button so that your response is also sent to the forum for other people to see the full conversation.
---
You received this message because you are subscribed to the Google Groups "OpenSongApp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opensongapp...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/opensongapp/40c7800e-0f46-431d-8de7-5232fd78fe74n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages