PDF Convert not always importing text and lyrics

• Mar 15, 2021 - 08:27

Sometimes when I use the online tool to convert a PDF music score to MSCZ format it includes all text (eg. headings, chords etc.) and lyrics. Other times it just includes the music but no text or lyrics.

It has nothing to do with the quality of the PDF. I just did two that were virtually identical. Both were perfectly crisp and sharp. One copied over all of the text and lyrics, the other one didn't. What am I doing wrong?

I have attached them both below. "Isn't she lovely" converts all of the text and lyrics (almost) perfectly. "Lucille" does not get any of the text or lyrics copied over.

Attachment Size
Isnt She Lovely.pdf 61.08 KB
Lucille.pdf 539.07 KB

Comments

The AI tool that handles the conversion - called Audiveris - is like all AI technology, not perfect, it just does the best it can. Probably nothing you are doing wrong, except expecting too much :-)

In reply to by Marc Sabatella

There is definitely something strange going on. I have processed about 10 files, some of them quite poor quality. One didn't work at all but every other one except that one captured at least some of the text, for example the song name if nothing else.

What I can add is that the one below that worked (Isn't She Lovely) is only 62kb in size. The other one that didn't work (Lucille) is 540kb in size. In both cases the text is actual text not a jpg image, as the text in the PDF can be copied and pasted.

I tried splitting up the pages of Lucille and processing each page individually but that made no difference. Even the third page that just contains 2 lines did not work.

I also did a screen capture of page 3, saved it as a jpg file, imported it into Word, saved it as a PDF file then tried it but no luck, yet some of the ones that did work were just straight jpg images saved as PDF and having no selectable text in them.

When a conversion works immediately, very quickly, them most likely you're getting the result of an old, previous, cached conversion of the same PDF, likely created with an older verify Audiveris in the backend.

Do you still have an unanswered question? Please log in first to post your question.