why is Import confused, and how can I fix the file?

• Dec 4, 2021 - 19:24

Hello everyone!

I tried to import a very clean PDF that starts like this:
original.png
And the result that came back looks like this:
imported_result.png
This happens a lot: the measures are not aligned properly. And there are odd symbols in the staff like the little rectangle in the treble staff and the "hidden" rests inserted everywhere. And as you can see from my selection, the score sort of knows what notes belong in the measure with it, but the barline disagrees. Really weird. And double weird because the original is crystal clear in these regards.

1) So I guess my first question is, why is it getting this wrong? Can I "white out" something in the original to help it be less confused?

2) But my more urgent question is, how can I salvage this import? I don't even know how to select some of these strange characters or how to fix this score in general. It's not just a matter of shifting everything to the left 1/4 note.

I'd really value your help here, since the alternative is doing this all by hand.

Thanks!

p.s. The "import" score is also fully attached.
2909532d859d2b034e7904395b7824e216cb20c3.mscz

Attachment Size
imported_result.png 940.63 KB

Comments

Opening your attached score shows that it is seriously corrupted:

Bar 1, stave 1 incomplete. Expected: 4/4; Found: 2/1
Bar 1, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 2, stave 1 incomplete. Expected: 4/4; Found: 2/1
Bar 2, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 4, stave 1 incomplete. Expected: 4/4; Found: 2/1
Bar 4, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 5, stave 1 incomplete. Expected: 4/4; Found: 2/1
Bar 5, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 8, stave 1 incomplete. Expected: 4/4; Found: 7/4
Bar 8, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 9, stave 1 incomplete. Expected: 4/4; Found: 7/4
Bar 9, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 10, stave 1 incomplete. Expected: 4/4; Found: 7/4
Bar 10, stave 2 incomplete. Expected: 4/4; Found: 6/4
Bar 11, stave 1 incomplete. Expected: 4/4; Found: 31/16
Bar 11, stave 2 incomplete. Expected: 4/4; Found: 31/16
Bar 12, stave 1 incomplete. Expected: 4/4; Found: 24/16
Bar 12, stave 2 incomplete. Expected: 4/4; Found: 24/16
Bar 13, stave 1 incomplete. Expected: 4/4; Found: 3/2
Bar 13, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 14, stave 1 incomplete. Expected: 4/4; Found: 3/2
Bar 14, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 15, stave 1 incomplete. Expected: 4/4; Found: 3/2
Bar 15, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 16, stave 1 incomplete. Expected: 4/4; Found: 31/16
Bar 16, stave 1, voice 2 too long. Expected: 4/4; Found: 47/32
Bar 16, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 17, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 18, stave 1 incomplete. Expected: 4/4; Found: 7/4
Bar 18, stave 2 incomplete. Expected: 4/4; Found: 7/4
Bar 19, stave 1 incomplete. Expected: 4/4; Found: 6/4
Bar 19, stave 2 incomplete. Expected: 4/4; Found: 6/4
Bar 20, stave 1 incomplete. Expected: 4/4; Found: 31/16
Bar 20, stave 2 incomplete. Expected: 4/4; Found: 31/16
Bar 21, stave 1 incomplete. Expected: 4/4; Found: 31/16
Bar 21, stave 2 incomplete. Expected: 4/4; Found: 31/16
Bar 22, stave 1 incomplete. Expected: 4/4; Found: 30/16
Bar 22, stave 2 incomplete. Expected: 4/4; Found: 28/16

It would be interesting to see the original PDF score which resulted in such a mixed-up import.

In reply to by DanielR

Yes, tons of error messages.

"It would be interesting to see the original PDF score which resulted in such a mixed-up import."

Well, I included a snapshot including the very first measure in my OP. I can't include the whole PDF, since I purchased it online.

1) Why? Because OMR is mostly an experimental thing; especially the service as offered on .com, which I believe even lags a release behind

2) In a case like this, my advice would be to just enter the score yourself. The proficiency needed to "fix" that is higher than the skill required to enter it.

I have an update.

I was wondering if all this content above the staff might be confusing the OMR:
super-staff content.png
So I used a PDF editor to blank it out. Then I printed the PDF and scanned it back in as a new PDF, and submitted the result for OMR processing. The result was almost completely accurate.

So I think I have identified why the engine is doing poorly here, in answer to my original question.

Note that trying to OMR the edited PDF without going through the print+scan flow did NOT result in improved results. The editing tool I'm using, Preview (macOS), only places white rectangles on top of the content. Audiveris must be looking at the postscript content underneath, which is removed by print+scan.

In reply to by reggoboy

Just for fun, I converted your above png to a PDF with Cute PDF Writer. Then I used the MuseScore PDF import. The result did not retain any of the text. Nor did it understand the 2 measure repeat sign. But all the right notes were in the right places.

In reply to by bobjp

Interesting. I guess that matches my hypothesis. By the time I had created the PNG, all the postscript was stripped out, even after you converted back to PDF. So then the import process didn't get confused. Hopefully they can fix this.

In reply to by reggoboy

Okay, so a bit more of an update. My comment above just involved OMR on the first PDF page, since I didn't want to take the effort to blank out everything until I was convinced it would help. But when the above example worked, I blanked out the rest of the "extra" text above the staves, and then the OMR returned a bunch of errors again, including messing up even the first few measures again! So somehow, content on page 2 or later messed up measures at the beginning of the song.

So then I just submitted page 2 separately, and that appeared to work okay.

Very odd.

Anyway, I emailed the original unedited/unmasked PDF to my friend with Sibelius, and his import routine handled it swimmingly. It ignored almost all text/chords/lyrics, but got almost all the chords right. The only place it did inferior was not getting a short 32nd note run right. I wonder if Sibelius has their own algorithm...

In reply to by reggoboy

So I've done a bit more with this. I only have your photo to work with.
This time I used the Microsoft PDF printer as I'm on Windows.

MuseScore PDF import:
MS1.png
I have Sibelius 7.5.1. It uses Photo Score to create files Sibelius can read.
Native PhotoScore file in Sibelius:
opt1.png
PhotoScore will also create an mxl:
sibxml1.png

All of them ignore any text and two measure repeats. But all the notes seem to be correct. I notice only MuseScore retained dynamics.

In reply to by bobjp

Converting from a graphic format to a music format is always problematic, and one cannot expect that all details will be recognized. Even when using commercial programs, manual corrections are always necessary.
The conversion of musescore.com is also based on the free Audiveris, but an older version.

Do you still have an unanswered question? Please log in first to post your question.