Musical content in PDF's?

• Jan 23, 2019 - 16:54

I have heard claims that there is structured musical information (beyond images) in PDF's generated by MuseScore (and other music editors). Is there truth to this? Is this documented? With what does one access it, if so? Or is this a yeti?


Comments

Here's a quote from http://www.myriad-online.com/en/products/pdftomusic.htm.
"Because it only processes PDF files that have been exported from a score editor software, PDFtoMusic offers a unique reliability and outstanding results.
Therefore, scanned sheet music cannot be managed by PDFtoMusic."

On the other hand, Audiveris (https://bacchushlg.gitbooks.io/audiveris-5-1/content/quick/load.html) can transcribe images:
"Audiveris accepts a variety of image formats as input, notably PDF, TIFF, JPG, PNG, BMP."

Regarding the difference in pdf music scores:
A pdf can be created from an image (e.g. scanned); or it can be exported from a scorewriter software.
The export from a scorewriter software maintains discrete elements, as opposed to a "picture image".

Consider the 'litmus test' comparison below:
Image_vs_score_elements.jpg

Regards.

In reply to by Jm6stringer

I looked at a PDF produced by MuseScore with a tool that re-ascifies the Postscript code, and all I saw was postscript code drawing font elements and lines -- I saw nothing (esp names) that looked like musical structure. Surely, one can learn what the Postscript output from MS via Qt looks like, and "disassemble" it, but that would have no relevance to the output of any other score writer, nor help anyone else who was attempting to reconstruct musical information from the PDF...

In reply to by [DELETED] 1831606

that would have no relevance to the output of any other score writer...
You are correct. There is no "musical" syntax in a pdf.

...nor help anyone else who was attempting to reconstruct musical information from the PDF...
Well...
PDFs produced from scorewriters contain discrete image/elements.
Software like PDFtoMusic can transcribe those discrete image elements into notation.
PDFtoMusic Pro (http://www.myriad-online.com/en/products/pdftomusicpro.htm) "rebuilds the original score, and exports it for instance into MusicXML format, useable in most of the professional score editors."

The referenced 'litmus test' I use only to decide whether a pdf is a candidate for PDFto Music or Audiveris.
If one looks at the scanned pdf image shown above, it's obvious to a human that the page was slanted when placed on the scanner bed. All the horizontal/vertical elements (lines especially) are skewed. For a machine, establishing a horizontal and vertical frame of reference is needed for the score to be 'recognized' by an OMR like Audiveris - a job more daunting.

In reply to by Jm6stringer

It just occurred to me that these tools almost certainly do not reverse-engineer the Postscript, but rather, run it in their own Postscript-interpreter engine, and observe carefully what comes out, a process midway between optical recognition and "knowing". It is "musical information", but not "musical syntax". I leave you to ponder to what degree a sound recording (or live concert) or photograph (or live viewing) of a score "contains musical information".

In reply to by [DELETED] 1831606

I leave you to ponder to what degree a sound recording (or live concert) or photograph (or live viewing) of a score "contains musical information".
(Perhaps that was a rhetorical statement?)

Anyway, my 2¢...
Though a sound recording, to the human ear, does convey musical information: e.g. structure, harmony, rhythm, orchestration, tempo, dynamics....
the Jojo-Schmitz metaphor (regarding transcriptions of sound files into notation) states:
"An egg cannot be unscrambled."

So, machines (currently) cannot transcribe sound recordings into notation as adroitly as the human brain-ear combination.

Regarding a score photograph...
A printed score displays musical syntax figures (glyphs, numbers, lyrics, symbols) in addition to conveying much of the other musical information like rhythm, harmony, dynamics, etc. (which truly comes into existence during actual performance of the score).

In reply to by Jm6stringer

Yes, it was a rhetorical statement, like Dylan's early "Are birds free from the chains of the sky-way?" I have heard rumors of programs that can (somewhat) produce score from sound -- I don't know of their truth or their efficacy (is that zeugma or anataclasis?). Mozart actually had a device that could hear Allegri's 9-part Miserere and write it back down later, but he was born with it and they're very hard to acquire, as you say...

In reply to by [DELETED] 1831606

A MuseScore generated PDF contains as much musical information and syntax as the .mscx source file, (assuming that the generated PDF is accurate). It just needs the right decoding equipment to understand that information. Fortunately our brains do a good job of decoding a displayed or printed PDF so we can extract all the information in real time and play the music. Computer systems seem to have some way to go yet but will probably get there eventually.

In reply to by yonah_ag

By that standard, so does a live performance (MORE so). The first sentence is provably false. Things in the .mscx that are not in the pdf or printed page: hidden elements, including those that have an effect on performance, and all piano-roll editor and similar articulations, as well as the execution of ornaments. These are the easy proof. By your standard, a .png contains "musical information and syntax", too, and begs the question of what each of us means by these things.

In reply to by [DELETED] 1831606

Information is a non-material property that requires a system for transmission and an agreed set of rules for it's interpretation, (e.g. the letters of an alphabet; words made from these letters to form a vocabulary; grammar rules for constructing sentences, etc.), but it is quite hard to pin down with a simple definition.

A live performance could well contain even more music information, but this information is transferred in a different format, i.e. sound waves.

A .PNG file can contain just as much musical information and syntax as a .PDF or a .mscx file. The information may, once again, be in a different format but, provided it's not a lossy representation, (i.e. it's "accurate" as I said in my post), then there's no difference in the amount of musical info.

You have proved nothing to the contrary with your example but have only asserted that MuseScore's PDF conversion is not 100% accurate, i.e. it is a lossy process, (like a JPEG image). The PDF certainly shows articulations so I'm not which details are actually lost. I don't think that a .mscx file contains a piano-roll editor but, rather, that the MuseScore software provides this editor and uses the .mscx data to present for editing in the user interface.

In reply to by yonah_ag

Please learn what the MuseScore piano-roll editor is. It is a user interface to allow you to control the on and off time of notes, categorically "articulation", "performance artifacts" which DO NOT APPEAR IN THE PRINTED SCORE, no matter what the visual medium (but do persist in the .mscx).

The argument about whether a live performance or a music book contain music is pointless. If they didn't, they couldn't be sold. The original question was from someone wanting to extract an editable (with a music editor) form from a PDF so that he could edit it; it is possible to edit a (what used to be) "tape" of a performance. But it is far easier to change notes with a music editor. Saying the have "the same information" or "as much information" in a Shannon sense is pointless, too. We don't have tools that can change the oboe Eb in m 3 to an F# in a sound recording, but with a music editor, it is easy. It is also possible to edit music in PNG/JPG form with Photoshop or GIMP, but MuseScore is orders of magnitude easier. I think it is folly to claim that they are equivalent.

In reply to by [DELETED] 1831606

I'm not talking in a Shannon sense, I am referring to apobetics.

Your points about the OP's question are correct and I made no comment on these, but, as often happens in a forum, the thread digressed into the interesting realm of information theory.

Of course it's easier to change notes in a music editor - did I suggest otherwise? No, I only commented on musical information. This by no means suggests they are equivalent in all respects. Your "folly" comment simply knocks down a straw man.

You still seem to be ignoring my caveat about accuracy and hence you subsequently knock down another straw man.

There is no inherent reason why a PDF score has to have less information than an electronic score. If the MuseScore programmers have made a design decision to leave out certain elements then they are indeed reducing the information in the PDF but this is a programming decision rather than a lack in PDF.

(By the way, the playback pitch of a sound recording of an oboe could be altered by speeding up or slowing down the playback, e.g. play a vinyl record at the wrong speed and you'll hear what I mean).

In reply to by yonah_ag

Really, you can change the pitch of one instrument in a recorded complex orchestral texture? Tell me about it.... I was not born yesterday.

No, exact timings of when notes start and stop (short of their notated time value) do not appear in ANY music notation, or image of a score. It is not an "MS design decision".

Yes, the accuracy of a printed professional edition is 100%. It is best not edited at all, as it is probably under copyright. The structure of information and ease of navigating, traversing, and editing it is not a function of its "accuracy".

In reply to by [DELETED] 1831606

On no, more straw men go tumbling!

Your first post on oboe pitches made no mention of an orchestra.

Given the bpm tempo of a piece and the note types then there is good timing information.

Professional print outs and copyright have nothing to do with my points. My term "accuracy" is referring to how well and completely the MuscScore software renders the .mscx file on conversion to PDF, (this is where the design and programming decisions apply), so your comment about "... ease of navigating ..." is specious.

Do you still have an unanswered question? Please log in first to post your question.