MusicXML import: Some accidentials not imported

• Nov 25, 2013 - 15:59
Type
Functional
Severity
S4 - Minor
Status
closed
Project
Tags

Only accidentals within a xml file, that have one of the attributes parenthesis, editorial, or cautionary, are really imported. All others are not imported from xml data but generated by MuseScore after the import process. Therefore all accidentals without attribute, that are "not necessary", vanish.

That is fatal when importing xml data generated by music OCR from copies of printed sheet music. All courtesy accientals are gone. Hours of work for corrections! By the way, the same happens with xml dta exported from MuseScore and imported again.

A very simple solution would be to change line 4937 of importxml.cpp from

if (editorial || cautionary || parentheses) {

to

if (accidental != Accidental::ACC_NONE) {

As far as I can see, this would not cause any other problems, but may be someone who knows the whole import process better, could give a comment to this.


Comments

I don't know the import process that well, but isn't courtesy the same as cautionary? In which case, shouldn't your scanning software have attached the cautionary attribute to those accidentals? Same with export from MuseScore, I guess. Then it would have imported properly (although I just tested it, and cautionary accidentals always get parentheses, which is not correct).

Not that I see any harm in preserving untagged accidentals even when unnecessary. But if MusicXML provides a cautionary tag, seems it should be used on export where appropriate, so the fact that it isn't is the real bug here (and your scanning software would seem to have a similar bug).

Smaller pure music OCR tools with no mighty music editing software in the background would not be able to give such a deep analysis of a score to manage this perfectly. They try to get, what they "see". Just users of those tools need a free software that is able to import their scans. And the decision whether an accidental is courtesy or superfluus finally is a matter of taste resp. the cultural background of the score.

I think, when importing xml we should be generous and follow the principle: get what we can get, as long as we get no trouble. Those who just import perfect music xml are not concerned, and others (like me) will be thankful.

Unfortunately, this is not as easy as it seems. The proposed change (corrected to also handle editorial accidentals) works OK for "unnecessary" accidentals, changes all imported accidentals from type ACC_AUTO into ACC_USER (which seems to have no side effects) but it also results in spurious accidentals. E.g. a sequence of four notes with step F alter 1 (F sharps) in one measure where only the first one has an explicit sharp accidental, now results in two F's with a sharp (the first one ACC_USER, the second ACC_AUTO).

This is clearly not acceptable and requires further investigation.

After looking closer to the problem I think, we cannot put all accidentals to ACC_USER, because the ACC_USER attribute is used to prevent accidentals from censorship by the function "updateAccidental()" and should only be used, when really needed. The problem with the two F's in Leon Finkens's example results from that, and even if this special problem is an error - this is the wrong way to solve the problem.

I think, it is necessary to definitely verify at the above given part of code, whether an accidental coming from XML is "nessessary" (as usual) or is "not necessary", which is possible with a few lines of code, and only in the last case add the accidental here like the cautionary accidentals and mark it as USER_ACC to prevent it from vanishing. Then all other accidentals are not concerned and side effects are avoided.

I will send a pull request these days.

First impression, based only on a quick look at the diffs, without actually testing anything, is that it is probably not correct. The reason is that the PR handles key signature changes and accidentals in MusicXML file order. Unfortunately, because MusicXML supports multiple voices and backups and forwards in time, these are not necessarily correctly ordered to determine required and extra accidentals. In order to do this correctly, you need to read a whole measure. Only then do you have sufficient context to determine the accidentals needed.

Furthermore, if you want to determine which accidentals would automatically be added by MuseScore versus which are "extra", you either need to copy the MuseScore accidental algorithm into the MusicXML importer (not good for several reasons), or have the MusicXML importer call this algorithm (a better solution, but not implemented).

You are right. If a second voice introduces an accidental on a certain line into the measure, then the first voice will not know about and may make a wrong decision. But those accidentals have an affect for one measure. So you can easily fix the problem by setting the USER_ACC flag for "extra" accidentals not "notewise", but at the end of each mesure. Give me some days to implement that.

I see no other solution. Calling the MuseScore accidental algorithm from the XML importer, I think, will fail, because it needs knowledge of the whole score that is not available during import.

On the other hand, it is a real important quality of a music notation program to be able to import usual notations "as they are".

FWIW, I ran into this problem with the following test file from Daniel Spreadbury of Steinberg, who was using it to compare accidental layout in various different programs. The file was generated by music21, a Python-based package from MIT. It contains lots of unnecessary natural signs which are ignored. It would be really nice to fix this.

In principle, setting all accidentals to ACC_USER seems fine to me. I don't care if that "locks" them in place for further editing; I can always remove them. However, the side effect with spurious accidentals Leon mentions in #5 obviously needs to be addressed.

Attachment Size
steinberg-randomAccs2-excerpt.xml 73.85 KB

Looks almost OK to me. Based only on inspecting the diffs in the files section in the pull request (meaning I did not actually test the changed code), these are my comments:

First of all, the strategy of handling the accidentals at measure end is the correct one. I expect the algorithm to work OK.

Detailed code review comment:
Replace magic numbers at lines 1988 and 1991 by constants or explain them in comments
Replace int pointer parameter in xmlNote by int reference

Finally, please add one or more test cases and add comments describing the behavior.