[MusicXML export] Lyrics do not handle flat and sharp signs

• Aug 10, 2012 - 22:55
S4 - Minor
needs info

Found this in 1.2, verified it with my own build of the sources from a few days back; I have no reason to expect it's been fixed since then.

Steps to reproduce:

1. Open attached score (created in 1.2)
2. Note the use of flat and sharp signs in the lyrics - these were added through the text symbol palette
3. Export to MusicXML
4. Open the resulting MusicXML file

Expected result: lyrics appear as before, with flat and sharp signs
Actual result: garbage characters are present in the MusicXML file

I am not sure if there is any sort of standard escape sequence that could be used here - that is, I don't fully expect other programs would recognize the symbols. But I'd hope at least MuseScore could recognize its own symbols.

Attachment Size
6.mscz 2.61 KB


There is no easy answer for this one, at least none I am aware of. A few observations / remarks:

The issue does not reproduce on my Linux system, attached snapshot shows what I get after importing the MusicXML file: flat and sharp are present, but the font has been changed to default. This is to be expected, as the MusicXML export does not export font information for lyrics.

MuseScore uses Unicode characters E10D and E10C for the flat and sharp symbols in the lyrics. These are correctly exported to MusicXML and imported into MuseScore again. E10D and E10C are in the Unicode "private use area", using these codes may not be portable.

Most likely explanation seems font handling.

Attachment Size
snapshot1.png 37.22 KB
Title MusicXML export of lyrics does not handle flat and sharp signs [MusicXML export] Lyrics do not handle flat and sharp signs

I can reproduce - see attached PDF for result.

Using MuseScore 2.0 Nightly Build (c65f00f) - Mac 10.7.4.

Attachment Size
6.pdf 32.23 KB

Leon - are you saying that on Linux, your XML file at least has nice escape sequences for those Unicode characters? It doesn't on Windows. i get the raw character, which displays as garbage in a text editor. Can that much at least be fixed?

First question: yes and no.

Yes, the internal character code E10D is written to the UTF-8 encoded MusicXML file as the three byte sequence EE 84 8D, which is correcty UTF-8 encoded. This is read back as E10D which (on Linux) gets displayed as a flat symbol.

No, as far as I can tell, the actual bytes written to the MusicXML file are identical on Linux and Windows. Just tried an old MuseScore 1.2 prerelease vresion on XP, the MusicXML file contains the exact same data.

Where my Linux and Windows version differ is in the way the E10D character is displayed by MuseScore. On Linux it is displayed as a flat symbol, on Windows it shows a small, empty square. Apparently font handling differs on these two platforms.

Finally, I suspect that writing "private use" characters is incorrect anyway. Unicode supports the flat and sharp symbol as 266D and 266F. I may need to translate from MuseScore internal codes into standard Unicode on MusicXML export.

This is partly solved in MuseScore 2.0. The F2 palette does not use MScore1 font anymore but FreeSerifMscore. This font is unicode and then the unicode code will be OK.

Nevertheless, one can still create a text element using several fonts and this font might encode data differently. The best way to support it would be to export and import font information in MusicXML.

I'm a bit confused.

I had assumed that MusicXML files should use ASCII escape sequences for non-standardized multibyte Unicode characters. That is, there should be actual strings like "" (eight single-byte ASCII characters: &, #, x, e, 1, 0, d, and ;) or something like that. Otherwise, when you open the file in Wordpad or other simple text editors, you see garbage.

I see now upon reading more about MusicXML that this is not strictly required - only certain "control characters" need to be escaped in this fashion. So writing Unicode directly into the XML file is OK, even if it likely won't display well in text editor. Seems an odd choice, but OK.

So if we are changing the Unicode codepoints for flat and sharp in our fonts, I guess I need to do this for MuseJazz too, yes? And then also update the chord style files that use this font? Not sure if I'd need to keep the old codepoints to avoid breaking older files that have these symbols embedded in text.

Also, are there any other codepoints that were moved? What about codas, segnos, etc?

MusicXML default encoding is UTF-8 and not ASCII. Regarding your other questions, can you open another bug or a discussion on the forum. Let's focus on MusicXML here.

No problem.

Hey, I just noticed that my attempt at writing out the Unicode escape sequence I was expecting to see was interpreted by the site software, yielding a square because the default font has nothing at that position :-)

Just checked (fairly recent trunk on Linux): musical symbols are written to and read from both MuseScore and MusicXML correctly. Both file formats use UTF-8 encoding.

When you use another application to look at these files, you may either see the correct symbol or a substitution such as a square. This depends on the fonts installed and the way the application handles Unicode. On my Linux system the default editor shows the flat but does not show the sharp up, while MuseScore displays both.

U+266D "music flat sign" is encoded as bytes E2 99 AD
U+1D130 "musical symbol sharp up" is encoded as bytes F0 9D 84 B0

MuseJazz has been updated to use Unicode code points for its (small set of) musical symbols. I will want to update the chord descriptor files, but need to understand a bit more about what this changed for 2.0 in this area first.