In Windows Explorer, files with special characters in filepath don't open when double-clicked

• Apr 19, 2019 - 12:57
Reported version
3.0
Priority
P2 - Medium
Type
Functional
Frequency
Many
Severity
S3 - Major
Reproducibility
Always
Status
closed
Regression
Yes
Workaround
Yes
Project

1) On Windows, open the native File Explorer (Windows Explorer)
2) Create a directory, named with at least one special character. E.g.: "♫ Demo"
3) In this new dir, create/save/put inside, a .mscz file
4) Open this file from the Windows Explorer (double-click, or select the file and press [Enter])
5) In Musescore you get an error message and the file don't open:
Can not read file E: \? Demo \ Test.mscz:
The file name, repertory or volume syntax is incorrect.

Note 1: the same issue happen if the filename have special/exotic chars.
Note 2: saving a file named with special chars, or inside a dir that have special chars doesn't cause any issues.
Note 3: this issue isn't reproducable with other type files, like .txt files, and with other apps like notepad, notepad++, etc. And .mscz files can still be opened by any text editor without issues.

Workaround: drag and drop the file on the MuseScore window to open correctly the file (without any issue).

Attachment Size
Test.zip 3.66 KB
screencast.gif 295.25 KB

Comments

This problem is actually quite common and I have come across it in many different applications. Some managed to do some change in programming to support this with the use of Unicode support. An example is IrfanView.

Maybe someone could ask the author of IrfanView what changes he had made to the source code in order to support non-ANSI string in filename path.

This seems a plain MuseScore 3 issue, MuseScore 2 (I checked with 2.3.2) has no such problem. So it doesn't seem to be a plain Windows issue either.
Another working example is 7Zip.

Regression No Yes

Might be an issue with Qt (5.4 for MuseScore 2, 5.9/5.12 for MuseScore 3) on Windows?
Or something about the installer, registering file types differently?

Hah! It is an MSVC issue, a MinGW build works!

Is that related to the compiler switches /execution-charset:utf-8 /source-charset:utf-8 (which could get shortened to just /utf-8 BTW)?
I found https://stackoverflow.com/questions/30829364/open-utf8-encoded-filename… which claims: On Windows, you MUST use 8bit ANSI (and it must match the user's locale) or UTF16 for filenames, there is no other option available

What actually is the difference between opening a score via File > Open and when passed as an argument into main()?
Maybe we just need to convert the commandline filenames into the same format that File > Open uses?

Title From the Windows Explorer: Double-click on files with special characters in filepath don't open (Windows) In Windows Explorer, files with special characters in filepath don't open when double-clicked

If this is an UTF-8/16 issue, could perhaps something like the patch attached below help? This is just a random guess but if someone on Windows could check this it would be great.

Attachment Size
287955.patch 1.02 KB

I'll check.
Edit: there is no QString::toUtf16() and the conversion from QByteArray to const short* is ambigous in that QString::fromUtf16()

I just spent a few hours on this, and I now have a working fix for this bug. (Actually, the bug — or if you prefer, design flaw — is in Qt itself, and what I've come up with is a workaround at the application level. I can share the gory details if anyone is interested.)

My fix requires platform-specific code in MuseScoreApplication::CommandLineParseResult() (inside an #ifdef Q_OS_WIN block). Is this acceptable? If so, I'll create a PR.

The gory details:

The problem is that the command-line arguments are passed to Qt as char *. On the Mac and Linux platforms, that's fine, because those platforms use UTF-8 everywhere for char *. Under Windows (whose design predates UTF-8), char * means Windows 8-bit “ANSI” in whatever the local code page is set to. Because of this, any characters that don't fit in the local code page are clobbered by the CRT before they even get to main().

The de facto way to get around this problem under Windows is to define an alternative entry point, wmain(), that takes the command-line arguments as wchar_t * strings that are encoded as UTF-16. I made this change to MuseScore, which worked as far as the entry point, but of course wouldn't build because the QApplication constructor wants char *.

OK, fine. So I put in calls to WideCharToMultiByte() to convert the UTF-16 wchar_t * strings back into UTF-8 char * strings. The application now built and ran, but failed at runtime because Qt was still interpreting the char * strings as Windows ANSI. Garbage in, garbage out.

Seemingly at a dead end, I went to see where the command-line arguments were actually being accessed by the app. The answer was just one place: MuseScoreApplication::parseCommandLineArguments(). So I put a Windows-specific shim right inside that function that queries the OS for the command-line arguments instead of asking Qt for them (since Qt's copies are already corrupted at this point) and then converts them to a QStringList before sending them off to the parser.

With that change, I was finally able to open scores with Chinese filenames on the command line.

Upon further reflection, though, I'm not happy with this workaround. What I'd really like to do is revisit the first approach and see if there's any way I can coax Qt into interpreting the char * strings as UTF-8 under Windows without having to touch the Qt source code.

So I'll play around with it some more and hold off on the PR at the moment. Once I either figure it out or give up I'll go ahead and create the PR, with either the current workaround or the hypothetical better one.

But also note that this is not a Windows nor a Qt issue, but an MSVC one, as a MinGW build does not have that issue. So is got to have something to do how MSVC handles that command line

The reason it works under MinGW is that MinGW is kind enough to transparently wrap the strings in a tidy UTF-16-to-UTF-8 conversion layer in order to make things more Unix-like, thereby making Qt happier by giving it Unicode strings in the only format that it correctly accepts.

MSVC also does a transparent conversion, but it's from UTF-16 to ANSI, which causes data loss. However, in MSVC's defense, this is by design and has been well-documented for decades. (Remember, this all predates the existence of UTF-8.) The designers of Qt, which came much later, should have taken this into account by being able to accept UTF-16 strings (or even UTF-8 strings) under Windows.

Whether you consider this situation Windows' fault, MSVC's fault, or Qt's fault depends on how you look at it. The way I see it, it's clearly Qt's fault.

Status PR created fixed

Fixed in branch master, commit fb63d3e128

_Fix #287955: In Windows Explorer, files with special characters in filepath don't open when double-clicked

Fixed a problem that caused MuseScore to fail to open files that contained non-ASCII Unicode characters under Windows._

Fixed in branch master, commit 89cb85e7f6

_Merge pull request #5694 from Spire42/287955-Windows-Unicode-command-line-arguments

Fix #287955: In Windows Explorer, files with special characters in filepath don't open when double-clicked_

Fix version
3.5.0