GSoC 2018 Student Introduction for Project: Machine Learning Dataset for Optical Music Recognition

Posted 6 years ago

Greetings,
I am Animesh Tewari, a Computer Science undergraduate at Jaypee Institute of Information Technology, Noida.
About to begin this awesome journey of GSoC with MuseScore.

Project Goal:
The basic idea would be to support the current scenario of Audiveris OMR integration with MuseScore by accumulating a large correct dataset. The OMR needs to be successfully trained on a large dataset so that it can provide precise outputs which will help the Music community. To carry out this task, a new metadata XML format, similar to the output understandable by Audiveris, is to be implemented in the application so that scores are saved as the desired XML. This project will create a huge dataset of annotated music symbols that will help Audiveris perform better which in turn will bring out a new dimension to Musescore as a feature. This functionality will be able to help support other OMR projects.

Work to be done:
1. Porting previously done work from imeta to the master branch.
2. Changing the note annotations implementation to cover grace notes - acciaccatura and appoggiatura.
3. Implementation of tuplets, brackets, repeat dots etc. annotation.
4. To be tested: Existence of staccatos in outputted XML when scores containing staccatos are exported. If not, modification of the articulation annotation code in the edata.cpp accordingly. Use of SMuFL symbols in the sym file.

Progress:
I have now been working on bringing the developments from the imeta branch to the master branch. I have submitted a PR for it.
Here is the PR link:
Link: https://github.com/musescore/MuseScore/pull/3669

I propose to deliver the best I can to MuseScore with this project. :D

Ciao