GSoC 2016 - Week 4 - Input analysis and voice separation

Posted 7 years ago

This was my fourth week working on note entry with MuseScore for Google Summer of Code. Last week I began the task of analysing the user's realtime input to make it look more like actual music. This week I reached what is probably the most difficult aspect of the analysis - the separation of overlapping notes into different voices.

This week’s summary:

  • Basic voice separation in semi-realtime mode.

Still to do:

  • Rhythmical note groupings based on time signature.

Voice separation

In music notation, voices are used to indicate overlapping notes. Some instruments are capable of playing more than one note at a time. If the notes begin and end together then they can be displayed as a chord in a single voice. However, if the notes start or finish at different times to each other, but there is some period of overlap for which both are being played, then additional voices become necessary.

The difficulty as far as notation is performance analysis is concerned is that there are multiple ways to dividing the notes between the different voices, and the task becomes more complex as the number of overlapping notes increases.

The algorithm I use starts by assuming all notes belong in Voice 1 and then looks for overlapping notes. When the first overlapping note is found it gets sent to Voice 2. If another overlapping note is found then the algorithm tries to send it to Voice 2 as well, but if it overlaps with anything already in Voice 2 then it goes into Voice 3, and so on. The voices quickly start to fill up as more overlapping notes are discovered, so higher “virtual voices” are created to store them.

Once all of the notes in the measure have been assigned to a voice the algorithm goes back and simplifies the durations, replacing tied short notes with long notes. However, it is possible that some ties will be left behind after this, and notes that were previously considered overlapping do not actually overlap. At this point the voices must be analysed for a second time to see if any of them can be combined into a single voice. A difficulty that arises at this point is how to represent the notes within the voices. For the first round of overlap checking it makes sense to use a MIDI-like representation with a start and end time for each note, rather than worrying about whether notes are actually made up of lots of tied notes. This representation is not how notes are normally stored within MuseScore, but it has the advantage that the normal rules do not apply. (For example, this permits having more than 4 voices.) However, for the second round of optimisation it becomes essential to be able to tell where the ties are, so it becomes necessary to convert back to MuseScore’s usual representation. However, at this point the number of voices might still be greater than the maximum of 4. I still need to work out a way to solve this problem. After that has been solved, the remaining problem will be ensuring that the tied notes are combined in a way that makes rhythmical sense.