SVG animation via WebVTT

• Oct 23, 2015 - 13:25

I am coding a project, currently for personal use, that will modify MuseScore's SVG Export to produce:
- Modified SVG that includes "id" and other attributes that will be used by the animation program
- A WebVTT file (.vtt) that contains timing information linked to SVG "id"
see here for a more practical perspective on WebVTT
and here for a listing of browser support for WebVTT

Outside of MuseScore I am developing the WebVTT event handler, which will handle more than just the MuseScore sheet music SVG animation events - there will be more animated SVG than just the sheet music in the final display. The sheet music part of the event handler will be relatively generic, in that all the information for highlighting and un-highlighting the notes will be embedded in the SVG itself.

I am just now starting to write/debug code in Qt Creator, having just finished setting up for compile/link. From what I understand so far, here is my MuseScore coding plan:

1) mscore/file.cpp : PaintElements()
Exclude animated Elements from SVG output. Note Heads, Dots, Accidentals, and Rests are excluded via a new boolean function parameter that defaults to FALSE. This step is complete and debugged as of yesterday.

2) mscore/file.cpp: SaveSVG()
Change the call to PaintElements(), adding new arg set to TRUE, to exclude elements-to-be-animated.
Then add new set of loops to collect all the Note Heads, Dots, Accidentals, and Rests in playback order.
Then write the SVG and VTT files for these notes. This is a big step, which will have a bunch of sub-steps.
I am thinking to write the loops in this order:

for each Measure in the Score
    for each Segment
        for each Track           // track = voice ("channel" in staff.cpp) across staves, right?
            for each ChordRest

Inside each ChordRest are the elements I need, including the ticks which I will attempt to convert into elapsed time via the score's TempoMap::tick2time() function. WebVTT is text that deals with elapsed time, in a fixed format, to the millisecond: HH:MM:SS.000. Here is a first sketch to get me to the ChordRest level in C++:

      foreach (Measure* m, score->measures()) {
          foreach (Segment* s, m->segments()) {
              for (int i = 0; i < score->ntracks(); i++) {
                  s->cr(i); // each and every ChordRest in the Segment/Measure/Score

I have looked at renderMidi() (rendermidi.cpp) and fixTicks() (score.cpp) as well as various other functions that loop around ChordRests, in and out of playback order, and I am emulating those to collect my Elements.
I am not dealing with RepeatSegments yet, though it seems straightforward enough to add them using rendermidi.cpp as the example, so I plan to tackle that after I get the basics working.
I will add Swing next, similarly using rendermidi.cpp as the example.
I don't yet want to consider grace notes, arpeggios or other Articulations that rendermidi executes because they require additional graphic types beyond note heads, dots, accidentals and rests.

I think that's it for now. Any suggestions are welcome. I can already see that I've left out Lyrics :-)


Sounds good. Could be worth doing your own method next to the existing ones instead of modifying saveSVG, something like saveVTT for example where you will save your modified svg and the vtt file.

In reply to by [DELETED] 5

Definitely in the future. For now I'm modifying saveSVG() because it makes it easier for me to get it working in the short term. As I said, the code is in isolated blocks and I'm somewhat overzealous with my internal docs, so later integration and separation should be straightforward.

I do think that the SVG and VTT files should be saved in one procedure for this purpose, as they share the same looping process and much of the same data. So it would be a saveSVG-VTT() function, or something like that. Linking the data in the two files is the key to the whole process.

I have succeeded in implementing the basic animation functionality using an SVG file and a linked VTT file, both generated by my modified MuseScore.exe. The results can be seen here:

It's not much, but it proves the concept and tests various details. It's version 2 because version 1 used javascript setTimeout() instead of requestAnimationFrame(), and timing data was in SVG markup, not a separate VTT file. Plus, v1 required manual markup of the SVG. It was like a pre-proof-of-concept or limited-usefulness version.

requestAnimationFrame() locks up to your screen refresh rate, generally 60hz. That's 16.67 milliseconds per frame (60 frames-per-second or fps). At a tempo of 900bpm, that's one 1/16th note. Or at 225bpm one 1/64th note. But even notes at around half the frame rate might animate imperfectly. Thankfully, even 30fps is 1/32nd notes at 225bpm, and most scores avoid such blazing speed. Nancarrow fans might take issue with the accuracy of an animated score of his music, but I don't see any of his scores at Too bad, it would be a great test for this.

As you can see if you peek into the javascript, even though I'm using a WebVTT file and attaching it to the audio element in HTML, I'm not using HTML <track> element events. textTrack.onCueChange() is officially unreliable in Firefox, but worse than that, the event only fires at around 6hz, not 60hz. I take full advantage of the DOM textTrack object and it's cues, but that's all done in a front-loading process that caches all the cues into my own small stew of collections. Possibly clumsy, but the animation runs spot-on in my tests so far, and even in a larger score it won't be much data to cache into memory. Further tests will tell... One must keep the animation calculations within that 0.0167 second frame length for smooth animation results, so caching data is a common procedure in this context.

Any feedback is welcome. I'm continuing to develop further functionality - this example/test case is very basic, but there is a lot of potential here. Text-based SVG and VTT files vs. flash/video media files, or in conjunction with those other media files. Combined with real audio, even live performances, certainly studio performances. MIDI-generated audio is great, but recordings of human performances are richer, and support for MIDI playback is patchy in today's browsers anyway. With audio playback, versus MIDI playback, the artist can can fully control the resulting sound, even if that sound is generated by MIDI in the artist's studio.

Do you still have an unanswered question? Please log in first to post your question.