Score Editing Slow

• Dec 31, 2018 - 00:31
Reported version
3.0
Priority
P2 - Medium
Type
Functional
Frequency
Few
Severity
S4 - Minor
Reproducibility
Always
Status
active
Regression
No
Workaround
No
Project

I heard of Musescore 3 implementing a new editing design that speeds the entire process up. It was explained during the release of Musescore 3 Alpha that instead of updating the entire document, it would update only the page the edit is made on, keeping edits taking the same amount of time for documents of any size. However, I see no proof of this, because I am editing a document that is 151 measures long, 20 pages long, and has 34 parts to it, and it takes several seconds for every update.

Attachment Size
Thunderbolt_March.mscz 596.81 KB

Comments

Out of curiousity, I ran MSVC 2017's profiler on one execution of score->update() from changing one note in the middle of that big score, and attached the function breakdown. (Note I'm on a debug build...I should run release...and note that this is my first time using MSVC's profiler, although I'm familliar with profiling with other tools in the past).

What I notice initially is SkylineLine::add() takes up a lot (32.35% of total cpu when called directly from processLine() plus another %14.94 when called directly from layoutSystemElements()). In particular, 44% of SkylineLine::add() time is taken up to calculate SkylineLine's end() (this is if I'm interpreting this profiler data correctly).

It seems that the hottest line by far is this iteration line in skyline.cpp:76

      for (auto i = begin(); i != end(); ++i) {

Looking at the call of SkylineLine:;add from processLines, that line takes up a full 22.76% of the total execution time of the entire update() (again if I'm interpreting this profiler correctly):

skyline-call-from-processlines.PNG

I'm guessing that %22.76 for that line is made up of 14.30% from end() plus 6.15% from != plus 2.08% from ++i. So my big

The other call of SkylineLine::add (from layoutSystemElements directly) also similarly has that iteration line taking up a large chunk...a total of 9.82% of total execution time (seems 655% from end() plus 2.80% from != plus 0.24% from ++i:

skyline-call-from-layoutsystemelements.PNG

So basically that one line takes up 22.76% + 9.82% = %32.58 of the entire update().

So that iteration line would by my first place to look when optimizing...I'm guessing the data structure is related to autoplacement(?) but it probably gets very big or hard to iterate over for some reason...maybe there is some way to break it up or better way to test != end() or something...sorry I don't know anything about that function to be honest ;P without digging in...all that code is stuff I've never looked at. But just was curious myself...

I assume it is only recalculating the page on the single note change, but I didn't check that. The score has a lot of instruments, so it is a pretty dense page...I wouldn't be surprised if a single page takes a long time.

Attachment Size
doLayoutRange_profile-cpu-breakdown.PNG 49.95 KB

In reply to by ericfontainejazz

Just for my curiousity...Running again this time in Release. This time I did a simlar operation: change duration of one note in the middle of the score (wasn't the same note), and what I'm noticing now is not SkylineLine::add...I'm guessing that part of processing isn't mostly done on every update. But am noticing a lot of the update() is spend in SlurSegment::computeBezier in this hot loop which consumes 45.20% of a 4-second update:

beizerloop.PNG

First thing I'm wondering is why nbShapes = 32 was specifically chosen...it seems reducing that to 4 dramatically helps, reducing the total update from 4-seconds to 2.35 seconds such that computeBezier only takes on 10% of update time:

bezier-4shapes.PNG

It seems to QRectFs that function calculates aren't actually used for drawing, but I'm guessing just for collision detection. Maybe there is a way to just get a coarse estimate during update, and then refine later? Or maybe there is a simpler way to approximate beizer shape for collision detection? Or maybe just use some heuristics instead of actually calculating the shape.

Anyway, doing some more editing, and I notice a crash in the score...so I'm going to look into that now.

I've also always wondered if there there is potential for multi-threading some of the layout code...of course that opens up a whole new can of worms. But I'm on 12-core system, and musescore update only really uses one thread.

In reply to by Justin Bornais

Also, it seems like Musescore is used by a lot of people who have strong computers. My computer has 16 GB of RAM DDR3 and an Intel Core i7 Processor 3rd Generation. It's not the strongest one out there, but it's still strong.

I have some ideas on how to increase efficiency with Musescore editing, because I find it ridiculous how my strong computer still takes ages to make one simple edit. One thing that could speed the process up for people who have instrument parts is to only edit the parts that have the instrument that is edited. I don't know if Musescore already does this, but it seems like the editing time grows exponentially worse when I add parts, no matter if they include that instrument or not.

Additionally, I remember there was a topic (https://musescore.org/en/user/101731/blog/2016/06/02/developing-musesco…) that described how Musescore was going to run faster. They said something that caught my eyes. They said, and I quote:

"If a change on one page results in the page breaking in a different place—i.e., also changing adjacent pages—then any affected pages will be laid out again, as well, but no more than necessary."

Could it be possible that simply checking to see if the edit affects other pages takes up more time than it should? What if it checks every page per edit to see if that page has a layout change?

Also, regarding multi-threading, is there a way to set Musescore to run multi-threaded in Task Manager? Plus, couldn't they leave multi-threading as an option in Musescore for people whose computers can handle it? This would make the entire process so much quicker and easier for the people who have the computers capable enough to do it, which it seems like there are lots of people like this. Almost all of the people I associate myself with, when it comes to computer science and music, have powerful computers. Some of them run Musescore.

Basically, checking to see which pages need to be edited can be just as bad as simply editing all of the pages. What if there was an option for the user to select an area they want to edit, and that area was locked in another view or something like that? It could be something as simple as selecting the first measure they want to edit, holding down Shift, then selecting the last measure. Then they could hit Ctrl + E, or something like that. Then, they can edit like that. Or they can select on a specific note to add or delete articulations. This would be simpler because it would only have to edit that little area. The user will most likely just click on the single measure they want to edit, and it would only have to update the one measure. Or maybe the single page view can edit only the page the user is viewing.

Then, if the score looks like it's all messed up because everything isn't aligned properly because of this editing technique, why can't there be a refresh button that, finally, edits all affected pages, or edits all the pages at once to make the score look better? Why do they need to update the whole score, or all affected pages, every single edit? Also, can they make this refresh function multi-threaded so you can continue to edit and view the score during the refresh?

This seems like a simpler and better approach to me. If the Musescore team disagrees with me on this, then how about another approach:

Assign a number value to every possible edit as an attribute to that edit (or a variable). Let's say, assign the number 0 to edits that don't change the layout at all (such as something like a staccato, accent, or other such edit that doesn't change the layout), then assign the number 1 to edits that change the layout (such as inserting measures and line and page breaks). We'll call the attribute "doesEditLayout".

Then, make an array that contains a link to the corresponding editing method. Give each possible edit (i.e. adding staccatos, accents, etc.) an index number as an attribute. This is more efficient than a large if statement. The code would look something like this:

@FunctionalInterface
interface ScriptTask {
    void execute(String s, int i);
}

class Script {

    private ScriptTask[] tasks;

    Script() {
        this.tasks = new ScriptTask[2];
        this.tasks[0] = this::foo;
        this.tasks[1] = this::bar;
    }

    private void foo(String s, int i) {
        System.out.println(s);
    }

    private void bar(String s, int i) {
        System.out.println(i);
    }

    run() {
        for (int i = 0; i < tasks.length; i++) {
            tasks[i].execute("hello", i);
        }
    }
}

Then, check to see if the previously mentioned "doesEditLayout" attribute equals 0 or 1. If it equals 0, then no layout update is required, except for adding the staccato or other articulation to the note. If it equals 1, then run a separate editLayout method that edits the layout.

I also recommend that certain edits like adding measures and line and page breaks should only update the pages the breaks or measures are added to and the ones after it. This is because adding a measure on page 7 of a score will not change the view of pages 1 to 6.

These are my thoughts on how to diagnose this issue. Other than that, that's all I have to offer as possible contributions. I hope people read this comment and reshare it, or use it to come up with bigger and better ideas.

Regarding multithreading, it is not as simple as setting it in task manager. Rather it needs significant modifications to the algorithms and by figuring out ways to break it up into tasks that are mostly independent of one another. And then deal with the bugs that are inevitably introduced.

But still i think there are opportunies to make some parts of the single threaded code faster anyway...such as why as I mentioned approximating bezeirs with 4 line segments rather than 32 rectangles as it appears to do (it seems the Shape class only understand non-rotated rectangles). And I suspect there are still performance bugs with unnecessary redundant layout...but I haven't looked at the code at all.

I didn't read the rest of your post, sorry.

In reply to by ericfontainejazz

I understand your point. It just frustrates me how something as simple as adding a staccato (yes, I'm going to keep using that example) takes so long to do when your score is larger.

To summarize what I said before, either Musescore should let you edit a certain area first, and then refresh later IF NEEDED, or let the program decide if it should refresh the score or not based on what edits are made (i.e. staccatos require no refresh, but line breaks and adding measures do). I also believe that adding measures and line breaks does not require the entire score to be refreshed; only the area where the new breaks and measures are being added, and the part of the score after it.

Also, why can't Musescore perform collision detection in a separate task, not included in the editing process? It can be its own separate button. Then, the user can choose to perform collision detection on the whole score, or just parts of the score. Also, the whole re-orienting the placements of slurs and other stuff like that can also be done on its own separate button. When I edit something, I only care about the music being changed. After that, then I worry about the layout.

For that
int nbShapes  = 32;  // (pp2.x() - pp1.x()) / _spatium;
instead of changing the 32 to a 4, why not making it dependant of the slur's/tie's length, like the comment implies? Or is that calculation already too expensive?

> "why can't Musescore perform collision detection in a separate task, not included in the editing process..."

The collision detection is used for auto-placement as I understand in addition to simply detecting mouse presses on the slurs...hypothetically it could be put it in a separate thread...and as I suggested earlier "maybe there is a way to just get a coarse estimate during update, and then refine later" when I was talking about opportunities for multi-threading. Some elements might move around after the estimate gets refined...but at least user could interact with the rough-estimate of layout.

> "Musescore should let you edit a certain area first"

Well basically what musescore 3 does (or is supposed to do) is only do relayout of the systems(s) that have been affected by the change...so it is trying to do basically what you are asking for automatically already (although again it has problems). The big issue I've been noticing is that for scores with many instruments and many generated parts, that the optimizations do work but unfortunately there are so many systems affected: if N instruments then there are N different generate part scores who have a system affected, plus the main score has N staves. So basically there are roughly 2N systems that have to be re-laid out... If dealing with 50 instruments and 10 measures/system on average, than that is like 1,000 measures that have to be laid out. This is a big improvement over 2.3.2 layout...but there still could be improvement, cause basically user is only looking at main score, so layout in other scores could possibly be delayed, and even hypothetically could be more ambitious by avoiding relayout of unchanged measures in a system for measures whose widths didn't change.

Anyway, after studying the new layout code, what I do think is not too difficult is to do concurrent layout for each part score, because each part score's layout is independent of its sibling scores. About 12 years ago, I'd use an easy tool OpenMP to easily parallel loops...which was perfect for such easy to parallelism loops. And I'm noticing Qt has built-in Map-Reduce built-in (https://doc.qt.io/qt-5/qtconcurrentmap.html), so probably makes sense to use that as it apparntly works for built-in Qt structures like QLists. So what I'm looking at is:

void Score::update()
      {
      bool updateAll = false;
      for (MasterScore* ms : *movements()) {
            CmdState& cs = ms->cmdState();
            ms->deletePostponed();
            if (cs.layoutRange()) {
                  for (Score* s : ms->scoreList())
                        s->doLayoutRange(cs.startTick(), cs.endTick());
                  updateAll = true;
                  }
            }

That iteration over Scores in ms->scoreList() seems to be completely independent for each score, such that each score could be processed in any order and don't need to write to a shared data structure...until they each finish when they could write their updated laid-out systems back. The overhead from spawning threads will be well made up by the savings, I'm almost certain...but I'll take measurements.

In reply to by Jojo-Schmitz

Jojo, yeah that previous code was probably onto something by making it depend on the horizontal length. I noticed it too but don't know why it was commented out. That calculation of x distance is nothing compared to the cost of each calculation of the n guide points of the bezier. Another way to determine n that might make sense is to count the number of chord rests spanned by the slur.

I'm running on an old, budget Asus laptop (X202E) and also feel like the overall speed in musescore has slowed down instead of sped up, EXCEPT when I drag an existing note up and down. In that case, there is almost no delay, even as multiple systems are shuffled around across several pages. This used to cause massive delays in version 2. By contrast, for entering notes, modifying notes using arrow keys, or typing text, each action takes half a second at least (when typing text, that means half a second for each individual letter). It seems to me like the local relayout is fast enough in practice, but is only implemented for the specific case of dragging notes up and down, and needs implementing in those other cases. Does that sound plausible?

In reply to by Thingy Person

@Thingy, what you wrote seems to be what I observe too: doing a full layout on 3 is actually slower than on 2 (I'm guessing all the auto-placement logic has increased the complexity of layout), while changing just a single note is faster for most cases due to 3's optimization of only doing relayout of the changed system.

However, the problem this issue is about seems to be for scores with many instruments and many parts that even changing a single note is slow, because there are so many staves in the system and there are so many part systems that all have to be relaid out, and that time adds up.

In reply to by ericfontainejazz

How about making a simple code to detect if your edit increases or decreases the size of the measure. If it doesn't, then it only needs to update that specific measure. However, if it does, then perform the relayout. To prevent this additional checking from taking more time, add an attribute to each edit that tells whether or not it changes the size of the measure. A simple if statement would do the trick.

MuseScore being slow is why I've relied on the Album feature to work with truly large scores - and I work with an orchestra, so symphonys are pretty standard. MS3 is definitely not fast enough for me to consider having a symphony in a single file, and as the album feature is currently kaput, I've been just eating the extra paper required for having every new movement start on a fresh page. I have also noticed that MS3 appears to actually be slower than MS2 - not that surprising with the auto-place.

Slur adjustments do take a really long time - and they are the ones that will affect the layout of an entire piece. I just flipped a slur; it took seconds to respond, and moved systems around. I'm not sure why systems need to move is you flip a slur, but that's what happens.

In reply to by Laurelin

Flipping a slur is more likely to affect the distance between staves in a system than adding a note. Until you get into a high register, notes do not affect the heighth of an individdual measure where flipping slurs almost always affects the heighth of a measure.

In reply to by ericfontainejazz

@Eric, It doesn't seem that way to me. Consider the attached score (it's made in 2.0.3 so import it while resetting the layout); 11 staves and 467 measures. If I move a note with the arrow keys, type text, or even so much as select a note, there is a 500ms delay, but I can click and drag a note into the sky and incur no slowdown while the surrounding systems get pushed down and onto the next page (you can do this easily on page two). It's not due to a lack of horizontal relayout either; click-drag a note when there is an accidental earlier in the measure and a natural sign will blip in and out of existence, pushing everything in the staff around all with no delay. If it can do this lightning fast, then it seems to me like the capability is already there, it just doesn't seem to be hooked in to the most common actions.

I have another score with 30 staves and 292 measures (albeit most of them empty) where dragging is still almost completely as fast as in the smaller score, but here the common actions take a full second. Admittedly, dragging is more choppy when the note is part of a slur (evaluating those new, gorgeous slurs takes time) but the delay is nothing compared to e.g. moving the note with the arrow keys.

edit: it's not the slur that makes it (marginally) slower, it's whether or not the measure is next to a line break, i.e. either the first or last bar in a system. Either way, still much faster...

Attachment Size
potpourri_score.mscz 142.93 KB

Slurs definitely adjust more layout than anything except for saving, for some reason. I've been adding tons of cue notes to my score, which requires lots of fiddly adjustments - if I move whole measure rests up and down, they move over to one side. If I add a note, make another rest small, or some other adjustment that doesn't only change placement, the whole rest snaps back to where it should be - but only on that particular system. However, if I add or delete a slur, the layout recalc appears to work on more than one system.

Which is why I can edit some, flip a slur and then have to go track down the thing I was working on because two systems switched pages in the adjustment. It wasn't height - this was a slur on midstaff notes, not very low or high notes. It just resets the layout on more than anything other than saving.

Why? Is this intended? It makes thing slow. Heck, I'd almost ask for a 'refresh layout' button, so that I could do a bunch of work, hit the button and grab a snack. I don't like the relayout on save either, because it means saving takes forever, which means that I get really tempted to turn off autosave, which leads to bad things. Relayout on load is good. I don't think it should happen on save, I dislike the amount it happens with slurs, mostly because it makes no sense, and I... yep. I want a refresh button.

@Thingy Person well yes your attached score is a bit slow. But it isn't anywhere near as slow as the many instrument & many part score in op. It seems that there are multiple sources of slowness. Size of the system seems correlate with slowness (both number of measures and number of staves). Also number of slurs seem to strongly correlate with slowness. And of course generally speaking density of elements in a system, of course. And there are other sources of slowness too.

I'm an old user. Been using this wonderful program since Musescore 1. So to add to this discussion, I did a sort of apples-to-apples comparison of Musescore 2 to 3 in editing a score I have that has 32 parts and is currently at 99 measures (seems longer since its run time at this point is 7 minutes). I had edited another score in Linux on the AppImage for 3 and I figured it was the VM I was using, but I had 2 installed on my Windows 10 machine, and decided to see how 3 did. I always use continuous view for big scores, and 2 runs like a dream. I opened the same score in 3, and it was painful. I would try to edit at my normal rate (I use the keys, so I move fast), and when I'd reach the end of typing a phrase, it would take the program several MINUTES to catch up. 2 was instantaneous. On my Linux VM (different score), I would type out my phrase (even when I tried switching to page view an zooming), and it would sometimes take up to ten minutes for it to finish. And that score was started from scratch, and it was slow right from measure 1. I'm using 2 until this is cleared up.

It still leaves nbShapes at that seemingly insanely high value of 32 for slurs and 15 for ties, reducing those should have an even more drastic performance gain, shouldn't it?

In reply to by ericfontainejazz

@ericfontainejazz It seems like I was right because I got the lightning speed I wanted in version 3.0.3, but it took until now (3.1) for me to notice because I've had the timeline open all this time which was slowing everything down tremendously. The timeline thing could still be investigated, but as far as I can tell, this issue can be closed, right?

Severity S2 - Critical S4 - Minor
Priority P2 - Medium

Also, originally continuous view had none of the "range layout" speed improvements of page view (only laying out the portion of a score that has changed, and only to the extent necessary). But as of 3.1 continuous view does similar range processing and should now be just about as fast.

No doubt there is always room for further improvement, but we should at this point be far better than 2.3.2 in both page and continuous view.

In reply to by Marc Sabatella

Just my 2 cents: I read the discussion about multi-threading some parts of musescore. For optimizing things like bezier curves, I'd say that it is easier to vectorize (SIMD) this instead of multi-threading this. Vectorizing is much less invasive than multi-threading and requires much less refactoring. Vectorizing is compatible with multi-threading, that is both can be used together.

Note: I'm a just-retired PBR 3D render engine programmer getting back to music. Multi-threading and vectorizing is the sort of things I used to do.