Test suite needs to be more exhaustive.

• Mar 20, 2016 - 09:33

I notice that most of the folders in mtest/libmscore only contain a handful of tests, and that in general the tests are limited in scope. They also tend to be just checking of issues that have previously been reported as broken. So when I've added a bug fix, I've adhered to this and only added test cases for the specific thing I think I fixed.

But I think the test suite should much much larger and contain not just behavior that was reported as broken, but should basically cover as much functioning behavior that musescore currently does (within some reasonable limit) to gaurentee that all the currently-functioning behavior will continue to function in to the future, by quickly catching regressions the moment they are sent to github. I also want to note that the test suite when run on travis with -j2 only take 30 seconds (and that is including the longer benchmarks), which is a small amount of time compared to the ~20min that it takes for travis to compile the test suite. I think we shouldn't feel ashamed for utililzing much more of that time...could hypothetically go up to ~25 minutes for tests and still fit within the 50 minute limit Travis imposes (and could hypothetically exceed that with concurrent test jobs). Keep in mind when debugging on home computer, only really need to run the particular module we are debugging, and we can all utilize our own personal travis for running the entire test suite (https://musescore.org/en/developers-handbook/git-workflow#Run-tests-on-…).

There are also some behavior which is harder to fit in the specific folders already setup in mtest/libmscore, so one question is is it ok to create more folders, and is it ok to maybe create a general folder for some things that might deal with behavior covering multiple modules, so I don't waste time trying to decide which folder to put some test in? Also is it ok for me to submit PR's containing nothing more than tests for things that are already working?

Part of the reason I bring this up is because when submit a PR and someone says have you tested it, all I can say is that I've fiddled around with some things that I think might be affected and that of course my special test covering the bug passed, but I can't say with much confidence that I haven't broken some (possibly remotely related) previously working behavior not specified in the test suite, without having a more exhausitive test suite.


Comments

I don't think you'll find disagreement here. Just the reality that no one likes writing tests enough to spend the time that would be needed to make the test suite more exhaustive.

PR's that just add tests would be extremely welcome I have to imagine!

Please do add more tests, folders, whatever. We do need more tests.

Of course, it might become a problem later on if we have too much tests.
We would probably need some tests management solution and just more folders without any documentation will not cut it. Also we will probably need a way to know which behaviors are tested and which one aren't. If you have any insights about good practices for test management, please share.

In reply to by [DELETED] 5

regarding folder orgainization...I would make sure have single folder for every libmscore class (or .cpp as it seems someof them are). And then have subfolders inside each class folder to handle the important functions of each class. These subfolders would have more rigours testing that would enumerate all intended behavior of each of those functions. And then I would leave generic tests for each class outside of the subfolder.

The important thing you want to do make sure have clearly defined behavior of each function. So one thing when making these new test might also be to update each function's comment header to more precisely define what it is expected to do (I notice most of the function headers just say the name of the function...and adding these comments will also help produce more useful DoxyGen html output). And then with this behavior you want to systematically identify all the possible cases that that function needs to cover, including all corner cases, and have a small test covering each case. Mainly a matter of being very diligent.

In reply to by [DELETED] 5

btw, I'm setting up my CMakeFiles to use GNU Gcov to report information about the code coverage of the tests, which provide a metric and let me know which lines of code aren't being exercised at all by the tests. There are more advanced code coverage concepts like "Condition coverage" which I don't think Gcov handles, but I haven't really searched for anything more advanced. But at least simple line coverage let me know where to start triaging.

btw, I was trying to get Gcov incorporated into qtcreator with a plugin, but that plugin doesn't seem to be maintained anymore . It seems there is a graphical frontend viewer for Gcov called Lcov which should help me out, which I'm going to look into tomorrow.

Just writing this incase any of you can suggest some more advanced open source tools or any suggestions on where to start before I get going.

Do you still have an unanswered question? Please log in first to post your question.