Vocals that actually sing

• Dec 28, 2017 - 16:47

I highly doubt this is even feasible on Musescore, but I want to know if there is some sort of software in which you can make sampled (fairly realistic) voices sing actual text, compared to the 'ooos' and 'ahhs' you'd get from a soundfont. I am aware that some orchestra software such as EWQL and Garritan offer this tool, but I am looking more for cheaper, preferably, free alternatives.


It's not technically feasible just using MIDI and soundfonts, but of course it is technically feasible to some extent. In order to work with MuseScore there would need to be significant development effort to incorporate the necessary technology. It's something that gets looked at now and then usually in conjunction with the Google Summer of Code, but I think the technical challenges are probably too great to be solved by one student in one summer.

In reply to by Marc Sabatella

The MIDI specification allows for user-defined messages, and although they're normally musical, they can control things other than music. For example, there are some concert lighting systems that use MIDI.

I would think that if you can define custom MIDI messages for things like "switch on light number 1640" or "set fog generator volume to 3", you could also define custom MIDI messages for things like "set vocal generator to syllable 'hel', pitch C4, velocity 5", "set vocal generator to syllable 'lo', pitch C4, velocity 5", and "set vocal generator to syllable 'world', pitch A4, velocity 4".

That said, just because the MIDI specification allows custom MIDI messages, and that they could be used to control a speech generator, it's not necessarily the case that anyone has done that.

I don't know how Musescore works inside, but I can imagine a kludge that would allow Musescore to directly generate the necessary custom MIDI: require the vocal track to include one voice that defines the pitch of the singing, and another voice (or multiple voices) that define the syllables, phonemes, or other vocal elements that comprise the lyrics of the singing. The pitch voice would be as simple as any other music to score, but the voices that define the vocal elements would be a mess to score, because the position on the staff wouldn't represent pitch, but rather vocalizations. It would be absurdly tedious, and completely tied to the technology used by the speech generation technology (unit selection synthesis, diphone synthesis, domain-specific synthesis, and so forth.

More about speech synthesis: https://en.wikipedia.org/wiki/Speech_synthesis

Additionally, since Soundfonts don't (as far as I'm aware) know how to treat one voice as a modifier to another (as in voice 1 controls pitch, and voice 2 controls phoneme selection), all Musescore could do would be play an approximation of the singing that consists of a tone for the pitch played simultaneously with a spoken syllable. To hear the actual singing, you'd have to have Musescore export to MIDI, then play the MIDI through a synthesizer that understands the custom MIDI codes.

Do you still have an unanswered question? Please log in first to post your question.