It's seriously just a modified midi-synth. It's not much more complex than, say, Mario Paint Composer. It uses a set of sampled sounds in its own format of soundfont, that replicate the sounds of a singer on a scale. A more accurate comparison would be a 1985 Toyota Corolla to a 1992 Toyota Corolla. There were some improvements, modifications, and tweaking under the hood, but they run on the same set of principles, and even share the same engine, just in a slightly different configuration.
It's not even a text-to-speech system. Each sound is named for its phonic and note. When you compose, it finds the file that matches the lyric phonic and staff note, then applies it with a sampled loop to hold it out for the longer notes. It's the same principle for a conventional midi synthesizer. It finds the correct note of the correct instrument and puts it in place.
I mean, I wish I could say it was revolutionary at all, but, honestly, it just isn't. It's a midi synthesizer with somewhat unique soundfonts. I'm fairly certain it isn't even the first to have vocal soundfonts.
Now, if it generated the voices on its own (it uses samples of Japanese singers), I would be impressed. If it smoothed out the changes in pitch, as a real voice, I would be impressed. If it could articulate accurately, I would be impressed. Those would be on the track to a Sharon Apple. But all it does is beep like a midi synth, with the benefit of having vocal-like (vocaloid would be proper etymology) sounds, as opposed to The Legend of Zelda. That's old technology. Even current TTS programs are more advanced. I mean, they articulate.
The only reason this one succeeds is because of the characters associated with it. This isn't even the first release of the program. The current release is Vocaloid 2, and is 4 years old. The original failed to garner attention because it didn't have the characters or the "iconic" Miku voice.
I guess it could be worse. Could be autotuned S. Hawking.