Synthetic Voice-Overs: 'Hal' Has
Arrived - But Where's He Going?
By Dave Courviosier
Voice Actor & TV News Anchor
When I walked out of a movie theatre in the Summer of 1968, having just seen 2001, A Space Odyssey, I had lots of questions.
Sci-Fi was MY genre, and yet, here was a movie that just befuddled my 16-yr-old Illinois farm-kid brain.
No matter, really, because I was a “Hal 9000” fan evermore. The soft, comforting tones of the spaceship’s onboard computer (voiced by Douglas Rain) stole the show for me.
Unbelievable! A computer that sounds more human than most humans!
Well, 2001 has come and gone … heck, even George Orwell’s 1984 is now a distant memory. And no plot-lines from those landmark stories have come true (ACLU claims notwithstanding).
Likewise, no computer-generated voice like Hal’s exists either. Or … does it?
Visit the Loquendo.com web site some day, and be amazed by the emotion, phrasing, and endearingly human qualities of their sample synthetic voices.
Most who visit say they’ve never heard a computer-generated voice sound that good.
To a lesser extent, Lessactech.com is also experimenting with speech that approximates the intonation and pacing of the human voice.
TWEET THE TECH
None of this should come as a surprise to anyone who’s watching recent trends in technology.
Do a Twitter search for the word “voiceover” and you’ll see a healthy number of tweets discussing the new functionality Apple has designed into its software, especially as a feature of the iPhone3GS, called, appropriately: voiceover.
People who are visually-impaired love the program, because it reads words on the screen out loud. Apple calls it a “spoken English interface.”
(Editor's note: Also hit the search engines for "Computer Voice." Quite a education.)
Amazon made similar capabilities available with the release of its second-generation Kindle electronic book.
Virtually any content held in memory on the device can be read out loud by software called “Read to Me.” Granted, the quality of the sound is quite mechanical, anonymous, flat, unemotional, and plodding … but it’s understandable … and don’t think for a minute the software designers and engineers working in this area are just going leave it where it sits.
In fact, drilling down into this issue forces to the surface an intersection of creativity, marketing, and price-point that voice-actors can’t afford to ignore in the long term.
The innovation that produces a human-sounding voice from a computer - with all the lilts, nuances, and timing that makes it genuine - requires perhaps as much technical artistry from the software engineer as it does experience from the voice actor.
What market forces will eventually force some clients and vendors to choose the “fake” voice over the human voice?
Some believe the writing is on the wall.
One observer told me that if the price point comes down by half, or if the quality goes up by another 20% - or even if the application to convert text to voice becomes easier, thus saving time and money for the ‘producer’ - then voice acting price arbitrage will open up to synthesized voices for sure.
Most voice actors feel that the encroachment of synthetic voices will hit the industrial/corporate market first, and that audiobook publishers will be the longer hold-outs.
MAKE THEIR OWN
Professional voice over artist Peter Drew said it this way:
Quite a strong contingent of audiobook listeners and narrators believe a computer voice will never replace a human one for capturing the delicate spirit of the spoken word.
Many point out that listeners of long-form narrations don’t brook acceptance of compression or other forms of audio processing, citing “ear fatigue.”
Wouldn’t a long-form synthesized voice hold similar challenges?
CONSIDER MUSIC INDUSTRY
Analogies to the music industry might serve this discussion.
Take, for instance, the many imitation sounds engineered into some electronic keyboards today. The audiophile can discern the difference, but most average music listeners can’t, and don’t much care, i it means a less-expensive download for their iPod.
Vegas’ stage shows used to all have live orchestras, but now most musicians have a hard time finding work on the Strip. In fact, the electronic equivalent of human-generated music gained a foothold as a genre and a market all its own many years ago.
MARKET WILL DECIDE
So, could a computer-synthesized voice approach the anguish in the voice of a jilted lover, or a woman giving birth?
The jury’s still out, but technical innovators tinkering with ever more sophisticated mathematical formulas, and honing artificial intelligence programs to a sharp edge will keep trying.
My guess is we’ll hear synthetic voices that rise to the level of chess-playing software in their ability to innovate, learn, and approximate human nuances. That’s when market forces will determine whether it’s worth the cost to customers.
In the meantime, whenever I hear someone say my name,“Dave”, I hear echoes of Hal, and wonder if it’s a real human, or just a computer-gone-wild.
ABOUT DAVE ...
Dave Courvoisier (“pronounced just like the fine cognac, only no relation”) is an Emmy Award-winning broadcaster, writer, producer, voice actor, and the main weeknight news anchor on KLAS-TV, Channel 8, the Las Vegas CBS affiliate. Savvy with technology and social media, he also writes “Voice-Acting in Vegas,” an informative and entertaining daily blog of voice acting adventures and industry observations.