Text-To-Speech

Last week I put up a link to an audio file I made out of the pope’s new encyclical, Deus Caritas Est, using a text-to-speech program.

I also suggested on Catholic Answers Live that a person might want to read the Catechism off the Vatican’s web site using such a program to help one get through it.

The result of these actions was that I got a number of requests for info about what program I use, how much it costs, etc.

I’ve blogged about this before, but it seemed opportune to hit this again, so here goes:

The program I use is called TextAloud. It’s produced by the folks at NextUp.Com, and it costs about $30. You can also download a trial verison for free.

One of the nice things about TextAloud is that you can buy different voices to go with it, and some of the voices they have these days are REALLY cool.

The best voices currently are the AT&T Natural Voices, which sound so good that I suspect they are reverse-engineered from individual people. The basic two Natural Voices are known as Crystal and Mike. They come with the pack that you need to order to use Natural Voices. This pack costs $25 or $45 depending on the quality you want the voice to have (8khz vs. 16khz).

Incidentlaly, you can download both TextAloud and the AT&T Natural Voices online from the NextUp site. You don’t have to wait for CDs to ship, so you can be up and running with these programs in next to no time.

Personally, I use AT&T Natural Voice Mike (16khz) most of the time. If you want to hear what he sounds like, listen to THE POPE’S ENCYCLICAL or, if you don’t want to download 17mb then listen to THIS ADAPTATION I DID OF EDGAR ALLEN POE’S "THE RAVEN."

One of the nice things about TextAloud is that the current version integrates a plug-in for the Firefox web browser so that you can have it read web pages without having to copy and paste them into TextAloud. In fact, you can use your cursor to select specific text on a web page so that the program won’t read stuff on the page that you aren’t interested in. (HINT: Have it read the "printer friendly" version of a web page to eliminate even more junk.)

I do this all the time and, in fact, it’s the principal way that I get my news. I have Mike read me a bunch of printer-friendly news stories every day.

TextAloud also will read a file into .mp3 format, and you can control the speed that it does this (it doesn’t do it out loud in this mode, so it can go really, really fast. Mike read the pope’s encyclical to .mp3 in a couple of minutes on my computer, but when you listen to the file it’s like an hour and a half of listening time).

You can then listen to the .mp3 on your computer or your portable player (think: iPod).

Incidentally, if you haven’t yet joined the .mp3 revolution then you should know that you probably already have joined it without realizing it. Y’see: Virtually every computer sold these days already plays .mp3s. Window Media Player, QuickTime, iTunes, RealPlayer (WARNING! Evil software application!), and countless others all play .mp3s. Since virtually every computer sold these days comes with at least one of these programs pre-loaded, you may well have clicked on a web audio link and heard an .mp3 file without even realizing you were listening to one.

Which is a long-winded way of saying: Don’t be intimidated by .mp3s if you haven’t consciously used them yet. Unless you bought your computer back in the Cenozoic Era, you’ve already got what you need to listen to them, so go ahead start using them consciously.

Practice by clicking the above link to "The Raven."

So: Hope that helps, and happy text-to-speech-ing!

Author: Jimmy Akin

Jimmy was born in Texas, grew up nominally Protestant, but at age 20 experienced a profound conversion to Christ. Planning on becoming a Protestant seminary professor, he started an intensive study of the Bible. But the more he immersed himself in Scripture the more he found to support the Catholic faith, and in 1992 he entered the Catholic Church. His conversion story, "A Triumph and a Tragedy," is published in Surprised by Truth. Besides being an author, Jimmy is the Senior Apologist at Catholic Answers, a contributing editor to Catholic Answers Magazine, and a weekly guest on "Catholic Answers Live."

16 thoughts on “Text-To-Speech”

  1. I also use TextAloud with the AT&T voices and you’re right the technology has come along way. I have listened to a bunch of e-texts and books and am pretty amazed at how little it stumbles when it comes to words it doesn’t recognize. Even with SF and Fantasy with made up names it does a credible job of intepreting them.
    Your readers might like to know that the AT&T voices cost extra and are not included with the basic program. The included free voices are acceptable, but the AT&T true voices are definately worth the added cost.

  2. which sound so good that I suspect they are reverse-engineered from individual people
    I could be wrong, but I think the voices in these type of programs have always been reverse-engineered from human voices. I think it was always just a matter of how they get processed (both in the reverse-engineering and when making new words from the sounds you have.)
    Hey, anybody remember Dr. Sbaitso? He was cool (talking chatter box psychiatrist from back in the days of 8-bit sound cards.)

  3. The AT&T system has a huge database of prerecorded utterances from a human speaker which it chops and forms to form new utterances. So I believe some common phrases might be completely canned. I don’t know the nuts and bolts of how the voices in the old systems got synthesized, but they didn’t have enough memory to store even a minute of audio so they must have done something else — convert the words into sequences of phonemes and then maybe the phonemes were recorded or maybe they were synthesized too.

  4. Most Linux boxen no longer play mp3s by default because of the legality of that being fuzzy. I’m not nitpicking; I just wanted to point out the stupidity of that. It doesn’t hold anybody back from playing them–SuSE allows an update to remedy the situation even though they won’t put the software on their CDs, and there are dozens of applications for Linux that can play them and/or convert them, including a less-evil but CPU-hogging version of Real Player for Linux, as well as cool open source programs.
    I prefer .ogg files; they’re an open source format and sound better anyway. Some people even install a special Linux-based “OS” on their iPods for extra functionality and to play their .ogg files. My husband SO wouldn’t let me try to do that, tho! 😉
    Anyone in the open source world might want to try out “festival” (text-to-speech application). I played with it for the first time last night because of this article, just to see how it was. It also comes with a text2wav component to record a .wav. I’m not done researching all of the possible front ends and tie-in programs, but I imagine there are dozens.
    I had festival render The Raven for me last night and it wasn’t too bad. Jimmy’s is probably better overall, though festival didn’t stumble on some consonant combinations that Text aloud did (e.g. the “mb” in “chamber” sounds better).
    For the curious/geeky types:
    Festival Sample MP3
    It always amazes me, the sheer number of open source projects out there!

  5. what’s evil about Real’s player? i use it to watch EWTN TV-feeds all the time…
    I just remember it being obnoxious, like Quicktime, under Windows, with its Start Center and nags and connections behind your back. In Linux it’s not evil; it just …….Buffering………
    …hogs a ton of CPU.
    MPlayer is hands down the best app I’ve tried in any OS–it can play everything I throw at it, and it comes with a browser plugin, and Mencoder that can convert just about any video format under the sun. I’d like to see Windows users have a version comparable to that for Linux someday. There was already an attempt to get it out of production, but it lives on, since the EU wisely decided that software patents are ridiculous. (Under them, you can patent something like a Progress Bar or a File Manager, meaning that even if you wrote your own Progress Bar from scratch, you were violating someone’s Progress Bar patent. And yes, someone patented the Progress Bar already. If you have any apps with a progress bar, and I know you all probably do, you’re technically violating a patent.)

  6. Oh, here’s something else I hadn’t seen. This is cross-platform (works with Java) and also integrates with Firefox by way of a Firefox extension (downloaded separately). It’s based on FreeTTS and is open source.
    CLC-4-TTS
    It should run on any system that has Java, from the looks of it. (…off to check it out)

  7. MPlayer is definitely a great program. I do tend to use Xine a lot more though since I like it better for DVD playback (it supports DVD menus).
    As a matter of fact Xine is catching up with MPlayer if you count all the plugins that are available for it.

  8. Yea, Xine is very good too. But I have an Athlon 64-bit machine, so to make it play some things, I’d need a 32-bit version on my 64-bit machine, and installing the 32-bit version would break too many other 64-bit programs. I was able to compile Mplayer to be 32-bit without breaking anything else that I’m aware of. I thought maybe extending my Xine with pitfdll (either 32-bit or 64-bit) might bring luck, but no joy so far. Next time I install an OS, I’m just going to spare myself and install a 32-bit version. They don’t seem to run much slower on here, anyway. 🙂

  9. Exactly! I debated whether got with 32 or 64 bits and I finally went with 32 bits for better compatibility. I don’t think 64 bits is justified for most people.
    As for Xine it is possible to compile it for a 64 bit machine without braking anything… but it is a crippled version that wouldn’t be worth the trouble. The developpers are working on it though.
    Anyway I think I’ve hijacked that comments threat long enough..

  10. While TTS is fast, it’s been a few days now. Has anyone done a proper recording with a human reading yet? If not, I’ve been contemplating doing a “Deus Caritas Est” podcast…

  11. Yea, I’d not recommend 64-bit machines even to most Windows users. Not to hijack the combox but this is a good place to say it. I tried to tell my brother how unhappy I was with it but he bought one anyway, and now he’s frustrated too, even using 64-bit Windows. Lots of vendors just don’t make 64-bit drivers for hardware, for one thing. He says, “At least I’m ready for the future!” and he sucked it up, and put a 32-bit Windows and Linux on it.
    The good thing is that you *can* put a 32-bit OS on it if you want, and not have these problems.
    The good thing about more people using 64-bit machines is, eventually the drivers and software will come, by virtue of demand.
    I like the speed and I’m willing to pioneer and be counted among the 64-bit users and ask for 64-bit software and drivers, and speed along the conversion of the world to 64-bit, assuming it’ll happen. If you’re like that, and you don’t mind the inconveniences, go for it. But if it were my mom or my cousin Joe Six Pack, I’d recommend them a fast 32-bit machine. My husband has one just as fast as my machine. And I do think I’ll put a 32-bit OS back on here next time I have a hard drive failure–at least a dual-boot. 🙂

  12. Thanks to Jimmy, I’ve been using the trial version of text aloud recently. I use it for readings from a doctoral class. I stick the files on my ipod and listen on the way to and from work (2 hours of driving each day). It’s been great so far!

  13. I’m with ya scooter. I had a $50 gift certificate from amazon, so I sent off for crystal and mike. I am mentally salivating over the endless possibilties: the fathers, catechism, online classics, etc. I spend a lot of time in transit as well.

Comments are closed.