dcsimg

Look Who’s Talking: Android Edition

Adding speech to Android apps isn't nearly as difficult as you would think. "Text to Speech" is a snap with the android.speech.tts package.

Instant Poll: What’s Your Preferred Mobile OS? Vote Now.

Text to Speech is but one of many bells and whistles available in the increasingly capable Android platform, built right into the Software Development Kit, or SDK.

In order to include Text to Speech features, an application imports a couple of classes from the android.speech.tts package:

import android.speech.tts.TextToSpeech;
import android.speech.tts.TextToSpeech.OnInitListener;

The TextToSpeech import represents the class we’ll use and the OnInitListener is a Java Interface that is required when initializing an instance of the TextToSpeech class.

The parameters to the TextToSpeech constructur include a “Context”, which is satisfied by our Activity, so we can use “this”. Additionally, because our class implements the OnInitListener, we can get away with passing “this” again.

	tts = new TextToSpeech(this,this);

Implementing the OnInitListener is straight-forward, just add a method named onInit:

	public void onInit(int status) {
		Log.i(tag,"onInit [" + status + "]");
	}

Getting some speech out of your Android device is as simple as passing a String to the “speak” method of an instance of the TextToSpeech class.

    	tts.speak("Say something here", TextToSpeech.QUEUE_FLUSH, null);

The speak method takes three arguments: a string, a flag indicating how the string should be enqueued, and an optional HashMap for managing how the audio is handled.

The QUEUE_FLUSH value for the queueMode parameter tells Android to remove anything that may be currently queued for speaking and immediately speak the provided text. An alternative value is QUEUE_ADD where the string is added to the end of an existing queue of string fragments to be “spoken”.

The TextToSpeech reference provides more information on the available “knobs and dials” for controlling speech on the platform. Without going into the details of the class which you can read for yourself, some things to consider include:

  • You can select from a handful of different “languages” — you’re not limited to northern California-speak.
  • You can provide and register your own sound resources for particular snippets or phrases of text.
  • You can even say nothing with the playSilence method!

Let’s have a look at a concrete example.

Sample application

If the first programming language you learned was Java, you may be too young to recall a movie about a high school student who stumbles upon a government computer and nearly causes a global nuclear conflict. One of the classic lines of that movie was when the computer asks, “Shall we play a game?”. So, I thought it would be appropriate to name this application: WarGames, after the movie. By the way, the movie came out in 1983 so if you recall the movie and you’re reading this article, your first programming language probably was not Java!

We want the application to speak any text we provide, so we’ll have two user interface elements for this application: an EditText and a Button. Here is the layout file: main.xml:

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:orientation="vertical"
    android:layout_width="fill_parent"
    android:layout_height="fill_parent"
    >
<EditText
    android:layout_width="fill_parent"
    android:layout_height="wrap_content"
    android:text="Shall we play a game?"
    android:id="@+id/inputBox"
    />
<Button
    android:layout_width="fill_parent"
    android:layout_height="wrap_content"
    android:text="Speak Now!"
    android:onClick="speakNow"
	/>
</LinearLayout>

Note that we are defaulting the text to be the familiar, “Shall we play a game” phrase linked here from Movie Sounds Central.

Shall we play a game?
Shall we play a game?

Here is the code for the one and only Activity of the application, WarGamesActivity.

package com.navitend.lm.wargames;

import android.app.Activity;
import android.os.Bundle;
import android.view.*;
import android.speech.tts.TextToSpeech;
import android.speech.tts.TextToSpeech.OnInitListener;
import android.util.Log;
import android.widget.EditText;

public class WarGamesActivity extends Activity implements OnInitListener{

    /** Called when the activity is first created. */
	private String tag = WarGamesActivity.class.getSimpleName();
	private EditText et = null;
	private TextToSpeech tts = null;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        et = (EditText) findViewById(R.id.inputBox);
        tts = new TextToSpeech(this,this);

    }

    @Override
    protected void onDestroy() {
    	super.onDestroy();
    	tts.shutdown();
    }

    public void speakNow(View v) {
    	Log.i(tag,"speakNow [" + et.getText().toString() + "]");
    	tts.speak(et.getText().toString(), TextToSpeech.QUEUE_FLUSH, null);

    }

	public void onInit(int status) {
		Log.i(tag,"onInit [" + status + "]");
	}

}

Some things to note about this code:

  1. We import the TextToSpeech classes.
  2. We implement the OnInitListener interface in the class definition and provide the required onInit() method.
  3. The tts variable is an instance of the TextToSpeech class.
  4. tts is initialized in the onCreate method and cleaned up in the onDestroy method.
  5. The variable “et” is used to gain access to the EditText widget which allows the user to provide text.
  6. The speakNow method invokes the speech engine to speak our desired text.

Now for a little fun.

The text to speech engine is pretty good, but keep in mind that not everything will be readily spoken cleanly. For example, when I ran the application for my son, he said, “how about supercalifragilisticexpialidocious?”. No, I did not spell check that and you’re welcome to do so for me…

Well, I tried it and the result wasn’t all that great.

Mary Poppins, where are you?
Mary Poppins, where are you?

So, just for fun I broke the word up a bit and the results were much better. Something to keep in mind when building your own speech enabled applications.

Creative license may be required
Creative license may be required

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62