This site is no longer updated.Go to new Conversational Cloud docs

Speech recognition and synthesis


Bots that make and accept calls use automatic speech recognition and synthesis:

  • Automatic Speech Recognition (ASR) is the process of translating speech to text.
  • Text-To-Speech (TTS), or speech synthesis, is the process of generating speech from written text.

When creating a phone channel, you can do either of the following:

  • Select one of the ASR/TTS providers supported by Just AI.
    You can then customize speech recognition and text-to-speech settings in JAICP: select a model for recognition, a specific voice for speech synthesis, etc.

  • Create a connection using your own account registered by the ASR/TTS provider.

    If you prefer to use your own connection, Just AI ASR/TTS limits do not apply to you.

Then, you will need to use the a tag or the $reactions.answer method for generating replies from the script.

Speech synthesis markup

To make the bot’s speech more expressive, you can use speech synthesis markup. JAICP supports Speech Synthesis Markup Language (SSML) that allows you to customize the speech tone, pronunciation, speed, volume, etc. Learn more about SSML in Speech synthesis markup.

Speech synthesis with variables

You can also use speech synthesis with variables if you want to use context-dependent variables that should be mentioned throughout the dialog. For more information, see Speech synthesis with variables.