Dejan Lukić

How to Create Multilingual Voice Call Transcription using Twilio Studio

Originally written for Twilio.

Hello, Hola, Bonjour, Привет, Zdravo, Ahoj, reader! In this guide, you will learn how to handle caller input using Twilio Studio, then, based on the selected digits, transcribe and translate what the caller said.

You will delve into the fascinating world of Twilio Studio and explore how it can revolutionize your communication workflows. Language barriers will no longer hinder effective communication as we unveil a solution that transcribes and translates what the caller says, opening doors to seamless multilingual conversations.

Imagine a scenario where you receive voice calls from individuals speaking different languages. Understanding and documenting the details of these calls accurately can be daunting. However, with Twilio Studio and the Google Translation API, you can effortlessly bridge language gaps and gain valuable insights from your voice interactions.

Prerequisites

To get started, you will need the following:

Create Twilio Functions Services

In this section, you will create two Twilio Functions to handle call recording and voice transcription. Twilio Functions is a serverless environment, enabling you to deploy backend functions without dealing with servers.

Voice recording Function Service

Head to the Functions and Assets section in the Twilio Console to create a new Functions service. Press the Create Service button. Services are containers for your environments.

In the Service Name field, enter "translation-service". Then, press the Next button, which will take you to the Service editor.

Press the Add + button on the left top of the page, and from the dropdown menu, select Add Function. Name the Function as "/call-handler".

A code editor on the right will populate with a boilerplate. Remove any code inside and replace it with the following:

exports.handler = function (context, event, callback) {

   let twiml = new Twilio.twiml.VoiceResponse();

   console.log('[Event] Call recording started')

   twiml.record({

      transcribeCallback: '/call-processing'

   });

   return callback(null, twiml);

};

Press the Save button underneath to save the Function.

This function uses the<Record> verb which records the caller's voice and passes the transcription of the call to the /call-processing route using the transcribeCallback attribute..

Transcription and translation Function Service

Press the Add + button on the top-left side of the page and name this new Function as "/call-processing".

A code editor on the right will populate with a boilerplate. Remove any code inside and replace it with the following:

require('dotenv').config();

const axios = require('axios');

const GOOGLE_API_KEY = process.env.GOOGLE\_API\_KEY;

const GOOGLE_TRANSLATE_ENDPOINT = 'https://translation.googleapis.com/language/translate/v2';

Now, add a languageMap object to the same Function, which will contain ISO-639 country code identifiers corresponding to the digit entered by the caller.

const languageMap = {

// Digit 1 corresponds to 'sr' - Serbian

1: 'sr', // Serbian

// Digit 2 corresponds to 'fr' - French

2: 'fr', // French

// Digit 3 corresponds to 'es' - Spanish

3: 'es', // Spanish

}

Lastly, add a Function handler that will perform the translation.

exports.handler = async (context, event, callback) = \ > {

   console.log('[Event] Transcription started.')

   let language = languageMap[event.selectedLanguage];

   if (event.TranscriptionStatus == 'completed') {

      let transcription = event.TranscriptionText;

      console.log(`[Transcription] ${transcription}`);

      try {

         const response = await axios.post(

            `${GOOGLE_TRANSLATE_ENDPOINT}?key=${GOOGLE_API_KEY}`,

            {

               q: transcription,

               target: language,

            },

            {

               headers: {

                  'Content-Type': 'application/json',

               },

            }

         );

         console.log(

            `[Translated transcription - ${language}] ${response.data.data.translations[0].translatedText}`

         );

         return callback(null);

      } catch (error) {

         console.error(error);

         return callback(error);

      }

   }

};\

Press the Save button underneath to save the Function.

Add dependencies

On the left panel in the editor, under Settings, press Dependencies.

In "Module" enter "axios", and in "Version" enter "latest". Then, hit Add, and you should see "Adding dependency axios@latest" in the logs. Twilio Console showing Dependencies section

Add environment variables

Under the same panel, press Environment Variables. Here, you will add Google's Translations API key.

In "Key" enter "GOOGLE_API_KEY", and in "Value" paste your API key.

Twilio Console showing Environment Variables

Deploy the Functions

Press Deploy All on the bottom-left part of the page. The building process will start, and you will see the "Deployed to environment" message in the logs if no issues arise.

Build Status shown in the Twilio Console

Create a new Twilio Studio Flow

You'll need to create a Twilio Studio Flow to handle caller input and call the previously created Functions. Flows are individual workflows that operate one or more use cases with logic blocks.

Create a new Flow from the Studio section in the Console. Press Create new Flow from the right-hand side of the Console.

Assign a name for the Flow, for example, "Language Selector" and click Next. Then, select Start from scratch.

You will be presented with the Studio Editor.

Twilio Studio Editor

From the Widget Library on the right, under Voice, drag and drop Gather Input on Call on to the canvas. Then, connect the Trigger with the gather_1 by pressing Incoming Call and drawing the line to the gather_1 input.

A Studio Flow showing the initial gather widget

From the Widget Library, under Tools & Execute Code, drag and drop Run Function beneath the gather_1 widget and connect the User Pressed Keys transition with the input of function_1.

A Studio Flow showing the gather and function widgets

If the Function fails to execute, we can create a loop by connecting the Fail transition with the input of the gather_1.

A Studio Flow showing the gather and function widget

Now, to set up the blocks, press gather_1 and enter the following within the Text to say input on the right side:

Hi, Twilio! Press 1 for Serbian, Press 2 for French, press 3 for Spanish. Thanks!

This will be played when the caller joins the call. Next, you can select the original language of the call and the voice that will be played. Press Save when finished.

Gather widget configuration in Twilio Studio

Lastly, you need to set up the Function. To do so, do the following:

  1. Press on the function_1.
  2. Under Service, select translation-service.
  3. Under the Environment, select ui.
  4. Under the Function, select call-handler.
  5. Hit the Save button.

That's it for the Flow setup. Click Publish at the top to save your flow. Next, you must configure the phone number to activate the Flow of incoming calls.

Configure the Twilio phone number

In a new tab, open Twilio phone numbers in the Console. Select the phone number you want to use and press on it to open the Configure tab.

In this Voice Configuration section, configure it with the following:

The phone number is now configured to execute the Flow on all incoming calls. Scroll to the bottom and click Save configuration.

Test the app

Head back to the Functions editor and press Live Logs off on the right-hand side to enable live logs.

Call the Twilio-provided phone number used for the Flow from your phone. Note that you will incur outbound call fees and, if abroad - roaming fees.

After the trial message, you'll hear the message you've set in the Gather Input widget. Afterward, a beep will play, after which you can speak.

Your original and translated transcription will take a few seconds to appear in the Console.

Call transcription in Twilio Console

That's it. Честитам! Congratulations! Félicitations! ¡Felicidades! That's how you translate transcriptions with Twilio Functions and handle caller input with Twilio Studio Flows.

Warning: Recording calls is illegal in some jurisdictions. Some may require the prior consent of all participating parties. You must adhere to all applicable laws regarding call recording. This is not legal advice, and Twilio is not responsible for your actions. Please consult a lawyer before proceeding.

#communications #saas #twilio