How to Create Multilingual Voice Call Transcription using Twilio Studio
Originally written for Twilio.
Hello, Hola, Bonjour, Привет, Zdravo, Ahoj, reader! In this guide, you will learn how to handle caller input using Twilio Studio, then, based on the selected digits, transcribe and translate what the caller said.
You will delve into the fascinating world of Twilio Studio and explore how it can revolutionize your communication workflows. Language barriers will no longer hinder effective communication as we unveil a solution that transcribes and translates what the caller says, opening doors to seamless multilingual conversations.
Imagine a scenario where you receive voice calls from individuals speaking different languages. Understanding and documenting the details of these calls accurately can be daunting. However, with Twilio Studio and the Google Translation API, you can effortlessly bridge language gaps and gain valuable insights from your voice interactions.
Prerequisites
To get started, you will need the following:
Create Twilio Functions Services
In this section, you will create two Twilio Functions to handle call recording and voice transcription. Twilio Functions is a serverless environment, enabling you to deploy backend functions without dealing with servers.
Voice recording Function Service
Head to the Functions and Assets section in the Twilio Console to create a new Functions service. Press the Create Service button. Services are containers for your environments.
In the Service Name field, enter "translation-service". Then, press the Next button, which will take you to the Service editor.
Press the Add + button on the left top of the page, and from the dropdown menu, select Add Function. Name the Function as "/call-handler".
A code editor on the right will populate with a boilerplate. Remove any code inside and replace it with the following:
exports.handler = function (context, event, callback) {
let twiml = new Twilio.twiml.VoiceResponse();
console.log('[Event] Call recording started')
twiml.record({
transcribeCallback: '/call-processing'
});
return callback(null, twiml);
};
Press the Save button underneath to save the Function.
This function uses the<Record>
verb which records the caller's voice and passes the transcription of the call to the /call-processing route using the transcribeCallback
attribute..
Transcription and translation Function Service
Press the Add + button on the top-left side of the page and name this new Function as "/call-processing".
A code editor on the right will populate with a boilerplate. Remove any code inside and replace it with the following:
require('dotenv').config();
const axios = require('axios');
const GOOGLE_API_KEY = process.env.GOOGLE\_API\_KEY;
const GOOGLE_TRANSLATE_ENDPOINT = 'https://translation.googleapis.com/language/translate/v2';
Now, add a languageMap
object to the same Function, which will contain ISO-639 country code identifiers corresponding to the digit entered by the caller.
const languageMap = {
// Digit 1 corresponds to 'sr' - Serbian
1: 'sr', // Serbian
// Digit 2 corresponds to 'fr' - French
2: 'fr', // French
// Digit 3 corresponds to 'es' - Spanish
3: 'es', // Spanish
}
Lastly, add a Function handler that will perform the translation.
exports.handler = async (context, event, callback) = \ > {
console.log('[Event] Transcription started.')
let language = languageMap[event.selectedLanguage];
if (event.TranscriptionStatus == 'completed') {
let transcription = event.TranscriptionText;
console.log(`[Transcription] ${transcription}`);
try {
const response = await axios.post(
`${GOOGLE_TRANSLATE_ENDPOINT}?key=${GOOGLE_API_KEY}`,
{
q: transcription,
target: language,
},
{
headers: {
'Content-Type': 'application/json',
},
}
);
console.log(
`[Translated transcription - ${language}] ${response.data.data.translations[0].translatedText}`
);
return callback(null);
} catch (error) {
console.error(error);
return callback(error);
}
}
};\
Press the Save button underneath to save the Function.
Add dependencies
On the left panel in the editor, under Settings, press Dependencies.
In "Module" enter "axios", and in "Version" enter "latest". Then, hit Add, and you should see "Adding dependency axios@latest" in the logs.
Add environment variables
Under the same panel, press Environment Variables. Here, you will add Google's Translations API key.
In "Key" enter "GOOGLE_API_KEY", and in "Value" paste your API key.
Deploy the Functions
Press Deploy All on the bottom-left part of the page. The building process will start, and you will see the "Deployed to environment" message in the logs if no issues arise.
Create a new Twilio Studio Flow
You'll need to create a Twilio Studio Flow to handle caller input and call the previously created Functions. Flows are individual workflows that operate one or more use cases with logic blocks.
Create a new Flow from the Studio section in the Console. Press Create new Flow from the right-hand side of the Console.
Assign a name for the Flow, for example, "Language Selector" and click Next. Then, select Start from scratch.
You will be presented with the Studio Editor.
From the Widget Library on the right, under Voice, drag and drop Gather Input on Call on to the canvas. Then, connect the Trigger with the gather_1 by pressing Incoming Call and drawing the line to the gather_1 input.
From the Widget Library, under Tools & Execute Code, drag and drop Run Function beneath the gather_1 widget and connect the User Pressed Keys transition with the input of function_1.
If the Function fails to execute, we can create a loop by connecting the Fail transition with the input of the gather_1.
Now, to set up the blocks, press gather_1 and enter the following within the Text to say input on the right side:
Hi, Twilio! Press 1 for Serbian, Press 2 for French, press 3 for Spanish. Thanks!
This will be played when the caller joins the call. Next, you can select the original language of the call and the voice that will be played. Press Save when finished.
Lastly, you need to set up the Function. To do so, do the following:
- Press on the function_1.
- Under Service, select translation-service.
- Under the Environment, select ui.
- Under the Function, select call-handler.
- Hit the Save button.
That's it for the Flow setup. Click Publish at the top to save your flow. Next, you must configure the phone number to activate the Flow of incoming calls.
Configure the Twilio phone number
In a new tab, open Twilio phone numbers in the Console. Select the phone number you want to use and press on it to open the Configure tab.
In this Voice Configuration section, configure it with the following:
- For Configure With, select Webhooks, TwiML Bins, Functions, Studio, or Proxy.
- For A Call Comes In, select Studio Flow; next to it, choose Language Selector.
The phone number is now configured to execute the Flow on all incoming calls. Scroll to the bottom and click Save configuration.
Test the app
Head back to the Functions editor and press Live Logs off on the right-hand side to enable live logs.
Call the Twilio-provided phone number used for the Flow from your phone. Note that you will incur outbound call fees and, if abroad - roaming fees.
After the trial message, you'll hear the message you've set in the Gather Input widget. Afterward, a beep will play, after which you can speak.
Your original and translated transcription will take a few seconds to appear in the Console.
That's it. Честитам! Congratulations! Félicitations! ¡Felicidades! That's how you translate transcriptions with Twilio Functions and handle caller input with Twilio Studio Flows.
Warning: Recording calls is illegal in some jurisdictions. Some may require the prior consent of all participating parties. You must adhere to all applicable laws regarding call recording. This is not legal advice, and Twilio is not responsible for your actions. Please consult a lawyer before proceeding.