Voice AI as a technology is becoming an inseparable part of the automation and digital transformation agenda for modern enterprises in order to stay relevant. They want to offer personalized solutions to meet today’s ever changing consumer behavior, demands and expectations.
Yellow.ai, as a pioneer in conversational voice AI space, has enabled its customers spread across industries to create highly differentiating, meaningful and fulfilling customer experiences for their consumers.
The use cases range from customer support, customer engagement and conversational commerce (from the CX point of view) to HR automation and ITSM automation (from the EX point of view).
Yellow.ai is committed to continuous research and development in conversational voice AI and strives to bring in state of the art and cool features that will further strengthen and elevate the experiential dimension of customer conversations with our dynamic AI agents.
Yellow.ai Voice AI Product team produced key and interesting feature upgrades to the existing voice AI offering, which is focused on humanizing the voice interactions that the dynamic AI agents are having with their customers.
Before we take a deep dive into those specific features, let’s understand what humanization of voice interactions is and why it’s being sought out.
What is the humanization of voice Interactions?
We all prefer interacting with humans over machines. But chatbots come with the significant advantage of being available 24/7, across regions and across languages. They have the ability to quickly fetch information with a click, as well as remember a whole lot of context and information about each and every customer.
Now imagine a scenario where the chatbot sounds human-like with you being able to converse in a very free-flowing and natural way of talking and understanding as humans do. What if the chatbot knows when to start, pause, listen, stop and handle interruptions during a conversation? What if the chatbot is able to understand the sensitivity of the conversation (good or bad) and empathize or pacify the response accordingly? That is humanization of voice interactions for us.
Here’s a quick snapshot of some of the cool new features from Yellow.ai Dynamic Voice AI agents that both our existing and new customers can quickly tap into.
- Configuring pre-post speech pause duration
Customers as humans have a diverse way of speaking, pausing and also using fillers like “ah” and “hmm” sounds during conversations with AI agents. The AI agent needs to be intelligent enough to understand when the customers have completed their response to the AI agent, before the AI agent starts deciding the next course of action or fetching further information or asking a pertinent follow-up question. This is what we call post-speech pause duration.
For example, if the AI agent asks the customer a question—“Hi Jack, Can you please narrate when and how this laptop got broken?”–the customer will start speaking in a very natural language and the AI agent should have the ability to understand when the customer response is complete, before it starts responding.
Similarly, there is also a pre-speech pause duration, where the AI agent will pause and allow the customer to think for a while before answering the question that it asked. If the customer is taking too much time to answer, it will probably repeat the question or decide the appropriate course of action.
The pre- and post-speech pause duration is vital to make the interactions between the customers and dynamic voice AI agents as natural and human-like as possible.
Yellow.ai’s new Voice AI upgraded platform gives our customers the ability to customize and configure both the pre- and post-speech pause duration depending on their respective business needs and customer personas.
- Pick your choice of voice and language for your AI agent
In today’s business scenario, customer needs and their personas are diverse. It does not make sense to use the same robotic neutral voice for all of your responses, which makes it less human-like and monotonous.
Businesses should personalize the voice tonalities of their voice AI agents. For example, business A might prefer a female voice to respond to their audience, whereas business B might prefer a male voice.
With Yellow.ai’s upgraded Voice AI platform, a business will be able to pick and choose the voice and language for their AI agents. The options are very diverse in terms of male or female natural human voice, in any specific language, and includes different tonalities such as cheerful, empathetic, gentle, serious, etc.
This enables our customers to have more personalized and very natural human-like voice interactions and conversations with their end consumers.
US ENG MALE
UK English Female
- Interruption handling
As humans when we converse, we tend to interrupt each other. Humans have the ability to pause, listen to what the other person is saying and respond accordingly. That makes conversations more natural and engaging.
When a customer converses with a voice AI agent, it’s really unnatural for customers to ask a question as an interruption because the AI agent will continue to say what it intended to say as a response.
The expectation from the customer is that when they interrupt while the voice AI agent is speaking, it should stop to listen to the customer and respond accordingly. The Yellow.ai voice platform has been upgraded with an ability to handle interruptions from the customer during a response.
The voice AI agent is always listening to what the customer has to say, even while he is speaking. That allows the AI agent to filter out background noises and react only to actionable interruptions by the customer. This ensures that the conversation is not unnecessarily interrupted by background/environmental disturbances.
Conversational Intent Change
Environmental Interruption
- Custom models for decoding alpha numerical accurately
When businesses capture inputs from users or prospects as part of voice AI–enabled conversation, there are situations where alphanumeric characters are part of the response.
For example, if the user has to spell a flight booking ID such as ABQI3215C, it must be captured accurately as the AI agent may need to fetch pertinent information based on that input. We have observed in certain instances the alphanumeric inputs incorrectly captured, which led to an erroneous response.
Our voice AI experts have developed a way of designing custom models for capturing an alphanumeric response from a user. For example, if the system has to capture a PAN ID as a response, it is aware that it will be five letters, followed by four digits and a single letter.
This way the voice AI agent can improve the accuracy in capturing alphanumeric details. This feature will also enhance the capture accuracy of alphanumeric details when presented in slang or dialects of different languages.
Playback Confirmation
- Recording pause-resume feature for customer sensitive information
Conversations between the customers and the voice AI agent are typically recorded for quality purposes, and these recordings are then analyzed to improve the efficacy of responses to customers in the future.
But there are some situations where a customer needs to share sensitive information. For example, the system may have sent a voice OTP, which the customer has to repeat for identification. Or there might be some patient sensitive information which the healthcare system needs to record as part of their responses.
Since these pertain to regulatory compliance and privacy requirements, the customers that we work with were wanting to pause the recording while the user spelled out sensitive information, and once it was captured, the system would resume recording again. This helps our customers to take care of the privacy requirements of their end consumers when conversing with their voice AI systems.
Our renewed voice AI platform provides this unique capability for our customers to configure, when the calls should be paused for recording and when recording should be resumed, with sensitive information redacted from the final recording.
Actual Recording
Redacted Recording
- Detecting and responding to automated answering machines
When sales and marketing teams run outbound voice campaigns, many times they face a scenario where the AI agent encounters an answering machine on the other end. In a normal scenario, the AI agent will assume that it’s speaking to a human and will continue to run its script based on the response provided. That’s like two bots talking to each other without any end purpose being served.
But our upgraded voice AI platform has the ability to detect and differentiate an answering machine from a human response. Our voice agents had the ability to hang up a call once it detected an answering machine at the other end. Our future releases in this space aim to provide our customers with an ability to even configure the response from voice AI agents in those situations.
Detection Without Answering Machine
With all these cool and innovative feature upgrades, the Yellow.ai voice AI platform is now more robust, dynamic and strives to make each and every conversation more natural and humanized, fulfilling and memorable !
Come see the tech in action!