Tech News : Wait For It …The OpenAI Voice Cloning Tool

Written by Pronetic

Pronetic is a leading provider of core IT support for ISO 27001, Cyber Essentials and Cyber Essentials Plus compliance.

April 3, 2024

OpenAI has announced the preview of its (two years in the making) ‘Voice Engine’ voice cloning tool, although there’s no firm release date yet. 

What Can It Do? 

OpenAI says Voice Engine uses “text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.”  OpenAI says this “small model” with a single 15-second sample can create “emotive and realistic voices.” 

Two Years On 

Voice Engine was first developed almost 2 years ago in late 2022, since then it’s been used to power the preset voices available in the text-to-speech API and ChatGPT Voice and Read Aloud. ChatGPT Voice is the feature that enables ChatGPT to use voice commands and AI to speak its responses. OpenAI’s text-to-speech (TTS) API is the service that converts text into natural-sounding speech, i.e. it uses AI models to produce speech that closely mimics human voices. 

Being Cautious 

Although the voice cloning tool has been powering other aspects of OpenAI’s voice command and text-to-speech features for almost two years, the announcement of Voice Engine itself has been delivered with more than a hint of caution about it. For example, OpenAI’s announcement about Voice Engine says it’s just “preliminary insights and results from a small-scale preview.” Also, OpenAI admits it is deliberately taking a “cautious and informed approach to a broader release” which it says is because of the “potential for synthetic voice misuse” (e.g. deepfakes) and using convincing fake audio recordings for fraudulent purposes, impersonation, or spreading misinformation. 

OpenAI says that it recognises that generating speech that resembles people’s voices “has serious risks, which are especially top of mind in an election year” and is “engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.“ 

Also, testing partners for Voice Engine have had to agree to usage policies that prohibit the impersonation of another individual or organisation without consent or legal right. OpenAI is also asking partners to get explicit and informed consent from the original speaker and to disclose to their audience that the voices they’re hearing are AI-generated.  

To enable OpenAI to monitor and enforce these policies and requirements, OpenAI says it’s implemented a set of safety measures, which include “watermarking to trace the origin of any audio generated by Voice Engine, as well as proactive monitoring of how it’s being used.“ 

What Now? 

Although OpenAI wants to announce the fact that it has developed a powerful AI voice cloning tool, it wants to temper the disappointment about not releasing it yet by highlighting a few positive uses for Voice Engine. For example, in its recent announcement about Voice Engine, OpenAI listed how it could be used to :

  • Provide reading assistance to non-readers and children
  • Translate content like videos and podcasts (for creators and businesses)
  • Support people who are non-verbal (therapeutic applications).

OpenAI also highlights how Voice Engine could prove extremely useful for patients recovering their voice or for those people suffering from sudden or degenerative speech conditions, and for improving essential service delivery in remote settings, thereby reaching global communities. 

What Does This Mean For Your Business? 

With this being a very important election year for at least 64 countries (including the US, UK and India), each of the large AI companies are very reluctant to be named as the one that allowed misuse of their AI products and/or didn’t take the right precautions to prevent misuse. For example, just as Google has put restrictions on what its Gemini AI model will answer about elections for fear of it being misused, OpenAI has decided now is not the right time, without the right protections in place, to release its two years in the making voice cloning tool.

OpenAI, therefore, is happy to let the world and OpenAI’s competitors know that it has an advanced AI ‘Voice Engine’ in the pipeline, but it isn’t prepared to take the risk of the tool and the company’s name being tarnished by misuse within the global arena of elections. It’s likely that we’ll see much more of this caution being exercised by AI companies releasing new features and products, particularly this year. 

For businesses and organisations, plus those in the health/therapy sectors hoping to make use of the powerful, value-adding capabilities of Voice Engine, it’s a case of waiting a bit longer. The danger, however, in the fast-moving field of AI is that while time passes (as testing and safety policies are being put in place), another competitor with a new or updated existing powerful voice cloning tool may be released during the meantime, thereby stealing some of Voice Engine’s thunder.

Even when Voice Engine is regarded to be safe to release, this won’t guarantee attempts by bad actors to misuse it, so it will be interesting to see whether it’s as well protected as OpenAI says it will be and what users are able to produce with it. Ultimately, OpenAI will want to get this tool out there, being used by as many people as possible as soon as possible – pending this period of caution. 

You May Also Like…
Tech Insight : New Ways To Search

Tech Insight : New Ways To Search

Search technology has transformed significantly from text-based queries back in the nineties to now, where there’s a...

0 Comments

Why Choose Pronetic

We Are ISO 27001 & Cyber Essentials Plus Certified

Be reassured that we have been externally audited. You can have complete peace of mind that the team managing your IT systems and safeguarding your data are independently vetted annually.

Seamless & Comprehensive IT Support

Our investment in people, tools and processes, continuously improved, ensures that we don’t just deliver exceptional I.T. support but include your compliance to Cyber Essentials or ISO 27001 “baked-in”. Yes, that means no more annual headaches and stress when your certification comes round.

Expert Support Money Back Guarantee

We're confident in the value we deliver. That's why we offer a 90-day, no-quibble money-back guarantee. If, for any reason, you're not completely satisfied with our IT support services, we'll provide a full refund and cancel your contract without any hassle.

Book Your Free IT Strategy Call Now!

Simply Fill In The Form Below To Receive Your Free IT Strategy Call:

By submitting this form, you consent to us using your personal information to contact you. For more information please see our privacy policy.