Artificial intelligence clones a person’s voice. She only needs 15 seconds


The start-up behind the ChatGPT chatbot calls the new technology the Voice Engine for now.

Now it’s being tested by education or healthcare companies that have pledged not to use anyone’s voice without their consent. However, it will not easily reach ordinary users, due to concerns about simple misuse.

OpenAI said on its blog that it wants to discuss responsible marketing and debate how ready the public is for synthetic voices. Only then will it be decided when and if Voice Engine will be available to ordinary people.

Sample of real and synthetic voice

A new weapon for cybercriminals?

The question of possible abuse is appropriate. In recent years, so-called vishing has become massively widespread, which is similar to phishing, but instead of e-mail, cybercriminals use social engineering methods during phone calls. Fraudsters present themselves to call recipients, for example, as bank employees who have discovered that their bank account has been hacked, or as security experts who want to secure their computer.

However, the ability to copy a real voice within a few seconds using AI would give cybercriminals a completely new weapon that they would probably not hesitate to use in their attacks. Already today, they manage to forge the official phone numbers of specific institutions without much difficulty, which is referred to as spoofing.

With the new technology, they could easily impersonate a relative or an acquaintance, considering the convincing quality of the synthetic voice, even seasoned users would not have to recognize such a fraud.

Fake president

As the American station NBC News pointed out, a fake voice intervened in local politics in January, when a robot imitating President Joe Biden called voters in New Hampshire. According to the station, the Voice Engine will definitely not be freely available in this US election year.

The Voice Engine, which the company developed back in 2022, but showed it to the world only now, can speak not only in the native language of the speaker, but also in a number of others. For example, this is to allow podcast creators or businesses to reach a larger number of people around the world.

Interestingly, a synthetic voice with an accent speaks in a foreign language. For example, an English voice modeled after a French speaker speaks English with a typical French accent.


Photo: Bonnie Cash, Reuters

US President Joe Biden

A phenomenon called ChatGPT

Artificial intelligence has taken center stage with the development of ChatGPT. This chat system can generate a variety of texts including articles, essays, jokes and poetry based on simple queries. ChatGPT learns to respond to user input and, like humans, learns from large amounts of data.

In March 2023, a more advanced GPT-4 AI model was made available. It should be able to provide safer and more useful answers and pave the way for the spread of human-like technologies.

The web application is offered by OpenAI for free, but this only applies to the older version of the chatbot GPT-3.5. GPT-4’s more advanced artificial intelligence, which is capable of more accurate and useful answers, can only be used by subscription owners. This will cost potential applicants $20 per month, i.e. approximately CZK 453.

ChatGPT is behind OpenAI, a start-up funded by Microsoft.

