AI that simulates voices with threesecond samples

The technology first analyses how a person sounds and then breaks that information into discrete components

Mathures Paul Published 12.01.23, 02:50 PM

At a time when the world is waking up to ChatGPT, the power of AI is getting stronger on the text-to-speech front. Microsoft researchers have announced a new AI model called VALL-E that can simulate a person’s voice when given a three-second audio sample. This comes days after Apple quietly launched (not in India) a feature called “digital narration”. The feature is for its Books app and it will let users listen to written titles as audiobooks, narrated by artificial intelligence. It’s a feature that may upend the fast-growing audiobook market.

VALL-E generates discrete audio codec codes from text and acoustic prompts. The technology first analyses how a person sounds and then breaks that information into discrete components and uses training data to match what it “knows” about how that voice would sound beyond the three-second sample.

The upside of the technology is better text-to-speech accessibility for people with visual impairment. But what about a future where podcasts are streamed with guests that are AI models?

To keep things under check, VALL-E codes haven’t been provided in the public space for people to experiment.

AI that simulates voices with threesecond samples

The technology first analyses how a person sounds and then breaks that information into discrete components

RELATED TOPICS

Budget spur to spending: Modi govt hopes more savings to drive up urban demand

No tax for people earning up to Rs 12.75 lakh, Modi govt eye on middle class in Budget

Budget 2025 cuts customs duty on lifesaving drugs, but little impact on affordability

Amid global uncertainty, solving economic crisis needed a shift, but the govt lacks ideas

FM allocates Rs 6.81 lakh crore for defence modernisation amid growing border threats

Budget 2025: Tax bounty leaves out majority of workforce, missing growth stimulus

Funds pittance in Budget 2025 hints at further census delay, with just Rs 574.80 crore allocated

Zakia Jafri, face of 2002 Gujarat riot protest, passes away at 86

Mohun Bagan Super Giant thrash city rivals Mohammedan Sporting 4-0 in ISL

Films with 'larger-than-life male heroes' signs of patriarchy: Paoli Dam

Budget is a band-aid for bullet wounds! This government is bankrupt of ideas

AAP MLA assaulted during public rally, Kejriwal accuses BJP of indulging in 'hooliganism'