VALL E is an Android application which is an artificial intelligence model from Microsoft Corporation. The algorithm is able to imitate human speech with amazing accuracy. At the same time, a voice recording sample lasting only three seconds is sufficient for training.
Principle of operation
Just Like ChatGPT, this platform is an AI-based algorithm. During the development process, the LibriLight library was used, which includes more than 60 thousand hours of speech in English.
Moreover, this model does not work with sound waves unlike standard speech synthesis methods. The neural network reveals the features of a person’s talk, breaking it into special tokens. This allows you to simulate the sound of a voice beyond a three-second sample.
It is worth noting that the algorithm can also simulate various emotions, for example anger, joy, disgust and so on. It is possible to play the sounds of the environment.
Availability
Microsoft does not publish the algorithm in the public domain for experiments at the time of writing the review. It is related to the fact that there is a high risk of using the neural network by attackers. Users can view demonstration samples of speech simulation and evaluate the capabilities of the neural network on the official website.
Features
- the algorithm is capable of imitating human speech with high accuracy;
- it is possible to reproduce the environment and emotional coloring;
- EnCodec technology was used when creating the model;
- compatible with current Android versions;
- free to download and use.