Udio

Udio is a generative artificial intelligence model that produces music based on simple text prompts. It can generate vocals and instrumentation. Its free beta version was released publicly on April 10, 2024. Users can pay to subscribe monthly or annually to unlock more capabilities such as audio inpainting.

Founded in December 2023 by a team of former researchers for Google DeepMind headed by Udio's CEO, David Ding, the program received financial backing from the venture capital firm Andreessen Horowitz and musicians will.i.am and Common, among others. Critics praised its ability to create realistic-sounding vocals while others raised concerns over the possibility that its training data contained copyrighted music.

History
Udio was created in December 2023 by a team of four former researchers for Google DeepMind, including Udio's CEO David Ding, Conor Durkan, Charlie Nash, Yaroslav Ganin, as well as Andrew Sanchez under the name of Uncharted Labs. The venture capital firm Andreessen Horowitz; the music distributor UnitedMasters; musicians will.i.am, Tay Keith, and Common; investor Kevin Wall; Instagram cofounder Mike Krieger; and DeepMind researcher Oriol Vinyals all provided financial backing for Udio, and it was valued at $10 million in seed funding (plus the original $8.5 million raised previously). It spent several months in a closed beta phase before being publicly released in its beta phase on April 10, 2024 on the Udio website. , it allows users to generate 600 songs per month for free. Sanchez described it as "enabl[ing musicians] to create great music and... to make money off of that music in the future". Udio's release followed the releases of other text-to-music generators such as Suno AI and Stability Audio.

Udio was used to create "BBL Drizzy" by Willonius Hatcher, a parody song that went viral in the context of the Drake–Kendrick Lamar feud, with over 23 million views on Twitter and 3.3 million streams on SoundCloud the first week.

Capabilities
Udio bases the songs it creates on text prompts, which can include their genre (including barbershop quartet, country, classical, hip hop, German pop, and hard rock, among others), lyrics, story direction, and other artists to base their sound on. Its lyrics are created with a large language model (LLM), while the process used to generate the music itself,, has not been disclosed. The program generates two songs based on the prompts and users can "remix" their songs with further text prompts. Songs are first generated as roughly 30 second-long pieces, and can be extended by additional 30 second increments. Paying subscribers can access advanced functionality such as audio inpainting.

Reception
Mark Hachman, the senior editor of PC World, compared Udio to AI art generators and praised its ability to turn "a few rather poor lyrics" into a "rather catchy" song, also calling the vocals it generated "incredibly realistic and even emotional". Sabrina Ortiz of ZDNET described the songs it generated as being "impressive" and sounding "as though they were produced professionally". She also called them "fuller and richer" than those of other text-to-music generators, which she said it had "more personalization options" than. Tom's Guide's Ryan Morrison wrote that Udio had "an uncanny ability to capture emotion in synthetic vocals" and was the only AI music generator "to have captured the passion, pain and spirit of a vocal performance". He added that the program was geared toward "people with no or minimal musical ability". Brian Hiatt of Rolling Stone wrote that Udio was "more customizable but also perhaps less intuitive to use" than Suno AI and added that "some early users have suggested that on average, Udio's output may sound crisper than Suno's".

For Ars Technica, Benj Edwards wrote that Udio's generation capability was imperfect and "less impressive" than Suno AI's, noting that its songs were substantially shorter than Suno AI's. He also called the songs it produced "half-baked and almost nightmarish". In response to the company's announcement of Udio's beta release on Twitter, Telefon Tel Aviv member Joshua Eustis tweeted that Udio was "an app to replace musicians" and called into question the data that it used. Udio has also been criticized online as "soulless" and for having the potential to create audio deepfakes. Lucas Ropek of Gizmodo stated that Udio was "full of acoustical nonsense" and that its songs were "extraordinarily bad".

Copyright concerns
Critics of Udio have questioned what data was used to train it and if that data consisted of copyrighted music. Rolling Stone wrote that there was "substantial reason to believe" that both Udio and Suno AI were trained with copyrighted music, while Benj Edwards of Ars Technica wrote that its training data was "likely filled with copyrighted material". Udio does not directly recreate copyrighted songs if prompted. Ding has stated that Udio has "extensive automated copyright filters" and that the company is "continually refining [its] safeguards". Stability AI took a different approach with Stable Audio 2.0, and used an explicitly licensed dataset of music called AudioSparx.

In June 2024, a lawsuit, lead by the Recording Industry Association of America, was filed against Udio and Suno alleging widespread infringement of copyrighted sound recordings. The lawsuit sought to bar the companies from training on copyrighted music, as well as damages of up to $150,000 per work from infringements that have already taken place.