Imagine typing, “soft electric-guitar bed, slow build, 90 seconds,” and getting a finished backing track that fits a vocal take or a short clip of a video. That’s the picture painted in a recent report: OpenAI — the company behind ChatGPT and the Sora video tool — is exploring a music-generation product that would accept text and audio prompts and return original instrumental or vocal music. The Information’s reporting says the company has even worked with students from the Juilliard School to annotate musical scores that could feed training data for the model.
If that sounds familiar, there’s a reason. OpenAI has generated music before. In 2020, it released Jukebox, a research model that produced raw audio music (including rudimentary singing) and released code and samples so people could experiment. What’s notable about the current reporting is that OpenAI may be trying to turn this kind of capability into a polished, product-grade feature — something that could be embedded in creator tools or integrated into its existing apps.
Why Juilliard? Why human annotation?
Machine learning researchers have two big choices when building generative music models: train on unlabelled audio at a massive scale, or curate labeled, structured data that teaches the model about notation, instrumentation and musical form. The Information’s sources say Juilliard students were enlisted to annotate scores — essentially turning sheet music into training signals the model can learn from. That approach could improve control: instead of a random collage, the system might understand “guitar strum pattern,” “sforzando,” or “violin countermelody” in a way that makes outputs far more usable to musicians and filmmakers.
The competitive landscape is loud — and legally noisy
OpenAI wouldn’t be entering an empty field. Startups and established audio companies have been racing to commercialize music generation. ElevenLabs, known for voice synthesis, launched “Eleven Music” this year — a studio-grade offering that promises multi-language vocals and granular editing from text prompts. And smaller players like Suno have drawn big attention (and legal scrutiny).
Those legal fights matter. Major labels and collecting societies have sued AI music firms, alleging copyrighted recordings were used without permission to train models — a legal question that is still very much alive and likely to shape how any new OpenAI product is built, licensed and distributed. If a tool is trained on unlicensed recordings, rightsholders will complain; if it’s trained on licensed, annotated scores, the cost and complexity change.
Spam, deepfakes, and the Velvet Sundown wake-up call
There’s another practical problem: scale and abuse. Streaming platforms are already dealing with a flood of low-quality, mass-uploaded tracks — many made with AI — that try to game royalties or search algorithms. Spotify recently said it removed around 75 million spam tracks over a year as AI lowered the barrier to creating audio content en masse. And a high-profile case this summer — an AI-made act known as “The Velvet Sundown” that amassed millions of streams before its synthetic origins were scrutinized — showed how fast synthetic music can penetrate listener ecosystems and the kind of verification and transparency headaches platforms will face.
That episode underlines a broader tension: generative music can empower creators, but it can also be weaponized for impersonation, flooding, or deceptive monetization. How companies detect and disclose AI-made content, and how platforms prevent abuse, will be core questions for any product rollout.
Use cases — practical, not just theoretical
If OpenAI’s team is serious about the product, the early use cases will be straightforward and practical: generating accompaniment for demos (guitar beds under a vocal), producing short soundtracks for videos or games, or helping indie creators who don’t have budgets for session musicians. For editors and content creators, an on-demand source of fits-to-length music that can be tweaked by mood, tempo and instrumentation is extremely attractive. The devil, as always, will be in the licensing, attribution and revenue-share rules.
What the music business wants — and fears
Labels and songwriters want two things: rights and clarity. They’ve already sued to defend their catalogs; many in the industry argue that machine learning models trained on copyrighted sound recordings should either be licensed or limited. Some newer entrants are trying to navigate that by negotiating licensing deals up front — an approach that reduces legal risk but increases cost. Meanwhile, artists and advocates insist that any system that can mimic or replace human performers should come with guardrails — from vocal-deepfake detection to mandatory disclosures that music was AI-assisted.
So — is OpenAI actually shipping something?
Reports say the work is exploratory; the company’s aims aren’t fully clear publicly. OpenAI has the research lineage (Jukebox) and the engineering muscle, and the company has been building audio products and multimodal tools already. But turning research models into products that are safe, legally sound and appealing to creators is a hard, expensive process — one that will involve licensing, user controls, and perhaps new industry standards for disclosure.
Generative music is moving from lab experiments to real products, and OpenAI — with big resources and prior work in audio — looks likely to be a major player if it decides to go all in. That prospect is energizing for creators who want faster, cheaper ways to make soundtracks and accompaniments — and worrying for rights holders and platforms trying to keep ecosystems honest. Expect the next few months to bring more reporting, industry statements and — almost certainly — legal and technical debates over how to make AI music that’s useful and fair.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
