Elon Musk, the CEO of Tesla and SpaceX, has threatened to sue Microsoft over allegations that the tech giant illegally used Twitter’s data to train its artificial intelligence model. Musk’s tweet was prompted by reports from various media outlets that Microsoft planned to drop Twitter from its advertising platform, which enables advertisers to manage all of their social media accounts in one place.
“They trained illegally using Twitter data. Lawsuit time,” Musk tweeted, but no lawsuit has been filed yet. Twitter has not commented on the matter, while a Microsoft representative has declined to offer a response.
Data ownership is becoming an increasingly contentious issue in the field of generative AI, as Big Tech firms work to develop cutting-edge AI models like OpenAI’s GPT, while data owners seek to prevent them or charge them for use of their content. Microsoft develops its own large language models (LLMs) and also sells access to OpenAI’s models. Last year, Microsoft invested $10 billion in OpenAI in a deal that was structured in an unusual way.
LLMs such as GPT require terabytes of data for training, much of which is scraped from websites like Reddit, StackOverflow, and Twitter. Training data from social networks is valuable because it captures informal, back-and-forth conversations. As these new AI models move from research labs and universities into the corporate world, the owners of the data are starting to make demands.
For instance, Reddit recently announced that it would charge companies for access to its programming interface, which is used to feed the conversations among Redditors into AI training software. Similarly, Universal Music Group said that training artists’ music using AI would represent “both a breach of our agreements and a violation of copyright law” in response to a viral video of a song that claimed to use AI to imitate the rapper, Drake.
Meanwhile, stock photo database Getty Images is suing Stable Diffusion, alleging that the company copied its content to train its AI image generator. Musk himself has been vocal on this issue, having announced plans to build his own large language model in one of his companies called TruthGPT, and having said in December that Twitter would “pause” OpenAI’s access to its database.