Google has reached an agreement with social media platform Reddit to gain access to vast amounts of data to train its artificial intelligence systems, the companies announced Thursday.
Under the deal, Reddit will provide Google with real-time access to content posted on its site through its public API. This will give Google’s AI researchers an efficient way to tap into Reddit’s massive trove of discussions, images, videos and other user-generated content.
Financial terms of the partnership were not disclosed, but it comes just weeks after reports emerged of Reddit securing a $60 million agreement to supply content to an unnamed AI firm, widely believed to be Google.
For Google, the Reddit deal promises to significantly expand its pool of training data at a time when AI is becoming central to products like Google Search, Maps and its personal assistant technology. Machine learning models thrive on huge datasets to learn patterns and relationships.
But the arrangement is raising red flags among privacy advocates already concerned about the consolidation of data by tech giants. Reddit discussions often contain personal anecdotes and confidential information shared under pseudonyms or anonymously.
Google has said it applies techniques like differential privacy and data coarsening to protect people’s identities. But experts say truly anonymizing data at scale remains a challenge.
For Reddit, the Google deal provides a lucrative source of revenue and potentially increased visibility in search results. But it also renews tensions with users over commercializing personal information posted on the platform.
In the past, Reddit has resisted pressure to open up its data, even temporarily blocking Google’s crawlers over concerns it was monetizing Reddit content. Thursday’s announcement already stirred backlash on some subreddits.
The partnership underscores the powerful appeal of Reddit’s crowdsourced discussions for training AI, which some liken to digital oil. Major tech firms are racing to tap into new sources of data to stay ahead in the AI race.
But as companies trade more in data, civilians are left in the dark on how their intimate conversations and creative expressions are packaged up and fed into black-box algorithms – often without their knowledge or consent.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
