When Microsoft unveiled its new AI helper called Edge Copilot for its Edge web browser this week, one feature that grabbed headlines was the ability to automatically generate text summaries of online videos. However, as impressive as this capability may sound at first, there are some key caveats to what Edge Copilot can accomplish.
As explained by Mikhail Parakhin, Microsoft’s CEO of advertising and web services, for Edge Copilot to summarize a video, that video first needs to either have subtitles or needs to have been “pre-processed” by Microsoft’s AI systems ahead of time. The assistant cannot watch and comprehend raw video in real-time the way a human can.
“In order for it to work, we need to pre-process the video. If the video has subtitles – we can always fallback on that, if it does not and we didn’t preprocess it yet – then it won’t work,” Parakhin wrote in response to questions about the feature.
In essence, rather than truly summarizing video content, what Edge Copilot does is summarize the text transcript of a video, whether that transcript was added manually via subtitles or auto-generated by Microsoft’s speech recognition software. So while the result to the user may appear like an AI-powered video summary, the underlying technique is more text-based than video-based in nature.
This nuance became apparent when designer Pietro Schirano posted a demonstration of Edge Copilot summarizing a YouTube video about the trailer for the forthcoming Grand Theft Auto VI video game. While Copilot quickly generated a coherent text summary, in this case, the video already included both machine-generated subtitles from YouTube as well as a user-created transcript. It was unclear whether Copilot could have achieved the same feat with a video lacking subtitles.
When asked whether Edge Copilot could summarize most publicly available YouTube videos without pre-processing, Parakhin’s response suggested that while it may work on many videos, performance would be unreliable compared to videos containing subtitles. “Should work for most videos,” he stated tentatively.
The subtleties around Edge Copilot’s video summarization capabilities underscore how AI systems that may seem intelligent on the surface can still have significant underlying constraints in terms of the data they require. It also highlights the machine learning arms race unfolding between Microsoft and leading rivals like Google. Just last month, Google announced enhancements to YouTube summarization in its own AI chatbot called Bard.
As for Edge Copilot, Parakhin readily admits the tool remains a work in progress, posting from an airplane this week that the team continues “adding ability for Edge Copilot to use information in videos.” So while Copilot’s video smarts face limitations today, Microsoft is invested in enhancing them over time. For now, though, viewers hoping to leverage AI for digesting video content may need to lower their expectations around what Copilot can realistically deliver absent manually added subtitles.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
