Few problems vex AI researchers more than long-form video. While today’s models can summarize a TikTok clip or generate a short film script, they buckle under hours of continuous footage—making them all but useless for security firms, marketers, and other video-driven industries. Enter Memories.ai, a San Francisco startup founded by two ex-Meta Reality Labs researchers, Dr. Shawn Shen and Ben (Enmin) Zhou, on a mission to give machines what humans take for granted: visual memory.
Shen built his chops pursuing a PhD and running experiments at Meta Reality Labs; Zhou cut his teeth shipping machine-learning systems at Meta. Together they sketched out what they call the world’s first Large Visual Memory Model (LVMM), capable of ingesting and reasoning over up to 10 million hours of footage—orders of magnitude beyond the one- or two-hour limit of existing AI pipelines. “All top AI companies, such as Google, OpenAI, and Meta, are focused on producing end-to-end models,” Shen told TechCrunch, “but these models often have limitations around understanding video context beyond one or two hours.”
On July 24, 2025, Memories.ai announced it had raised $8 million in a seed financing led by Susa Ventures, with participation from Samsung Next, Fusion Fund, Crane Ventures, Seedcamp, and Creator Ventures. Originally targeting $4 million, the round was oversubscribed in less than a month, a sign of fierce investor confidence in the company’s long-context pitch. “There’s a gap in the market for long-context visual intelligence,” said Misha Gordon-Rowe of Susa Ventures. “Shen is obsessed with pushing boundaries of video understanding and intelligence.”
Wilson Sonsini Goodrich & Rosati, the prominent tech law firm, advised on the deal—another nod to the round’s heft. Their team, including Lang Liu and Alex Youssef, shepherded the company through the paperwork, ensuring Memories.ai is set up for rapid growth across multiple jurisdictions, from San Francisco to Shanghai.
How LVMM works
At its core, Memories.ai’s LVMM is built around a multi-layered pipeline:
- Noise removal & compression: Raw footage passes through denoising filters, then a compression layer slices out irrelevant frames—so the system only stores “signal.”
- Indexing & tagging: The cleaned data is tokenized into searchable chunks with natural-language tags and segments, enabling queries like “show me all red cars in the past 24 hours.”
- Aggregation & reporting: Finally, an analytics layer collates insights, generating summary reports or dashboards that highlight trends and anomalies.
This architecture sidesteps the need to load entire clips into memory, dramatically accelerating query times without sacrificing context. “Instead of processing clips in isolation,” TechCrunch notes, “Memories.ai captures, stores, and structures visual data over time, allowing AI models to retain context, recognize patterns, and compare new footage against past events.”
While Susa Ventures led the round, Samsung Next’s involvement carries a distinct consumer angle. Their investment thesis centers on Memories.ai’s unique ability to run heavy video analysis on-device, reducing—or even eliminating—the need to upload sensitive footage to the cloud. “One thing we liked about Memories.ai is that it could do a lot of on-device computing,” explained Sam Campbell of Samsung Next. “This can unlock better security applications for people apprehensive of putting security cameras in their house because of privacy concerns.”
Industry watchers expect Samsung may eventually bake LVMM capabilities into its Galaxy AI suite—unlocking features like instant video search, highlight reels, and context-aware alerts right on smartphones or home monitoring devices. No official product plans have been announced, but the deal certainly signals what might be on the horizon for Samsung’s consumer lineup.
Who’s tsing it today
From brand managers tracking viral trends on Instagram Stories to security teams scanning months of CCTV, Memories.ai already has paying pilots in two core verticals:
- Marketing: Agencies upload their social video libraries to sift for emerging motifs—colors, logos, settings—that resonate most with audiences. The platform even offers tools to help create new clips based on identified trends.
- Security: Firms query past footage to flag “suspicious” behavior—such as loitering vehicles or unauthorized entry—using pattern-recognition models that learn what “normal” looks like over weeks or months.
Currently, companies must upload video batches manually, but Shen says future updates will support seamless folder syncing and live-stream analysis. Imagine asking, “What unusual activity occurred at our lobby between 2 a.m. and 4 a.m. last Thursday?” and getting a concise, time-stamped report in seconds.
Memories.ai isn’t alone in chasing persistent video memory. Startups like TwelveLabs tout similar video-understanding APIs, while “memory-layer” hopefuls such as mem0 and Letta explore context retention—albeit mostly for text or short clips. Google’s internal labs and Meta itself are also rumored to be prototyping long-form video models. But none match LVMM’s advertised 10 million-hour capacity, giving Memories.ai a potential technological moat—if they can scale reliably.
With a lean team of about 15 engineers and researchers, the company plans to use its fresh $8 million to:
- Recruit top talent across ML, MLOps, and data engineering
- Harden on-device SDKs for mobile and embedded systems
- Advance search and summarization features (e.g., auto-generated storyboards)
- Explore new applications in robotics, autonomous vehicles, and augmented reality
As video continues its reign as the world’s dominant content medium, the need for AI that “remembers” past footage will only grow. For now, Memories.ai occupies a small yet rapidly expanding niche, armed with deep expertise and well-heeled backers who believe context is the next frontier in AI.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
