By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google’s Gemma 3n AI can run locally on smartphones with 2GB RAM

Gemma 3n is a lightweight but capable AI model from Google that can be deployed locally for private, fast, and on-device processing.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Jun 27, 2025, 7:51 AM EDT
Share
A logo of the Google Gemma AI model.
Image: Google
SHARE

When Google first unveiled its Gemma 3 family of AI models earlier this year, it hinted at a future where powerful language understanding wouldn’t be confined to data centers. On Thursday, the company delivered on that promise by releasing Gemma 3n—a fully open-source model refined for on-device use and capable of running on as little as 2GB of RAM. This means developers can deploy a sophisticated large language model (LLM) directly on smartphones and other low-resource hardware, making AI more accessible, private, and responsive than ever before.

In May, Google teased the “Nano” aspirations of Gemma 3n at its I/O developer conference, promising an AI small enough to fit in your pocket. The latest release confirms those ambitions: despite a raw parameter count of 5 billion (E2B) and 8 billion (E4B), Gemma 3n behaves like a 2 billion or 4 billion-parameter model in terms of memory footprint—just 2GB and 3GB of RAM, respectively. By offloading less-critical weights to “extra layer embeddings” handled by the CPU, Gemma 3n keeps its active memory lean, ensuring fast, local inference even on modest hardware.

At the heart of Gemma 3n lies Google’s Matryoshka Transformer, or MatFormer. Inspired by Russian nesting dolls, this architecture trains a larger model (E4B) alongside a nested smaller one (E2B), sharing weights where it counts and dynamically switching between the two during inference. The result is what Google calls a “mobile-first architecture,” where a single model package serves both high-performance and lightweight use cases without compromising quality.

A critical innovation enabling Gemma 3n’s efficiency is Per-Layer Embeddings (PLE). Traditional transformers load all parameters into GPU memory, but PLE keeps only the most essential ones in fast memory (VRAM), shuffling the rest in and out via the CPU. Coupled with activation quantization and key-value-cache (KVC) sharing, this approach slashes RAM requirements and speeds up response times—on mobile devices, Gemma 3n is roughly 1.5× faster than its predecessor, Gemma 3 4B, while delivering superior output quality.

Gemma 3n isn’t just a text model—it’s multimodal. It natively ingests images, audio, video, and text, though it currently generates text only. It’s also a global citizen, supporting 140 languages for text inputs and 35 languages when working with multimodal data, making it a versatile tool for developers worldwide. Best of all, Google has open-sourced both the model weights and a “cookbook” of recipe-style instructions for fine-tuning and deployment under a permissive Gemma license, allowing academic researchers and commercial teams alike to innovate without legal headaches.

Not content to ship one-size-fits-all models, Google is also releasing MatFormer Lab—a toolkit that lets developers experiment with different nesting depths, parameter allocations, and quantization settings to craft custom-sized Gemma derivatives. Whether you need a hyper-lean 1 B-equivalent model for a microcontroller or a beefier 6 B-equivalent engine for a laptop, MatFormer Lab provides a playground to tune for specific latency, memory, and accuracy trade-offs.

You don’t need a PhD to start tinkering with Gemma 3n. The full model line is available on Hugging Face and via Google’s Kaggle listings. For a zero-install trial, head to Google AI Studio, where you can spin up inference jobs or even deploy directly to Cloud Run. Want to see it in action? Google provides sample notebooks, Docker images, and command-line scripts to help you integrate Gemma 3n into chatbots, virtual assistants, or edge-AI demos within minutes.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Gemini AI (formerly Bard)Google DeepMind
Most Popular

DJI’s FC200 and T200 drones push industrial delivery and agriculture into the 200kg era

DJI Osmo Mobile 8P debuts with detachable remote and smarter tracking

ChatGPT for Clinicians is now free for verified US doctors

OpenAI Privacy Filter brings open-weight PII redaction to everyone

Opera GX Playground bundles panic button, Fake My History and Grass Touching Corner

Also Read
Password Illustration

Microsoft finally adds passkey sync to its built-in password manager

Perplexity illustration. Abstract illustration of a transparent glass cube refracting beams of light into rainbow-like streaks across a dark, textured surface, symbolizing clarity, synthesis, and the convergence of multiple perspectives.

GPT-5.5 is now on Perplexity – but only for Max subscribers

Stylish living room featuring the Amazon Ember Artline lifestyle TV mounted above a white marble fireplace. The TV displays a framed landscape artwork of rolling green hills with orange flowers under a blue sky, blending in like wall art. The room includes a mustard yellow sofa with decorative pillows, wooden lounge chairs, warm wall sconces, books, and modern decor, creating a cozy upscale interior design.

Amazon Ember Artline is now available in the US, starting at $899

Screenshot of the Google Admin console showing the data import tool dashboard. The page headline reads “Copy your data seamlessly using the data import tool,” with sections highlighting cloud-native infrastructure, accelerated parallel data import, and comprehensive tracking and resolution. Below, a “Data import batches” table lists import jobs for departments like finance, marketing, legal, and HR, showing Exchange Online as the data type, running status, and success rates between 97% and 99%.

Google Workspace now has a free built-in data migration tool for enterprises

Screenshot of Google Drive with the “Ask Gemini” panel open. The interface shows options to ask questions about files with actions like “Get prepared,” “Find insights,” and “Make progress.” A sidebar labeled “Your sources” allows users to add files for deeper insights, while the main prompt box at the bottom lets users ask Gemini questions directly within Google Drive.

Google’s Ask Gemini in Drive is now out of beta and available to everyone

Screenshot of a Google Sheets spreadsheet titled “Customer Feedback” for Dallas AC Tech & Repair. The table includes columns for Customer Name, Customer Message, Praise or Complaint, and Suggested Response. Rows show customer feedback entries with Gemini-generated classifications and professional response drafts, demonstrating AI-assisted spreadsheet filling and customer service workflow management.

Google Sheets’ new Fill with Gemini feature fills your data nine times faster

Green Google Sheets document icon centered on a light gray background, showing a simple white spreadsheet grid symbol on the front of the file.

You can now paste unformatted text and let Gemini build a Sheets table for you

Green Google Sheets document icon centered on a light gray background, showing a simple white spreadsheet grid symbol on the front of the file.

Building complicated spreadsheets in Google Sheets is now Gemini’s job

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.