Microsoft Edge Embeds On-Device AI Model, Adds Translation and Speech APIs
Microsoft is embedding a new small language model — software designed to process and generate text locally on a device — directly into its Edge browser, alongside on-device translation and speech recognition Tools That reduce the need to send data to remote servers.
The new model, called Aion-1.0-Instruct, replaces Phi-4-mini, which Microsoft began testing in Edge last year through the browser’s Prompt and Writing Assistance APIs.
Rather than requiring every web application to ship its own model or call an external service, Edge exposes a built-in model that websites can access via standard browser APIs for tasks such as rewriting text, summarizing content, or offering writing assistance.
Aion-1.0-Instruct is currently in preview, available only in the Canary and Dev channels of Edge behind a feature flag. Microsoft said the model runs on a wider range of hardware than larger cloud-based models, including machines that would struggle under heavier workloads.
That positioning aligns with Microsoft’s broader push around Copilot+ PCs and “AI PCs,” where locally run models are meant to respond instantly without taxing laptop hardware.
Translation and Language Detection
Edge is also gaining two task-specific models built into the browser itself. A Language Detector API identifies the language of a given piece of text, while a Translator API converts text between more than 145 languages — both running entirely on-device.
Web applications and browser extensions can call both APIs directly from JavaScript, the programming language that powers interactive web features.
In practice, a site could detect that a user is typing in Hindi and maintain a translated English version of the message in a separate field, all without sending individual keystrokes to an external translation service. Microsoft said these features ship in Edge version 148.
Microsoft highlighted the privacy and cost implications. Local translation generates no server-side logs and removes per-request billing that web developers currently incur when using cloud-based translation APIs.
For developers, it also eliminates an external dependency — they can rely on the browser’s own AI stack as long as users run a sufficiently recent version of Edge.
On-Device Speech Recognition
Edge is also integrating on-device speech recognition into the standard Web Speech API, again starting in Canary and Dev builds. The Web Speech API is a browser standard that lets Websites Access microphone input for dictation or voice commands.
Under Microsoft’s approach, audio routes to a local model first rather than immediately going to a server, bringing browser-based voice features closer in responsiveness to native desktop applications. Microsoft said cloud speech services remain an option, but framed the local path as faster and more private for basic scenarios.
Browser AI Push Widens
The move extends a strategy Microsoft began last year when it introduced Phi-4-mini through Edge’s Prompt API, effectively turning the browser into a host that any website could borrow AI capabilities from.
Google has pursued a similar path with its Chrome browser, building Gemini Nano — its own on-device language model — directly into Chrome for use by websites via experimental APIs.
Both companies are betting that shifting AI processing into the browser itself, rather than relying on cloud round-trips for every request, will become a standard part of web development.
