What “offline” actually means in 2026
Most Mac dictation apps that advertise “offline mode” in 2026 mean one of two things: either transcription runs on your Mac but the AI cleanup pass routes through a cloud LLM, or the whole product still needs an internet connection but caches gracefully when you're briefly offline. Both are reasonable engineering tradeoffs. Neither is fully offline.
Fully offline means a specific architectural shape: audio is captured into RAM only, transcribed by a model running on your CPU/GPU, processed by an on-device LLM for cleanup, and typed into your editor — all without a single packet leaving the machine. You can turn the Wi-Fi off, fly to another country, or stand inside a Faraday cage, and dictation keeps working with the same quality.
The distinction matters because privacy policies are promises and architecture is a guarantee. A promise can change with a new TOS revision, a data breach, a subpoena, an acquisition, or a quiet misconfiguration. Architecture stays put. For work that touches client privilege, patient data, regulated information, or NDA-bound source code, the architectural question is the one the compliance team actually asks.
Which Mac dictation apps actually run fully offline?
As of the 2026 mid-year landscape, the answer is uncomfortably short:
| App | Transcription on-device | AI cleanup on-device | Truly offline? |
|---|---|---|---|
| Speechcap | Yes (Pro) | Yes (Pro) · IBM Granite 4 Micro | Yes — both stages |
| Superwhisper | Yes | No — typically routes through OpenAI | Partial |
| MacWhisper | Yes (file transcription) | Partial (post-process pass) | Partial · different category |
| Wispr Flow | No — cloud | No — cloud | No |
| Willow Voice | No — primarily cloud | No — cloud | No |
| Apple Dictation | Optional (lower quality) | No AI cleanup at all | Partial · accuracy drops |
What stays on your Mac on Pro
With Speechcap Pro's on-device engine selected (Settings → Transcription → On-device), the following data classes never reach our servers:
- Audio. Captured into RAM only — never written to disk, never uploaded, never reaches our servers, never reaches a third-party transcription provider. Discarded after transcription.
- Transcripts. The raw Whisper output is processed in-memory by the on-device cleanup model. The cleaned text is typed into your editor and saved to a local history database on your Mac. No cloud copy exists.
- AI cleanup prompts. The prompts wrapping your transcript for the cleanup model never leave the machine — neither the system prompt nor your transcript get sent to any LLM provider.
- Custom vocabulary. Your personal word list (names, acronyms, jargon) lives in a local file. Cloud sync is opt-in and clearly labelled; toggle it off in Settings → Vocabulary if you want true air-gap operation.
- Dictation history. Every transcribed session is searchable in the app's local History tab. It's a local SQLite database; nothing leaves your Mac.
The only network calls Speechcap makes on Pro on-device mode are: license verification (an intermittent ping with your account ID — no content), and optional crash diagnostics if you haven't disabled them. Both are toggleable in Settings.
Who this is for
Offline dictation isn't about being a stereotypical privacy maximalist. It's about specific workflows where the architectural answer is the only acceptable one.
Lawyers and legal professionals
Attorney-client privilege survives only if privileged content isn't shared with third parties. Dictating draft contracts, depositions, or case notes through a cloud dictation app risks waiving privilege depending on jurisdiction and the cloud provider's subprocessor chain. On-device dictation removes that exposure entirely.
Healthcare and medical practice
We're not yet HIPAA-certified — that's a Q3 2026 target for Speechcap Pro. But the architecture (no audio uploaded, no PHI transcripts in our systems) is the foundation HIPAA compliance will eventually attest to. Practitioners who can't wait for the certification often use Speechcap today on the basis that no PHI reaches our servers in the first place.
Journalists and investigative reporters
Source protection means assuming any cloud you touch is subpoena-able. Reporters dictating notes from a sensitive interview, drafts of a story with an anonymous source, or even just internal editorial messages benefit from architecture that doesn't create discoverable artifacts. Offline dictation produces nothing for a future legal demand to discover.
Security-conscious developers and founders
Dictating draft commit messages that mention unannounced features, PR descriptions for an unfiled patent, founder notes for a future fundraise — these are routine and routinely sensitive. On-device dictation lets you keep voice-as-input as a daily tool without growing the surface area of where your unannounced work exists.
The technical architecture, briefly
Two models do the work, both running on your Mac:
Whisper Large v3 for transcription
Speechcap uses OpenAI's open-source Whisper Large v3 model for speech recognition. On M-series Apple Silicon, transcription runs at 4–8× realtime — your 30-second dictation transcribes in roughly 4 seconds, end-to-end. The model is ~1.5 GB on disk, memory-mapped at runtime, unloaded when the app is idle. Accuracy on typical English (including accented variants) lands at 95–98% per our internal testing across the supported language set.
IBM Granite 4 Micro for cleanup
The AI cleanup pass — removing fillers, fixing punctuation, applying context-aware formatting, handling in-flight transforms — runs through a 3-billion-parameter on-device LLM. We ship IBM's Granite 4 Micro because it has the best speed-to-quality tradeoff we've measured for the cleanup task on consumer Apple hardware. The model is ~2 GB, loaded on first use, cached for the session.
Hardware requirements
Both models run comfortably on any Apple Silicon Mac (M1 and newer). On Intel Macs, performance is noticeably slower — usable but not snappy. Memory: 16 GB Macs run both models in parallel without pressure; 8 GB Macs work but you'll notice the swap pressure if you have many other apps open. Storage: ~4 GB for both models combined; downloaded once on first launch.
What you trade by going fully offline
Honest tradeoffs, because no architectural choice is free:
- Cold-start latency. On the very first dictation after launching Speechcap, the cleanup model takes 2–3 seconds to load. Subsequent dictations are instant. Cloud apps don't have this because they keep models warm on shared infrastructure.
- Slightly less polished cleanup on complex prompts. A 3B-parameter local model isn't GPT-4-class. For typical dictation cleanup (fillers, punctuation, light reformatting) it's indistinguishable. For very intricate transforms (e.g. “rewrite this in the style of Hemingway”) you can occasionally tell.
- Battery cost. Running both models on-device draws more battery than a network call would. On a MacBook Air, expect ~5–8% extra battery use across a full day of heavy dictation vs cloud mode. On a plugged-in Mac, zero impact.
- Disk space. ~4 GB used by the two models. Not nothing, but not significant relative to modern Mac storage.
For most workflows the tradeoffs are imperceptible. The privacy upside is permanent.
Pricing
On-device dictation isn't a separate product — it's a setting inside Speechcap Pro. Pro starts at $3/month with PPP-localised pricing in 89 markets ($6/month is the top tier). The free tier (2,000 words/month, cloud transcription) is generous enough to try the workflow before committing. There's also a 14-day Pro trial with no credit card.
Compared to other Mac dictation apps, Speechcap Pro is roughly half the price of Wispr Flow and Willow Voice ($12–15/month) and undercuts Superwhisper's monthly tier ($8.49). The architectural difference (full on-device) plus the pricing difference is the central pitch.
Frequently asked questions
What does "offline dictation" actually mean for a Mac app?
It depends on the app. The honest definition: every stage of the dictation pipeline — audio capture, transcription, AI cleanup, and text injection into your editor — runs on the user's Mac with no cloud round-trip. Some apps run transcription locally but still send the transcript to a cloud LLM for cleanup; that's partially offline, not fully offline. Speechcap Pro runs both stages on your Mac; nothing leaves the device.
Which Mac dictation apps work without an internet connection?
On Pro, Speechcap runs entirely on your Mac — full Whisper transcription and AI cleanup both local. Superwhisper runs transcription locally but their AI cleanup pass typically routes through a cloud LLM (OpenAI). MacWhisper runs file transcription on-device but has limited live-dictation features. Wispr Flow, Willow Voice, and Apple Dictation's cloud mode all require an internet connection. Apple Dictation has an on-device option but at noticeably lower accuracy.
Is on-device dictation as accurate as cloud-based dictation in 2026?
On modern Apple Silicon Macs, yes. Whisper Large v3 running locally on M1 / M2 / M3 / M4 produces 95–98% accuracy on typical English — comparable to cloud transcription. The accuracy gap that justified cloud-only dictation in 2019 has effectively closed. The remaining quality gaps are about cleanup quality (small on-device LLMs vs frontier cloud models), not transcription itself.
How much disk space and RAM does on-device dictation use?
Whisper Large v3 is about 1.5 GB on disk. The on-device cleanup model (Speechcap uses IBM Granite 4 Micro, ~2 GB) adds another ~2 GB. At runtime, models are memory-mapped and use ~3 GB of RAM when actively dictating. Models are unloaded when idle. On an 8 GB Mac that's tight; 16 GB is comfortable; 24 GB or more leaves room for everything else.
Can I dictate on a plane, in a SCIF, or behind a corporate firewall?
Yes — that's the whole point of full-offline mode. Toggle Wi-Fi off, hold your push-to-talk key, speak. Speechcap will transcribe, AI-clean, and type the result into whatever app you're focused on without any network access. The only thing you lose offline is cloud-synced custom vocabulary; the local list still works.
Why does on-device matter if the company has a privacy policy?
Privacy policies are promises. On-device architecture is a guarantee. A privacy-policy promise can be broken by a future policy change, a data breach, a subpoena, a misconfigured server, or an acquiring company with different priorities. On-device processing removes the data from those failure modes by design. For attorney-client privileged work, patient notes, NDA'd source code, executive communications — the architectural answer matters more than the policy answer.