Aqua Voice vs Speechcap: contextual cloud, or private on-device.
Aqua Voice leads with a single claim: its Avalon model reads what's on your screen to pick the right vocabulary, and they cite 97.3% accuracy as the proof. Speechcap leads with a different claim: your voice never leaves your Mac. Both can be true; they're optimising for different problems. This is a pick-by-priority comparison.
Side-by-side
| Aqua Voice | Speechcap | |
|---|---|---|
| Platforms | Mac · Windows · iOS | Mac · Windows beta |
| Pricing | Free 1,000 words · Pro $8/mo · Team $12/mo · Enterprise | Free 2,000 words/month · Pro from $3/month (PPP-localised) |
| Transcription | Cloud (proprietary Avalon model) | On-device Whisper on Pro · cloud on Free |
| AI cleanup | Cloud, context-aware | On-device Granite 4 Micro on Pro · cloud on Free |
| Languages | 49 | 89 |
| Activation | Tap-to-toggle hotkey | Push-to-talk (hold) · toggle optional |
| Screen context awareness | Yes — reads visible app/text | No — app context only |
| Custom dictionary | Yes (5 entries free · unlimited Pro) | Yes, cloud-synced |
| Free tier | 1,000 lifetime words | 2,000 words / month, full quality |
| Privacy posture | Cloud-first; policy-based | Architecture-based; audio never leaves device on Pro |
Where Aqua Voice is honestly better
Cross-platform reach
Aqua ships on Mac, Windows, and iOS today. Speechcap is Mac-first with a Windows beta. If you split work across a MacBook and a Windows desktop, Aqua wins by default — Speechcap's beta isn't ready for primary use yet.
Context-aware accuracy in technical work
Avalon reads what's visible in your editor. Function names, library imports, variable spellings — it tunes against them. Speechcap's custom dictionary works but you have to feed it manually. For developers who dictate code or technical prose, Aqua's screen-reading is a real edge.
Polished proprietary model
Aqua isn't running off-the-shelf Whisper. Avalon is their own — that means model updates ship on their schedule, not Whisper's. Their claimed 97.3% accuracy is plausible; in published benchmarks they often edge Whisper-large on streaming throughput.
Where Speechcap is honestly better
On-device cleanup, not just transcription
Most "on-device" dictation apps run Whisper locally but ship the transcript to a cloud LLM for cleanup. Speechcap runs both stages on-device on Pro — IBM Granite 4 Micro handles the cleanup in ~1.5s on an M1 Pro. Your dictation never reaches another computer.
Push-to-talk by default
Speechcap defaults to hold-a-key. The mic is on for exactly as long as your finger is on the key — no "I forgot to turn it off" failure mode. Aqua uses tap-to-toggle by default, which means you can leave it recording. We have an opinion on this (speechcap.com/blog/the-case-for-push-to-talk).
Pricing per market
Aqua is $8/month (annual). Speechcap is $3–6/month with PPP localisation across 89 markets — students and users in India, Brazil, Vietnam, Egypt pay much less. If you're outside the US/EU, Speechcap is meaningfully cheaper.
89 languages vs 49
Speechcap supports 89 dictation languages via Whisper Large v3. Aqua covers 49. For non-English speakers, that gap matters more than it looks — Whisper handles Hindi, Tamil, Bengali, Marathi, and Vietnamese well; Aqua's coverage is thinner on South Asian and Southeast Asian languages.
Who should pick which
- You dictate across Mac, Windows, and iOS daily.
- Most of your dictation is code or domain jargon Aqua's screen-awareness helps with.
- You're fine with cloud transcription and trust the privacy policy.
- You want a US/EU subscription with team plans and SSO.
- You handle NDA material, regulated data, or sensitive client work.
- You want both transcription AND AI cleanup on-device — not just one.
- You speak a non-English language Whisper covers better than Avalon.
- You're outside the US/EU and want PPP-localised pricing.
- Push-to-talk's no-state-bug guarantee matters to you.
Sources & further reading
- Aqua Voice — Official site ↗Reference for pricing tiers, accuracy claims, and platform coverage.
- OpenAI Whisper model card ↗Speechcap's transcription engine; multi-language coverage reference.
Frequently asked questions
Is Aqua Voice or Speechcap more accurate?
It depends on the input. For English dictation with technical jargon visible on screen, Aqua's Avalon model often wins because it tunes against context. For non-English dictation or general everyday English, Whisper Large v3 (Speechcap's engine) is competitive to better. Both are well above the 88% Apple Dictation baseline; the meaningful axes are privacy and language coverage, not accuracy points.
Does Aqua Voice run on-device?
No. Aqua's Avalon model is cloud-hosted — that's how it gets screen-context awareness and pushes accuracy updates without app updates. If your work requires audio that never leaves your computer (NDA, healthcare, legal, regulated data), Aqua isn't the right fit. Speechcap Pro's on-device mode runs both transcription and AI cleanup locally.
Which is cheaper, Aqua Voice or Speechcap?
Speechcap is cheaper in most markets. Aqua's Pro plan is $8/month billed annually. Speechcap is $3–6/month with PPP-localised pricing — students and users in India, Brazil, Vietnam pay near the lower end. Both have free tiers; Aqua's is 1,000 lifetime words, Speechcap's is 2,000 words per month at full quality.
Does Aqua Voice work on Windows?
Yes — Aqua ships on Mac, Windows, and iOS. Speechcap is Mac-first; Windows is in beta. If cross-platform reach matters more than on-device privacy, Aqua is the better pick today.
Why does Speechcap support more languages?
Speechcap uses Whisper Large v3 directly, which OpenAI trained on 99 languages. After filtering for production-quality output, Speechcap exposes 89 of them — including strong support for Hindi, Tamil, Bengali, Marathi, Vietnamese, and other South/Southeast Asian languages. Aqua's Avalon supports 49 languages and tends to focus on European-language coverage.