Slashdot

Canonical's Upcoming AI Tool: Talk to Ubuntu Instead of Typing

This week the Ubuntu desktop's director of engineering announced they're bringing speech-to-text dictation to Ubuntu Desktop, aiming for an experience "that feels like a natural part of the desktop while respecting user privacy and running entirely on local hardware."

"Speech recognition has become a common feature on modern platforms, and we think it should be a first-class experience on Ubuntu Desktop as well."

Myna: The Initial Dictation Tool

For Ubuntu 26.10, the initial version of Myna is expected to be a desktop dictation tool built around GNOME on Wayland with a push-to-talk mechanism gatekeeping when your microphone accepts input. Using it means holding a hotkey, speaking, and letting go. A small activity indicator shows while it is listening, and the transcribed text lands wherever the cursor was sitting when dictation started.

Architecture and Privacy

Recognition itself happens inside a sandboxed component called the Canonical Inference Snap, while a Speech Orchestrator manages the session and an Audio Adapter handles whatever the microphone picks up, denoising and chunking it before it ever reaches the model.

Speech recognition will happen locally, and an internet connection is not needed once the appropriate model is installed. The audio data won't be sticking around either, being stored in a small in-memory buffer that gets discarded the moment the session ends.

What's Not Included

Features like dictation into password fields, wake words, continuous listening, voice assistants, voice commands, translation, speaker identification, and automatic language detection are all off the table.

Feedback and Community Response

You should also know that Canonical is looking for feedback before the specs for Myna are finalized, especially from people who already rely on dictation or assistive tools on Linux.

Community reactions have been mixed:

  • Some users question the relevance for those who don't use GNOME, Wayland, or Snap packages.
  • Others note that speech-to-text tools are already available via third-party applications and API connections.
  • Several commenters highlight the value for users with disabilities or repetitive strain injuries, with one user sharing: "I use it because I've been writing so many long prompts that I developed relatively severe tendonitis in my left arm."
  • One user asks where to provide feedback, noting the article is "chock-full of links, but not one for that."
  • Another commenter expresses hope: "I'm all for a speech to text feature. I've wanted one for years. But, it has to not suck."

Read more of this story at Slashdot.

Comments

No comments yet. Start the discussion.