Tutorials‎ > ‎

Digital Dictation On Linux

posted 2 May 2013, 04:00 by Alistair Hamilton   [ updated 23 Jul 2015, 01:47 ]
I was looking for Linux compatible digital dictation software recently and came up short. There's a host of products available for Windows, but precious little for Linux. NCH, the makers of Express Scribe did have a Linux version for a while but I am reliably informed that it is no longer supported (* see update at the bottom of this article). Their Windows product does run under Wine, but I found it problematic in its use. Specifically, its system wide hot keys wouldn't work.

My efforts in trying to track down suitable, dedicated software proved fruitless though it did get me thinking...

In its basic form, a digital dictation system only needs to replicate the functionality of an old style Dictaphone and tape playback used by countless typists World wide. In other words, all it requires is a mechanism to record the dictation and a method to replay it so that a typist can transcribe it into a document.

These things are readily available. Indeed, you probably have what you need already.

Even the most basic of modern mobile phones have voice recorder functionality. There's your Dictaphone substitute right there, without the need for any additional software. Not only that, most phones will then allow you to email that dictation file to your typist, no matter where she/he is.

The file is nothing more than an audio file, so all your typist needs is a means to play it. You'll be pleased to hear that almost any Linux based audio player will do just that.

Traditionally, typists use a foot pedal to control the playing/pausing/wind/rewind functions of the recording. This is merely a historical relic from the days of analogue Dictaphone and tapes. It was simply the only way to efficiently transcribe the recording and allow the typist to keep their hands on the keyboard. Now, with media keyboards, or the ability to assign hot keys to a keyboard, there's no real reason to use a foot pedal. A proficient typist will soon get used to controlling the audio using the keyboard.

If you want to go to the expense of installing a foot pedal, by all means do so. Search the Internet for Linux compatible models. I do not have access to one so have not tested it as part of this exercise. As you'll gather from the previous paragraph, I do not think it is necessary.

There's a few caveats that you must be aware of:
  1. Most phones record voice as AMR (Adaptive Multi-Rate ACELP Codec) files. This codec must be installed on the typist's Linux box before they can be played directly.
  2. Assuming the codecs are installed, you may find your player doesn't recognise the file name extension and consequently refuses to load it. Clementine is a case in point. However, simply rename the file by changing the AMR extension to MP3 (note this isn't converting the file to MP3) and it loads and plays quite happily.
  3. You may wish to use global hot keys to control the playing of the file, even when the audio software does not have focus. Not all players can do this.
I use Xubuntu, but the following principles should work on other distros:
  1. Using Menu -> System -> Synaptic Package Manager check that the following AMR packages are installed...
    libvo-amrwbenc0, libopencore-amrnb0 and libopencore-amrwb0
  2. Install one of the media player listed below if not already installed. These have been selected primarily due to their global hot key support and of course, their ability to play AMR files.

Suggested Linux media players:
  1. Clementine
    • Can play AMR files but does not load them without first renaming the file extension to MP3.
    • Supports Global Keys allowing control of audio playback even though Clementine does not have focus.
    • Automated playlist importation allowing newly received recording to be listed without user intervention.
    • Integrates with the volume control in the system tray.
    • Not installed in Xubuntu by default.
  2. Audacious
    • Plays AMR files natively when added to play list though importing a directory seems to ignore AMR files unless renamed with a MP3 extension.
    • Supports Global Keys allowing control of audio playback even though Audacious does not have focus.
    • No automated playlist importation.
    • Integrates with the volume control in the system tray.
    • Not installed in Xubuntu by default.
  3. VLC
    • Plays AMR files natively.
    • Supports Global Keys allowing control of audio playback even though VLC does not have focus.
    • The only one of the three, that I can see, that allows the user to change the speed of the playback which may be an issue for some users.
    • No automated playlist importation.
    • Installed in Xubuntu by default.
Of the three listed, I'd prefer Audacious if it had an automated importation facility where one could point it at a folder and any new recording would be automatically added to the play list. It has a simple, clear interface which would be ideal but for that one limitation.

As a result, my vote goes for Clementine. Its automated playlist updating, not to mention the ability for the user to delete the recording from disk directly from the application makes this a winner. It is a pity that AMR files must be renamed with an MP3 connection before they get recognised in the playlist. However, that minor issue can easily be overcome by having a cron job set up to automatically rename such files on a regular basis thus eliminating the need for any user intervention.

One final piece of configuration that you may want to try is a rule within the typists email program to have voice recorder attachments automatically saved to the folder pointed to by your media player, but I'll leave that for you to figure out.


* Update 13 June 2013 - Although NCH claim they no longer support their Linux version of Express Scribe, the software is still available for download from their website. I had previously dismissed this as it always threw an error up on start up. However, I've since revisited it and the error is due to the installation process creating a hidden folder within the home directory for root, rather than the user. If you create a folder "/.nch/scribe" for each user and set that folder as the data folder within the Express Scribe options (Disk Usage tab) you'll be good to go.