Since early 2017, Threads has offered subscribers the choice of Google Speech or Speechmatics automatic speech recognition (ASR) to transcribe phone calls – and we shall soon be adding more ASR services. Apart from all the obvious benefits of automatically transcribing phone calls, what Threads does really well is to turn these services into practical tools for routine use.
As they stand, ASR services are no more than speech recognition engines and although all the service providers have web front-ends for manually processing ad-hoc transcriptions, they are not practical for routine automatic use. Furthermore, most services do not present transcriptions in a particularly user-friendly way, making it difficult to determine who’s speech has been transcribed and when they spoke. If calls last for many minutes or hours even, users need the ability to scrub through transcriptions to quickly locate the section of interest. ASR services generally do not offer this – but Threads does. Also, because Threads stores all calls and indexes them and collates them with other communications, it offers users the facility to search all calls for keywords in exactly the same way as they do for emails. It is difficult to imagine how many business users would be able to cope without being able to search their emails but once they start using Threads, they soon wonder how they ever coped without being able to search their calls.
However, if your company employees spend a large part of their working day on the telephone, then with ASR charges ranging from $2-$10 per hour, costs can easily rack up. Although ASR costs are nowhere near the cost of human transcription, if you are transcribing all telephone traffic, it can get expensive.
Previously with Threads, once an ASR service was enabled, all subsequent telephone calls would be dispatched to the service for transcription. If the service was disabled, then, subsequent calls would stop being transcribed.
With our new “transcription on demand” feature, users can selectively transcribe calls and where not all calls require transcription, this has the major benefit of reducing ASR costs.
Users may obviously only search for keywords in calls that have been transcribed so if transcription on demand is enabled, the downside is that it will yield a much smaller set of phone calls that can be searched.
A further issue is that prior to transcribing a call, Threads stores a significant amount of additional metadata from each phone call. Indeed, this is the reason Threads is able to achieve recognition rates higher than could be obtained simply by submitting an audio file to an ASR service. In order to post-process a conversation, this increased amount of data must be stored.This becomes a trade-off because normally once a call has been transcribed, much of the metadata can be thrown away, leaving just a raw audio file for users to listen to. If calls are post-processed, larger amounts of data must be stored.
With this new feature, we intend to allow calls to be transcribed up to 24 hours after they occurred. After this time, they may still be transcribed but the recognition rates will be lower.
We think this is a great new feature, so do try it and let us know what you think.