Taking Note

5 questions for… Nuance – does speech recognition have a place in healthcare?

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

Speech recognition has been on the brink of major success for decades, so it feels. Rather than set of generic “when will it be mainstream” questions, I was keen to catch up with Martin Held, Senior Product Manager, Healthcare at Nuance, to find out how things stood in this, specific and highly relevant context.

  1. How do you see the potential for speech recognition in the healthcare sector?

Right now, the most gain will be from general documentation, enabling people to dictate instead of type, to get text out faster. In some areas of healthcare, things are pretty structured – you have to fill forms electronically, with drop-down lists and so on. That’s not a primary application for speech, but anything that requires free text, there’s no comparison or alternative. Areas where handwritten notes are put into notes fields, that’s a good application. Discharge notes can be also be very wordy.

From a use case perspective, we’ve done analysis on how much time teams are spending on documentation and it’s huge — three quarters of medical practices are spending half of their time on documentation alone. In the South Tees Emergency department, we did a study where use of speech recognition reduced documentation time by 40%. In another study with Dukinfield, a smaller practice, by introducing our technology they were able to see 4 more patients (about a 10% increase) per day.

  1. What has happened over the past 5 years in terms of performance improvements and innovation?

In these scenarios, it’s a question of “can it work, can it perform” across a range of input devices. General speech recognition has improved so much that we are in the upper 90% range straight out of the gate. Now none of our products require training, based on new technology that was introduced using deep neural networks and machine learning.

In healthcare, we have also added cloud computing and changed the architecture: we put a lightweight client on the end-point machine or device, which streams audio to a back-end recognition server hosted in Microsoft Azure. We announced recently the general availability of Dragon Medical One — cloud-based recognition.

Still connectivity is a big issue, in particular for mobile situations, such as a community nurse — it’s not always possible to use recognition back in the car, if a mobile signal is poor for example. We are looking at technology that could record, then transcribe later.

  1. How have you addressed the privacy and risk implications?

We are certified to connect to N3 network, allowing NHS entities to connect according to requirements around governance and privacy, for example patient confidentiality. Offering a service through the NHS N3 network requires an Information Governance Statement of Compliance and submission of IG Toolkit through NHS Digital — this involves a relatively long and detailed certification process, including disaster recovery, Nuance internal processes and practices, employees with access and so on.

We are also offering input via the public Internet, as encryption and other technologies are secure so customers can connect through these means. So, for example, we can use mobile phones as an input device. We are not trying to build mobile medical devices, we know how difficult that is, but we are looking to replace the keyboard (which is not a medical device!)

As a matter of best practice, it is still required that the doctor has to sign the discharge or confirm an entry in electronic medical record system, whether it has been typed or dictated. So generated text is always a reference: and that will need to stay there. It’s more than five years before the computer can be seen as taking this responsibility from the doctor. Advice similarly can only be guidance.

  1. How do you see the market need for speech recognition maturing in healthcare? 

Right now we’re still very much in an enablement situation with our customers, helping with their documentation needs. From a recognition perspective we can see the potential of moving from enablement to augmentation, making it simpler and hands-free, moving to more of a virtual assistant approach for a single person. In the longer-term, further out, we have the potential to do that for multiple people at the same time, for example a clinician, parent and child.

We’re also looking at the coding side of things — categorising disease, treatment, length of stay and so on from patient documentation. Codes are used for multiple things – reimbursement with insurance, negotiation between GPs, primary and secondary care about services to provide in future, with commissioner and trust to negotiate on payment levels. For primary care, doctors do coding but in secondary care, it’s done by a coder looking through a record after the discharge of a patient. If data is incomplete or non-specific, trusts can miss out on funding. Nuance already offers Natural Language Understanding based-coding products in the US, and these are being evaluated for the specifics of the healthcare market in the UK.

So we want to help turn documentation into something that can be easily analysed. Our technology cannot just recognise what you say, but in natural language understanding we can analyse the text and match against codes, potentially opening the door to offering prompts. For example, if doctor diagnoses a COPD, the clinician may need to ask if patient is a smoker, which will have a consequence in the code.

  1. How does Nuance see the next 5 years panning out, in terms of measuring success for speech recognition?

We believe speech recognition is ready to deliver a great deal of benefit to healthcare, gaining efficiency and freeing up clinical staff. In terms of the future, we recently showed a prototype of a virtual assistant that combines a lot of technologies, including biometrics, complete speech control, text analysis and meaning extraction, and also appropriate selection — so the machine can distinguish between a command and whether I just wanted to say something.

This combination should make the reaction a lot more human — we call this conversational artificial intelligence. Another part of this is about making text to speech as human as possible. Then combining that with cameras and microphones in the environment, for example pointing at something and saying, give me more information about ‘this’. That’s all longer term, but the virtual assistant and video are things we are working on.

My take: healthcare needs all the help it can get

So, does speech recognition have a place? Over the past couple of decades of use, we have learned that we generally do not like talking into thin air, and particularly not to a computer: the main change over recent years, the reduction in training time, has done little to reduce this very psychological blocker, which means that speech recognition remains in a highly useful, yet relatively limited niche of auto-transcription.

Turning specifically to the healthcare industry, a victim of its own science-led success: it is difficult to think of an industry vertical in which staff efficiency is more important. In every geography, potential improvements to patient outcomes are being stymied by a lack of funds, symptomized by waiting lists, bed shortages and so on, while being burdened by the weight of ever-increasing bureaucracy.

Even if speech recognition could knock one or two percentage points off the time taken to execute a clinical pathway, the overall savings could be massive. Greater efficiency also opens the door to higher potential quality, as clinicians can focus on ‘the job’ rather than the paperwork.

For the future, use of speech recognition beyond the note-taking this also links to the potential for improved diagnosis, through augmented decision making, and indeed, improved patient safety as technology provides more support to what is still a highly manual industry. This will take time, but our general habits are changing as the likes of Alexa and Siri make us more comfortable about talking to inanimate objects.

Overall, progress may be slow for speech recognition particularly in healthcare, but it is heading in the right direction. One day, our lives might depend on it.


7 Responses to “5 questions for… Nuance – does speech recognition have a place in healthcare?”

  1. I don’t think speech recognition systems are new in the healthcare field, but the importance and prevalence of the technology has increased in the medical practice in the recent years, especially with the introduction of healthcare information technology.

  2. To be blunt here, voice recognition technology has been threatening to eliminate the need for traditional medical transcription for several years. However, the reality of the situation is that, while speech recognition technology has no doubt increased in importance in the healthcare industry, it is not capable of replacing a traditional medical transcriptionist.

  3. I’m all for the speech recognition use in the healthcare. I would be very happy to see a decrease in time which is lost to all the paperwork done in the hospitals. As you said, even a few saved minutes may change the wait time in the healthcare institutions and their efficiency.

  4. I wish these developers every success in making speech recognition work well with documentation, although in my own experience I can type at a keyboard faster and more accurately than I can get software to recognize what I say.

    My hunch is that speech recognition is better suited for task management for on-the-go physicians and nurses. Typing in tasks on a smartphone or tablet screen is far slower than at a keyboard and most of those tasks are so routine that a few key words could trigger something more complex. It could also prove a marvelous time saver.

    For instance, when we gave blood to a child with leukemia, there was a series of checks for a bad reaction, every fifteen minutes for and hour and so forth. I set a timer to remind myself, but that was clumsy, since the time intervals changed. Far better would be a Siri-like app into which I’d say, “Initiate blood protocol for Jones in B-307.” That would initiate a series of verbal reminders at the right times.

    Indeed, one way of thinking of this would be a Siri specifically adapted for hospital tasks and tweaked to fit the protocols of a specific hospital. Instead of looking for the closest Mexican restaurant, you’d be checking on a patient’s latest lab results. And on the go with a smartphone, speech recognition does save time.

  5. Yes it does.

    1. Make use of apps such as Google Assistant to text, contact or inform others of specific actions.
    2. Security – instead of pins,codes and passwords doctors can use their voice to access medical documents quickly and effectively.
    3. Point 2 can also be applied to patients.

    This is just touching base on what is possible with Voice recognition in the medical field.