All about Captions!

What do they do, and how do they work?

Live Captions

In an age of fast and constantly changing television, keeping the viewers’ attention is more challenging than ever. Fast-paced dramas, news reports cutting from newsroom to reporters in the field and, of course, the exciting world of live sports with dialogue flying a mile a minute, it’s sometimes a wonder that anyone can keep up sometimes with everything that is being said.

Now imagine turning that dialogue into text in real time! This is a tool which benefits 12 million people who are deaf or have hearing loss in the UK. Many deaf people rely on captions for their news and entertainment and to keep them connected and informed about the world around them.

It is this role that highly skilled Live Captioners carry out on a daily basis.

You generate fast and accurate live captions in two ways:

Stenography
Voice recognition

Professional typists average in excess of 80wpm, not even close to the 180-250wpm spoken on an average television show. Computer-generated captions are no match for the personal approach of having a real person apply themselves to the spoken.

Stenographers use stenograph machines to capture dialogue. These machines have simplified keyboards that include just 22 keys with no letters, shift keys or space bars. The keys on the stenograph machine represent syllables. Stenographers write words phonetically based on the sounds they hear, one syllable at a time and by pressing keys at the same time in different sequences, they can create the words. It is a little bit like playing chords on a piano and rubbing your head and tummy at the same time!

A Stenographer will create their own shortcuts for commonly used words and phrases, building their own library of terms, technical vocabulary and names. You can imagine that captioning an event like the Olympics would require a catalogue of thousands of individual entries, such as events, athletes’ names and sports terms, and all must be entered manually.

It’s almost impossible to capture some dialogue verbatim due to the speed at which television can move. Most of the errors that do occur are due to technical concerns rather than human error, but they are only human, and even at 99% accuracy, mistakes can happen.

The use of voice recognition is an entirely different approach to the process. The professionals that perform this work are called Respeakers.

Respeakers will listen to the audio and simply repeat what they hear into highly sophisticated voice recognition software, which is trained to a particular voice. Due to this highly focused training, errors will rarely happen, but it is not flawless. If the system doesn’t recognise a word or Respeaker mixes up their words, it will usually go to the one it sounds like the most, which is often incorrect. The Respeakers will build their own dictionary and library of words which they may need ahead of time.

Who would use this service?

Captioning can be a really useful addition to any video content that a deaf person needs to access. Live captions are great if the content needs to be immediately accessible, or they can be added afterwards.

We would always recommend you ask the deaf person what service they would prefer, as this could change depending on the setting.

So, could technology replace the interpreter?

This answer is simply – no, and for many reasons.

It is important to understand that English, as we understand it is not always the first language of deaf people; it is usually BSL. For a deaf person who only knows BSL, spoken English is often regarded as a second language.

English has a number of idiosyncrasies that make it super hard to learn as a second language. With the added challenges that deaf students face throughout their school careers. Deaf adults are estimated to have an average reading age of around aged 9-10 (or years 4-5 at school)*. So captions aren’t always the most accessible form of communication.

Although captioning is highly accurate now, glitches can happen with any technology, and this can deny people critical information, a challenge that has been very current during the Covid-19 pandemic captions cannot convey the tone of voice or any urgency where a professional sign language interpreter can do that in real-time.

There is the added human element when using a BSL Interpreter. The Interpreter will put much more emotion and, therefore, additional meaning and context to what is said. This is done through facial expressions and body movements.

In a presentation or meeting, having an interpreter present can make a deaf person feel more included in the discussion. It allows for a two-way conversation through the interpreter. It also empowers deaf people to ask questions and express their opinions, which is impossible using captions. When there is any confusion about the correct pronunciation of a word, working with an interpreter will mitigate this. They can also convey the correct pronunciation to the deaf person.

We recommend both. The deaf person can then access language through the Interpreter. Then on any pre-recorded information, they also have access to the captions. This offers true inclusion for all.

It is important to provide a service that matches the needs of the deaf person – so don’t assume; ask them!

Click here to find out more about the role of a BSL Interpreter.

Most young people now watch TV with the subtitles on

18-24yr olds: 61%
25-49yr olds: 31%
50-64yr olds: 13%
65+yr olds: 22%https://t.co/5WFFhdcfDB pic.twitter.com/zKvfIzuyj8
— YouGov (@YouGov) February 24, 2023

Reference: https://limpingchicken.com/2023/02/24/almost-two-thirds-of-young-people-watch-tv-shows-and-movies-with-subtitles-yougov-survey-finds/