Transcription - How to Convert Audio to Text - The Ultimate Guide

What is Transcription?

Transcription is the process of converting the spoken word into text. This service is performed by a 'transcriber', who is specially trained to type audio into text in rapid speed.

A transcriber is a professional touch typist that listens to recorded audio and types what they hear, ensuring correct spelling and use of grammar. They also insert paragraph breaks wherever necessary within the transcript.

The transcript itself could be an exact word for word document (also known as verbatim), or a tidied version of the dialogue. They do the latter by removing things like ‘um' and ‘ah’ that are heard but are not necessary in comprehending what is said.

 
The audio is typically recorded using a dictaphone, digital voice recorder or any other recording device which can produce a digital audio file. 

Transcription VS. Translation

Despite being similar, transcription and translation are different services and require different approaches to deliver. Transcription is the tracking of all spoken dialogue for later review and distribution. 

Translation is converting audio or text from one language to another. It requires another layer of skill in linguistics and languages, which a transcriber may not have. 

 

A brief history of transcripts

Transcription is a process which has existed for centuries and has endured as one of the oldest forms of documentation. As long as the written word has been recorded, reproduced and stored, transcription has been a valuable practice.

 

Ancient Times

Ancient historical documentation was made possible by the scholars (or scribes) committed to transcribing (and duplicating) as much information as possible. This was achieved either in hieroglyphics, Latin, or Ancient Greek depending on the period. Most of the transcribed knowledge was typically religious stories, historical events, records and scientific observations.

 

 

16/17th Century7770729 (1)

The invention of the printing press in 1439 changed the way we record, write, and reproduce literature. For a brief period, scribing as a profession fell out of fashion as the demand for skilled labor was reduced by this primitive automation. 

It would not be until the 17th Century when the English language shorthand was developed. Here, scribes were tasked mainly with writing manuscripts or other documents. 

 

19/20th Century

The 19th Century saw the emergence of the typewriter, once again changing the face of transcription. With speed and clarity now more easily achievable, stenographers and typists became quite common. 

Advances in technology brought with it word documentation and digital storage which made the process of transcribing media more efficient and widespread.  

 

Why use transcripts?

1. Accessibility and Inclusion 

People who are Deaf or Hard of Hearing are often unable to engage with videos or content without an accompanying transcript of what is said.

A transcript relays more than just speech. It should also include identification of speakers and essential non-speech sounds like laughter, silence or other diegetic sounds.

 

2. Social Media Outreach

Many users prefer to consume video or text content over audio. As a result, transcripts are a great means of sharing content.

Any content that requires audio is made more shareable when converted into text. Having access to text versions of your content means a large database of possible content that can be re-packaged and shared on social media, expanding the outreach and visibility of your content. 

 

3. Search Engine Optimization

Search engine optimization serves businesses who want to maximize their online outreach and social media presence. It may surprise you, but providing transcripts can assist in spreading the outreach of content.

When it comes to content, whether it be videos or podcasts, transcribing the audio allows another avenue of consumption. Providing text on the same web page as your content gives consumers more options: whether people miss certain words, struggle with the language, or simply prefer to readtranscripts increase the chance of engagement and potential clicks.

 

Transcription becomes more important when you consider that Google, like most search engines, can't read video or audio data. It relies on text when calculating what relevant content appears when searching keywords.

Transcripts have the potential to contain certain keywords which summarize your content. Keywords are the bread and butter when driving clicks to your websites, so the more relevant keywords the better. This is especially important for podcasts, whether it be for interviews or other discussion.

 

Who Benefits from Transcripts?

People who benefit from Transcripts include people who are:

  • Deaf/ Hard of Hearing: Transcripts allow people who cannot hear your content the ability to access it. By reading audio as text, those who struggle to hear audio are given the ability to engage, and possibly share accessible content they otherwise would not have been able to access.
  • People with developmental disorders: Those with ASD (Autism Spectrum Disorder) may lack the ability to engage with speakers as they struggle to focus what is being communicated or understand extended metaphors and meaning. This also applies to those with dyslexia, who may struggle to read captions on screen or subtitles in the time necessary. Simple text assists with comprehension, as some people are just better taking in information through text rather than speech.  This is especially true in school, university, or corporate settings. Transcripts allow individuals to read content in their own time.
  • Non-Native Language Speakers: For those who are using their non-native language, transcripts can make comprehension much easier. Georgia Tech found that reading transcripts while following along with lectures, meetings, or other content had greater comprehension than those that didn't. 

 

What can be transcribed?

Any audio content can be transcribed. The most popular demand for transcripts comes from:

  • Facebook/ YouTube Videos 
  • Corporate Meetings
  • Interviews
  • Seminars
  • Legal Proceedings
  • Podcasts 
  • University Lectures
  • Family Events

You can contact us to find out more here.

 

How are transcripts made?

Manual Transcription

Manual transcription involves a human doing all the transcription work using specific software or machines to type in real-time. People who perform this task are usually called stenographers, based off the stenograph machine. A stenograph is a unique keyboard with its keys arranged in such a way as to maximize typing speed

To this day, transcribers remain to be the most accurate method of transcribing audio, especially if it is live. 

 

Independent Providers 

There are numerous companies and service providers who offer services ranging from captions to transcripts. All use a variety of techniques. Here at Ai-Media, our transcribers or ‘scripters’ use specialized foot pedals to start, stop and rewind recorded audio. Here, ‘scripters’ then type out what is spoken, as they hear it, into an online software. The script is then saved as Microsoft Word Doc file. We provide high quality transcripts from US$1/minute.

 

Who can work in Transcription?

Transcribers need to have an excellent grasp of the English language (or the language you are transcribing), good grammar and basic computer literacy. Modern software and technology has streamlined the process, making the act of typing audio quicker and more accurate.

Thanks to the low physical labor of the job, transcribers can work from home or in an office space. 

 

What transcript service do I need?

Standard Transcription

The most basic and fastest form of transcription. Transcribers will create a basic transcript and review it for any immediate errors. This option is available for those requiring a cost effective and fast transcript. The accuracy is good but you might find yourself needing to edit spelling of names or terms that were unknown to the transcriber.

 

'Second Pass' Transcription

After the initial transcript is created, the review process is handled by our most qualified transcribers. Here, the transcript is thoroughly reviewed to ensure an almost perfect quality. This takes slightly longer to deliver and is more expensive. This option is usually available for customers who require a high degree of accuracy such as universities, corporations, and seminars.

 

What determines transcript quality?

Audio Quality

The most important factor when determining transcript quality is undoubtedly the audio itself. Audio quality effects what is and isn’t heard during the transcription process. Software or well-trained transcribers need clear, concise and uninterrupted speech in order to transcribe for maximum accuracy.

The most common audio issues include low volume, unclear or distorted recordings, speakers with heavy accents or speakers who are mumbling. These can all lead to 'inaudibles' within the transcript and the transcription process may take longer than usual and may render an additional cost due to the audio being low quality and causing the transcriber to take longer to transcribe.

 

Subject matter

In some cases, lectures, seminars or other meetings which discuss complicated subject matter can impact quality. Medical lectures, language classes, and other classes with complicated terminology can result in misspells, misinterpretations, or inaccurate phrases. Whether these are detected is based upon the skill of the transcriber.

Handy tip: You can supply your transcriber with a list of known terms, names and jargon so they can transcribe the term correctly and you won't have edit the transcript yourself later.

 

Skill of the Transcriber

Finally, the skill of those transcribing the audio is a determining factor. All transcribers are trained to a high extent, but human labor has risks of error. These usually include minor spelling mistakes, incorrect punctuation, miss-hears etc. To avoid this, more time is spent in the editing phase, along with a review process that involves another individual checking the transcript. 

 

How to ensure high quality audio?

Use the Correct Equipment 

Microphone: When recording an event, lecture, seminar, or other event that needs transcribing, the microphone is a crucial piece of equipment that can be the difference between high and low quality audio.

There are three main types of microphones: dynamic, condenser, and ribbon. Each of these specializes in a different type of sound, so it's best to know which category is right for you.  When looking to buy/use your microphone, these questions should help you find the right one:

  • How many speakers will there be?
  • Where will you be recording?
  • What will the background noise level be?
  • What direction is the audio coming from?

Setup: The recording space and setup is essential to producing quality output. If you have access to a large room with high ceilings, soundproof walls, and concrete floors, this is the ideal environment to record in. If this is not the case, it’s not the end of the world, simply find a quiet space that is not too empty or echo-y.

To make the space more optimal for recording, you can hang blankets or place a sound booth around your microphone to reduce the sound entering the room or bouncing off the walls. Furniture, or specialized acoustic foam wall cells also help in dampening sound waves and prevent echo.

buy-acoustic-foam-uk (1) (1) (1)

 

PC Software: If you have the time, and resources, you may want to make edits before finalizing your recording. Edits can shorten the length of audio to a more manageable length; as well as edit out mistakes, re-takes or unnecessary dialogue. While there is paid software available, there are also free software programs including Avid’s Pro Tools First or, if you’re using a Mac, Audacity and Garage Band. These programs can be downloaded right from your computer and can be used to tweak your audio.

 

Guidelines and Principals: WCAG 2.0 was developed by the World Wide Web Consortium and outlines the guidelines making digital content accessible to everyone. This is primarily for people with disabilities such as deafness, blindness etc. In the United States, this is also regulated under the Americans with Disabilities Act (ADA), with the Department of Justice outlining standards for accessible design. Among this legislation, four principals are used to meet basic industry standards:

  • Perceivable: Text must be presentable and readable to users. 
  • Operable: Interface must be accessible and easily operated to users. 
  • Understandable: Content must be able to be clear and understood by the user.
  • Robust: Content must be interpreted reliably by a variety of assistive technologies and agencies. 

 

comments
0