ProComm Blog

Voice Over Studio Terms and Definitions

Good communication is a key to success in all relationships. Often miscommunication and misunderstandings between people occur because they simply don’t know how to “speak the same language”. I’m not talking about the difference between English and Chinese. I’m talking about technical language. In the business world, nearly every type of business, has a name for everything they do. The recording and voiceover world is no different.

In an effort to overcome the “language barrier” that often exists between engineers, directors, clients and talent, my friends and I at ProComm Voices have put together a glossary of voice over studio terms that everyone in this business should know.

While individual studios and production companies may also have unique terminology that is used internally, this list will provide you with key terms that are fairly universal throughout the industry.



ADR – (Automated Dialog Replacement) Also referred to as ʻloopingʼ. The process of replacing a voice over for an on-camera talent.

Compression – The use of an audio processor to control audio dynamics (loudness and softness) on a piece of audio. It can be applied to individual parts as well as to an overall production.

Data compression – Process designed to reduce the transmission bandwidth requirement of digital audio streams and the storage size of audio files.

De-breathing – The process of removing all breaths from a vocal performance.

Editing – The process of removing unwanted portions of audio, leaving only the portion that will be used in the final production. May or may not include de-breathing.

Equalization (EQ) – The use of an audio processor to manipulate the frequencies that exist within all sounds heard by the human ear.

Audio File Formats – Common uncompressed audio file types used in audio production are: AIFF & WAV.  Compressed audio files are typically MP3.

ISDN – Integrated Services Digital Network – A communication standard allowing the transmission of voice (as well as video and data) from one studio to another over telephone lines in high speed, digital quality and with great fidelity.

Limiting –  The use of an audio processor to keep audio from exceeding a certain level or threshold as determined by the engineer.

Maximization or Maximize – A mastering process that includes the use of an audio processor to bring audio up to a maximum level as determined by the engineer.

Mix – A fully produced, finished or broadcast ready audio presentation that may include voice, music, sound effects, all necessary processing and maximization. Usually provided in stereo unless the final format is mono only (such as a phone system)

Mixing – The process of manipulating and combining multiple audio signals or elements to create a final audio production or mix.

Noise – Any sound that is undesirable or unwanted.

Processing – Any alteration of raw audio through the use of audio tools such as compression, equalization (EQ), maximization, or time-based/space-based effects (i.e. – delay or reverb).

Raw Audio – Any recorded audio that is unedited and unprocessed. Delivery of “raw audio” means to provide clients with audio exactly as it was recorded.

Reverb – A space/time based effect that simulates an environment. All environments have an effect on a sound within that environment. Example: A voice heard in a stadium sounds different than a voice heard in a closet. Reverb can be used to simulate the sound of both environments. Some people refer to this as “echo” (IMPORTANT NOTE: the use of reverb is very dependent on an overall production and therefore is rarely added unless a full mix is being produced).

Sample Rate/Bit Rate – Essentially the amount of digital information used by the computer in the creation of an audio file. The higher the rate, the higher the supposed quality of the file. However, beyond a certain point (for most people above 44.1kHz/16 bit) the difference in sound quality is undetectable. Therefore, the need to obtain/verify this information becomes necessary primarily for compatibility among files or systems.

44.1kHz/16bit = CD quality audio

48kHz/16bit = Video standard for audio

Slate – A recorded audio cue that identifies the audio that follows. Generally stated as, “take one (followed by the recorded VO),  take two (followed by the recorded VO) take three…” etc.

Stems – Individual elements of a mix provided separately. Voice, music, and sound effects provided as separate files instead of combined in a full mix.

Takes – A separate file of recorded audio. Each take is identified by a separate file name and a slate.

Time Compression/Expansion – An electronic process using an algorithm which leaves the pitch of the signal intact while changing its speed (tempo)