Let’s say you’re a business owner who’s commissioned a podcast, or maybe you’re an executive producer who’s transitioned to audio from print and don’t have direct experience with the format.

Your producer’s just sent you the final version of an episode and, excited, you give it a listen. You think it sounds fine, but as your finger hovers over the ‘Publish’ button, you freeze…

…What if this is bad, and you just can’t hear it?

This sounds like a thing that shouldn’t happen, and it shouldn’t, but mistakes get made, errors get missed, and if yours is the last pair of ears to hear a piece before it goes out into the world, you need to be confident it’s ready.

So, here are five (really kind of embarrassing) issues I’ve heard in professionally produced podcasts. I’m not going to say where. My aim here isn’t to embarrass people, it’s to encourage better audio. But these are all fairly common, and they’re all avoidable.

EDIT: Since first publishing this, I’ve heard another avoidable clanger in a very high-profile podcast, and so you’re getting six – count them – six common issues to look out for…

Five Six issues to listen out for in your audio

1) Poor sound quality

This one’s obvious, but fundamental. It’s no fun struggling to understand what people are saying because the recording quality is poor. And even if you can understand them, if the recording sounds ugly, it’s not going to be an enjoyable experience.

At the end of the day, there’s only so much that can be done in post-production, and it’s always best to begin with well-recorded tape sourced with decent equipment by a producer who knows what they’re doing in the studio and the field.

That said, there are some pretty amazing fixes out there now, from AI voice enhancers to iZotope’s Spectral Recovery tool. If your podcast isn’t sounding great, it might be worth discussing options with the producer, but solutions can be expensive and time consuming, and if used poorly they can make issues even worse, so always try to get that initial recording right.

Which brings us to the next issue…

2) Noise Gates

Simply put, a noise gate is a tool that cuts the output of a track when it falls below a certain volume threshold.

What a noise gate does (courtesy of Wikipedia)

In theory, this could be pretty useful for a podcast. You could set the gate to allow a speaker’s voice to pass through, but cut the signal when all it detects is the quieter noise of traffic in the background, or the annoying hum of a fridge.

In practice, though, unless they’re being used very subtly noise gates on voiceovers are awful. Whenever the speaker pauses the track cuts out entirely, which sounds unnatural and distracting.

It’s much better to have a little low-level room sound on a track than the awkward stop-and-start of a badly calibrated gate. If you hear this kind of thing on your podcast, ask the producer or engineer to try another approach like a filter, or duplicating the track with a gate on one version and some lowered background noise on the other.

3) The music is too loud

This is a straightforward one: if the music under a voice is so loud that tuning into the speaker takes effort, lower the music.

That said, go easy on your producer with this one. After listening to a section of speech dozens of times during an edit, it’s likely their memory of what’s been said will kick in automatically to compensate for any words that are obscured by music. This is more or less unavoidable, and one of the reasons it’s always good to have a second pair of ears on a piece.

4) Bad Cuts

If an editor’s cut one of your speakers mid-sentence, you’re probably going to catch it, but there are subtler cutting errors you might want to listen out for. Here are two that really grate:

What a breath looks like in waveform (courtesy of ResearchGate)

Cutting in the middle of a breath is something a lot of listeners might not consciously notice, but even if it’s not too obvious it can make your speaker sound unnaturally hurried, which is stressful to listen to. People tend to pause for a fraction of a second before and after a breath, so listen out for when part of that breath or pause is missing, and make sure to bring it back for a smooth, natural effect.

Badly cut music is something that should never happen, and yet I recently heard an example of this in a very high-profile podcast. When placing music under someone talking, an editor may need to cut or repeat some part of the music so that it’s the right length to fit the speech. When that music has a drum beat, it’s crucial that the rhythm is maintained across the cuts. Skipping a beat in the middle of a piece of music sounds super amateurish, and it’s something you should listen out for.

5) Overcompression

This one’s a bit more subtle, but it’s important.

I’m not going to go into the ins and outs of compression here, but it is a super useful tool. It lets you make your audio louder, clearer, and more uniform.

What over compression looks like as waveform compared to normal compression – gross. (Courtesy of mastering.com)

The problem is when compression is overused. You can spot this when everything in your podcast sounds like it’s happening at the same loudness and intensity – it all feels squashed and flat. That’s a bad thing, because it robs your audio of dynamic range (the importance of which I write about here), and makes it boring to listen to.

The amount of compression is partly a question of taste, but like so many post-processing effects it’s best to use it subtly. If your audio is sounding squashed or flat, or something about it is making your ears feel a bit fuzzy, have a chat with your engineer.

6) Bonus material: Tracks Are out of Sync

For most interviews you’ll be recording with two mics, and that means you’re going to come away with two tracks of audio.

Unless you’re recording speakers in different locations, or isolated booths, some of the sound from each is inevitably going to bleed into the other, which means the two tracks need to be perfectly synchronised in the edit.

What a two-channel interview looks like during an edit

Most of the time, a producer will cut the sections on each track when that speaker isn’t talking – this simplifies the edit and reduces the risk of things going wrong – but what about those fun, dynamic moments when the speakers are talking over each other, laughing, or having a fast back-and-forth?

There’s no catch-all solution here. The editor is just going to have use what they can from the two channels to make things sound as natural as possible, and that’s often going to mean moving tracks around a little. This is where the issue comes in: you don’t want desyncronised artefacts from one channel bleeding into another, laughs cut in awkward places, or strange echoes where two channels have moved out of sync. This never fails to kill what would otherwise be a nice, lively moment in a piece of audio.

If you hear something like this, ask your producer if they can fiddle it into better shape. Sometimes it’s simply not possible and you’ll need to make a decision whether it’s worth keeping with a suboptimal edit, but it’s worth knowing your options.

Final thoughts

I wrote this post to give people without an audio background some tools to identify issues in their content, but you’re still going to need someone to fix them.

If you hear any of these, have a chat with your producer or engineer about ways you might improve the quality of what you’re putting out there. If you need any other help, or someone who can tell you whether your audio is measuring up and how to make it better, get in touch.

I love In Our Time on BBC Radio 4. It’s an institution: A panel of academic experts discussing their niches while Melvin Bragg sasses them? Delightful.

But if you asked me to tell you something I’ve learned from an episode, I honestly don’t think I could. Why? Because the show has zero dynamic range.

What is dynamic range, and why does it matter to podcasting?

The human brain uses a tonne of energy – around 20% of the body’s overall consumption – and that means it looks for efficiencies where it can. If your brain thinks everything’s ticking along smoothly without any new events it needs to be vigilant to (hungry approaching tigers, etc.) it’ll conserve energy by going into autopilot, and that means it stops paying attention.

That’s why, no matter how much you want to learn about the Hanseatic Language, the Irish Rebellion of 1798, or the Great Stink, after a few minutes of academics explaining them to Melvin Bragg: Sass Machine in the same voices, volumes, and pitches, like it or not you’re going to stop listening.

Mixing up the dynamic range – the timbre, volume, and pace of your piece – is vital to making audio with impact, because that’s what keeps your audience engaged.

How can you keep people listening to your podcast?

There’s no hard and fast rule to working with dynamic range, and you should always follow your own tastes, but I find the sweet spot for holding an audience’s attention (or mine at least) is a little under two minutes.

This means that roughly every two minutes, you want something to change. It could be the person speaking, the energy in their voice, the music, sound effects, archive, or something else, but you can’t just keep alternating between the same two things. When the brain starts to sense a repeating pattern, it’s going to assume it doesn’t need to listen anymore.

This is basically about musicality, which is central to interesting sound design and something I’m going to write about in a future blog. A lot of music is about establishing patterns and interrupting them – theme and variation – and the same is true for speech-based audio.

So, in terms of sound (because content and narrative are something else I’m going to write about in future posts):

Dynamic Range + Variety = Compelling Audio

What does strong dynamic range sound like?

To illustrate strong dynamic range that keeps an audience engaged, I’m going to pick apart the intro I made for an episode of the Guardian’s award-winning Today in Focus podcast.

(Listen to the full episode here)

Now, this clip is just over 2 minutes in itself, but being the introduction to an episode it needs to be especially arresting. There’s some kind of dynamic variation every 5 to 10 seconds here, which would be exhausting to listen to for a full 30 minutes, but might be appropriate for a hard-hitting intro, or a climactic sequence somewhere in your piece.

Here’s a rough breakdown of what’s happening:

0” – 8” Presenter Nosheen Iqbal is speaking without anything underneath. We’re around a 2/10 here.
8” – 48” Around 40 seconds of alternating speech and archive, changing every 5 to 10 seconds, over some droning music. A solid 5/10
48” – 1’08” Sooner or later that’s going to get boring. Your brain’s going to figure out the pattern and start switching off. So we change up the dynamic a little. A (thematically appropriate) clock starts beeping, and the droning music starts to rise up in the mix, some other sound effects come into play. Pushing up to maybe 8/10 here!
1’08” – 1’20 But careful! The ear’s only going to tune in to that kind of dense noise for so long. So after a brief climax the dynamic comes down again: a single note repeated on the piano, then Nosheen with nothing underneath her. Notice how there’s still some residual tension from the previous section. 3/10.
1’20” – 1’36” Now we’re mixing up the dynamic with some music again, but notice that this isn’t too full on. It’s got rhythm, but no beat. 6/10.
1’36” – 2’06” Now we take it up a notch, introduce a beat and return to our pattern, alternating between Nosheen and fairly urgent-sounding archive. 8.5/10
2’06” – 2’11” There’s a lot going on now and we’re nearly at the end, so just to give one last boost to Nosheen’s last line – the all-important title – we’re going to switch up the dynamic one last time, taking the music down a notch. 7/10 here, but not for long…
2’11 – 2’16 … after a few seconds, we crank everything back up for the final line, which is all the more emphatic for following a moment of relative calm. 9/10

*Now see how these (very scientific) scores out of ten look plotted on a (very accurate) graph, and you get an idea what’s going on… It’s all a bit “****Kurt Vonnegut explaining story structure****,” right? You’re creating a dramatic arc with sound!*

So there you have it: speech, archive, music, and effects all brought together with dynamic range and variety in mind. The result: An intro that holds your attention, compels you to listen to what’s being said, and ultimately makes you want to stick around and hear the rest of the episode.

Again, you don’t want to sustain this kind of pitch and pace for a whole piece, but the same principles slowed down some will keep listeners tuned in.

How can you improve the dynamic range in your episodes?

Dynamic range doesn’t necessarily mean you need to pack your podcast with music, archive and sound effects. Many speech-only podcasts are brilliant, but they often rely on gifted presenters and great guests with plenty of chemistry to keep the dynamics varied and the energy up.

You can have a go at improving the dynamic range of your episodes yourself using some of the ideas above, but there’s no substitute for the trained ear and technical experience of a producer who can really make your audio sing (ahem).

Want help making sure your audience tunes into everything you’ve got to say? Get in touch.

Make Your Podcast Sound Better by Listening out for these Five Common Issues