For Instructional Designers ·
What you'll accomplish
By the end of this guide, you'll be able to record your own narration audio and clean it up with Descript's AI in under 20 minutes — removing filler words ("um", "uh"), background noise, and mouth sounds automatically. You'll also be able to edit your audio the same way you edit a Word document: delete a word from the transcript, and Descript removes it from the audio.
What you'll need
Go to descript.com → click Start for free → sign up with Google or email → download the Descript desktop app (works best as the desktop version, though the web version also works).
What you should see: The Descript app dashboard with an option to start a new project.
Click New Project → name it after your course module (e.g., "HIPAA Module 2 Narration") → click Create.
What you should see: An empty project workspace with a timeline at the bottom and a blank transcription area in the center.
Click the Record button (microphone icon in the toolbar) → select your microphone from the dropdown → click Start Recording.
Read your script at a natural, slightly-slower-than-normal pace. Don't stop to re-record mistakes — just pause, take a breath, and continue from the last complete sentence. Descript makes editing mistakes trivial.
Click Stop Recording when done.
What you should see: Your recording appears in the project timeline, and Descript automatically transcribes the audio (takes 1–2 minutes). Troubleshooting: If your microphone isn't detected, check System Preferences/Settings → Sound → Input on your computer.
After transcription completes → click Edit tab at the top → find the Remove filler words option (usually in the Action bar or under Edit menu) → select all filler words to remove: "um", "uh", "like", "you know" → click Remove.
What you should see: The transcript updates, highlighting where fillers were removed. The audio is automatically edited to remove those moments — you'll hear the recording play back without them.
Click on your audio clip in the timeline → in the right panel, find Studio Sound → toggle it on → Descript analyzes your recording and applies AI noise reduction, removing room echo, background hum, and mouth clicks.
What you should see: A processing indicator for a few seconds, then the clean audio version replaces the original. Play it back — it should sound noticeably cleaner.
If you stumbled over a sentence or said something incorrectly → find that section in the text transcript → select the words you want to remove → press Delete.
What you should see: The corresponding audio is removed automatically, and the surrounding audio joins seamlessly.
This is the key Descript superpower: editing text = editing audio. You never need to use the timeline for basic edits.
When your narration sounds good → click Export → select Audio only → choose MP3 format → set quality to 192kbps or higher → click Export.
What you should see: An MP3 file downloads to your chosen location.
Open Storyline → navigate to the target slide → Insert → Audio → Audio from File → select your exported MP3 → the audio is now attached to the slide timeline.
What you should see: An audio waveform appears at the bottom of your Storyline slide timeline, starting at the slide's beginning.
For checking timing per slide: After recording, Descript shows timestamps in the transcript. Use this to verify each slide's narration stays within your target (usually 45–90 seconds per slide for eLearning).
For re-recording a single word: Click the word in the transcript → click Record → Descript records just that word and replaces it seamlessly in the audio. No re-recording the whole slide.
For adding a pause: Click between two words in the transcript → press Option+P (Mac) or Alt+P (PC) → insert a silence of your specified duration. Useful for letting a visual animation complete before narration continues.