Transcribe

Convert audio and video files to Korean text transcripts using OpenAI Whisper.

Quick Start

Run the transcription script with one or more media files:

bash

.venv/bin/python .claude/skills/transcribe/scripts/transcribe.py sources/audio.m4a

The script:

bash

.venv/bin/python .claude/skills/transcribe/scripts/transcribe.py sources/[FILE]

Example:

bash

.venv/bin/python .claude/skills/transcribe/scripts/transcribe.py sources/file1.m4a sources/file2.mp4

Example:

Use --model flag to specify Whisper model (tiny, base, small, medium, large):

bash

.venv/bin/python .claude/skills/transcribe/scripts/transcribe.py sources/video.mp4 --model tiny

Model selection:

Example:

Check file duration before transcription to set user expectations:

bash

ffmpeg -i sources/[FILE] 2>&1 | grep Duration

Rough processing time estimates (base model):

For long files (>30 min), inform user of expected wait time and suggest using tiny model for faster results.

Generated files are saved to ./output/ with naming pattern:

code

[first 30 chars of transcript]_YYYYMMDD_HHMMSS.txt

Example: 네 엉덕션을 미야 사는 신의 순놈을_20260108_183045.txt

Video: .mp4, .avi, .mov, .mkv, .flv, .wmv, .webm Audio: .mp3, .wav, .m4a, .aac, .ogg, .flac, .wma

Main script for transcription workflow. Handles:

Usage:

bash

.venv/bin/python .claude/skills/transcribe/scripts/transcribe.py <files...> [--model MODEL] [--output-dir DIR]

Lower-level utility for single file transcription. Use transcribe.py instead for standard workflows.