YouTube Transcript Skill
Smart YouTube transcription: subtitles first, Whisper fallback.
Strategy
- •
Try existing subtitles (youtube-transcript-api)
- •Fast, free, no API limits
- •Works for auto-generated + manual captions
- •Tries: English → Russian → any available
- •
Try auto-generated subtitles (yt-dlp with cookies)
- •Works even when YouTube blocks cloud IPs (Hetzner)
- •Uses cookies + JS runtime + challenge solver
- •Extracts VTT subtitles without downloading video
- •
Fallback to Whisper (if no subtitles)
- •Download audio via yt-dlp (with cookies)
- •Transcribe via Groq Whisper API
- •Auto-delete temp files
- •Note: Fails for videos >25MB (Whisper API limit)
Usage
Basic extraction
yt-transcript <youtube-url> [language]
Parameters:
- •
youtube-url— any YouTube URL format - •
language— (optional) language code for Whisper (e.g.,en,ru,de)- •If omitted: Whisper auto-detects language
Examples:
# Auto-detect language yt-transcript https://youtube.com/watch?v=dQw4w9WgXcQ # Force Russian transcription (Whisper fallback only) yt-transcript https://youtu.be/dQw4w9WgXcQ ru # Short URL format yt-transcript https://youtu.be/dQw4w9WgXcQ # With timestamp (works) yt-transcript "https://youtube.com/watch?v=dQw4w9WgXcQ&t=42"
Extract and send to Telegram
yt-transcript-send.sh <youtube-url> [target-id] [language]
Parameters:
- •
youtube-url— YouTube URL - •
target-id— (optional) Telegram user/chat ID (default: [TELEGRAM_ID]) - •
language— (optional) language code for Whisper
Examples:
# Send to default target yt-transcript-send.sh "https://youtube.com/watch?v=..." # Send to specific user yt-transcript-send.sh "https://youtube.com/watch?v=..." @username # With language override yt-transcript-send.sh "https://youtube.com/watch?v=..." [TELEGRAM_ID] ru
What it does:
- •Extracts transcript (subtitles → Whisper fallback)
- •Saves to
/tmp/youtube-transcript-<VIDEO_ID>.txt - •Automatically sends file via Telegram
Output:
[1/3] Trying existing subtitles... ✓ Found existing subtitles We're no strangers to love. You know the rules and so do I...
Or if no subtitles:
[1/3] Trying existing subtitles... Subtitles not available: No transcripts found [2/3] Subtitles unavailable, downloading audio... [3/3] Transcribing with Whisper... ✓ Transcribed via Whisper We're no strangers to love...
Dependencies
Installed:
- •✅
yt-dlp(system:/usr/local/bin/yt-dlp) - •✅
youtube-transcript-api==1.2.4(venv) - •✅ Python 3.12 (system)
- •✅ Node.js (for JS signature solving)
Required:
- •
GROQ_API_KEYenvironment variable - •
~/clawd/youtube_cookies.txt— fresh YouTube cookies (required!)
Check:
echo $GROQ_API_KEY # should output your API key yt-dlp --version # should show 2026.01.29+ ls -lh ~/clawd/youtube_cookies.txt # should exist and be recent
Cookie freshness: YouTube cookies expire fast (minutes to hours). If extraction fails with "Sign in to confirm you're not a bot", re-export cookies:
- •Open Chrome incognito window
- •Go to
youtube.com/robots.txtand log in - •Export cookies via Get cookies.txt LOCALLY
- •Save to
~/clawd/youtube_cookies.txt - •Close incognito window (don't reuse it!)
How It Works
Phase 1: Subtitle extraction
from youtube_transcript_api import YouTubeTranscriptApi ytt_api = YouTubeTranscriptApi() fetched = ytt_api.fetch(video_id, languages=['en', 'ru']) text = ' '.join([snippet.text for snippet in fetched])
Phase 2: Auto-generated subtitles via yt-dlp (fallback)
yt-dlp \ --cookies ~/clawd/youtube_cookies.txt \ --js-runtimes node:/usr/bin/node \ --remote-components ejs:github \ --write-auto-subs \ --sub-langs en \ --skip-download \ -o subs \ <youtube-url>
Why this works on Hetzner:
- •Uses fresh cookies to authenticate
- •JS runtime solves signature challenges
- •Challenge solver bypasses bot detection
- •Works even when YouTube blocks cloud IPs
Phase 3: Audio download (final fallback)
yt-dlp -f 'bestaudio[ext=m4a]/bestaudio' \ --cookies ~/clawd/youtube_cookies.txt \ --js-runtimes node:/usr/bin/node \ --remote-components ejs:github \ --throttled-rate 100K \ -o audio.m4a \ <youtube-url>
Phase 4: Whisper transcription (final fallback)
curl -X POST "https://api.groq.com/openai/v1/audio/transcriptions" \ -H "Authorization: Bearer $GROQ_API_KEY" \ -F "file=@audio.m4a" \ -F "model=whisper-large-v3" \ -F "response_format=text"
Advantages over youtube-transcribe skill
| Feature | youtube-transcript | youtube-transcribe |
|---|---|---|
| Uses existing subtitles | ✅ Yes (free, instant) | ❌ No |
| Whisper fallback | ✅ Yes | ✅ Yes |
| API cost | 💰 Only if no subs | 💰 Always |
| Speed | ⚡ Instant (subs) | 🐌 Always downloads |
| Language detection | ✅ Auto (Whisper) | ⚠️ Hardcoded ru |
Limitations
- •Whisper API limits: Free tier = ~25 hours/month
- •YouTube blocks: Cloud provider IPs may be blocked (use residential proxies if needed)
- •Long videos: >1 hour may hit API limits or timeout
Troubleshooting
ERROR: GROQ_API_KEY not set
export GROQ_API_KEY="gsk_..."
ERROR: Failed to download audio
- •Update yt-dlp:
yt-dlp -U - •Check URL is public
- •If "Sign in to confirm" error → YouTube blocking IP (try cookies or proxy)
Subtitles in wrong language
- •Script tries: English → Russian → any available
- •To customize: edit
languages=['en', 'ru']in script
Empty Whisper response
- •Check GROQ_API_KEY is valid
- •Check audio file is not corrupted
- •Try shorter video first
Integration with Agent
Direct usage:
yt-transcript "https://youtube.com/watch?v=VIDEO_ID"
In workflow:
# Download + transcribe + save yt-transcript "https://youtube.com/watch?v=VIDEO_ID" > transcript.txt # With language override yt-transcript "https://youtube.com/watch?v=VIDEO_ID" en > transcript_en.txt
Agent commands:
- •"Транскрибируй это видео: [URL]"
- •"Что говорится в этом видео?"
- •"Извлеки текст из YouTube: [URL]"
Next Steps
After getting transcript:
- •Save to file —
> transcript.txt - •Analyze content — sentiment, keywords, topics
- •Generate summary — via LLM
- •Extract structure — timestamps, speakers, chapters
- •Translate — via LLM or translation API
Files
youtube-transcript/
├── SKILL.md # This file
├── yt-transcript.sh # Main extraction script
├── yt-transcript-send.sh # Wrapper: extract + send to Telegram
└── venv/ # Python virtual environment
└── lib/python3.12/site-packages/
└── youtube_transcript_api/
Testing
# Test with short video (has subtitles) yt-transcript "https://youtube.com/watch?v=dQw4w9WgXcQ" # Test with video without subtitles (will use Whisper) # (find a video without auto-generated captions) # Test language override yt-transcript "https://youtube.com/watch?v=dQw4w9WgXcQ" en # Test send to Telegram yt-transcript-send.sh "https://youtube.com/watch?v=dQw4w9WgXcQ"
Cookie Management
When to refresh cookies
YouTube cookies expire fast due to security rotation. Refresh when you see:
ERROR: [youtube] Sign in to confirm you're not a bot
How to refresh (Chrome extension)
- •Open Chrome incognito (Ctrl+Shift+N)
- •Go to
youtube.comand log in - •Navigate to
youtube.com/robots.txt(required!) - •Install Get cookies.txt LOCALLY
- •Click extension → Export → Select
youtube.com - •Save as
~/clawd/youtube_cookies.txt - •Close incognito window (important!)
How to refresh (DevTools)
- •Open Chrome incognito
- •Go to
youtube.com/robots.txtand log in - •Press F12 → Console
- •Run:
copy(document.cookie) - •Send to agent (will save to
~/clawd/youtube_cookies.txt) - •Close incognito window
Why cookies expire
YouTube detects "double usage":
- •You export cookies from browser
- •Continue using same browser/profile
- •YouTube invalidates cookies (~1-5 min)
Solution: Export from incognito/separate profile you won't reuse.
References
- •youtube-transcript-api GitHub
- •yt-dlp documentation
- •Groq Whisper API
- •Research:
~/clawd/research/youtube-download-2026.md