What it does
This MCP server retrieves transcripts and metadata from YouTube videos via four tools. get_transcript returns plain-text transcripts; get_timed_transcript includes timestamps for each caption segment, enabling precise reference during editing or research. get_video_info fetches metadata like title, duration, and channel. get_available_languages lists available caption languages. For longer videos, transcripts exceeding 50,000 characters are split across multiple responses, with a next_cursor parameter for pagination. The server supports multiple caption languages and includes built-in handling for accessing YouTube from geographically restricted networks via proxy configuration.
Who it's for
Researchers analyzing video content for citations or evidence; content creators and analytics teams tracking video performance; developers building AI applications that need to extract video context; anyone accessing YouTube from networks with IP restrictions.
Common use cases
- Extract and analyze a full transcript from a YouTube video for research or fact-checking
- Retrieve timestamped transcripts for video editing, precise reference, or creating chapters
- Pull video metadata alongside transcript content for structured archival
- Access transcripts in languages other than English for multilingual research
- Fetch transcripts from geographically restricted networks using proxy servers
Setup pitfalls
- Long transcripts are paginated at 50,000 characters; always check for
next_cursorand fetch continuation content to avoid incomplete results - YouTube access may be blocked by IP; configure proxies via
WEBSHARE_PROXY_USERNAMEandWEBSHARE_PROXY_PASSWORDenvironment variables orHTTP_PROXY/HTTPS_PROXY - Response size defaults to roughly 50k characters; use the
--response-limitargument if your LLM has a smaller context window (e.g.,--response-limit 15000for 15k-char chunks)