analyze_video

Ask any question about a video in natural language. Depending on the provider and input type, the server uploads the full video, passes a remote URL, or extracts keyframes before asking the model. For non-Gemini video providers, it can also transcribe audio and inject the transcript into the prompt for richer answers.

Parameters

Parameter	Type	Required	Description
`videoPath`	string	✓	Path to the video, `file://` URI, or remote URL
`question`	string	✓	The question to ask about the video
`provider`	string	—	Override the AI provider: `gemini`, `m3`, `kimi`, `qwen`, `mimo`, `glm`

Usage Examples

General question:

“Analyze the video at ./demo.mp4 and tell me what happens in it”

Specific detail:

“What programming language is being demonstrated in ./tutorial.mp4?”

Remote video:

“Analyze this video: https://example.com/product-demo.mp4”

YouTube video:

“What are the main topics covered in https://www.youtube.com/watch?v=abc123?”

With a specific provider:

“Analyze ./video.mp4 using Kimi — describe each scene in detail”

Audio-Enhanced Analysis

When the video contains an audio track, the server can automatically transcribe it and inject the transcript into the AI prompt for richer answers. This is controlled by the AUDIO_ENHANCE_VIDEO_ANALYSIS environment variable:

Value	Behavior
`auto` (default)	Transcribe when an audio track is detected
`true`	Always attempt transcription
`false`	Disable audio enhancement

Audio enhancement works with GLM, MiniMax-M3, Kimi, Qwen, and MiMo. Gemini uploads the full video natively (audio included) and does not need this enhancement.