Track entities across videos
Follow a single subject - a person, vehicle, or object - across multiple videos. Jockey correlates visual identity across your collection and reconstructs a chronological timeline of appearances.
Cross-video entity tracking is a core capability in private beta. Behavior and accuracy may evolve as the system improves.
What you’ll build
A timeline of a subject’s appearances across multiple videos, with timestamps, locations, and context for each sighting.
Prerequisites
- Complete the Quickstart to create a knowledge store with at least one item in
readystatus. - Read Create a response to understand the request and response format.
When to use this
- Tracking a person across multiple camera angles or footage sources
- Building a chronological timeline of a subject’s appearances
- Identifying every moment where a specific entity appears in a collection
How it works
Describe the subject you want to track in plain language. Jockey correlates visual identity across all videos in your knowledge store and returns a timeline of appearances with timestamps and context. All tracking goes through the same POST /responses endpoint - no special configuration is needed.
Be specific about the subject. Physical descriptions, clothing, and distinguishing features improve accuracy. “The person in the blue hoodie” works better than “the suspect.”
Track with structured output
Use a schema to get machine-readable tracking results. The schema captures each appearance with a timestamp, video reference, location, and description of what the subject is doing.
Refine with follow-up turns
Use the session_id from the first response to drill into specific appearances without starting over.
Limitations
- Visual-only identification. Voice-based matching is not supported in this phase.
- Single entity per request. Tracking multiple subjects requires separate conversations.
- Accuracy depends on video quality. Camera angles, lighting, and how distinctive the subject is all affect results.
Variations
- Different domains: Swap instructions to “video editor” or “documentary researcher” for different emphasis
- Highlight reel: “Create a highlight reel of this subject’s best moments”
- Multi-episode: Track a recurring character across an episode series
See also
- Extract entities - list all entities before deciding which to track
- Multi-turn sessions - more on continuing conversations