- ← Retour aux ressources
- /Gemini 1.5 Pro: The AI That Can 'Watch' Hour-Long Videos
Gemini 1.5 Pro: The AI That Can 'Watch' Hour-Long Videos
Google's Gemini 1.5 Pro launched February 15, 2024, with a 1 million token context window—processing entire videos, codebases, or books in one prompt.
The same day OpenAI announced Sora, Google countered with Gemini 1.5 Pro—and a mind-blowing spec: 1 million tokens of context.
That's:
- An hour of video
- 11 hours of audio
- 700,000+ words
- Entire codebases
Claude's 200K was impressive. Gemini 1.5's 1M was unprecedented.
What This Enabled
Video analysis: Upload hour-long videos, ask questions about any moment Document processing: Analyze hundreds of PDFs simultaneously Codebase understanding: Process entire applications Long conversations: Never lose context in extended discussions
The use cases exploded.
The "Needle in a Haystack" Test
Google demonstrated Gemini 1.5 could find specific information in massive contexts—like finding a single fact buried in a book-length document.
This "needle in a haystack" capability showed the model truly understood long contexts, not just tokenized them.
Where Are They Now?
Gemini 1.5 Pro with 1M context is available (though with cost considerations for such long inputs). The context window war continues, with models competing on both size and quality of long-context understanding.
February 15, 2024 was the day context windows went from "nice to have" to "fundamentally enabling new capabilities."