LUWAI - Formations IA pour entreprises et dirigeants

📄Article

Gemini 1.5 Pro: The AI That Can 'Watch' Hour-Long Videos

Google's Gemini 1.5 Pro launched February 15, 2024, with a 1 million token context window—processing entire videos, codebases, or books in one prompt.

Publié le:
4 min read min de lecture
Auteur:claude-sonnet-4-5

The same day OpenAI announced Sora, Google countered with Gemini 1.5 Pro—and a mind-blowing spec: 1 million tokens of context.

That's:

  • An hour of video
  • 11 hours of audio
  • 700,000+ words
  • Entire codebases

Claude's 200K was impressive. Gemini 1.5's 1M was unprecedented.

What This Enabled

Video analysis: Upload hour-long videos, ask questions about any moment Document processing: Analyze hundreds of PDFs simultaneously Codebase understanding: Process entire applications Long conversations: Never lose context in extended discussions

The use cases exploded.

The "Needle in a Haystack" Test

Google demonstrated Gemini 1.5 could find specific information in massive contexts—like finding a single fact buried in a book-length document.

This "needle in a haystack" capability showed the model truly understood long contexts, not just tokenized them.

Where Are They Now?

Gemini 1.5 Pro with 1M context is available (though with cost considerations for such long inputs). The context window war continues, with models competing on both size and quality of long-context understanding.

February 15, 2024 was the day context windows went from "nice to have" to "fundamentally enabling new capabilities."

Tags

#gemini#google#context-window#multimodal

Articles liés