Conversational AI Takes a Step Forward with Document Context
Developer Simon Willison has significantly enhanced his OpenAI WebRTC Audio Session tool, adding the ability to provide document context for more focused conversations.
The original tool, launched in December 2024 to explore OpenAI’s real-time audio API, now supports GPT‑Realtime‑2 - a new model with “GPT‑5‑class reasoning” but a knowledge cut-off of September 30, 2024. While this advanced model hasn’t yet appeared in ChatGPT’s iPhone app, users can access it through Willison’s playground.
The key update allows users to paste substantial text - such as research papers, articles, or meeting transcripts - which the AI can then reference during conversations. This creates a more grounded and informative dialogue experience compared to standard chatbot interactions.
Practical Applications of Contextual Audio Sessions
- Review complex documents with an AI assistant that understands the content
- Conduct in-depth research through conversational exploration
- Brainstorm ideas while referencing specific materials
- Get personalized explanations tailored to your context
The tool offers a clean interface where users can select their voice (Coral or V2), choose between models, and paste their document before starting a session. Once initialized, the AI will be able to discuss the provided text in a natural conversational manner.
This enhancement addresses a common limitation of current AI - its tendency to generate responses without deep understanding of specific contexts - making it particularly useful for professionals who work with complex information.