TLDR; AirCaps is bringing AI assistance to in-person conversations.
Our AI-copilot provides live captions, translations, AI meeting notes and insights for in-person conversations in real-time. It's deployed as an app for lightweight AR glasses so you can see visual information overlaid onto your field of vision.
You can think of it like a Zoom AI meeting assistant or Granola, but for in-person conversations, and with the ability to proactively help you in real-time, not just after the conversation.
We’re building the capture and intelligence layer for the 216 billion daily conversations that happen face-to-face.
We’ve already transcribed 16,500 hours of in-person conversations and counting. Today, AirCaps assists with 11% of our users’ in-person conversations.
We did $93K in revenue in October and grew 6.5x from September while spending <$3K on marketing (and hit $70K in revenue in the first 2 weeks of November). Our power users average 6h+ daily usage and our day-30 retention is 91%.
We previously went viral (75M+ views on TikTok, 150K+ followers) and have been featured by The New Yorker, WIRED, and Forbes.
The Problem
The average person struggles to understand and retain 50% of in-person conversations, forgets 70% within 48 hours, and can't review any of it.
We have 20-30 in-person conversations daily, meaning humanity generates ~200B in-person conversations (on average 10 minutes, generating 36 billion hours of content) every single day.
While virtual meetings have captions, recordings, and hundreds of AI assistants, these in-person conversations remain completely unassisted by technology.
Why?
Our Solution
We're finally bringing AI to real-world conversations. We’re deploying our software on a discreet, socially acceptable, always-on, and hands-free visual interface: AR glasses.
Our earliest customers are people who need help understanding conversations (captions for real life) or translating languages for in-person business meetings, and meeting-heavy professionals - healthcare workers, executives, salespeople - who need real-time AI assistance during high-stakes conversations but can't use screens without breaking eye contact.
We’re building the killer app for the next platform: think Spotify / Instagram for phones before everyone had a phone.
AR glasses are the only form factor that works for in-person conversations:
The Team
Madhav and Nirbhay have been obsessed with voice technology and AR for 11 years and met at an AR hackathon in the summer of 2024.
Madhav (CEO, Yale CS) built his first Google Glass apps at age 13 and his first Jarvis voice assistant for RPis at age 14. He researched audio AI at MIT Media Lab, where he was part of the team that built the world's first live AI-human musical co-performance. His Yale thesis focused on extracting clean transcripts from noisy multi-speaker audio.
Nirbhay (CTO, Cornell CS) started building voice AI on smart glasses in high school, developing an emotional support tool for people with Autism. He's previously built voice and conversational AI platforms for therapy as the first hire at several early-stage (including YC) startups.
Why Now
There’s never been a better time for us: AI is becoming faster & more accurate for speech, and people are starting to default to AI assistance for conversations: 3 out of every 4 professionals uses AI notetakers for virtual meetings. And not to mention: Meta, Apple, Google are investing $200B+ to ensure everyone wears AR glasses.
Ask
If you know doctors who use medical transcription services, or organizations that rely on field sales (real estate, home services, retail), please connect us!