{"id":92886,"title":"Liva AI – Real Voice \u0026 Video Data for AI","tagline":"We provide real voice and video data for developing realistic AI.","body":"**TL;DR:** We provide high-quality human voice and video datasets for companies building realistic AI models. Real data sourced in-house across diverse languages, emotions, and contexts.\n\n**Ask:** [Talk to us](https://calendly.com/ashleymo-theliva/30min) if you're a researcher or founder building voice/video generation models. We can help you.\n\nhttps://youtu.be/PwjTSZRC8XE\n\n**The Problem:**\\\nCompanies building voice and video AI models need quality training data to reach human-like performance for models. Scraping has limits and existing datasets lack diversity across accents, emotions, and languages. As AI moves toward human-like interactions like customer conversations, therapy sessions, classroom teaching, or entertainment, models need authentic human expressions that the internet can't supply.\n\n**Our Solution:**\\\nLiva collects real, consented human voice and video entirely in-house—no synthetic data or third-party purchases. All content is authentic and rights-cleared.  We’re already delivering our voice dataset to a lab training expressive foundation models for voice.\n\nWe capture diverse accents, emotional range, and varied contexts (sales calls, multi-channel dialogues, expressive monologues, casual conversations, job interviews, and more) with high production quality through crowdsourcing and strategic partnerships.\n\n**The Team:**\n\nWe’ve worked on many research projects together since we met 3 years ago.\n\n* **Ashley:** Caltech CS dropout. Built an AI model to detect lung disease from cough recordings (@ MIT) and led large-scale data collection initiatives. \n* **Aoi:** Prev. Harvard Bio/CS. Did ML research in representation learning and image diffusion models. 5+ Publications in ICML, Nature, etc.\n\n**Our Ask:**\n\nIntroductions to:\n\n* AI labs building or fine-tuning voice, video, and multimodal generation models\n* Companies using or integrating voice/video models\n* Film, audio, and content production companies\n* Researchers working on audio, video, and multimodal generative models\n\nContact us: [founders@theliva.ai](mailto:founders@theliva.ai)","slug":"OAA-liva-ai-real-voice-video-data-for-ai","created_at":"2025-08-12T03:48:13.634Z","updated_at":"2026-05-25T01:58:32.930Z","total_vote_count":46,"url":"https://www.ycombinator.com/launches/OAA-liva-ai-real-voice-video-data-for-ai","share_image_url":"//bookface-static.ycombinator.com/assets/ycdc/yc-og-image-c440a0ad1dacfb86eeeb343717479cc54d256614449b4ef719977a0a451f8bc8.png","company":{"id":30622,"name":"Liva AI","slug":"liva-ai","url":"https://www.theliva.ai","logo":"https://bookface-images.s3.amazonaws.com/small_logos/6d577992606b8392c9619b8dffacec074b48668d.png","batch":"Summer 2025","industry":"B2B","tags":["Artificial Intelligence","Marketplace","B2B","Data Labeling","Big Data"],"search_path":"https://bookface.ycombinator.com/company/30622"}}