{"id":87549,"title":"AfterQuery - High-quality AI starts with high-quality human data","tagline":"Powering AI models with expert-level human data","body":"## **TL;DR**\n\nAI models need high-quality training data to effectively automate complex professional tasks - we provide that data. If you are building an AI model or need fine-tuning data for your agent, we’d love to help. \n\n\\------------\n\n## **Why We’re Building AfterQuery**\n\nThere is a dearth of high-quality training data.\n\nCurrent AI models are trained to be smart but are not trained on the work of actual professionals in many fields. \n\nFor example, current models are largely unusable to professionals working in finance (private equity, hedge funds, investment banks, etc). See our [paper](https://arxiv.org/pdf/2501.18062), where LLMs fail 60% of realistic daily tasks – but with some fine-tuning on high-quality training data, significantly improve.\n\n![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeJndLBLBOQJVpf7yPJNA2WFddpffkxqdE1rFqmEfxo3FRmcSCVWqHnsOR793XuN1iCUm-ibeWfFj-FV_3FaswWhXetsEhembFLjln5Lwc9LA0tJV6uNq7MPWyjdDq7FTZSRdmekQ?key=_udDbu5NRw5fwLzMYyqM9i52)\n\n_Chart showing LLM performance against FinanceQA_\n\nAnd the lack of high-quality training data is everywhere:\n\n* Domain-specific AI (legal, consulting, healthcare, etc.)\n* Multimodal data (audio, multilingual, etc.)\n* Foundational models looking to push the boundary on reasoning and grading\n\n## **Solution**\n\nWe are building a platform to onboard experts to work on custom datasets. Think working professionals at top firms and the smartest students in various subjects (PhDs). Then, whenever you have a request for a certain type of data, we will construct and lead a team to produce:\n\n* Post-training / fine-tuning data\n* Human-Led Reinforcement Learning (RLHF) data\n* Custom benchmarks\n\nYou will know the people making your dataset and their qualifications while we coordinate with you to ensure quality standards. We save you time and energy, so you can focus on the model and not the data.\n\n## **Example Datasets We Provide**\n\n_Foundational Model Advancing Coding Capabilities:_\n\n* New data structures and algorithms problems with test cases and step-by-step answer reasoning \n* Computer architecture and system design reasoning\n* Performance optimization code with benchmarks and reasoning\n\n_Enterprise SaaS Company Building Internal AI Developer Tool:_\n\n* Codebase-wide code snippets with corresponding refactoring suggestions and explanations \n* System and runtime errors paired with step-by-step debugging solutions \n* Database query optimization solutions and reasoning steps\n\n_Startup Building an AI Agent for Law:_\n\n* Answered legal questions with expert reasoning\n* Computer use data for common tasks on legal project management software\n* Contract redlines with expert corrections\n\n**If AI is one day to replace jobs entirely, it can’t just be good, it needs to be near perfect. This transformation will require an immense amount of high-quality training data.**\n\n## **Our Ask**\n\n* Are you **working on a foundational model** advancing multimodal applications, nuanced reasoning, or domain-specific expertise?\n* Are you **implementing AI in a specialized domain** where current models lack expertise (e.g., medicine, legal, finance, software, government, healthcare)?\n\nWe would love to meet you! \n\n## **Our Team**\n\nCarlos and Spencer first met in high school at a summer program at Google and then interned together at Meta. Danny and Spencer met in high school, where they built a startup. They then both got into Wharton, sold that startup, and became roommates.\n\n[Carlos](https://www.linkedin.com/in/carlossgg/) has built multiple software businesses on his own, exited an ed-tech startup, and has worked as a software engineer at Citadel Securities. He’s won the largest possible academic scholarship in all of Canada, competed in state and national sailing competitions, and has 8x his bankroll in poker.\n\n[Spencer](https://www.linkedin.com/in/spencermateega) wrote an award-winning research paper and interned at a private equity firm in high school, before deciding to intern at Meta (SWE), Google (SWE), Morgan Stanley (IB), and Silver Lake (PE) in the next 3 years. He was the sole summer analyst at Silver Lake globally and has programmed numerous apps in his spare time (and sold some of them). He trains for half marathons and lifts in his free time between completing a master’s in computer science concurrently with his Wharton undergraduate degree.\n\n[Danny](https://www.linkedin.com/in/danny1898/) competed and won international competitions in public speaking during high school while pursuing his interest in AI, winning an international Microsoft AI competition and conducting research at a machine learning neuroscience lab. He then learned how people build real estate, from grocery stores to billion-dollar luxury hotels, eventually working on the largest real estate IPO in history and the largest IPO in 2024. He also really likes movies and attended the world's largest film festival, catching premieres of films like Anora and The Substance.\n\n![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXfkN7pvcMyTHjf5fZe7qJrF9xiyAvI0yAJ0PHNDdxWcOcU0gw5cDFUzOq20x68YJeb40Pk8dPXsNGsPVSnhYNZu3AHEAoZHcSTQ3nYyLYvMNcthHoAkoERsmzARwkGFS-fRnslLNw?key=_udDbu5NRw5fwLzMYyqM9i52)\n\n","slug":"Mm5-afterquery-high-quality-ai-starts-with-high-quality-human-data","created_at":"2025-02-10T21:58:10.460Z","updated_at":"2026-05-25T02:54:17.663Z","total_vote_count":111,"url":"https://www.ycombinator.com/launches/Mm5-afterquery-high-quality-ai-starts-with-high-quality-human-data","share_image_url":"https://lh7-rt.googleusercontent.com/docsz/AD_4nXfkN7pvcMyTHjf5fZe7qJrF9xiyAvI0yAJ0PHNDdxWcOcU0gw5cDFUzOq20x68YJeb40Pk8dPXsNGsPVSnhYNZu3AHEAoZHcSTQ3nYyLYvMNcthHoAkoERsmzARwkGFS-fRnslLNw?key=_udDbu5NRw5fwLzMYyqM9i52","company":{"id":30303,"name":"AfterQuery","slug":"afterquery","url":"https://afterquery.com","logo":"https://bookface-images.s3.amazonaws.com/small_logos/b63e52a3ec831660a5917dbb85f52cfd61f714e9.png","batch":"Winter 2025","industry":"B2B","tags":["Artificial Intelligence","B2B","Data Labeling","Big Data","AI"],"search_path":"https://bookface.ycombinator.com/company/30303"}}