{"id":83077,"title":"Simplex: on-demand photorealistic vision datasets","tagline":"Receive millions of high quality images and labels for training your AI models.","body":"**TL:DR;** Simplex creates photorealistic vision datasets rendered from 3D scenes for AI model training. Submit a request [on our website](https://simplex.sh/#data-request-form) to receive high-quality data and labels.\n\n![uploaded image](/media/?type=post\u0026id=83077\u0026key=user_uploads/1588677/826c7bea-8936-451d-97ff-895bc8ca6c38)\n\n**Data request for the above sample:** _“Generate images and labels of a home kitchen with household objects on a center table. I need a variety of household objects in a variety of lighting conditions. Our desired labels are semantic segmentation and depth maps.”_\n\nHi everyone, we’re [**Shreya**](https://www.linkedin.com/in/shreya-karpoor/) and [**Marco**](https://www.linkedin.com/in/marco-nocito/), two MIT grads building Simplex.\n\nCollecting vision data for model training is time-consuming, costly, and often unsafe. Shreya spent over 200 hours physically operating a robot to collect image training data during her research at MIT. Marco worked on machine learning for synthetic data at Waymo to solve this exact problem.\n\nWe realized data scarcity wasn’t just an issue in robotics – it affects any company training vision models. When fine-tuning foundation models or building a new dataset from scratch, teams must curate existing data or label and collect data themselves. \n\nWe resolve the data scarcity problem by generating photorealistic ground truth labeled datasets for **any scenario**. We can generate **millions of varied images** **from 3D scenes** using our physics engine pipeline.\n\nHere’s how you’d use Simplex:\n\n1. Fill out our data request form [here](https://simplex.sh/#data-request-form) – it takes less than a minute. \n2. Give us feedback on a few sample image/label pairs that we generate. Repeat if necessary.\n3. Once you’re satisfied, download your complete dataset.\n\nWe support semantic segmentation, captions, simulated LiDAR, depth maps, and bounding boxes. You can generate large volumes of randomized scenes or provide a CAD/phone scan model for more specific scenes.\n\n# **Our Ask**\n\n* If you or someone you know needs vision data, fill out our 30-second data request [form.](https://www.simplex.sh/#data-request-form) We’re taking a limited number of early customers.\n* If you have a more complicated request or would otherwise like to contact us, email [shreya@simplex.sh](https://mailto:shreya@simplex.sh).\n\n# **The Team**\n\n![uploaded image](/media/?type=post\u0026id=83077\u0026key=user_uploads/1588677/d2e5778c-86c2-4c7b-bc75-3c73aefff883)\n\n[**Shreya**](https://www.linkedin.com/in/shreya-karpoor/): Computer science (BS and MEng) at MIT, software engineer at Tesla and Viam. Built simulation pipelines for locomotion and dexterous manipulation research at MIT. \\\n[**Marco**](https://www.linkedin.com/in/marco-nocito/): Computer science (BS and MEng) at MIT, software engineer at Waymo, Bloomberg, and Viam. Built machine learning models to generate synthetic data at Waymo.","slug":"Lbx-simplex-on-demand-photorealistic-vision-datasets","created_at":"2024-08-15T14:29:52.537Z","updated_at":"2026-07-22T04:51:22.251Z","total_vote_count":407,"url":"https://www.ycombinator.com/launches/Lbx-simplex-on-demand-photorealistic-vision-datasets","share_image_url":"https://www.ycombinator.com/media/?type=post\u0026id=83077\u0026key=user_uploads/1588677/d2e5778c-86c2-4c7b-bc75-3c73aefff883","company":{"id":29296,"name":"Simplex","slug":"simplex","url":"https://simplex.sh","logo":"https://bookface-images.s3.amazonaws.com/small_logos/556d5a66ef0386d859db3ad9f500ceeadf756032.png","batch":"Summer 2024","industry":"B2B","tags":["Machine Learning","Robotic Process Automation","B2B","AI"],"search_path":"https://bookface.ycombinator.com/company/29296"}}