
API for real-world training data.
Hub is a distributed AI data infrastructure company. We provide real-world training data to AI companies through a global network of 500,000+ contributors across 150+ countries and 100+ languages.
We own the full end-to-end pipeline: collection, automated processing, human-in-the-loop QA, and delivery. Our clients are leading AI labs and top-tier technology companies who need original, hard-to-access multimodal data — audio, image, and video — that doesn't exist on the public web.
The vision: make real-world data collection for AI as simple as an API call.
This role is not for everyone. We're a small team moving at YC batch pace. The work is real, the stakes are high, and there's no time to manage people.
You'll own both ends of the stack. You design and ship the backend APIs that process data from half a million contributors. You build the React interfaces that internal teams and clients actually use. You don't wait for someone to spec things out — you identify the problem, design the solution, and ship it.
Requirements:
1. Backend Development Design and maintain FastAPI services powering Hub's data collection and delivery pipeline. Build APIs consumed by internal tools, contributor interfaces, and client-facing integrations. Own reliability, performance, and documentation.
2. Database Performance Write, review, and optimize SQL queries for correctness and speed across high-volume datasets. Tune PostgreSQL performance: indexing, query plans, partitioning, connection pooling. Design schemas that scale with Hub's contributor network and project volume.
3. Frontend Development Build and iterate on React interfaces for contributors, internal ops, and client-facing dashboards. Translate complex data workflows into clean, usable UI. Maintain performance and responsiveness across all surfaces.
4. Async Pipeline Architecture Design and maintain background job systems that handle audio, image, and video file processing at scale. Own reliability and observability of async workflows.
5. Cross-Team Collaboration Work closely with ML engineers, DevOps, and the data team to ship integrated features. Participate in architecture discussions and sprint planning. Contribute to internal tooling that accelerates team velocity.
To be considered for this role, you must complete our full application form here: Hub Full Stack AI Engineer Application
The application covers 5 parts: your background, technical depth, working style, a mandatory Loom video (5 min, camera on), and availability. Estimated time: 20–30 minutes.
Important: At the end of the form, in the "How did you find out about this position?" field, please mention that you applied through YC / Work at a Startup.