Sylvian incentivizes high quality expert data for LLMs through Competition
We are William Huang and Niall Kehoe, and we’re launching Sylvian! We gather expert data for LLMs through competitions, starting with tool use (e.g. Excel, VSCode).
Problem
Scaling laws dictate that LLMs continue to need lots of expert data, despite advances in RL. Even RL environments require data—for example, an environment that allows LLMs to edit spreadsheets in Excel would still require sample spreadsheets to define the task at hand.
Unfortunately, existing data vendors like Scale or Mercor are not motivating the best experts through part-time pay.
Our Solution
Sylvian hosts competitions where the best experts are motivated by the thrill of moving up leaderboards and the prestige of winning a large competition. We’re starting with tool use data because there are many expert communities surrounding tools like Excel and VSCode.
We already have 4,500+ experts, from IMO Golds to MIT/Stanford PhDs to full time QRs at Point72, producing data at 1B tokens/week!
Our data is consistently at the frontier. Below is the result of one of our latest benchmarks we made with VSCode data from Data Science experts. See sylvian.ai for more details.
Our Story
William won an IPhO Gold at 17 and Niall has won international coding contests at 13, since then we’ve went on to be a part of Stanford CS, Harvard Medical School, Citadel Securities, Two Sigma, and Waymo.
Our Ask
If you’re in need of expert tool use data, we’d love to speak to you! You can reach us at founders@sylvian.ai or visit sylvian.ai.