{"id":82440,"title":"FiddleCube - Data platform for fine-tuning LLMs","tagline":"Synthetic data platform to streamline dataset generation for custom LLM training","body":"### TL;DR: Upload your files \u0026 generate a high-quality dataset in minutes. [Give it a go](https://dashboard.fiddlecube.ai/sign-up)!\n\nLlama3.1 405B has just dropped, and it's already outperforming GPT-4o. As we assist our customers in fine-tuning domain-specific LLMs, we see firsthand that it's no small feat. It requires an extensive, diverse, and superior-quality dataset, and multiple iterations of training to get it right.\n\n## ❌ Creating high-quality datasets from raw data is messy!\n\n**Identifying the right data** in the knowledge base is a manual, challenging process.\n\n**Data cleaning and filtering** takes significant effort and man-hours, and is error-prone.\n\n**Costs of training \u0026 evals skyrocket** **with bad datasets** requiring multiple iterations of training.\n\n## ✅ We're making it easy and efficient for businesses.\n\nFiddleCube’s data platform converts your data corpus into a high-quality fine-tuning dataset. Generate 1000s of rows of multi-turn chat, function-calling, and QnAs. Additionally, augment your datasets synthetically from unstructured data to improve your model's performance.\n\n**Our users have used us to:**\n\n* **Save \u003e2 months in their data cleaning, preparation, generation, and quality check cycle.**\n* Generate a **high-quality training dataset** that accurately resembles their production data without PII.\n* Generate a **golden dataset** for testing \u0026 benchmarking their AI-powered apps.\n* Generate **gender diversity, safety \u0026 guard railing** dataset.\n* **Customize the tone** of their responses instead of a standard GPT-like tone.\n\n## 🚀 FiddleCube’s - Data platform empowers you with:\n\n* **Data generation - A simple \u0026 clean UI** to generate datasets from PDF, TXT, and data sources to train your model.\n* **Dataset Management** - Editing, versioning, RBAC, and synthetic data augmentation to create self-correcting datasets.\n* **Diagnosing** **and improving underperforming queries** with regression testing \u0026 detailed data diagnostic tools.\n* Use **production logs and feedback** to auto-generate datasets.\n\n![uploaded image](/media/?type=post\u0026id=82440\u0026key=user_uploads/1103301/d3b373f7-800d-4539-8790-364859f60904)\n\n### 🙋‍♂️ Let's Take Your Data to Production and Get You Started\n\n[Sign up here](https://dashboard.fiddlecube.ai/sign-up) to generate your first dataset. Or [book a call with us](https://fiddlecube.ai/contact-us) for help in getting started.\n\n![uploaded image](/media/?type=post\u0026id=82440\u0026key=user_uploads/1103301/1b2e9f87-678a-4b9b-9f13-f1c4e4e3af3b)\n\n","slug":"LRg-fiddlecube-data-platform-for-fine-tuning-llms","created_at":"2024-07-25T16:24:53.504Z","updated_at":"2026-05-25T02:05:26.352Z","total_vote_count":37,"url":"https://www.ycombinator.com/launches/LRg-fiddlecube-data-platform-for-fine-tuning-llms","share_image_url":"https://www.ycombinator.com/media/?type=post\u0026id=82440\u0026key=user_uploads/1103301/1b2e9f87-678a-4b9b-9f13-f1c4e4e3af3b","company":{"id":27935,"name":"compliant-llm","slug":"compliant-llm","url":"https://www.compliantllm.com/","logo":"https://bookface-images.s3.amazonaws.com/small_logos/e836589affa3183241d8a3c3f2914bb8dc44d2a8.png","batch":"Winter 2023","industry":"B2B","tags":[],"search_path":"https://bookface.ycombinator.com/company/27935"}}