{"id":73434,"title":"🎲 FiddleCube - Automated dataset generation for fine-tuning LLMs","tagline":"Create high-quality datasets for fine-tuning and reinforcement learning.","body":"Tl;Dr; Fine-tuning LLMs requires high-quality datasets. FiddleCube automagically generates fine-tuning datasets from your data.\n\n**User Data Source \u003e Fine-tuning Datasets (FiddleCube) \u003e Fine-tuning**\n\nHead over to [fiddlecube.ai](http://fiddlecube.ai) to get started!\n\nHi everyone, we are Neha and Kaushik. We’re building FiddleCube to make high-quality datasets accessible to everyone.\n\n![uploaded image](/media/?type=post\u0026id=73434\u0026key=user_uploads/1103301/b1e5c3f8-b7af-42f7-b251-bbb11ad5d38f)\n\n🦸 Kaushik spent most of the last decade building tech at companies like Google, Uber, and LinkedIn.\n\n🧙🏻 Neha has spent a similar amount of time as a dev at multiple startups, most recently at Uber\n\n👫🏻🫶🏻 We met at Uber, eventually got married, and decided to build a startup together, following our passion for AI.\n\n# 😤 The Problem\n\nIn the real world, LLMs need to be aligned to follow human instructions. It needs to respond in a manner that is:\n\n* Positive, Truthful \u0026 Honest\n* And in accordance with human beliefs and sensibilities\n\n![uploaded image](/media/?type=post\u0026id=73434\u0026key=user_uploads/1103301/d7f8e8a0-2522-4ecf-a80e-d591c6cacd9a)\n\nRemarkable outcomes have been achieved towards this end by fine-tuning and reinforcement learning with high-quality datasets. However, creating these datasets takes significant time, manual effort, and money.\n\n# 💡The Solution\n\nFiddleCube leverages a suite of AI models to create high-quality datasets for fine-tuning and reinforcement learning.\n\n* Generate annotated datasets from raw data.\n* Augment the datasets - create large datasets to significantly improve model performance.\n* Evaluate and improve the data quality of your training dataset.\n\nWe create a rich, diverse, high-quality dataset to produce better models with a lower corpus of data.\n\n# ⚙️ Use Cases\n\n### 👩🏻‍🎤Personalization\n\nGive the model a personality, voice, and tone. For example, you can create a safe Dora the explorer / Peppa Pig model that speaks to children.\n\n### 👩🏻‍💻 API calling and coding\n\nFor specific use cases like making API calls or generating code, fine-tuning has provably demonstrated better results. You can fine-tune the LLM on a corpus of code or API data to significantly improve their ability at these tasks.\n\n### 🚄 Increase Throughput, Reduce Latency and Cost\n\nFine-tuned LLMs are much smaller than the foundational models. You can use them to increase throughput and reduce latency and cost.\n\n### 🗺️ Low Resource Domains\n\nLLMs perform poorly in certain domains like vernacular languages. These domains lack a sufficient corpus of high-quality data. Fine-tuning using generated datasets has shown remarkable improvements over the state of the art in these cases.\n\n# 🙏🏻 Ask\n\nAre you fine-tuning any LLM, or looking to fine-tune LLaMa V2, MPT, or Falcon? We would love to know your use case. Drop a comment on what you are doing, or reach out to us privately!\n\n# 👋🏻 Need help with fine-tuning?\n\nBook a slot on our calendar 🗓️ or drop us a line using:\n\n\\- Email 📧 : [kaushik@fiddlecube.ai](mailto:kaushik@fiddlecube.ai)\n\n\\- [Typeform](https://m6qcwjky3c1.typeform.com/to/CyfMeAbT) 📝\n\nand we will get back to you!","slug":"J6Q-fiddlecube-automated-dataset-generation-for-fine-tuning-llms","created_at":"2023-07-26T05:46:22.692Z","updated_at":"2026-05-24T01:02:57.826Z","total_vote_count":87,"url":"https://www.ycombinator.com/launches/J6Q-fiddlecube-automated-dataset-generation-for-fine-tuning-llms","share_image_url":"//bookface-static.ycombinator.com/assets/ycdc/yc-og-image-c440a0ad1dacfb86eeeb343717479cc54d256614449b4ef719977a0a451f8bc8.png","company":{"id":27935,"name":"compliant-llm","slug":"compliant-llm","url":"https://www.compliantllm.com/","logo":"https://bookface-images.s3.amazonaws.com/small_logos/e836589affa3183241d8a3c3f2914bb8dc44d2a8.png","batch":"Winter 2023","industry":"B2B","tags":[],"search_path":"https://bookface.ycombinator.com/company/27935"}}