{"id":76501,"title":"Haven: Fine-tune and run open source LLMs super fast ⚡","tagline":"Haven lets you build open source LLMs that are specialized for specific tasks","body":"# TLDR:\n\nWe are excited to launch [Haven](https://haven.run/)’s fine-tuning platform for open source LLMs!\n\n* Haven lets you fine-tune open source LLMs such as Llama or Mistral without writing code or setting up infrastructure. We charge $0.004/1k training tokens, and you get $5 in free credits after signing up\n* You can host models for testing with \u003c1s cold start times. We achieve this by running hundreds of model adapters on a single GPU and hot-swapping them based on user requests (will write more about this in a blog post)\n\nYou can get started [**here**](https://app.haven.run/) - alternatively, reach out to [**hello@haven.run**](https://app.haven.run/) or watch our [demo](https://www.youtube.com/watch?v=XpyVKUyt7k8) if you want to learn more!\n\n### **The Problem**\n\nOver the last months, we’ve identified two big pain points that make it hard to work with open source models:\n\n1. Open source models work best when they are trained for specific use cases, but the fine-tuning process with existing tools is super annoying. We have found that most of our time is spent setting up infrastructure to go from finishing a training run to actually testing our models, rather than actually writing code and improving our models\n2. Hosting custom models is expensive. Running a single Llama-7B model in float16 requires at minimum an A10 GPU, which costs $700+ per month. To run ten or a hundred specialized models for common tasks, this would mean that we have a monthly AWS bill of $7,000 or $70,000, respectively.\n\n### **The Solution**\n\nHaven’s platform offers a super simple way to fine-tune models without managing infrastructure or writing code, and to test and run them with low costs and without any additional work.\n\nWe are able to provide a super short feedback loop of going from training to running a fine-tuned model by hosting multiple lora adapters in parallel. This makes it possible for us to host hundreds of fine-tuned models on-demand on a single GPU. For our users, this reduced model cold start times to \u003c1s and internally, we are able to host a single fine-tuned model for a couple of dollars per month. We also enable our users to export their model weights to Huggingface, so that they can run models entirely on their terms.\n\n### **Our Ask**\n\nFeel free to check out our [**platform**](https://app.haven.run/) and give us feedback! After signing up, you’ll receive $5 in credits to train a couple of models :) We are also happy to answer questions at [**hello@haven.run**](https://app.haven.run/)","slug":"Jtt-haven-fine-tune-and-run-open-source-llms-super-fast","created_at":"2023-11-29T17:29:33.896Z","updated_at":"2026-07-22T12:45:07.953Z","total_vote_count":8,"url":"https://www.ycombinator.com/launches/Jtt-haven-fine-tune-and-run-open-source-llms-super-fast","share_image_url":"//bookface-static.ycombinator.com/assets/ycdc/yc-og-image-c440a0ad1dacfb86eeeb343717479cc54d256614449b4ef719977a0a451f8bc8.png","company":{"id":28930,"name":"Midrender","slug":"midrender","url":"https://midrender.com","logo":"https://bookface-images.s3.amazonaws.com/small_logos/ae8f20158b0233edf45f926bd2ba47a325fa7fd6.png","batch":"Summer 2023","industry":"B2B","tags":["Design","Design Tools","Video","Marketing","AI"],"search_path":"https://bookface.ycombinator.com/company/28930"}}