{"id":83260,"title":"Pipeshift AI - Fine-tuning and inference for open-source LLMs","tagline":"Replace GPT/Claude in production with specialized LLMs that are fine-tuned on your context, offering higher accuracy, lower latencies and model ownership.","body":"**_TL;DR:_** Pipeshift is the cloud platform for finetuning and inferencing open-source LLMs, helping teams get to production with their LLMs faster than ever. With Pipeshift, companies making \u003e1000 calls/day on frontier LLMs can use their data and logs to replace GPT/Claude with specialized LLMs that offer higher accuracy, lower latencies, and model ownership. [_Connect with us._](https://cal.com/arkoc/pipeshift)\n\n![uploaded image](/media/?type=post\u0026id=83260\u0026key=user_uploads/951072/55b3d2ab-6279-4002-9bc7-2e7527c8578b)\n\n# **🧨 The Problem: Building with Open-source LLMs is hard!**\n\nThe open-source AI stack is missing, forcing most teams to experiment by **duct-taping things like TGI/vLLM but having nothing ready for production.** As you scale, it requires expensive ML talent, long build cycles, and constant optimizations.\n\nThe **gap between open-source and closed-source models is shrinking** (Meta's Llama 3.1 405B is a testament to that)! And open-source LLMs offer multiple benefits over their closed-source counterparts:\n\n🔏 Model ownership and IP control\\\n🎯 Verticalization and customizability\\\n🏎️ Improved inference speeds and latency\\\n💰 Reduction of API costs at scale\n\n# **🎉 The Solution: Heroku/Vercel for Open-source LLMs**\n\nPipeshift is the **cloud platform for fine-tuning and inferencing open-source LLMs**, helping developers get to production with their LLMs faster than ever.\n\n**🎯 Fine-tune Specialized LLMs**\\\nRun multiple LoRA-based fine-tuning jobs to build specialized LLMs.\n\n**⚡️ Serverless APIs of Base and Fine-tuned LLMs**\\\nRun inference for your fine-tuned LLMs and pay as per your token usage.\n\n**🏎️ Dedicated Instances for High Speed and Low Latency**\\\nUse our optimised inference stack to get max throughputs and utilisation on GPUs.\n\n[Product Demo: https://youtu.be/z8z5ILyXxCI](https://youtu.be/z8z5ILyXxCI)\n\n_Our inference stack is one of the best globally, hitting **150+ tokens/sec on 70B parameter LLMs** without any model quantization. And, since our private beta access was opened (\u003c2 weeks back), we have already seen **25+ LLMs being fine-tuned with over 1.8B tokens in training data** across 15+ companies._\n\n![uploaded image](/media/?type=post\u0026id=83260\u0026key=user_uploads/951072/b91ba4bb-dca1-4567-8a0c-cf637e03b561)\n\n# **👋 Ask: How you can help**\n\nIf you’re building an AI co-pilot/agent/SaaS product and **are looking to move to open-source LLMs** or know someone who’s looking to do that same, then [book a call](https://cal.com/arkoc/pipeshift) or mail us at [_founders@pipeshift.ai_](mailto:founders@pipeshift.ai) - whichever you’d like!","slug":"Leu-pipeshift-ai-fine-tuning-and-inference-for-open-source-llms","created_at":"2024-08-21T01:29:39.204Z","updated_at":"2026-05-24T21:48:20.781Z","total_vote_count":703,"url":"https://www.ycombinator.com/launches/Leu-pipeshift-ai-fine-tuning-and-inference-for-open-source-llms","share_image_url":"https://www.ycombinator.com/media/?type=post\u0026id=83260\u0026key=user_uploads/951072/55b3d2ab-6279-4002-9bc7-2e7527c8578b","company":{"id":29950,"name":"Pipeshift","slug":"pipeshift","url":"https://pipeshift.com","logo":"https://bookface-images.s3.amazonaws.com/small_logos/6dc8136429eadec250bda86f1102143f81a24beb.png","batch":"Summer 2024","industry":"B2B","tags":["AIOps","Artificial Intelligence","Infrastructure","AI","ML"],"search_path":"https://bookface.ycombinator.com/company/29950"}}