{"id":72536,"title":"Artie - real-time data streaming for databases ⚡","tagline":"We transfer data from databases to data warehouses in real-time with CDC streaming","body":"# _TL;DR_\n\n_Artie is an open-source, streaming version of Fivetran - we transfer data from databases to data warehouses in real-time. Setting up a connector takes minutes and Artie leverages change data capture (CDC) to help companies reduce their data warehouse costs by 50%! We enable organizations to unlock real-time insights for better decision making. _[_Star us on GitHub_](https://github.com/artie-labs/transfer)_!_\n\n![uploaded image](/media/?type=post\u0026id=72536\u0026key=user_uploads/1276059/60958f68-f519-4d78-bb98-c2c33974cb39)\n\nHi everyone, we are Jacqueline and Robin and we’re the team behind [Artie](https://www.artie.so/?utm_source=YC\u0026utm_medium=newsletter\u0026utm_campaign=launchYC\u0026utm_id=launchYC). We’re on a mission to eliminate data latency and make it easy for every company to enable real-time data streaming!\n\nData is typically synced from production databases to the data warehouse once every X hour(s)/day(s) - this is a constraint that companies have lived with for decades. Robin personally felt the pain of not having access to production data in real-time and there were no easy to use out-of-the-box solutions, so we decided to build one! \n\n# **❌ The Problem**\n\nDoes your company sync data to the data warehouse every 6 hours, or worse, once a day? Are your analytics always lagged and filled with stale insights? **Why settle for a data platform that’s barely good enough when you can have real time data AND reduce your data warehouse costs?!** Not to mention you can have Artie set up in minutes!\n\nTraditional ETLs are based on **batched processes** that operate on a cron schedule (DAGs, Airflow) and **cannot achieve real-time data syncs**.\n\nBuilding and managing streaming data pipelines is hard. Most companies have a small team of data engineers and they often spend all day maintaining their data pipelines, which is not productive.\n\nFactors companies should consider if they want to self-manage pipelines:\n\n* Can the solution scale to multiple different data sources?\n  * How easy is it to add new data sources?\n  * How easy is it to manage across all the data sources?\n* Can the solution scale to handle 1m+ queries per second?\n  * Is the solution horizontally scalable?\n  * Do workers require coordination? Or are they stateless and distributed?\n* How do you ensure there are no out of order or missing events (even when the system crashes)?\n* Can the solution handle schema evolution without creating breaking changes downstream?\n\n# **🎉 Solution**\n\nArtie leverages change data capture (CDC) and stream processing to achieve **sub-minute data latency (\\~typically 10-20 seconds)**. Since we **only transfer changed data**, Artie is more efficient than traditional ETLs and can help you **cut down on your data warehouse cost by 50%!**\n\nSetting up a connector requires no programming. Just follow the setup guide and deploy in minutes! After the initial snapshot, any changes in your database will be reflected in your data warehouse in real-time.\n\n![uploaded image](/media/?type=post\u0026id=72536\u0026key=user_uploads/1276059/879c552d-7c8a-402e-9378-56f55380308a)\n\n# **🎯 Who Needs Artie?**\n\n**Engineers** that are exhausted stitching together Airflow + AWS Glue + Apache Spark + AWS Kinesis/Kafka + Apache Flink 😵‍💫\n\n**Companies that are using traditional ETLs or batched processes**. Once you enable real-time, there is no going back (your data engineers/BI analysts won’t let you)! Think of all the previously unattainable use cases that you can now implement without data latency.\n\n**Companies that have a cost cutting initiative.** Adopt Artie’s CDC streaming capabilities to reduce your data warehouse costs!\n\n# **🙏 Ask**\n\nEmail [hi@artie.so](mailto:hi@artie.so) or sign up [here](https://artie.so/contact?utm_source=YC\u0026utm_medium=newsletter\u0026utm_campaign=launchYC\u0026utm_id=launchYC) to try Artie today! \n\nWant to use open source? Install Artie at \u003chttps://github.com/artie-labs/transfer\u003e (give us a ⭐!) and ping us on [Slack](http://artie.so/slack?utm_source=YC\u0026utm_medium=newsletter\u0026utm_campaign=launchYC\u0026utm_id=launchYC), we’re happy to help you get started.","slug":"Irw-artie-real-time-data-streaming-for-databases","created_at":"2023-06-26T15:13:53.399Z","updated_at":"2026-05-25T01:51:07.265Z","total_vote_count":108,"url":"https://www.ycombinator.com/launches/Irw-artie-real-time-data-streaming-for-databases","share_image_url":"//bookface-static.ycombinator.com/assets/ycdc/yc-og-image-c440a0ad1dacfb86eeeb343717479cc54d256614449b4ef719977a0a451f8bc8.png","company":{"id":28650,"name":"Artie","slug":"artie","url":"https://www.artie.com/","logo":"https://bookface-images.s3.amazonaws.com/small_logos/973d2f0c6c4d07505113e974a08d18864eb95d54.png","batch":"Summer 2023","industry":"B2B","tags":["Developer Tools","SaaS","Data Engineering","Enterprise Software"],"search_path":"https://bookface.ycombinator.com/company/28650"}}