
Hey everyone,
tldr;
I’m Aditya, co-founder of nCompass Technologies. We’re building a developer tool that unifies profiling, trace collaboration and trace analysis of AI systems. We automate performance optimization of AI systems across all levels of the infrastructure stack.
Using this tool, we implemented a Hopper GEMM kernel that outperformed NVIDIA's CUTLASS GEMMs by 3%, within a day - this took us months before.
Checkout our product demo below. It’s free to use and you can get started today in VS Code, Cursor or Claude Code - Quick Start
https://www.youtube.com/watch?v=Q3Pq-BPU2Ec
THE PROBLEM
Identifying the root cause of performance bottlenecks is 4-8x slower than writing the code to fix them.
If you are optimizing a system like vLLM, you have to:
Running this loop till you have a performant system can take weeks, even months.
OUR SOLUTION
By building an AI agent that can analyze profiling data as well as interact with a bank of deep technical knowledge and expertise, we’re automating the process of identifying performance bottlenecks.
Now in a single VSCode interface, you can:
This applies to both systems and GPU kernel level analysis and our agent integrates directly into Cursor / Claude Code, so you never have to leave your normal workflow!
Anyone can now write both correct and performant code with AI!
ASKS
Install our VSCode extension and start optimizing your systems performance!
We also offer FDE services - if you would like us to step in and analyze your system’s performance and provide you with an analysis of how much we could improve it by - reach out at hello@ncompass.tech