Insights & Resources

Thought leadership and practical resources on performance engineering, the PPI-F™ framework, cloud cost optimization, and scaling enterprise systems. Follow KPI99 on LinkedIn for updates.

Latest

Programming languages don’t cause performance regressions

Languages shape how regressions appear. Walk through Python (O(n²) list lookups), Java/JVM (allocation & GC pressure), Node.js (event-loop blocking), Go (goroutine growth), and C++ (cache locality)—and why recognizing these patterns speeds diagnosis.

Read on LinkedIn

A decision-tree approach to diagnosing complex systems

Modern systems fail through pressure propagation, not where the alert fires. A ride-sharing example: one config change triggered a cascade. The fix took minutes and restored latency from 2.4s to 180ms—and how this thinking underpins the PPI-F framework.

Read on LinkedIn

Kafka performance misdiagnosed: understanding workload shapes

Two clusters can move the same MB/s and behave differently. Kafka performance is driven by how work arrives—tiny high-QPS messages, large payloads, bursty traffic, hot partitions, replay-heavy consumers—each with different bottlenecks. The model we use in KPI99 diagnostics.

Read on LinkedIn

AI accelerates software development, but complexity follows

As companies build internal AI-powered systems, complexity moves back in-house: unpredictable performance, rising infrastructure costs, scaling bottlenecks, observability gaps. PPI-F helps identify architectural pressure before it becomes a reliability or cost crisis—performance, scalability, infrastructure efficiency, cost-to-serve, and AI workload volatility.

Read on LinkedIn

The hidden cost of “fast enough”

Most teams optimize for acceptable latency; almost no one optimizes for cost-to-serve at scale. A query at 400ms instead of 200ms may not matter—until 50M executions/month. Performance pressure = (Latency × Concurrency × Cost per Unit) / Headroom. If you don’t quantify pressure, finance will see the failure first.

Read on LinkedIn

Find the inefficiency: A Trino regression hiding in plain sight

How a single type cast on a join key in Trino can silently disable pushdown, forcing full table scans and spiking query latency, cluster load, and infrastructure cost. Why these regressions occur, how to detect them, and how small query-design choices impact system-wide efficiency at scale.

Read on LinkedIn