Apache Flink, Kafka & streaming data architecture

What I do

Design Apache Flink jobs (DataStream, Table API, SQL) for production workloads — windowing, watermarks, and state TTL chosen for the actual event shape.
Architect Kafka / MSK / Confluent / Redpanda topologies including partitioning, retention, and consumer-group strategy.
Stand up ClickHouse as the analytical layer behind streaming pipelines — schema design, materialized views, replication.
Diagnose backpressure, checkpointing failures, and state-store growth in existing Flink / Spark Streaming jobs.
Build CDC pipelines (Debezium → Kafka → Flink) without losing exactly-once semantics.

A real-time pipeline has started being not-real-time and nobody can pinpoint where the lag is.
Flink checkpoints are failing or growing without bound and the job keeps falling over.
An analytics team needs sub-second queries on event data and Postgres has hit the wall.
A CDC pipeline is dropping or duplicating events and the team doesn't trust the warehouse anymore.
A new product needs a streaming architecture from scratch and the team has only batch experience.

Whiteboard + working prototype against a sample of your data. Deliverable: written architecture + IaC for the prototype.

Ship the pipeline end-to-end with your team. Includes pairing, code review, on-call shadowing.

Get a stuck Flink / Kafka system back to healthy and document why it broke. Scoped tightly.