AI

4:30 PM - 5:10 PM, PST , October 29

Who said datalakes need to be slow?

Originally developed from Flink Table Store, Apache Paimon is designed to bring the speed of streaming processing directly to your data lake. This session will explore Paimon’s architecture, built for real-time analytics on large datasets without sacrificing performance.


Key Questions We’ll Address:
How do you manage change data?
With streaming data, you need fast, reliable Change Data Capture (CDC) support. Learn how Paimon uses row-level tracking (RowKind) and incremental snapshots to keep data fresh and queries quick.
Can you handle complex, large-scale data with real-time updates?
Paimon’s LSM trees, bucketed sharding, and smart compaction modes bring efficient upserts and change tracking to the data lake—perfect for environments with high data turnover.
How do you handle what needs to be fast vs. what needs to be at scale?
Paimon lets you choose between “min-delta” and “full-delta” compactions for flexible storage and query optimization based on workload needs, offering near-real-time insights without the usual data lake slowdowns.
How do you balance your data lake and streaming requirements?
Seamlessly integrated with streaming engines like Apache Pulsar and Flink, Paimon transforms your data lake into a high-performance storage layer that’s both scalable and real-time ready.
Join us to explore how Paimon’s architecture helps solve the complexity of fast, scalable data management and provides a high-performance layer for modern, streaming-compatible data lakes.

Speaker

Ben Gamble

Field CTO, Ververica