AI

3:50 PM - 4:10 PM, PST , October 29

Truly Scalable Operational Data Layers for Data Pipelines

As streaming systems scale to match the ever-increasing volumes of data in applications, how should data engineers think about the scale properties of the sources and destinations of streaming data?
In this session, we’ll discuss scaling from the perspective of an operational data layer (both a destination and a source), or – more tangibly – the global source of truth for data aggregated from all internal sources. Engineers use this layer to park data for additional processing or operational business intelligence. Almost every large business has one or is building one, and they may not even know it.
The purpose of this talk is to precisely define this layer and discuss how to think about its scalability as it serves workloads from so many different places. We’ll deconstruct the idea of “scalability” into fine-grained parts to be considered in nearly all cases. By the end of this talk, you’ll know what it means to have a truly scalable operational data layer.

Speaker

Matthew Penaroza

Senior Solution Engineer, PingCap