Architecting and Orchestrating Hybrid Cloud AI/ML Data Pipelines

Mike Oglesby

Technical Marketing Engineer, NetApp

Decentralized architectures present a particular challenge when it comes to AI/ML applications, as these applications rely on massive amounts of data. This data needs to be collected, aggregated, transformed and then maintained for traceability and compliance purposes. This necessitates an architecture that facilitates the seamless movement of data from edge devices in remote data centers to a core data center and/or to the cloud, between a core data center and the cloud, and even between cloud platforms. In order to implement such an architecture at scale, it must be possible to orchestrate this data movement using a variety of open interfaces and tools. Likewise, applications running in various different locations need to be able to access this data using standard interfaces and protocols. Additionally, AI/ML models and the datasets that were used to train them must be versioned for compliance reasons. In this session, we will discuss the products and solutions that NetApp is developing in order to enable customers to build data pipelines within this complex paradigm.