COMPASS TALK by Patrick Stüdi (IBM Research): Data processing at the speed of 100 Gbps using Apache Crail (Incubating)

20.09.2018 10:00
Europe/Zurich

CAB E 72

Thursday, 20 September 2018, 10:00-11:00 in CAB E 72

Speaker: Patrick Stüdi (IBM Research)

Title: Data processing at the speed of 100 Gbps using Apache Crail (Incubating)

 

 

Abstract:

Once the staple of HPC clusters, today high-performance network and storage devices are everywhere. For a fraction of the cost, one can rent 40/100 Gbps RDMA networks and high-end NVMe flash devices supporting millions of IOPS, 10s of GB/s bandwidth and less than 100 microseconds of latencies. But how does one leverage the speed of high-throughput low-latency I/O hardware in distributed data processing systems like Spark, Flink or Tensorflow?

In this talk, I will introduce Apache Crail (Incubating) a fast, distributed data store that is designed specifically for high-performance network and storage devices. Crail's focus is on ephemeral data, such as shuffle data or temporary data sets in complex job pipelines, with the goal to enable data sharing at the speed of the hardware in an accessible way. From a user perspective, Crail offers a hierarchical storage namespace implemented over distributed or disaggregated DRAM and Flash. At its core, Crail supports multiple storage back ends (DRAM, NVMe Flash, and 3D XPoint) and networking protocols (RDMA and TPC/sockets). In the talk I will discuss the design of Crail, its use cases and the performance results on a 100Gbps cluster.

Bio:

Patrick is a member of the research staff at IBM research Zurich. His research interests are in distributed systems, networking and operating systems. Patrick graduated with a PhD from ETH Zurich in 2008 and spent two years (2008-2010) as a Postdoc at Microsoft Research Silicon Valley. The general theme of his work is to explore how modern networking and storage hardware can be exploited in distributed systems. Patrick is the creator of several open source projects such as DiSNI (RDMA for Java), DaRPC (Low latency RPC) and co-founder of Apache Crail (Incubating).

---

COMPASS TALKS 

---