The paper "PipeArch: Generic and Context-Switch Capable Data Processing on FPGAs" by Kaan Kara and Gustavo Alonso has been accepted for publication in the journal ACM Transactions on Reconfigurable Technology and Systems (TRETS).
The paper "PipeArch: Generic and Context-Switch Capable Data Processing on FPGAs" by Kaan Kara and Gustavo Alonso has been accepted for publication in the journal ACM Transactions on Reconfigurable Technology and Systems (TRETS).
The following paper has been accepted at Fragile Earth Workshop at KDD2020:
"One Forest: Towards a Global Species Dataset by Fusing Remote Sensing and Citizen Science Data with Graph Neural Networks" by Kenza Amara, David Dao, Björn Lütjens (MIT), Dava Newman (MIT), Tom Crowther (ETH), Ce Zhang.
The paper "Compressive Sensing Using Iterative Hard Thresholding with Low Precision Data Representation: Theory and Applications" by Nezihe Merve Gürel, Kaan Kara, Alen Stojanov, Tyler Smith, Thomas Lemmin, Dan Alistarh, Markus Püschel and Ce Zhang has been accepted for publication in IEEE Transactions on Signal Processing Journal.
A blog post entiteld "RumbleML, a declarative machine learning framework" by Can Berker Cikis has been posted on our Systems Group Blog.
The following monograph will be published in Foundations and Trends® in Databases: Vol. 9: No. 1:
"Distributed Learning Systems with First-Order Methods" by Ji Liu (University of Rochester and Kuaishou Inc.) and Ce Zhang (ETH Zurich).
Abstract
Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this topic is that recent progress has been made by researchers in two communities: (1) the system community such as database, data management, and distributed systems, and (2) the machine learning and mathematical optimization community. The interaction and knowledge sharing between these two communities has led to the rapid development of new distributed learning systems and theory. In this monograph, we hope to provide a brief introduction of some distributed learning techniques that have recently been developed, namely lossy communication compression (e.g., quantization and sparsification), asynchronous communication, and decentralized communication. One special focus in this monograph is on making sure that it can be easily understood by researchers in both communities — on the system side, we rely on a simplified system model hiding many system details that are not necessary for the intuition behind the system speedups; while, on the theory side, we rely on minimal assumptions and significantly simplify the proof of some recent work to achieve comparable results.
The following Demo has been accepted at the 46th International Conference on Very Large Data Bases (VLDB 2020):
"Ease.ml/snoopy in Action: Towards Automatic Feasibility Analysis for Machine Learning Application Development" by Cedric Renggli (ETH Zurich), Luka Rimanic (ETH Zurich), Luka Kolar ( ETH Zurich), Wentao Wu (Microsoft Research), Ce Zhang (ETH).
Abstract:
We demonstrate ease.ml/snoopy, a data analytics system that performs feasibility analysis for machine learning (ML) applications before they are developed. Given a performance target of an ML application (e.g., accuracy above 0.95), ease.ml/snoopy provides a decisive answer to ML developers regarding whether the target is achievable or not. We formulate the feasibility analysis problem as an instance of Bayes error estimation. That is, for a data (distribution) on which the ML application should be performed, ease.ml/snoopy provides an estimate of the Bayes error -- the minimum error rate that can be achieved by any classifier. It is well-known that estimating the Bayes error is a notoriously hard task. In ease.ml/snoopy we explore and employ estimators based on the combination of (1) nearest neighbor (NN) classifiers and (2) pre-trained feature transformations. To the best of our knowledge, this is the first work on Bayes error estimation that combines (1) and (2). In today's cost-driven business world, feasibility of an ML project is an ideal piece of information for ML application developers -- ease.ml/snoopy plays the role of a reliable ''consultant''.
The paper entitled "Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript" by Fangcheng Fu (Peking University), Yuzheng Hu (Peking University), Yihan He (Peking University), Jiawei Jiang (ETH Zurich), Yingxia Shao (BUPT), Ce Zhang (ETH), Bin Cui (Peking University) has been accepted at ICML 2020.
The following papers have been accepted for presentation at the International Conference on Field-Programmable Logic and Applications (FPL) which will take place in August/September 2020:
"HyperLogLog Sketch Acceleration on FPGA" by Amit Kulkarni, Monica Chiosa, Thomas Preusser, Kaan Kara, David Sidler, Gustavo Alonso
"High Bandwidth Memory on FPGAs: A Data Analytics Perspective" by Kaan Kara, Gustavo Alonso, Christoph Hagleitner, Dionysios Diamantopoulos, Dimitris Syrivelis
"Using DSP Slices as Content-Addressable Update Queues" by Thomas Preusser, Monica Chiosa, Alexander Weiss, Gustavo Alonso
The folowing manuscript has been accepted for publication in the Journal of Chemical Information and Modeling:
"RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks" by Hussein Hassan-Harrirou, Ce Zhang, and Thomas Lemmin.
The following paper has been accepted at SIGKDD 2020:
"Building Continuous Integration Services for Machine Learning" by: Bojan Karlaš, Matteo Interlandi, Cedric Renggli, Wentao Wu, Ce Zhang, Deepak Mukunthu Iyappan Babu, Jordan Edwards, Chris Lauren, Andy Xu and Markus Weimer.
The paper "Neural dynamics of sentiment processing during naturalistic sentence reading" by Christian Pfeiffer (UZH), Nora Hollenstein (ETH), Ce Zhang (ETH), Nicolas Langer (UZH) has been accepted for publication in the Elsevier Journal - NeuroImage.
Gustavo Alonso will lead a Xilinx Adaptive Compute Cluster (XACC) at ETH Zurich. The cluster will support novel research in high performance computing. More in D-INFK Spotlight...
The paper “TrueBranch: Metric Learning-based Verification of Forest Conservation Projects” by Simona Santamaria (ETH Zurich); David Dao (ETH Zurich); Björn Lütjens (MIT); Ce Zhang (ETH Zurich) has won the Best Proposal Paper Award at the ICLR Workshop on Tackling Climate Change with ML.
The paper addresses the huge problem of deforestation that accounts for 15% of all global greenhouse gas emissions! This problem is not new and UN and companies have started mitigation programs that pay landowners if they can prove that they're conserving forests. However, proving the conservation of forests is currently done by on-ground inspections, which is so expensive that many non-profits and indigenous groups are excluded from these projects. So, in our paper, we propose drone-based and AI-verified inventories of forests to create a cheap and trustworthy monitoring system. If deployed this system could incentivize many more landowners to conserve forests.
The following paper has been accepted at EuroSys 2020:
"StRoM: Smart Remote Memory" by David Sidler (Microsoft, USA & ETH Zurich), Zeke Wang (Zhejiang University, China & ETH Zurich), Monica Chiosa, Amit Kulkarni, Gustavo Alonso (ETH Zurich)
The following paper has been accepted at the 10th Workshop on Systems for Post-Moore Architectures (SPMA 2020):
"Serverless Clusters: The Missing Piece for Interactive Batch Applications?" by Ingo Müller (ETHZ), Rodrigo Bruno (ETHZ), Ana Klimovic (Google Inc.), John Wilkes (Google Inc.), Eric Sedlar (Oracle Labs), Gustavo Alonso (ETHZ).
The paper "Lambada: Interactive Data Analytics on Cold Data using Serverless Cloud Infrastructure" by Ingo Müller (ETHZ), Renato Marroquín (ETHZ), Gustavo Alonso (ETHZ). has been accepted at SIGMOD 2020.
The paper "Untangling Header Bidding Lore" co-authored by Debopam Bhattacherjee and Ankit Singla received the PAM 2020 Best Dataset Award:
"Untangling Header Bidding Lore" by Waqar Aqeel (Duke); Debopam Bhattacherjee (ETH Zürich); Balakrishnan Chandrasekaran (MPI); Brighten Godfrey (UIUC and Veriflow); Gregory Laughlin (Yale); Bruce Maggs (Duke); Ankit Singla (ETH Zürich).
Gustavo Alonso gave a talk on "How the cloud is driving software and hardware specialization - a use case from the airline industry" at the EE Distinguished Speakers Seminar Series of IEL-EPFL.
The following paper has been accepted for publication in IEEE Transactions on Computers in 2020:
"A Power- and Performance-Aware Software Framework for Control System Applications" by M. Giardino (ETH), E. Klawitter (Georgia Institute of Technology), B. Ferri (Georgia Institute of Technology), and A. Ferri (Georgia Institute of Technology).
Karolis Kusas joined the Systems Group as our new PhD Student. Karolis completed his MSc at The University of Oxford.
The article "Perfect Prediction in Normal Form: Superrational Thinking Extended to Non-Symmetric Games" by Ghislain Fourny has been accepted for publication in Journal of Mathematical Psychology.
The paper "Understanding video streaming algorithms in the wild" by Melissa Licciardello, Maximilian Grüner and Ankit Singla has been accepted at The Passive and Active Measurement (PAM) Conference in Oregon, USA, March 30-31, 2020.
The following two papers have been accepted at the Climate Change AI Workshop at Eighth International Conference on Learning Representations (ICLR 2020) in Addis Ababa, Ethiopia:
TrueBranch: Robust Deep Learning-based Verification of Forest Conservation Projects by Simona Santamaria (ETH), David Dao (ETH), Björn Lütjens (MIT), Ce Zhang (ETH).
Xingu: Explaining Critical Geospatial Predictions in Weak Supervision for Climate Finance by David Dao (ETH), Johannes Rausch (ETH), Iveta Rott (ETH), Ce Zhang (ETH)
The paper "Making Search Engines Faster by Lowering the Cost of Querying Business Rules Through FPGAs" authored by Fabio Maschi, Muhsen Owaida, Gustavo Alonso, Matteo Casalino (Amadeus), and Anthony Hock-Koon (Amadeus) has been accepted at SIGMOD 2020.
Merve Gürel presented the following paper at the Deep Learning on Graphs Workshop at the AAAI Conference on Artificial Intelligence 2020:
"An Anatomy of Graph Neural Networks Going Deep via the Lens of Mutual Information: Exponential Decay vs. Full Preservation" by Nezihe Merve Gürel (ETH Zurich), Hansheng Ren (Microsoft Research), Yujing Wang (Microsoft Research), Hui Xue (Microsoft Research), Yaming Yang (Microsoft Research) and Ce Zhang (ETH Zurich).
The following paper has been accepted at the 12th International Conference on Language Resources and Evaluation (LREC):
"ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation” by Nora Hollenstein (ETH Zürich), Marius Troendle (University of Zurich), Ce Zhang (ETH Zürich), Nicolas Langer (University of Zurich).
Zaheer Chothia defended his PhD Dissertation "Explaining, Measuring and Predicting Effects in Layered Systems".
Shuai Zhang joined the Systems Group as our new PostDoc. Shuai comes to us from the University of New South Wales, Sydney, Australia.
Reto Achermann defended his PhD Dissertation "On Memory Addressing".
Kaan Kara defended his PhD Dissertation "Specialized Hardware Solutions for In-Database Analytics and Machine Learning".
Anwar Hithnawi joined the Systems Group as our new PostDoc. Anwar comes to us from the UC Berkeley, CA, USA.
David Dao gave the following talk at the Applied Mashine Learning Days at EPFL.
Scaling Natural Climate Solutions with Machine Learning
Abstract:
Nature-based Solutions (NbS) such as forests have the potential to deliver up to a third of the emission reduction required to achieve the Paris Agreement and limit climate change to safe levels. Yet, deforestation rates in Brazil are reaching its highest in decades and it is estimated that humanity has already cut down half of the world's forests. To date NbS receive less than 3% of available climate funding. In this talk we introduce Komorebi, a joint research project between ETH Zurich and the Government of Chile, that aims to protect, restore and fund NbS using machine learning systems. We leverage unsupervised learning and data programming on satellite and drone imagery to improve deforestation warning alerts and increase the efficiency of results-based payments for ecosystem services.
Thomas Preusser joined the Systems Group as our new PostDoc.
Vasia Kalavri joined Boston University as an Assistant Professor. John Liagouris joined Boston University as an Adjunct Assistant Professor. John also joined the Hariri Institute for Computing as a research scientist.
Zeke Wang took up an Assistant Professor position at Zhejiang University, Hangzhou, China.
Ingo Müller gave a talk entitled "Lambada: Interactive Data Analytics on Cold Data using Serverless Cloud Infrastructure" at Snowflake in Berlin. Technical Report
Renato Marroquin defended his PhD Dissertation "On the Impact of Separated Storage and Compute for Data Processing".
The paper “Network Topology Design at 27,000 Km/hour” by Debopam Bhattacherjee and Ankit Singla (ACM CoNEXT 2019) has been awarded with the Applied Networking Research Prize by IETF/IRTF for 2020.
The paper "Non-Invasive Silent Speech Recognition in Multiple Sclerosis with Dysphonia" by Arnav Kapur, Utkarsh Sarawgi, Eric Wadkins, Matthew Wu, Nora Hollenstein, Pattie Maes has been presented at the Machine Learning for Health Workshop at NeurIPS.
Ioana Giurgiu, Oriana Riva, Dejan Juric, Ivan Krivulev, and Gustavo Alonso have been awarded The Test-of-Time Award of the ACM/IFIP/Usenix International Middleware Conference 2019 for their paper “Calling the Cloud: Enabling mobile phones as interfaces to cloud applications", published in Middleware 2009. Ioana Giurgiu, now a researcher at IBM Rüschlikon, gave a talk on the topic at Middleware 2019 (slides).
David Dao presented his research on how to predict deforestation using machine learning at the UN climate change conference in Madrid.
Swiss media featured the research in "St Gallen Tagblatt" (in German)
The following paper has been accepted at The First International Workshop on Deep Learning on Graphs: Methodologies and Applications (DLGMA’20) which will be held in conjuction with The Thirty Forth AAAI Conference on Artificial Intelligence 2020 in New York, NY, USA, February 7-12, 2020:
"An Anatomy of Graph Neural Networks Going Deep via the Lens of Mutual Information: Exponential Decay vs. Full Preservation" by Nezihe Merve Gürel (ETH Zurich), Hansheng Ren (Microsoft Research), Yujing Wang (Microsoft Research), Hui Xue (Microsoft Research), Yaming Yang (Microsoft Research) and Ce Zhang (ETH Zurich).
Ingo Müller gave an invited talk entitled "Lambada: Interactive Data Analytics on Cold Data using Serverless Cloud Infrastructure" at Oracle Labs, Zürich. Technical Report
Ghislain Fourny gave a talk on Non-Nashian Game Theory at the Chair of Learning Sciences and Higher Education (D-GESS).
The following two papers have been accepted ACM FPGA'20, February 23 - 25, 2020 Embassy Suites by Hilton Monterey Bay Seaside, Seaside, California, USA.
"Enabling Any-Precision K-Means on FPGAs" by Zhenhao He, Zeke Wang and Gustavo Alonso
"Boyi: A Systematic Framework for Automatically Deciding the Best Execution Model for OpenCL Applications on FPGAs", Jiantong Jiang, Xue Liu, Juan Gómez-Luna, Nan Guan, Qingxu Deng, Wei Zhang, Onur Mutlu, Zeke Wang
Fabio Maschi gave a talk entitled "An NFA-Based Approach for Accelerating Business Rule Inference on Reconfigurable Hardware" at Amadeus in Sophia Antipolis, France.
Gustavo Alonso gave a series of talks presenting different work from the group at several companies in the Bay Area:
Oracle: Exploring hardware acceleration and RDMA for data processing
Xilinx: Using FPGAs on cloud and datacenter search engines - a use case from the airline industry
EBay: Making data centers more efficient through hardware-software co-design
Timothy Roscoe gave an invited talk entitled "A fork() in the road" at Bell Labs at the celebration of the 50th Anniversary of Unix.
Read more in D-INFK spotlight story...
Nora Hossle joined the Systems Group as our new PhD student. Nora completed her MSc at ETH Zürich.
The following two papers have been accepted at Thirty Fourth AAAI Conference on Artificial Intelligence, February 7-12, 2020, New York, NY, USA.
"TextNAS: A Neural Architecture Search Space tailored for Text Representation" by Yujing Wang (MSRA), Yaming Yang (MSRA), Yiren Chen (Peking University),Jing Bai (Microsoft), Ce Zhang (ETH), Guinan Su (University of Science and Technology of China), Xiaoyu Kou (Peking University), Yunhai Tong (Peking University), Mao Yang (MSRA), Lidong Zhou (MSRA).
"Efficient Automatic CASH via Rising Bandits" by Yang Li (Peking University/ Systems Group visiting student), Jiawei Jiang (ETH Zurich), Jinyang Gao (Alibaba Group), Yingxia Shao (BUPT), Ce Zhang (ETH), Bin Cui (Peking University).
Dan-Ovidiu Graur joined the Systems Group as our new PhD student. Dan completed his MSc at TU Delft, The Netherlands.
Gustavo Alonso gave a talk on the ongoing research on Enzian and data processing on modern hardware at the 2019 High Performance Transaction Processing Workshop (HPTS) in Asilomar, California.
Kaan Kara gave an invited talk entiteld "Specialized Hardware Solutions for in-Database Analytics Enhanced by OpenCAPI+HBM" the OpenPOWER Summit Europe.
Abstract:
Driven by the increasing complexity of data processing tasks, specialized hardware solutions are emerging as a way to increase performance and efficiency for both relational data processing and advanced analytics such as machine learning. Recently increased usage of FPGAs as a target deployment platform in the datacenter opens up new opportunities in specializing hardware and rethinking the system architecture.
In the first part of this talk, we present an overview of our previous efforts in this domain, focusing on doppioDB: A branch of MonetDB with FPGA-based data processing capabilities. We introduce the hardware-software layer built to transparently use FPGA resources in a multi-tenant system, followed by an overview of already implemented FPGA-based operators, their advantages, and limitations. In the second part, we shift our focus to recently available OpenCAPI-attached FPGA platforms with high-bandwidth-memory (HBM). We discuss some preliminary results on how OpenCAPI and HBM can be utilized to improve FPGA-based processing within doppioDB.
The following paper co-authored by Shaoduo Gan has been accepted at the Workshop on Systems for ML at NeurIPS 2019:
"Distributed Asynchronous Domain Adaptation: Towards Making Domain Adaptation More Practical in Real-World Systems" by Shaoduo Gan, Akhil Mathur, Anton Isopoussu, Nadia Berthouze, Nicholas Lane, Fahim Kawsar.
Dimitrios Koutsoukos joined the Systems Group as our new PhD student. Dimitrios completed his MSc at ETH.
The paper "Causal limit on quantum communication" by Robert Pisarczyk, Zhikuan Zhao, Yingkai Ouyang, Vlatko Vedral, and Joseph F. Fitzsimons has been published in Physical Review Letters.
Popular summary by Zhikuan Zhao:
We exploit the causal structure of quantum mechanics to study the rates of communication over noisy quantum channels. As a technical framework, we introduce a measure for quantum causality that quantifies temporal correlations between ends of a communication channel. We then prove it to be a general upper bound on the capacity of the channel to transmit quantum information.
Our result illuminates an example of harnessing quantum causality in practice. Just as the discovery of quantum entanglement-based protocols over the decades have furnished our comprehension of locality and realism, this revelation in the operational meaning of causal structure on information processing tasks will inspire insight into the nature of time in quantum mechanics.
From an applicational point of view, the newly discovered general bound for quantum channel capacity has the advantage of being efficiently computable comparing to many previously known approaches. As the need for efficient estimates of information transmission over more prevalent and complex networks of quantum channels is ever-growing, we believe our result will provide invaluable additional toolkits to study realistic noisy quantum channels and hence contribute to the design and future realization of a large-scale quantum internet.
The paper "GeoLabels: Towards Efficient Ecosystem Monitoring using Data Programming on Geospatial Information" by David Dao, Johannes Rausch, and Ce Zhang has been accepted at climate change workshop at NeurIPS.
Luka Rimanic joined the Systems Group as our new PostDoc. Luka completed his PhD at University of Bristol, UK.
The following paper co-authored by Zhipeng Zhang (our former visiting student) and Ce Zhang has been accepted at ICDE 2020, April 20-24, 2020 in Dallas, Texas, USA.
ColumnSGD: A Column-oriented Framework for Distributed Stochastic Gradient Descent by Zhipeng Zhang (Peking University), Wentao Wu ( Microsoft Research), Jiawei Jiang (Tencent),Lele Yu (Tencent Inc.), Bin Cui (Peking University), and Ce Zhang (ETH).
The following paper has been accepted at CoNEXT 2019 (15th International Conference on emerging Networking EXperiments and Technologies), Orlando, Florida, U.S. December 9-12, 2019.
"Network design at 27,000 km/hour" by Debopam Bhattacherjee (ETH Zürich) and Ankit Singla (ETH Zürich).
The following paper has been accepted at CoNLL 2019 (Conference on Computational Natural Language Learning), Hong Kong, November 3-4, 2019.:
"CogniVal: A Framework for Cognitive Word Embedding Evaluation” by Nora Hollenstein, Antonio de la Torre, Nicolas Langer and Ce Zhang.
The following papers and two demos have been presented at VLDB 2019. In addition, a BIRTE workshop (co-located with VLDB) paper, has been presented in Los Angeles, California.
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning Zeke Wang (ETH Zurich), Kaan Kara (ETH Zurich), Hantian Zhang (ETH Zurich), Gustavo Alonso (ETH Zurich), Onur Mutlu (ETH Zurich), and Ce Zhang (ETH Zurich)
ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation Kaan Kara (ETH Zurich), Ken Eguro (Microsoft), Ce Zhang (ETH Zurich), and Gustavo Alonso (ETH Zurich)
Megaphone: Latency-conscious state migration for distributed streaming dataflows Moritz Hoffmann (ETH Zurich), Andrea Lattuada (ETH Zurich), Frank McSherry (ETH Zurich), Vasiliki Kalavri (ETH Zurich), John Liagouris (ETH Zurich), and Timothy Roscoe (ETH Zurich)
Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms Ruoxi Jia (UC Berkeley), David Dao (ETH Zurich), Boxin Wang (Zhejiang University), Frances Ann Hubis (ETH Zurich), Nezihe Merve Gürel (ETH Zurich), Bo Li (University of Illinois at Urbana–Champaign), Ce Zhang (ETH Zurich), Costas J. Spanos (UC Berkeley), and Dawn Song (UC Berkeley)
doppioDB 2.0: Hardware Techniques for Improved Integration of Machine Learning into Databases by Kaan Kara (ETH Zurich), Zeke Wang (ETH Zurich), Ce Zhang (ETH Zurich), and Gustavo Alonso (ETH Zurich)
Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization Cedric Renggli (ETH Zurich), Frances Ann Hubis (ETH Zurich), Bojan Karlaš (ETH Zurich), Kevin Schawinski (Modulos AG), Wentao Wu (Microsoft Research, Redmond), and Ce Zhang (ETH Zurich)
Workshop paper (BIRTE):
FASTER State Management for Timely Dataflow Matthew J Brookes, Vasiliki Kalavri, John Liagouris (ETH Zurich)
The article "Speeding up Percolator" by John T. Halloran, Hantian Zhang, Kaan Kara, Cedric Renggli , Matthew The, Ce Zhang, David M. Rocke , Lukas Käll and William Stafford Noble has been accepted for publication in the Journal of Proteome Research.
Abstract:
The processing of peptide tandem mass spectrometry data involves matching observed spectra against a sequence database. The ranking and calibration of these peptide-spectrum matches can be improved substantially by using a machine learning post-processor. Here, we describe our efforts to speed up one widely used post-processor, Percolator. The improved software is dramatically faster than the previous version of Percolator, even when using relatively few processors. We tested the new version of Percolator on a data set containing over 215 million spectra and recorded an overall reduction to 23% of the running-time as compared to the unoptimized code. We also show that the memory footprint required by these speedups is modest relative to that of the original version of Percolator.
Ghislain Fourny wrote a blog post entiteld "Building an inverted index on a large text collection with JSONiq" on the Systems Group Blog.
Michael Giardino joined the Systems Group as our new PostDoc. Michael completed his PhD at Georgia Institute of Technology, Atlanta, GA, USA.
The paper "Strong consistency is not hard to get: Two-Phase Locking and Two-Phase Commit on Thousands of Cores" by Claude Barthels, Ingo Müller, Konstantin Taranov, Torsten Hoefler, and Gustavo Alonso exploring the implementation of 2PL and 2PC at large scales using MPI has been accepted to VLDB 2020.
The paper “Lowering the Latency of Data Processing Pipelines Through FPGA based Hardware Acceleration" authored by Muhsen Owaida, Gustavo Alonso, Laura Fogliarini, Anthony Hock-Koon, and Pierre-Etienne Melet describing the joint project between the Systems Group and Amadeus to use FPGAs to accelerate inference over decision tree ensembles has been accepted to VLDB 2020.
The following paper co-authored by Cédric Renggli has been accepted at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’19) this year in Denver.
" SparCML: High-Performance Sparse Communication for Machine Learning " by Cedric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh and Torsten Hoefler
The following paper co-authored by John Liagouris has been accepted for publication at SOSP’19.
"Lineage Stash: Fault Tolerance Off the Critical Path" by Stephanie Wang (UC Berkeley), John Liagouris (ETH Zurich), Robert Nishihara (UC Berkeley), Philipp Moritz (UC Berkeley), Ujval Misra (UC Berkeley, Dropbox), Alexey Tumanov (UC Berkeley), Ion Stoica (UC Berkeley)
Ce Zhang and Gustavo Alonso have presented work done at the systems group on Machine learning and hardware infrastructures for machine learning at the 20th Microsoft Research Faculty Summit in Redmond, Wa., USA.
The following demo paper has been accepted for publication at BIRTE'19 (co-located with VLDB'19):
"FASTER State Management for Timely Dataflow", Matthew J. Brookes, Vasiliki Kalavri, John Liagouris.
Vasiliki Kalavri and her research highlights have been presented in the Department of Computer Science Spotlight article "Building a stream processor for the future".
Another paper co-authored by Zhikuan Zhao has been published in Physical Review A:
"Quantum algorithms for training Gaussian processes" by Zhikuan Zhao, Jack K. Fitzsimons, Michael A. Osborne, Stephen J. Roberts, and Joseph F. Fitzsimons.
The following paper co-authored by David Dao has been accepted at Poly'19 Workshop at VLDB 2019.
"Data Capsule: A New Paradigm for Automatic Compliance with Data Privacy Regulations" by Lun Wang (UC Berkeley), Joseph P. Near (University of Vermont), Neel Somani (UC Berkeley), Peng Gao (UC Berkeley), Andrew Low (UC Berkeley), David Dao (ETH Zurich), and Dawn Song (UC Berkeley).
Vojislav Dukic presented the following paper at HotCloud 2019 in Washington, US:
"Happiness index: Right-sizing the cloud’s tenant-provider interface" by Vojislav Dukic and Ankit Singla.
Abstract:
Cloud providers and their tenants have a mutual interest in identifying optimal configurations in which to run tenant jobs, i.e., ones that achieve tenants' performance goals at minimum cost; or ones that maximize performance within a specified budget. However, different tenants may have different performance goals that are opaque to the provider. A consequence of this opacity is that providers today typically offer fixed bundles of cloud resources, which tenants must themselves explore and choose from. This is burdensome for tenants and can lead to choices that are sub-optimal for both parties.
We thus explore a simple, minimal interface, which lets tenants communicate their happiness with cloud infrastructure to the provider, and enables the provider to explore resource configurations that maximize this happiness. Our early results indicate that this interface could strike a good balance between enabling efficient discovery of application resource needs and the complexity of communicating a full description of tenant utility from different configurations to the provider.
Frank McSherry has been awarded the Test-Of-Time Award at SIGMOD 2019 for his paper "Privacy integrated queries: an extensible platform for privacy-preserving data analysis" published 10 years earlier at SIGMOD 2009.
Catalina Alvarez joined the Systems Group as our new PhD student. Catalina completed her Master's Degree at the University of Chile.
The following paper co-authored by John Liagouris has been accepted for publication at the VLDB Journal:
"SRX: Efficient Management of Spatial RDF Data" by Konstantinos Theocharidis (University of Peloponnese/IMSI, RC 'Athena'), John Liagouris (ETH Zürich), Nikos Mamoulis (University of Ioannina), Panagiotis Bouros (Johannes Gutenberg University Mainz), Manolis Terrovitis (IMSI, RC 'Athena')
Abstract:
We present a general encoding scheme for the efficient management of spatial RDF data. The scheme approximates the geometries of the RDF entities inside their (integer) IDs and can be used, along with several operators and optimizations we introduce, to accelerate queries with spatial predicates and to re-encode entities dynamically in case of updates. We implement our ideas in SRX, a system built on top of the popular RDF-3X system. SRX extends RDF-3X with support for three types of spatial queries: range selections (e.g. find entities within a given polygon), spatial joins (e.g. find pairs of entities whose locations are close to each other), and spatial k nearest neighbors (e.g. find the three closest entities from a given location). We evaluate SRX on spatial queries and updates with real RDF data, and we also compare its performance with the latest versions of three popular RDF stores. The results show SRX’s superior performance over the competitors; compared to RDF-3X, SRX improves its performance for queries with spatial predicates while incurring little overhead during updates.
The following paper has been accepted at VLDB 2019 in Los Angeles, California, August 26th to August 30th, 2019.
"Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms" by Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Bo Li, Ce Zhang, Costas Spanos, Dawn Song.
Gustavo Alonso gave a keynote presentation "How Hardware Evolution is Driving Software Systems" at the 13th ACM International Conference on Distributed and Event-Based Systems, DEBS, in Darmstadt, Germany.
The paper "doppioDB 1.0: Machine Learning inside a Relational Engine" by Gustavo Alonso, Zsolt Istvan, Kaan Kara, Muhsen Owaida, and David Sidler has been accepted for publication at the IEEE Data Engineering Bulletin, June 2019, Vol. 42 No. 2.
The following paper has been accepted at ICML Climate Change Workshop at 36th International Conference on Machine Learning, Long Beach, California, 2019.
GainForest: Scaling Climate Finance for Forest Conservation using Interpretable Machine Learning on Satellite Imagery by David Dao, Catherine Cang, Clement Fung, Ming Zhang, Reuven Gonzales, Nick Beglinger, Ce Zhang.
The paper "Distributed Inference over Decision Tree Ensembles on Clusters of FPGAs" by Mohsen Ewaida and Amit Kulkarni has been accepted for publication in the ACM Journal Transactions on Reconfigurable Technology and Systems (TRETS). The paper is an extended version of a previous paper at FPL 2018, where it was selected as one of the best papers.
Kaan Kara obtained third place in System Design Contest (FPGA category) of the 2019, 56th Design Automation Conference (DAC 2019). More than 100 international teams participated in the two categories (GPU and FPGA).
Ghislain Fourny wrote a blog post entiteld "Rumble, an engine to run JSONiq on top of Spark" on the Systems Group Blog.
The project "SnowBell: An FPGA Computing Platform for Machine Learning in Data Analytics" by Mohsen Ewaida has been awarded an ETH Pioneer Fellowship with financing from the ETH Foundation. The Fellowships are intended to help with the development of a highly innovative product or service to be exploited commercially.
The following paper co-authored by Zhikuan Zhao has been published in Physical Review A:
Quantum-assisted Gaussian process regression by Zhikuan Zhao, Jack K. Fitzsimons, and Joseph F. Fitzsimons.
The following paper has been published in Springer Quantum Machine Intelligence:
Bayesian deep learning on a quantum computer by Zhikuan Zhao, Alejandro Pozas-Kerstjens, Patrick Rebentrost, Peter Wittek.
David Sidler defended his PhD Dissertation "In-Network Data Processing using FPGAs".
Maximilian Grüner joined the Systems Group as our new PhD student. Maximilian completed hid MSc at ETH.
John Liagouris gave the following talk at Microsoft Research Redmond, USA:
Reconfigurable Data Stream Processing
Abstract:
Next-generation streaming systems will not only be scalable and reliable, but also autonomous, flexible, and able to automatically re-configure running applications without downtime. Automatic re-configuration relies heavily on three aspects: (i) accurate profiling to identify computation bottlenecks at runtime, (ii) rigorous performance models to decide the new system configuration that meets the objectives, and (iii) efficient state migration mechanisms to move data around when necessary. After decades of systems research, the state-of-the-art solutions for all these three requirements are still problematic.
In this talk I will present our recent work at ETH Zurich on re-configurable stream processing. The first part of the talk will focus on SnailTrail (NSDI 18), a system for online critical path analysis of distributed streaming dataflows. SnailTrail uses the novel metric of critical participation to generate online performance summaries and provide immediate insights into specific parts of the dataflow computation. The second part of the talk will focus on DS2 (OSDI 18), an auto-scaling controller that leverages online performance metrics along with data flow dependencies to estimate the minimum amount of resources a dataflow needs in order to meet a target throughput. In the third and last part of the talk, I will briefly talk about Megaphone (VLDB 19), a latency-conscious state migration mechanism for streaming dataflows.
MIT Technology review published an article entitled "How AI could save lives without spilling medical secrets" on David Dao's work on Kara, a collaboration with Berkeley and Stanford Medical School.
One more demo has been accepted at VLDB 2019, Los Angeles, CA, USA, August 26-30, 2019.
"Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization" by Cedric Renggli (ETH Zurich), Frances Ann Hubis (ETH Zurich), Bojan Karlaš (ETH Zurich), Kevin Schawinski (Modulos AG), Wentao Wu (Microsoft Research), Ce Zhang (ETH Zurich).
Jiawei Jiang joined the Systems Group as our new PostDoc. Jiawei completed his PhD at Peking University.
“doppioDB 2.0: Hardware Techniques for Improved Integration of Machine Learning into Databases” by Kaan Kara, Zeke Wang, Ce Zhang, and Gustavo Alonso has been accepted at VLDB 2019, Los Angeles, CA, USA, August 26-30, 2019.
.
Anastasiia Ruzhanskaia joined the Systems Group as our new PhD student. Anastasiia is coming to us from Moscow Institute of Physics and Technology.
The following paper has been accepted at 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 19), in Renton, WA, USA, July 8, 2019.
"Happiness index: Right-sizing the Cloud’s Tenant-Provider Interface" by Vojislav Dukic and Ankit Singla.
Timothy Roscoe talks about a research computer, Enzian, and shares more about his research in the D-INFK spotlight article "Bridging the gap: a computer built for research".
Timothy Roscoe wrote a Systems Group blog post based on the HotOS paper "A fork() in the road" by Andrew Baumann, Jonathan Appavoo, Orran Krieger, Timothy Roscoe. The paper will be presented in May at HotOS-XVII in Bertinoro, Italy.
Moritz Hoffmann defended his PhD Dissertation “Managing and understanding distributed stream processing”.
Moritz will join Google Zürich in June 2019.
The following papers have been accepted at ICML 2019, Long Beach, CA, June 9 – 15, 2019.
"Distributed Learning over Unreliable Networks"
by Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu.
"DL2: Training and Querying Neural Networks with Logic"
by Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen , Timon Gehr, Ce Zhang, Martin Vechev.
The following paper has been accepted at VLDB 2019, Los Angeles, CA, August 26-30, 2019.
"Megaphone: Latency-conscious state migration for distributed streaming dataflows" by Moritz Hoffmann, Andrea Lattuada, Frank McSherry, Vasiliki Kalavri, John Liagouris, Timothy Roscoe.
Zhikuan ZHAO joined the Systems Group as our new PostDoc. Zhikuan (AKA Jansen) completed his PhD at Singapore University.
Timothy Roscoe gave a keynote address entitled "Building Enzian, a research computer" at the 9th Workshop Systems for Future Multicore Architectures (SFMA) in Berlin.
Gustavo Alonso gave a number of talks and overviews of the research in the Systems Group at Google X, E-Bay, and HPE. "Hardware Acceleration in the data center" and "Specialized hardware for better systems software"
Cedric Renggli presented the following paper at SysML'19, Stanford, CA, USA, March 31-April 2, 2019:
"Continuous Integration of Machine Learning Models: A Rigorous Yet Practical Treatment" by Cedric Renggli (ETH Zurich), Bojan Karlaš (ETH Zürich), Bolin Ding ("Data Analytics and Intelligence Lab, Alibaba Group"), Feng Liu (Huawei Technologies), Kevin Schawinski (Modulos AG), Wentao Wu (Microsoft Research), Ce Zhang (ETH Zurich).
The article entitled "Tracking Readers’ Eye Movements Can Help Computers Learn" reporting about Nora Hollenstein's work has been published in WIRED.
The article "How Artificial Intelligence Is Changing Science" about Ce Zhang's astronomy work with Kevin Schawinski has been published in Quanta Magazine.
Vojislav Ðukic presented the following paper at NSDI 2019:
Is advance knowledge of flow sizes a plausible assumption? by Vojislav Ðukić, ETH Zurich; Sangeetha Abdu Jyothi, University of Illinois at Urbana–Champaign; Bojan Karlaš, Muhsen Owaida, Ce Zhang, and Ankit Singla, ETH Zurich
Zhenhao He joined the Systems Group as our new PhD student. Zhenhao completed his Master Thesis in the Systems Group.
Timothy Roscoe gave a talk entitled "Writing down the hardware/software interface" at a Defense Advanced Research Projects Agency (DARPA) Information Science and Technology (ISAT) Meeting at MIT.
John Liagouris gave the following talk at eBay, San Jose, CA.
Strymon: Providing fast and meaningful insights into enterprise datacenters
Abstract:
In this talk I will present an overview of the Strymon project at ETH Zurich. The project focuses on the design and development of Strymon, a system that leverages existing datacenter logging pipelines to ingest and process logs of events in real time and provide datacenter administrators with timely insights into the functionality of the running systems. Strymon is written in Rust and builds on top of timely dataflow, a high-performance engine for parallel and distributed streaming computations. The talk will focus on two Strymon use cases: (i) online reconstruction of user sessions from individual logs, which is often the first step in many datacenter management tasks, and (ii) online critical path analysis of long-running applications, which can be used to identify performance bottlenecks at runtime.
The following paper co-authored by Frances Hubis, Merve Gurel and Ce Zhang has been accepted to AISTATS 2019.
Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve Gurel, Bo Li, Ce Zhang, Dawn Song, Costas Spanos: "Towards Efficient Data Valuation Based on the Shapley Value".
The paper "Entity Recognition at First Sight: Improving NER with Eye Movement Informationby" by Nora Hollenstein and Ce Zhang has been accepted at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), June 2–7, 2019 in Minneapolis, USA.
The following paper co-authored by Frances Hubis has been accepted to DAC 2019.
N. Gleinig, F. A. Hubis, T. Höfler : Embedding functions into reversible circuits: A probabilistic approach to the number of lines ; Proceedings of the 56th Design Automation Conference (DAC) 2019.
Tom Anderson from the University of Washington joined the Systems Group as our Visiting Professor until June 2019.
The paper "Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning" by Zeke Wang, Kaan Kara, Hantian Zhang, Gustavo Alonso, Onur Mutlu, and Ce Zhang has been accepted to VLDB'19.
Michal Wawrzoniak and Daniel Schwyn joined the Systems Group as our new PhD students. Michal completed his MSc at Princeton and Daniel at ETH.
Vasiliki Kalavri gave a talk at the "Tech talks on Scientific Computing" event at Ghent University organized by the HPC Group.
Title: Towards self-managed, re-configurable streaming dataflow systems
Monica Chiosa joined the Systems Group as our new Scientific Assistant.
Timothy Roscoe gave an invited talk "Enzian: research hardware for systems support for AI" at the ACM/Microsoft India Academic Research Summit.
Gustavo Alonso gave an invited talk ("Hardware for Data Management") at the annual meeting of the Network of Excellence on Modeling and Management of Data: which took place January 25th in Madrid, Spain.
Fabio Maschi joined the Systems Group as our new PhD student. Fabio completed his Master Thesis at Paris-Saclay University.
Ghislain Fourny gave a talk about Non-Nashian Game Theory at the Automatic Control Lab of D-ITET, ETHZ.
Gustavo Alonso gave a talk at Oracle on the use of RDMA for in-network data processing through FPGA based NICs.
.
The following paper has been accepted at SysML'19, Stanford, CA, USA, March 31-April 2, 2019.
"Continuous Integration of Machine Learning Models: A Rigorous Yet Practical Treatment" by Cedric Renggli (ETH Zurich), Bojan Karlaš (ETH Zürich), Bolin Ding ("Data Analytics and Intelligence Lab, Alibaba Group"), Feng Liu (Huawei Technologies), Kevin Schawinski (Modulos AG), Wentao Wu (Microsoft Research), Ce Zhang (ETH Zurich).
Claude Barthels defended his PhD Dissertation “Scalable Query and Transaction Processing over High-Performance Networks”.
The paper "ColumnML: Column Store Machine Learning with On The Fly Data Transformation" by Kaan Kara, Ken Eguro, Ce Zhang, and Gustavo Alonso has been accepted for publication at VLDB 2019, Los Angeles, California, USA
.
Gustavo Alonso gave the talk "How Hardware Evolution is Driving Software Systems" at the Department of Computer Science of McGill University, Montreal, Canada, covering the latest results from the group on FPGA research and developments around the Enzian project.
The article "ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading" by Nora Hollenstein, Jonathan Rotsztejn, Marius Troendle, Andreas Pedroni, Ce Zhang and Nicolas Langer has been published in Scientific Data - a natureresearch journal.
Zsolt István is one of the ETH Outstanding Doctoral Theses 2018 Award Winners for his PhD Thesis: "Building Distributed Storage with Specialized Hardware". (Department of Computer Science Award Winners).
Abstract:
In an effort to keep up with increasing data sizes, applications in the datacenter are scaled out to large number of machines. Even though this allows them to tackle complex problems, data movement bottlenecks of various types appear, limiting overall performance and scalability. Pushing computation closer to the data reduces bottlenecks, and in this work we explore this idea in the context of distributed key-value stores.
Efficient compute is important in the datacenter, especially at scale. Driven by the stagnation of single-threaded CPU performance, specialized hardware and hybrid architectures are emerging and could hold the answer for more efficient compute and data management. In this work we use Field Programmable Gate Arrays (FPGA) to break traditional trade-offs and limitations, and to explore design scenarios that were previously infeasible.
This dissertation focuses on distributed storage, a building block in scale out applications. It explores how such a service can benefit from specialized hardware nodes. We focus in particular on providing complex near-data computation with the goal of reducing the data movement bottleneck between application layers. Furthermore, this work addresses a shortcoming in the design of most distributed storage nodes, namely the mismatch between computational power and network/storage media bandwidth.
The mismatch is present because, if regular server machines are used, there is plenty of processing power to implement various filtering and processing operations, but the overall architecture is over-provisioned compared to the network. In contrast, if specialized hardware nodes are used (e.g. network-attached flash) the internal and external bandwidths are better matched, but these nodes will not be able to carry out complex processing near the data without slowing data access down. Our solution, Caribou, proposes a balanced design point: small-footprint hardware nodes that, even though offer high throughput and low latency, are also flexible to adapt to different workloads and processing types without being over-provisioned.
The work presented in this dissertation is not a one-off effort: it provides an extensible and i modular architecture for storage nodes that can be used as a platform for implementing near-data processing ideas for various application domains. The lessons are be applicable for different storage media or networking technologies as well.
Debopam Bhattacherjee has presented the following paper at at ACM HotNets 2018:
"Gearing up for the 21st century space race" by Debopam Bhattacherjee (ETH Zürich), Waqar Aqeel (Duke University), Ilker Nadi Bozkurt (Duke University), Anthony Aguirre (University of California, Santa Cruz), Balakrishnan Chandrasekaran (Max-Planck-Institut für Informatik), P. Brighten Godfrey (University of Illinois at Urbana-Champaign), Gregory P. Laughlin (Yale University), Bruce M. Maggs (Duke University), Ankit Singla (ETH Zürich).
Abstract:
A new space race is imminent, with several industry players working towards satellite-based Internet connectivity. While satellite networks are not themselves new, these recent proposals are aimed at orders of magnitude higher bandwidth and much lower latency, with constellations planned to comprise thousands of satellites. These are not merely far future plans -- the first satellite launches have already commenced, and substantial planned capacity has already been sold. It is thus critical that networking researchers engage actively with this research space, instead of missing what may be one of the most significant modern developments in networking.
In our first steps in this direction, we find that this new breed of satellite networks could potentially compete with today's ISPs in many settings, and in fact offer lower latencies than present fiber infrastructure over long distances. We thus elucidate some of the unique challenges these networks present at virtually all layers, from topology design and ISP economics, to routing and congestion control.
--
The following papers have been presented at MICRO, IMC and OSDI in October 2018:
• Yaohua Wang, Arash Tavakkol, Lois Orosa, Saugata Ghose, Nika Mansouri Ghiasi, Minesh Patel, Jeremie S. Kim, Hasan Hassan, Mohammad Sadrosadati, and Onur Mutlu, "Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration" Proceedings of the 51st International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, October 2018. [Lightning Talk Video]
• Justin Meza, Tianyin Xu, Kaushik Veeraraghavan, and Onur Mutlu, "A Large Scale Study of Data Center Network Reliability" Proceedings of the 18th ACM Internet Measurement Conference (IMC), Boston, MA, USA, October/November 2018.
• Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu, "Focus: Querying Large Video Datasets with Low Latency and Low Cost" Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Carlsbad, CA, USA, October 2018.
Onur Mutlu has been elected as a member of the Academy of Europe (Academia Europea).
Rodrigo Bruno joined the Systems Group as our new Postdoc. Rodrigo completed his PhD at Instituto Superior Técnico, Lissabon, Portugal.
Jeremie Kim presented his lead-author paper at ICCD 2018 in Orlando, FL, USA, October 2018.
Jeremie S. Kim, Minesh Patel, Hasan Hassan, and Onur Mutlu, "Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines"
The paper 'Sequence Classification with Human Attention' by Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei and Anders Søgaard has won the 'Special award for the best paper on research inspired by human language learning and processing' at CoNLL 2018 (The SIGNLL Conference on Computational Natural Language Learning).
Ankit Singla and Debopam Bhattacherjee talk about their project "Internet at the Speed of Light" in a video recording for the D-INFK Research news.
The paper "DPI: The Data Processing Interface for Modern Networks" by Gustavo Alonso, Carsten Binnig, Ippokratis Pandis, Kenneth Salem, Jan Skrzypczak, Ryan Stutsman, Lasse Thostrup, Tianzheng Wang, Zeke Wang, and Tobias Ziegler has been accepted for publication at CIDR 2019 which will take place in Asilomar, California, USA, in January 2019.
Renato Marroquín presented the following paper at ACM Symposium on Cloud Computing 2018 (SoCC 2018) in Carlsbad, California.
"Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution" by Renato Marroquín (ETH Zurich), Ingo Müller (ETH Zurich), Darko Makreshanski (Oracle Labs), Gustavo Alonso (ETH Zurich).
Vasiliki Kalavri presented the following paper at OSDI 2018 in Karlsbad, CA, USA:
"Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows" by Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, and Desislava Dimitrova, ETH Zurich; Matthew Forshaw, Newcastle University; Timothy Roscoe, ETH Zurich.
Dario Korolija joined the Systems Group as our new PhD student. Dario completed his Master Thesis at EPFL.
The paper "FPGA-based TCP/IP Checksum Offloading Engine for 100 Gbps Networks" by Mario Ruiz, Sergio Lopez, Gustavo Sutter, and Gustavo Alonso has been accepted at ReConFig 2018, and will be presented at the conference in Cancun, Mexico, on December 2018.
Vasiliki Kalavri gave the talk “Fast, accurate, automatic scaling decisions for distributed streaming dataflows”, at the Distributed Computing & Analytics Workshop, in Kista, Sweden.
Gustavo Alonso gave an invited talk at KTH, Stockholm, Sweden, on "Hardware Acceleration Close to the Network" on the 26th of September, 2018.
Vasiliki Kalavri gave the talk “Platforms for big data analytics and stream processing" at the 3rd Int’l Summer School on Data Science (SDDS 2018), in Split, Croatia,
The following paper has been accepted at the 17th ACM Workshop on Hot Topics in Networks (HotNets 2018).
"Gearing up for the 21st century space race" by Debopam Bhattacherjee (ETH Zürich), Waqar Aqeel (Duke University), Ilker Nadi Bozkurt (Duke University), Anthony Aguirre (University of California, Santa Cruz), Balakrishnan Chandrasekaran (Max-Planck-Institut für Informatik), P. Brighten Godfrey (University of Illinois at Urbana-Champaign), Gregory P. Laughlin (Yale University), Bruce M. Maggs (Duke University), Ankit Singla (ETH Zürich).
Merve Gürel presented the following paper at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop:
"Towards More Accurate Radio Telescope Images" by Nezihe Merve Gürel, Paul Hurley and Matthieu Simeoni.
Abstract:
Radio interferometry usually compensates for high levels of noise in sensor/antenna electronics by throwing data and energy at the problem: observe longer, then store and process it all. We propose instead a method to remove the noise explicitly before imaging. To this end, we developed an algorithm that first decomposes the instances of antenna correlation matrix, the so-called visibility matrix, into additive components using Singular Spectrum Analysis and then cluster these components using graph Laplacian matrix. We show through simulation the potential for radio astronomy, in particular, illustrating the benefit for LOFAR, the low frequency array in Netherlands. Least-squares images are estimated with far higher accuracy with low computation cost without the need for long observation time.
Timothy Roscoe gave the talk "How to keep academic systems research relevant in an age of custom hardware" at the University of California, Berkeley, 29. August 2018.
Gustavo Alonso gave the talk "How Hardware Evolution is Driving Software Systems" at VMware, Palo Alto California on Sept. 6th during a visit to VMware Research.
Our alumnus, Zsolt István, joins IMDEA Software Institute in Madrid, Spain as an Assisstant Research Professor.
Frances Ann Hubis joined the Systems Group as our new PhD student. Frances completed her Master Thesis at ETH.
Mohammed Alser and Shigang Li joined the Systems Group as our new Postdocs.
Mohammed comes to us from Bilkent University, Ankara, Turkey and Shigang from University of Science and Technology, Beijing, China.
Gustavo Alonso gave an invited talk at the 2nd Workshop on Reconfigurable Computing for Machine Learning (RCML'2018), August 30th, 2018, Dublin, Ireland: "What reconfigurable computing can do for machine learning".
Abstract:
Data Processing is undergoing a multitude of interesting changes: from the platforms (cloud, appliances) to the workloads, data types, and operations. Machine Learning has become the dominant workload in data processing, giving raise to many challenges. These challenges are being tackled through innovation in hardware even to the point of having fully specialized designs for particular applications. In this talk I will review some of the most important changes happening in hardware, with an emphasis on FPGAs, and discuss how they can be used in machine learning as well as the opportunities they create.
Gustavo Alonso gave a talk on "Database Acceleration on FPGAs" at Xilinx, Dublin, Ireland.
Several papers have been presented at FPL'18 on work done in the group around data processing on FPGAs:
Zsolt Istvan (Systems Group alumnus) presented the paper “Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value Store” (by Zsolt Istvan, Gustavo Alonso and Ankit Singla).
Muhsen Owaida presented the paper “Application Partitioning on FPGA Clusters: Inference over Decision Tree Ensembles” (by Muhsen Owaida, Gustavo Alonso).
Zhenhao He presented the paper "A Flexible K-Means Operator for Hybrid Databases" (by Zhenhao He, David Sidler, Zsolt Istvan and Gustavo Alonso).
Vasiliki Kalavri gave the talk "Online Performance Analysis of Distributed Dataflows" at the EIT Big Data Analytics Summer School in Stockholm, August 2018.
The following paper has been accepted at the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2018 ), in Brussels, Belgium, October 31 - November 1, 2018:
"Sequence classification with human attention" by Maria Barrett, Joachim Bingel*, Nora Hollenstein**, Marek Rei***, Anders Søgaard*
* University of Copenhagen
** ETH Zurich
*** University of Cambridge, United Kingdom
Onur Mutlu gave the "Memory Systems and Memory-Centric Computing Systems" course at the HiPEAC ACACES 2018 summer school. The course was attended by 160-180 students.
Course materials
Course videos
Course description
Onur Mutlu gave an invited talk entitled "RowHammer and Beyond" at Microsoft Research 2018 Faculty Summit (MSR FACSUMMIT), Redmond, WA, USA, August 2018.
The following paper has been accepted at The Ninth International Workshop on Health Text Mining and Information Analysis (LOUHI 2018):
"Patient Risk Assessment and Warning Symptom Detection Using Deep Attention-Based Neural Networks" by Ivan Girardi**, Pengfei Ji*, An-phi Nguyen**, Nora Hollenstein*, Adam Ivankay**, Lorenz Kuhn**, Chiara Marchiori** and Ce Zhang*.
* ETH Zürich
** IBM Research
Abishek Ramdas joined the System Group today as PhD student. He comes to us from Qualcomm Technologies, San Diego, CA, USA.
Michel Müller joined the Systems Group on 1 August 2018 as Postdoc. He comes to us from Tokyo Institute of Technology, Japan.
The following paper has been accepted for publication at the Symposium of Cloud Computing (SoCC'18), October 11–13, 2018, Carlsbad, CA, USA
Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution by Renato Marroquín, Ingo Müller, Darko Makreshanski (OracleLabs), Gustavo Alonso
Onur Mutlu gave the opening lecture at the IEEE CEDA Design Automation Summer School, San Francisco, CA, USA, June 2018.
Opening Lecture title: "Processing Data Where It Makes Sense in Modern Computing Systems: Enabling In-Memory Computation". Slides
Yishai Oltchik joined the Systems Group as a new PhD student. Yishai comes to us from Hebrew University of Jerusalem.
The following paper has been accepted at 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI'18), October 8–10, 2018, Carlsbad, CA, USA:
Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows by V. Kalavri, J. Liagouris, M. Hoffmann, D. Dimitrova, M. Forshaw, T. Roscoe.
Frank Mc Sherry gave a keynote talk "Differential Dataflow" at the Workshop on Incremental Recomputation: Provenance and Beyond (IRPb 2018) co-organized with Provenance Week 2018 in London, United Kingdom.
Onur Mutlu gave the keynote talk "Rethinking Memory System Design: Robustness, Energy, Performance" at the 24th IEEE International Symposium on On-Line Testing and Robust System Design (IOLTS) in July 2018. Slides
Lukas Humbel presented the following paper at the 9th International Conference on Interactive Theorem Proving (ITP 2018), July 9-12 2018, Oxford, United Kingdom.
Physical Addressing on Real Hardware in Isabelle/HOL by Reto Achermann, Lukas Humbel, David Cock and Timothy Roscoe.
Simon Gerber defended his PhD Dissertation “Authorization, Protection, and Allocation of Memory in a Large System”.
Tiziano De Matteis joined the Systems Group as a Postdoc. Tiziano comes to us from the University of Pisa, Italy.
Moritz Hoffmann presented the following paper at the Algorithms and Systems for MapReduce and Beyond Workshop (BeyondMR):
“Latency-conscious dataflow reconfiguration” by Moritz Hoffmann, Frank Mcsherry and Andrea Lattuada.
Claude Barthels, Timothy Roscoe, and Gustavo Alonso have given a number of talks at the Dagstuhl seminar on "Convergence in Networked Systems". The talks covered work done at the group on data processing using RDMA, the Enzian project, and data processing on FPGAs.
Gustavo Alonso received the Distinguished Alumnus Award from the Department of Computer Science at the University of California, Santa Barbara, where he obtained his PhD.
Kaan Kara obtained second place in System Design Contest (FPGA category) of the 2018, 55th Design Automation Conference. More than 100 international teams participated in the two categories (GPU and FPGA).
The following paper has been accepted at VLDB 2018 (Experiments and Analyses track):
Streaming Graph Partitioning: An Experimental Study by Z. Abbas, V. Kalavri, P. Carbone, V. Vlasso.
Onur Mutlu gave a keynote talk entitled "Accelerating Genome Analysis: A Primer on an Ongoing Journey" at HiCOMB 2018. (Slides)
Abstract:
Genome analysis is the foundation of many scientific and medical discoveries as well as a key pillar of personalized medicine. Any analysis of a genome fundamentally starts with the reconstruction of the genome from its sequenced fragments. This process is called read mapping. One key goal of read mapping is to find the variations that are present between the sequenced genome and reference genome(s) and to tolerate the errors introduced by the genome sequencing process. Read mapping is currently a major bottleneck in the entire genome analysis pipeline because state-of-the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques that are employed to reconstruct the genome. New sequencing technologies, like nanopore sequencing, greatly exacerbate this problem while at the same time making genome sequencing much less costly.
This talk describes our ongoing journey in greatly improving the performance of genome read mapping. We first provide a brief background on read mappers that can comprehensively find variations and tolerate sequencing errors. Then, we describe both algorithmic and hardware-based acceleration approaches. Algorithmic approaches exploit the structure of the genome as well as the structure of the underlying hardware. Hardware-based acceleration approaches exploit specialized microarchitectures or new execution paradigms, like processing in memory. We show that significant improvements are possible with both algorithmic an hardware-based approaches and their combination. We conclude with a foreshadowing of future challenges brought about by very low cost yet highly error prone new sequencing technologies.
Gustavo Alonso has authored a short survey of papers covering recent developments on FPGAs in data centers at ACM Queue, March/April 2018.
Ingo Müller gave the following talk at SAP Database Campus Research Seminar in Walldorf, Germany:
Multi-Query Execution through SQL Rewriting -- Pay One, Get Hundreds for Free
Abstract:
Optimizing and running multiple queries jointly instead of one by one has been a topic of research for several decades. It allows for more efficient execution because data access or computations needed by multiple queries can be shared and carried out only once. This is the case in applications such as parameter exploration, business reports, and even ad-hoc analytics, in all of which work sharing techniques have been used commercially.
In this talk, I will first review the prior work in this field. Then I will present a recent project from our group that extends existing work sharing techniques, but departs from them in a significant way: While prior approaches implemented dedicated operators or processing engines, we achieve to execute multi-query queries together by rewriting them to a single one expressed again in SQL. This allows applying work sharing techniques on any system with sufficient coverage of the SQL standard and even if work sharing was not originally foreseen by the system designer. We apply our approach to two Query-as-a-Service systems, where we do not only get significant throughput improvements, but can also execute an entire batch of queries for the price of just a single one. Finally, I will present some results on using the same approach on SAP HANA.
The following three papers have been accepted at the 28th International Conference on Field Programmable Logic & Applications (FPL 2018) in Dublin, Ireland, August 27 - 31, 2018:
“Application Partitioning on FPGA Clusters: Inference over Decision Tree Ensembles” by Muhsen Owaida, Gustavo Alonso.
"A Flexible K-Means Operator for Hybrid Databases" by Zhenhao He, David Sidler, Zsolt Istvan and Gustavo Alonso.
“Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value Store” by Zsolt Istvan, Gustavo Alonso and Ankit Singla.
The paper "Using transfer learning to detect galaxy mergers" by Sandro Ackermann, Kevin Schawinski, Ce Zhang, Anna K. Weigel, and M. Dennis Turphas been accepted for publication in Monthly Notices of the Royal Astronomical Society Main Journal.
The following paper has been accepted at VLDB 2018 in Rio De Janeiro, Brasil, August 27-31, 2018:
"MLBench: Benchmarking Machine Learning Services Against Human Experts" by Yu Liu, Hantian Zhang, Luyuan Zeng, Wentao Wu (Microsoft Research, Redmond), Ce Zhang.
The following two papers have been accepted at the 35th International Conference on Machine Learning (ICML 2018) in Stockholm, Sweden, July 10-15, 2018.
Asynchronous Decentralized Parallel Stochastic Gradient Descent by Xiangru Lian (University of Rochester), Wei Zhang (IBM), Ce Zhang, Ji Liu (University of Rochester).
D2: Decentralized Training over Decentralized Data by Hanlin Tang (University of Rochester), Xiangru Lian (University of Rochester), Ming Yan (Michigan State University), Ce Zhang, Ji Liu (University of Rochester).
The following two papers have been accepted as demos at VLDB 2018 in Rio De Janeiro, Brasil, August 27-31, 2018:
"Ease.ml in Action: Towards Multi-tenant Declarative Learning Services" by Bojan Karlas, Ji Liu, Wentao Wu, Ce Zhang.
"A Demonstration of Sterling: A Privacy-Preserving Data Marketplace" by Nick Hynes, Raymond Cheng, Noah Johnson, David Dao, Dawn Song.
Amit Kulkarni joined the Systems Group as a Postdoc. Amit comes to us from Ghent University.
Gustavo Alonso gave the talk "The impact of modern hardware on system design" at the 8th Workshop on Systems for Multi-core and Heterogeneous Architectures (SFMA 2018) and the talk "20+ Years of data replication and consistency. Have we learned anything?" at the 5th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC 2018), both collocated with EuroSys 2018, in Porto, Portugal.
---
"The impact of modern hardware on system design"
Abstract:
Computing Systems are undergoing a multitude of interesting changes: from the platforms (cloud, appliances) to the workloads, data types, and operations (big data, machine learning). Many of these changes are driven or being tackled through innovation in hardware even to the point of having fully specialized designs for particular applications. In this talk I will review some of the most important changes happening in hardware and discuss how they affect system design as well as the opportunities they create. I will focus on data processing as an example but also discuss applications in other areas.
---
"20+ Years of data replication and consistency. Have we learned anything?"
Abstract:
Data consistency is a fundamental topic in computer science, cutting across many areas and applications. Surprisingly, a significant part of ongoing research revolves around ideas pursued decades ago. The fact that published papers systematically ignore all this previous work does not mean it does not exist. Maybe the context has changed (cloud, geo-replication, massive scale, etc.) but the principles behind consistency for replicated data remain the same and have been known for a long time. In this talk, I will argue that there is a corpus of basic principles governing data consistency and that the mapping of such principles to mechanisms and system implications is also well established after decades of research. By providing a historical perspective of how data replication, and the subsequent problem of consistency, has evolved over the years, I will enumerate these principles, discuss the performance aspects associated to them, and relate these ideas to existing systems and trends. The goal of the talk is to open up new perspectives in an area where there is still much to be done and that has become highly relevant in the era of large scale computing infrastructures.
Konstantin Taranov presented the following paper at EuroSys 2018 in Porto, Portugal:
"Fast and strongly-consistent per-item resilience in key-value stores" by K. Taranov, G. Alonso, T. Hoefler.
Abstract:
In-memory key-value stores (KVSs) provide different forms of resilience through basic rrr-way replication and complex erasure codes such as Reed-Solomon. Each storage scheme exhibits different trade-offs in terms of reliability and resources used (memory, network load, latency, storage required, etc.). Unfortunately, most KVSs support only a single such storage scheme, forcing designers to employ different KVSs for different applications. To address this problem, we have designed a strongly consistent in-memory KVS, Ring, that empowers its users to set the level of resilience on a KV pair basis while still maintaining overall consistency and without compromising efficiency. At the heart of Ring lies a novel encoding scheme, Stretched Reed-Solomon coding, that combines hash key distributions of heterogeneous replication and erasure coding schemes. Ring utilizes RDMA to ensure low latencies and offload communication tasks. Its latency, bandwidth, and throughput are comparable to state-of-the-art systems that do not support changing resilience and, thus, have much higher memory overheads. We show use cases that demonstrate significant memory savings and discuss trade-offs between reliability, performance, and cost. Our work demonstrates how future applications that consciously manage resilience of KV pairs can reduce the overall operational cost and significantly improve the performance of KVS deployments.
Ingo Müller presented the paper "Reproducible Floating-Point Aggregation in RDBMSs" (Ingo Müller, Andrea Arteaga, Torsten Hoefler, Gustavo Alonso) at the 34th IEEE International Conference on Data Engineering (ICDE2018) in Paris, April 2018.
The following paper has been accepted at the 55th Design Automation Conference (DAC 2018):
• Anup Das, Hasan Hassan, and Onur Mutlu, "VRL-DRAM: Improving DRAM Performance via Variable Refresh Latency" Proceedings of the 55th Design Automation Conference (DAC), San Francisco, CA, USA, June 2018.
The following three papers have been accepted at the the 45th International Symposium on Computer Architecture (ISCA 2018):
• Arash Tavakkol, Mohammad Sadrosadati, Saugata Ghose, Jeremie Kim, Yixin Luo, Yaohua Wang, Nika Mansouri Ghiasi, Lois Orosa, Juan G. Luna and Onur Mutlu, "FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives" to appear in Proceedings of the 45th International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, June 2018.
• Nandita Vijaykumar, Eiman Ebrahimi, Kevin Hsieh, Phillip B. Gibbons and Onur Mutlu, "The Locality Descriptor: A Holistic Abstraction to Exploit Data Locality in Graphics Processing Units" to appear in Proceedings of the 45th International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, June 2018.
• Nandita Vijaykumar, Abhilasha Jain, Diptesh Majumdar, Kevin Hsieh, Gennady Pekhimenko, Eiman Ebrahimi, Nastaran Hajinazar, Phillip B. Gibbons and Onur Mutlu, "A Case for Richer Cross-layer Abstractions: Bridging the Semantic Gap to Enhance Memory Optimization" to appear in Proceedings of the 45th International Symposium on Computer Architecture
Marcin Copik joined the Systems Group as a new PhD student. Marcin comes to us from RWTH Aachen University, Germany.
The folowing paper has been accepted at the 9th International Conference on Interactive Theorem Proving (ITP'18) in Oxford, United Kingdom, July 9-12 2018:
"Physical addressing on real hardware in Isabelle/HOL" by Reto Achermann, Lukas Humbel, David Cock and Timothy Roscoe
Cedric Renggli joined the Systems Group as a new PhD student. Cedric completed his Master Theis in the Systems Group.
The following paper by Jonathan Rotsztejn, Nora Hollenstein and Ce Zhang has been accepted at the International Workshop on Semantic Evaluation (SemEval 2018) among 28 international teams, in three out of four subtasks for the relation extraction and classification tasks:
ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction
The following paper coauthored by Ce Zhang and Hantian Zhang has been published in Monthly Notices of the Royal Astronomical Society:
"PSFGAN: a generative adversarial network system for separating quasar point sources and host galaxy light" by Dominic Stark, Barthelemy Launet, Kevin Schawinski, Ce Zhang, Michael Koss, M Dennis Turp, Lia F Sartori, Hantian Zhang, Yiru Chen, Anna K Weigel.
Abstract
The study of unobscured active galactic nuclei (AGN) and quasars depends on the reliable decomposition of the light from the AGN point source and the extended host galaxy light. The problem is typically approached using parametric fitting routines using separate models for the host galaxy and the point spread function (PSF). We present a new approach using a Generative Adversarial Network (GAN) trained on galaxy images. We test the method using Sloan Digital Sky Survey (SDSS) r-band images with artificial AGN point sources added which are then removed using the GAN and with parametric methods using GALFIT. When the AGN point source PS is more than twice as bright as the host galaxy, we find that our method, PSFGAN, can recover PS and host galaxy magnitudes with smaller systematic error and a lower average scatter (49%). PSFGAN is more tolerant to poor knowledge of the PSF than parametric methods. Our tests show that PSFGAN is robust against a broadening in the PSF width of ±50% if it is trained on multiple PSF’s. We demonstrate that while a matched training set does improve performance, we can still subtract point sources using a PSFGAN trained on non-astronomical images. While initial training is computationally expensive, evaluating PSFGAN on data is more than 40 times faster than GALFIT fitting two components. Finally, PSFGAN it is more robust and easy to use than parametric methods as it requires no input parameters.
The following paper co-authored by Ce Zhang has been accepted at SIGMOD 2018 in Houston, TX, USA, June 10-15, 2018:
"DimBoost: Boosting Gradient Boosting Tree to Higher Dimensions" by Jiawei Jiang (Peking University), Bin Cui (Peking University), Ce Zhang (ETH), Fangcheng Fu (Peking University).
Claude Barthels gave the talk "Distributed Join Algorithms on Thousands of Cores" at Imperial College London and Microsoft Research Cambridge.
Gustavo Alonso gave an invited talk: "What Modern hardware can do for Data Processing" at the Symposium on Modern Database Platforms of the Fachgruppe Datenbanksysteme der Gesellschaft für Informatik e.V. (GI) hosted by SAP in Rot-Malsch, Germany.
Johannes de Fine Licht and Torsten Hoefler presented the tutorial "Productive parallel programming on FPGA with high-level synthesis" at Principles and Practice of Parallel Programming 2018 (PPoPP'18), Vienna, Austria, February 2018.
Timothy Roscoe gave the talk "Enzian: a research computer" at IBM Research in Rüschlikon.
Giray Yaglikci joined the Systems Group as a new PhD student. Giray comes to us from TOBB University of Economics and Technology, Ankara, Turkey.
The following paper, has been accepted at the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Irvine, CA, USA, June 2018:
Saba Ahmadian, Onur Mutlu, and Hossein Asadi: "ECI-Cache: A High-Endurance and Cost-Efficient I/O Caching Scheme for Virtualized Platforms".
Kaveh Razavi joined the Systems Group as a Visiting Researcher for 7 months. Kaveh will start as an Assistant Professor at VU Amsterdam in September 2018.
The following five papers have been accepted at the 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 24th – March 28th, Williamsburg, VA, USA:
• Rachata Ausavarungnirun, Vance Miller, Joshua Landgraf, Saugata Ghose, Jayneel Gandhi, Adwait Jog, Christopher J. Rossbach, and Onur Mutlu, "MASK: Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency"
• Maciej Besta, Syed Minhaj Hassan, Sudhakar Yalamanchili, Rachata Ausavarungnirun, Onur Mutlu, Torsten Hoefler, "Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability".
• Amirali Boroumand, Saugata Ghose, Youngsok Kim, Rachata Ausavarungnirun, Eric Shiu, Rahul Thakur, Daehyun Kim, Aki Kuusela, Allan Knies, Parthasarathy Ranganathan, and Onur Mutlu, "Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks".
• Amir M. Rahmani, Bryan Donyanavard, Tiago Mück, Kasra Moazzemi, Axel Jantsch, Onur Mutlu, and Nikil Dutt, "SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management".
• Mohammad Sadrosadati, Amirhossein Mirhosseini, Seyed Borna Ehsani, Hamid Sarbazi-Azad, Mario Drumond, Babak Falsafi, Rachata Ausavarungnirun, and Onur Mutlu, "LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching".
Thomas Lemmin joined the Systems Group as a Postdoc. Thomas comes to us from University of California, San Francisco.
David Dao and his team won a Microsoft AI for Earth Award for their deforestation prediction project. The project is using satellite imagery to predict the spread of deforestation over the Amazon with deep learning and is supported by Microsoft and Esri. http://gainforest.org
Gustavo Sutter from Autonomous University of Madrid (UAM) and Can Alkan from Bilkent University, Ankara joined us as Guest Professors until Summer 2018.
Nora Hollenstein, our PhD student, and Jonathan Rotsztejn, our Master Thesis student, won the 2017 International Workshop on Semantic Evaluation (SemEval-2017) "relation classification" competition, placing first out of 28 teams from all over the world.
Definition on semantic evaluation
Sabir Akhadov and Bojan Karlas joined the Systems group as new PhD students. Frank Mc Sherry joined us as a new Senior Researcher.
Our alumnus, Tim Kraska, joins MIT (Computer Science & Artificial Intelligence Lab - CSAIL) as an Associate Professor. MIT News
The paper "Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond" by Heng Guo (University of Edinburgh), Kaan Kara, and Ce Zhang has been accepted at the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018).
Zsolt István defended his PhD Dissertation "Building Distributed Storage with Specialized Hardware".
The following paper has been accepted at VLDB 2018 in Rio de Janeiro, Brasil, 27-31 August 2018:
Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads by Tian Li (ETH), Jie Zhong (University of Rochester), Ji Liu (University of Rochester), Wentao Wu (Microsoft Research, Redmond), Ce Zhang (ETH)
The paper "R-OSGi: Distributed Applications through Software Modularization", published at the ACM/IFIP/Usenix International Middleware Conference 2007, has received the Test-of-Time Award at the 2017 Middleware Conference. Jan Rellermeyer, now professor at TU Delft, received the award and gave a talk on the evolution of R-OSGi.
The following paper has been accepted at the 6th USENIX Conference on File and Storage Technologies (FAST '18) in Oakland, CA, USA, February 12–15, 2018.
"MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices" by Arash Tavakkol (ETH Zurich) Juan Gómez Luna (ETH Zurich) Mohammad Sadrosadati (ETH Zurich, Sharif University of Technology) Saugata Ghose (Carnegie Mellon University) Onur Mutlu (ETH Zurich, CMU).
The following paper has been accepted at the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI '18) in Renton, WA, USA, April 9-11, 2018:
SnailTrail: Generalizing Critical Paths for Online Analysis of Distributed Dataflows by Moritz Hoffmann, Andrea Lattuada, John Liagouris, Vasiliki Kalavri, Desislava Dimitrova, Sebastian Wicki, Zaheer Chothia, Timothy Roscoe.
Every year, ACM recognizes and honors outstanding ACM members for their achievements in computer science. This year, Onur Mutlu has been elected to the grade of Fellow for contributions to computer architecture research, especially in memory systems.
Onur Mutlu gave the Distinguished Lecture "Rethinking Memory System Design (and the Computing Platforms We Design Around It)" at INESC-ID in Lisbon, Portugal. Slides
Abstract:
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability.
In this talk, we first discuss major challenges facing modern memory systems in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges. We discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, via more memory-centric system design, 2) enabling emerging non-volatile memory (NVM) technologies via hybrid and persistent memory systems, 3) enabling predictable memory systems via QoS-aware memory system design. If time permits, we will also discuss research challenges and opportunities in NAND flash memories.
The paper "The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern DRAM Devices" by Jeremie S. Kim, Minesh Patel, Hasan Hassan, and Onur Mutlu has been accepted at the 24th International Symposium on High-Performance Computer Architecture (HPCA), Vienna, Austria, February 2018.
The XCell Daily Blog of Xilinx has highlighted the work done by Kaan Kara implementing the Zip-ML framework of Prof. Ce Zhang into an FPGA. More...
Gerd Zellweger defended his PhD Dissertation "On the Construction of Dynamic and Adaptive Operating Systems".
Timothy Roscoe gave a Colloquium entitled “The Trouble with Hardware” at Pierre-and-Marie-Curie University (UPMC) in Paris, France.
Abstract:
Computer hardware, from datacenters through rackscale computing down to mobile device systems-on-chip, is increasingly easy to design. A combination of advanced CAD systems, the rise of Moore's law, and now the fall of Moore's law, has resulted in a huge diversity of hardware platforms whose complexity is immense. One downside of this is the monumental software engineering challenge in building and maintaining correct, robust, and portable systems software. This is an open secret in many pockets of industry but receives little attention in research. We ran against this problem full square (and continue to do so) while developing the Barrelfish research OS. In part of my talk I'll discuss about what can be done to address this in the design of systems software, by importing ideas from formal verification, knowledge representation, and program synthesis to the C-dominated world of low-level code. This, however, begs broader questions: given that custom hardware is getting easier to design, what should it look like to system software? How can systems researchers influence such hardware? And in an age where more corporations are building custom hardware but academia is mostly restricted to commodity systems, what can be done to conduct relevant, impactful research in this space outside of industry? I'll try and suggest some answers.
David Dao and his team won the first prize at the hack4climate hackathon sponsored by the UNFCCC (the climate change division of the UN) for their idea about a novel and scalable way to fight deforestation with blockchain and artificial intelligence (http://gainforest.org).
The following two papers have been accepted at the 21st International Conference on Extending Database Technology (EDBT 2018), March 26-29, 2018 in Vienna, Austria.
MTBase : Optimizing Cross-Tenant Database Queries by Lucas Braun, Renato Marroquín, Kai-En Tsay, Donald Kossmann.
Synchronous Multi-GPU Deep Learning with Low-Precision Communication: An Experimental Study by Demjan Grubic (our master's student), Leo Tam (NVIDIA), Dan Alistarh, Ce Zhang.
Jana Giceva is one of the ETH Outstanding Doctoral Theses 2017 Award Winners for her PhD Thesis: Database/Operating system co-design.
Lukas Humbel presented the following paper at the PLOS Workshop, co-located with SOSP in Shanghai, China:
"Towards Correct-by-Construction Interrupt Routing on Real Hardware" Lukas Humbel, Reto Achemann, David Cock, and Timothy Roscoe.
Timothy Roscoe gave the talk "Intelligently Diagnosing Datacenters" at Huawei's Corporate Reliability Conference in Shenzhen, China.
Johannes M. Rausch, David Dao and Niels Gleinig joined the Systems Group as new PhD students.
Onur Mutlu gave the keynote talk entitled “"Processing Data Where It Makes Sense: Enabling In-Memory Computation" at the Mobile System Technologies (MST) 2017 Workshop.
Slides: PDF
The paper entitled "Genome Read In-Memory (GRIM) Filter: Fast Location Filtering in DNA Read Mapping Using Emerging Memory Technologies" by Jeremie Kim, Damla Senol, Hongyi Xin, Donghyuk Lee, Mohammed Alser, Hasan Hassan, Oguz Ergin, Can Alkan and Onur Mutlu has been accepted at the Asia-Pacific Bioinformatics Conference (APBC) 2018 . The paper will be published in the BMC Genomics Journal.
Melissa Licciardello joined the Systems Group as a PhD student. Melissa completed her Master Thesis at Bologna University, Italy.
Andrea Lattuada gave the following talk at the Rust Programming Language Conference (RustFest):
Title:
"A hammer you can only hold by the handle”
Abstract:
Rust’s type system provides tools to ensure safe memory management, and safe concurrent access to data. What if we used those same tools to encode and enforce other API constraints? We can leverage affine types (non-Clone structs) to enforce that a user performs a series of operations in a certain order; or we can use structs as tokens representing the user’s ability to perform certain actions. And everything’s checked at compile time. We’ll see how these techniques let us encode complex API constraints, and make them self-documenting by preventing disallowed behaviour at compile time.
The following four papers have been presented at the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 50) in Boston, MA, USA.
• Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons, and Todd C. Mowry, "Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology" Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
• Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, Onur Mutlu, and Srinivas Devadas, "Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation" Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
• Samira Khan, Chris Wilkerson, Zhe Wang, Alaa R. Alameldeen, Donghyuk Lee, and Onur Mutlu, "Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content" Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
• Rachata Ausavarungnirun, Joshua Landgraf, Vance Miller, Saugata Ghose, Jayneel Gandhi, Christopher J. Rossbach, and Onur Mutlu, "Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes" Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
Vasiliki Kalavri's PhD Thesis "Performance Optimization Techniques and Tools for Distributed Graph Processing" has been awarded the IBM Innovation Award 2017 by FNRS, the Belgian Fund for Scientific research.
The following paper has been accepted for publication by the IEEE Transactions on Knowledge and Data Engineering journal.
"High-Level Programming Abstractions for Distributed Graph Processing" by Vasiliki Kalavri, Vladimir Vlassov, and Seif Haridi.
Jana Giceva, our Dec. 2016 alumna, joined the Department of Computing at Imperial College London as an Assistant Professor.
The paper "Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives" by Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Lu and Onur Mutlu has been published in the Proceedings of the IEEE (Volume 105, Issue 9 | September 2017).
The paper entitled "Towards Correct-by-Construction Interrupt Routing on Real Hardware" by Lukas Humbel, Reto Achermann, David Cock and Timothy Roscoe, has been accepted at the 9th Workshop on Programming Languages and Operating Systems (PLOS 2017) in Shanghai, China, October 28, 2017.
Timothy Roscoe gave the talk entitled "Enzian: A Research Computer" at the ARM Research Summit in Cambridge, at MSR Cambridge, and at the University of Cambridge Computer Laboratory.
The following paper has been accepted as Oral Presentation (40 / 3240 submissions) at the Thirty-first Annual Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA , Dec. 4-9, 2017.
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent by Xiangru Lian (University of Rochester), Ce Zhang (ETH Zurich), Huan Zhang (University of California, Davis), Cho-Jui Hsieh (University of California, Davis), Wei Zhang (IBM T.J. Watson Research Center) and Ji Liu (University of Rochester).
Onur Mutlu gave an invited talk at the ARM Research Summit on 11 September 2017, entitled "Opportunities and Challenges of Emerging Memory Technologies" .
Andrea Bignoli and Shaoduo Gan joined the Systems Group as PhD students.
Mohsen Ewaida presented the paper "Scalable Inference of Decision Tree Ensembles: Flexible Design for CPU-FPGA Platforms" (Ewaida Mohsen, Hantian Zhang, Ce Zhang and Gustavo Alonso) and David Sidler presented the demo "doppioDB: A Hardware Accelerated Database" (David Sidler, Muhsen Owaida, Zsolt Istvan, Kaan Kara and Gustavo Alonso) at 27th International Conference on Field-Programmable Logic and Applications (FPL 2017), in Ghent, Belgium.
The following paper, which is the extended version of the work on multi-query joins by Darko Makreshanski, has been published in the VLDB Journal:
Many-query join: efficient shared execution of relational joins on modern hardware by D. Makreshanski, G. Giannikis, G. Alonso, D. Kossmann (The VLDB Journal, 1-24)
Merve Gürel joined the Systems Group as a PhD student. Merve completed her Master Thesis at EPFL.
Our alumnus, Jan Rellermeyer, joined Delft University of Technlology (TU Delft) as an Assistant Professor.
The following five papers and a demo have been presented at the 43rd International Conference on Very Large Data Bases VLDB 2017 in Munich, Germany.
"Distributed Join Algorithms on Thousands of Cores" by Claude Barthels (ETH), Ingo Müller (ETH), Timo Schneider (ETH), Gustavo Alonso (ETH), Torsten Hoefler (ETH)
“Caribou: Intelligent Distributed Storage” by Zsolt Istvan (ETH), David Sidler (ETH), Gustavo Alonso (ETH)
“Fast Scans on Key-Value Store”s by Markus Pilman (ETH), Kevin Bocksrocker (Microsoft), Lucas Braun (ETH), Renato Marroquın (ETH), Donald Kossmann (ETH)
“LDA*: A Robust and Large-scale Topic Modeling System” by Lele Yu (Peking University), Bin Cui (Peking University), Ce Zhang (ETH), Yingxia Shao (Peking University)
“An Experimental Evaluation of SimRank-based Similarity Search Algorithms” by Zhipeng Zhang (Peking University), Yingxia Shao (PKU), Bin Cui (Peking University), Ce Zhang (ETH)
DEMO: “MLog: Towards Declarative In-Database Machine Learning” by Xupeng Li (Peking University), Bin Cui (Peking University), Yiru Chen (Peking University), Wentao Wu (Microsoft Research), Ce Zhang (ETH)
Two Systems Group professors gave talks at the ETH Industry Day.
Timothy Roscoe gave a talk on "Online modelling of enterprise datacenter behavior" and Onur Multu on "Future computing and genome analysis platform" .
Gustavo Alonso gave a lecture on databases on emerging hardware at the Joint International BBDC and ScaDS Summer School on Big Data in Munich, Germany.
Simon Kassing presented the following paper at ACM SIGCOMM 2017 in Los Angeles, CA, USA:
Beyond fat-trees without antennae, mirrors, and disco-balls by Simon Kassing, Asaf Valadarsky*, Gal Shahaf*, Michael Schapira*, Ankit Singla.
*Hebrew University of Jerusalem
Vasiliki Kalavri gave a turorial at the EIT Big Data Analytics Summer School in Stockholm Sweden.
Onur Mutlu gave an invited talk at the Flash Memory Summit entitled “The Next Breakthroughs in Flash Memory – Changing Our Fixed Mindsets” in Santa Clara, CA, USA. He also gave an invited talk entitled "Key Design Challenges in Future Computing Platforms – Changing Our Fixed Mindsets” at VMware Research in Palo Alto.
Juan Gómez Luna joined the Systems Group as a Postdoc. Juan comes to us from the University of Córdoba.
Hantian Zhang presented the following paper at the 34th International Conference on Machine Learning (ICML 2017) ICML, Sydney, Australia, August 06-11, 2017:
The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning by Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang.
The following poster has been accepted at the ACM Symposium on Cloud Computing 2017 (SoCC '17), September 25-27, 2017 in Santa Clara, California:
How Good Are Machine Learning Clouds? by Hantian Zhang, Luyuan Zeng, Wentao Wu (MSR, Redmond), Ce Zhang.
The paper 'Fast Scans on Key-Value Stores' by Markus Pilman, Kevin Bocksrocker, Lucas Braun, Renato Marroquín and Donald Kossmann has been accepted at VLDB 2017, Munich, Germany, August 28 to September 1, 2017.
Lukas Arnold joined the Systems Group as a PhD Student. Lukas comes to us from the University of Bristol.
René Müller, our alumnus, joined Bern University of Applied Sciences as a Full Professor of Data Science and Business Intelligence.
Timothy Roscoe gave the talk "Enzian: a research computer" at Cavium, VMware and Apple in Silicon Valley.
Abstract:
Traditional systems software research is facing a new challenge to its relevance. Modern hardware CAD systems, the drive to lower power and "dark silicon", FPGAs, and other factors have made it both easy, quick, and cheap for system vendors to build custom hardware platforms. Almost any function can now be put into silicon or reconfigurable logic: the choice of exactly what *should* be built is a short-term business decision based on markets and workloads.
Such hardware qualitatively changes how systems, including system software, should be conceived and designed. However, most published OS research in rack-scale, embedded, or datacenter computing only uses affordable commodity platforms for which documentation is available to researchers. Academia and industry practice are diverging.
Enzian is an attempt to build a computer at ETH Zurich (with help from Cavium and Xilinx) to bridge this gap in a way not possible with commodity hardware or simulation. Enzian nodes closely couple a server-class CPU SoC with a large FPGA in the same coherence domain, with abundant network bandwidth to both chips. It is designed for maximum research flexibility, and can be used in many ways: a high-end server with FPGA-based acceleration, a multiport 200Gb/s NIC supporting custom protocols and cache access, a platform for runtime verification and auditing of system code, to name but three.
Alexandros-Nikolaos Ziogas joined the Systems Group as a PhD Student. Alexandros-Nikolaos comes to us from the National Technical University of Athens.
Zeke Wang joined the Systems Group as a PostDoc. Zeke comes to us from the National University of Singapore.
The following paper has been published in the Bioinformatics journal:
Mohammed Alser, Hasan Hassan, Hongyi Xin, Oguz Ergin, Onur Mutlu, and Can Alkan "GateKeeper: A New Hardware Architecture for Accelerating Pre-Alignment in DNA Short Read Mapping".
The paper is collaborative with Professor Mutlu’s co-advised PhD student Mohammed Alser and his co-advisor Can Alkan at Bilkent University. The paper proposes a novel FPGA-based accelerator for the analysis of genomes sequenced by next-generation genome sequencing technologies, which achieves one to two orders of magnitude speedup over existing genome pre-alignment techniques.
Debopam Bhattacherje presented the following paper at HotCloud 2017 in Santa Clara, CA, USA:
"A Cloud-based Content Gathering Network" by Debopam Bhattacherjee, ETH Zurich; Muhammad Tirmazi, LUMS; Ankit Singla, ETH Zurich.
Abstract:
Many popular Web services use CDNs to host their content closer to users and thus improve page load times. While this model’s success is beyond question, it has its limits: for users with poor last-mile latency even to a nearby CDN node, the many RTTs needed to fetch a Web page add up to large delays. Thus, in this work, we explore a complementary model of speeding up Web page delivery—a content gathering network (CGN), whereby users establish their own geo-distributed presence, and use these points of presence to proxy content for them. We show that deploying only 14 public cloud-based CGN nodes puts the closest node within a median RTT of merely 4.8 ms (7.2 ms) from servers hosting the top 10k (100k) most popular Web sites. The CGN node nearest to a server can thus obtain content from it rapidly, and then transmit it to the client over fewer (limited by available bandwidth) high-latency interactions using aggressive transport protocols. This simple approach reduces the median page load time across 100 popular Web sites by as much as 53%, and can be deployed immediately without depending on any changes to Web servers at an estimated cost of under $1 per month per user.
Minesh Patel presented his co-authored paper (with Jeremie Kim and Onur Mutlu) at ISCA 2017, the 44th International Symposium on Computer Architecture.
The work provides extensive characterization and understanding of the data retention behavior of modern LPDDR4 DRAM devices and proposes a novel methodology for retention time profiling of DRAM, one of the most difficult problems against DRAM technology scaling.
Vasiliki Kalavri gave the invited tutorial "Programming Models and Tools for Distributed Graph Processing" at the 31st British International Conference on Databases (BICOD) in London.
Abstract:
Graphs capture relationships between data items, such as interactions or dependencies, and their analysis can reveal valuable insights for machine learning tasks, anomaly detection, clustering, recommendations, social influence analysis, bioinformatics, and other application domains. This tutorial reviews the state of the art in high-level abstractions for distributed graph processing. First, we present six models that were developed specifically for distributed graph processing, namely vertex-centric, scatter-gather, gather-sum-apply-scatter, subgraph-centric, filter-process, and graph traversals. Then, we consider general-purpose distributed programming models that have been used for graph analysis, such as MapReduce, dataflow, linear algebra primitives, datalog, and shared partitioned tables. The tutorial aims at making a qualitative comparison of popular graph programming abstractions. We further consider performance limitations of some graph programming models and we summarize proposed extensions and optimizations.
The article "AI is changing how we do science" about space.ml, Ce Zhang's collaboration project with ETH astrophysicist Kevin Shawinski, has been published in Science Magazine.
Reto Achermann gave the talk "Model based system configuration and tasteful hardware" at University of Cambridge.
The following paper co-authored by Ce Zhang has been accepted at VLDB 2017, Munich, Germany, August 28 to September 1, 2017.
LDA*: A Robust and Large-scale Topic Modeling System by Lele Yu*, Bin Cui*, Ce Zhang, Yingxia Shao*
*Peking University, China
Yu Liu joined the Systems Group as a 3rd year PhD student.
The following PhD students left for their internships this summer:
Roni Häcki - Oracle Labs, Cambridge, UK
Renato Marroquin - Oracle Labs, Belmont, CA, USA
Konstantin Taranov - Microsoft Research, Cambridge, UK
Lukas Humbel - HPE, Palo Alto, CA, USA
Kaan Kara - Xilinx, Dublin, Ireland
Simon Kassing - Mellanox Technologies, Israel (Internship starting 16. July '17)
Vasiliki Kalavri has been awarded the ETH Postdoctoral Fellowship for the project "Automatic scaling of distributed streaming computations using graph analytics on real-time monitoring data".
The paper "Scalable Inference of Decision Tree Ensembles: Flexible Design for CPU-FPGA Platforms" by Ewaida Mohsen, Hantian Zhang, Ce Zhang and Gustavo Alonso and the demo "doppioDB: A Hardware Accelerated Database" by David Sidler, Muhsen Owaida, Zsolt Istvan, Kaan Kara and Gustavo Alonso have been accepted at 27th International Conference on Field-Programmable Logic and Applications (FPL 2017), in Ghent, Belgium, September 4-8, 2017.
Torsten Hoefler gave talks, tutorials and participated in panels at ISC 2017 - The event for high performance computing, networking and storage - in Frankfurt, Germany. Details are available here.
Furthermore, David Sidler presented "doppioDB: A Hardware Accelerated Database" at the Intel Collaboration Hub at ISC 2017.
Onur Mutlu gave the opening lecture "Memory Reliability, Security and Beyond" at the DAC Design Automation Summer School in Austin, TX, USA.
The lecture slides are available here.
Darko Makreshanski defended his PhD Dissertation "Systems and Methods for Interactive Data Processing on Modern Hardware".
Carsten Binnig, our alumnus, will join TU Darmstadt as a Full Professor (Data Management, Mining und Retrieval) in August 2017.
The article entitled "Learning Starts with People", published in ETH Globe 2/2017, gives an insight into Ce Zhang's research and his research plans.
"Fast, intelligent data systems are Ce Zhang’s speciality. To make sure they function smoothly, the data scientist combines basic research with service and dialogue." More...
Torsten Hoefler gave a keynote talk entitled "Scientific Benchmarking of Parallel Computing Systems" at Evolvable Methods for Benchmarking Realism and Community Engagement - EMBRACE Workshop (IPDPS 2017).
Torsten also gave an invited talk entitled "Progress in automatic GPU compilation and why you want to run MPI on your GPU" at Second Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware- IPDRM Workshop (IPDPS 2017)
Vasiliki Kalavri gave the talk “Strymon: Queryable Online Simulation for Modern Datacenters” at the NII Shonan meeting on Language integrated Queries in Kunagawa, Japan.
Ce Zhang gave his inaugural lecture entitled "Data Sciences, Data Systems, and Data Services".
Timothy Roscoe gave the keynote talk ""The Trouble with Hardware" at the 10th ACM International Systems and Storage Conference (Systor 2017) in Haifa, Israel.
The paper "Caribou: Intelligent Distributed Storage" by Zsolt Istvan, David Sidler and Gustavo Alonso has been accepted at VLDB 2017 in Munich, Germany, August 28 - September 1, 2017.
Ce Zhang gave the talk entitled "Pushing the Boundaries of Machine Learning" at ETH Meets New York.
Torsten Hoefler has been appointed as Associate Professor of Computer Science (Scalable Parallel Computing). Congratulations!
"The demo: "doppioDB: A Hardware Accelerated Database" by David Sidler, Zsolt Istvan, Muhsen Owaida, Kaan Kara, and Gustavo Alonso was selected as one of the Best Demos of SIGMOD 2017 and won an honorable mention.
Hantian Zhang presented the paper "Faster Machine Learning via Low-Precision Communication & Computation" with Dan Alistarh at Nvidia GPU Technology Conference (GTC) in Silicon Valley.
The following six papers and a demo have been presented at SIGMOD 2017 and collocated Workshops in Chicago, IL, May 14-19, 2017:
"BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads" by Darko Makreshanski, Jana Giceva, Claude Barthels and Gustavo Alonso, SIGMOD 2017
"Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures" by David Sidler, Zsolt István, Mohsen Ewaida, Gustavo Alonso, SIGMOD 2017
"Heterogeneity-aware Distributed Parameter Servers" by Jiawei Jiang*, Bin Cui*, Ce Zhang, Lele Yu* *(Peking University), SIGMOD 2017
"FPGA Based Data Partitioning" by Kaan Kara, Jana Giceva, Gustavo Alonso, SIGMOD 2017
"An Overreaction to the Broken Machine Learning Abstraction: The ease.ml Vision by Ce Zhang, Wentao Wu (Microsoft Research) and Tian Li (Peking University), Workshop on Human-In-the-Loop Data Analytics (HILDA 2017) collocated with SIGMOD 2017
"Scaling Column Imprints using Advanced Vectorization" by Lefteris Sidirourgos (ETH) and Hannes Muhleisen (CWI), 13th International Workshop on Data Management on New Hardware (DaMoN 2017) collocated with SIGMOD 2017
DEMO: "doppioDB: A Hardware Accelerated Database" by David Sidler, Zsolt Istvan, Muhsen Owaida, Kaan Kara, and Gustavo Alonso, SIGMOD 2017
Onur Mutlu gave his inaugural lecture entitled "Future Computing Architectures".
Slides are available here.
The following paper has been accepted at the 34th International Conference on Machine Learning (ICML 2017) ICML, Sydney, Australia, August 06-11, 2017:
The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning by Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhan.
The following paper has been accepted at ACM SIGCOMM, Los Angeles, CA, USA, August 21-25, 2017.
“Beyond fat-trees without antennae, mirrors, and disco-balls” by Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, Ankit Singla.
The following two papers have been presented at the 25th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2017) in Napa, CA, US, April 30 - May 2, 2017:
"Centaur: A Framework for Hybrid CPU-FPGA Databases" by Muhsen Owaida, David Sidler, Kaan Kara and Gustavo Alonso
"FPGA accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-off" by Kaan Kara, Dan Alistarh, Ce Zhang, Onur Mutlu, Gustavo Alonso
Timothy Roscoe took part in a panel discussion on the “Internet and Trust”, the first in a series talks based on the “Science in Perspective” course programme. More…
Abstract:
A secure and reliable internet generates trust. Just how the internet creates trust is a matter for debate between ETH computer scientists and social scientists – for example, in a public session on 9 May 2017. Trust, it is sometimes said, is almost more important to the economy and society than money. Unlike money, trust is not a neutral unit of value and comparison, but rather a relationship quality. When you give someone your trust, you grant them a certain degree of latitude to act and in return expect certain benefits. In this respect, trust facilitates communication and the acquisition of information.
The following demo co-authored by Ce Zhang has been accepted at VLDB 2017 Demonstration Track, Munich, Germany, August 28 to September 1, 2017.
MLog: Towards Declarative In-Database Machine Learning by Xupeng Li*, Bin Cui*, Yiru Chen*, Wentao Wu**, Ce Zhang.
*Peking University, China
**MSR Redmond, USA
Reto Achermann presented the following paper at MARS 2017 (2nd Workshop on Models for Formal Analysis of Real Systems), Uppsala, Sweden:
"Formalizing Memory Accesses and Interrupts" by Reto Achermann, Lukas Humbel, David Cock and Timothy Roscoe.
John Liagouris gave the talk entitled "Understanding Distributed Dataflow Systems" at Google, Mountain View and at VMware, Palo Alto.
Gustavo Alonso gave several talks at Intel Labs and Intel product groups in Portland, Oregon on using hybrid FPGA-CPU architectures for data processing as well as giving a talk on hardware acceleration for database engines at Microsoft Research Redmond, Washington.
The paper "A Cloud-based Content Gathering Network" by Debopam Bhattacherjee (ETH Zurich), Muhammad Tirmazi (LUMS), Ankit Singla (ETH Zurich) has been accepted at HotCloud 2017, Santa Clara, CA, July 10-11, 2017.
Onur Mutlu gave the keynote talk entitled "The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser" at Center for Advancing Electronics Dreseden (CFED) Workshop on Resilient Systems (WRS), Dresden, Germany.
Abstract:
We will discuss the RowHammer problem in DRAM and how it poses a new system-wide security vulnerability. RowHammer is the phenomenon that repeatedly accessing a row in a modern DRAM chip causes errors in physically-adjacent rows. It is caused by a hardware failure mechanism called read disturb errors. The Google Zero Project recently demonstrated that this hardware phenomenon can be exploited by user-level programs to gain kernel privileges. Several other recent works work demonstrated other attacks exploiting RowHammer, including remote takeover of a server vulnerable to RowHammer. We will analyze the root causes of the problem and examine solution directions. We will also discuss what other problems may be lurking in DRAM and other types of memories, e.g., NAND flash and Phase Change Memory, which can potentially threaten the foundations of reliable and secure systems, as the memory technologies scale to higher densities.
Zaheer Chothia presented the following paper at EuroSys 2017 in Belgrade, Serbia:
Online Reconstruction of Structural Information from Datacenter Logs by Zaheer Chothia, John Liagouris, Desislava Dimitrova, and Timothy Roscoe.
Abstract:
Well-run datacenter application architectures are heavily instrumented to provide detailed traces of messages and remote invocations. Reconstructing user sessions, call graphs, transaction trees, and other structural information from these messages, a process known as sessionization, is the foundation for a variety of diagnostic, profiling, and monitoring tasks essential to the operation of the datacenter.
We present the design and implementation of a system which processes log streams at gigabits per second and reconstructs user sessions comprising millions of transactions per second in real time with modest compute resources, while dealing with clock skew, message loss, and other real-world phenomena that make such a task challenging. Our system is based on the Timely Dataflow framework for low latency, data-parallel computation, and we demonstrate its utility with a number of use-cases and traces from a large, operational, mission-critical enterprise data center.
Simon Kassing joined the Systems Group as PhD student. Simon completed his Master Thesis in the Systems Group.
The following paper has been accepted at HotOS XVI, Whistler, British Columbia, Canada, May 7-10, 2017.
Separating Translation from Protection in Address Spaces with Dynamic Remapping by Reto Achermann, Chris Dalton (HP Labs), Paolo Faraboschi (HP Labs), Moritz Hoffman, Dejan Milojicic (HP Labs), Geoffrey Ndu (HP Labs), Alexander Richardson (University of Cambridge), Timothy Roscoe, Adrian L. Shaw (HP Labs), Robert N. M. Watson (University of Cambridge).
Onur Mutlu gave the keynote talk entitled "Rethinking Memory System Design (and the Computing Platforms We Design Around It)" at 13th International Symposium on Applied Reconfigurable Computing (ARC 2017) in Delft, The Netherlands.
Abstract:
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability.
In this talk, we first discuss major challenges facing modern memory systems in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges and thus enable scalable memory systems for the future. We discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of memory and the rest of the system, 2) designing a memory system that intelligently employs emerging non-volatile memory (NVM) technologies and coordinates memory and storage management, 3) reducing memory interference and providing predictable performance to applications sharing the memory system. If time permits, we will also touch upon our ongoing related work in combating scaling challenges of NAND flash memory.
An accompanying paper, slightly outdated (circa 2015), can be found here.
The paper "An Overreaction to the Broken Machine Learning Abstraction: The ease.ml Vision by Ce Zhang, Wentao Wu (Microsoft Research) and Tian Li (Peking University) has been accepted at the Workshop on Human-In-the-Loop Data Analytics (HILDA 2017) co-located with SIGMOD 2017 (14 May 2017, Chicago).
The article entitled "Designing Databases for Future High-Performance Networks" by Claude Barthels, Gustavo Alonso and Torsten Hoefler has been published in IEEE Data Engineering Bulletin, Vol. 40, No. 1. (March 2017).
The following paper has been accepted at 13th International Workshop on Data Management on New Hardware (DaMoN 2017) collocated with 2017 ACM SIGMOD/PODS Conference, Chicago, IL USA, May 14th - 19th, 2017:
Scaling Column Imprints using Advanced Vectorization by Lefteris Sidirourgos (ETH) and Hannes Muhleisen (CWI).
John Liagouris gave the talk entitled "Understanding Distributed Dataflow Systems" at Harvard and Boston University this week.
Timothy Roscoe gave the talk entitled "Diagnosing Datacenters" at Amadeus Architecture, Quality & Governance.
The paper "Centaur: A Framework for Hybrid CPU-FPGA Databases" by Muhsen Owaida, David Sidler, Kaan Kara and Gustavo Alonso has been accepted at the 25th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2017) in Napa, CA, US, April 30 - May 2, 2017.
The paper entitled "Improving the Reliabililty of Chip-off Forensic Analysis of NAND Flash Memory Devices" by Aya Fukami (Japan National Police Abgency); Saugata Ghose, Yixin Luo, Yu Cai (Carnegie Mellon University); and Onur Mutlu (ETH Zurich) has been selected as the BEST PAPER at the 4th European Digital Forensics Research Workshop and Conference (DFRWS 2017) held in Überlingen, Lake Constance, Germany.
The paper entitled "The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions" by Minesh Patel, Jeremie Kim and Onur Mutlu has been accepted at the 44th International Symposium on Computer Architecture (ISCA 2017) in Toronto, Ontario, Canada, June 24-28, 2017.
The following paper has been accepted at EuroSys 2017 in Belgrade, Serbia, April 23-26, 2017:
Online Reconstruction of Structural Information from Datacenter Logs by Zaheer Chothia, John Liagouris, Desislava Dimitrova, and Timothy Roscoe.
The paper "FPGA accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-off" by Kaan Kara, Dan Alistarh, Ce Zhang, Onur Mutlu, Gustavo Alonso has been accepted at the 25th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2017) in Napa, CA, US, April 30 - May 2, 2017.
The following demo has been accepted at SIGMOD 2017, Chicago, IL, May 14-19, 2017: "doppioDB: A Hardware Accelerated Database" by David Sidler, Zsolt Istvan, Muhsen Owaida, Kaan Kara, and Gustavo Alonso.
Polly Labs, the ETH / INRIA / ARM polyhedral compilation research lab, was selected as Google Summer of Code mentoring organization. As a result, Polly Labs will offer international Summer of Code students summer fellowships to work on polyhedral compilation! More...
Ce Zhang's collaboration with ETH astrophysicist, Kevin Shawinski, has been followed by the media since the paper "Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit" by Kevin Schawinski, Ce Zhang, Hantian Zhang, Lucas Fowler, and Gokula Krishnan has recently been published in Monthly Notices of the Royal Astronomical Society.
The article "Machine Learning Is Bringing the Cosmos Into Focus" has been published in "The Atlantic".
More press coverage:
Oxford Journals: https://oxfordjournals.altmetric.com/details/16663618/news
Phys.org: https://phys.org/news/2017-02-neural-networks-sharpest-images.html
Project page: space.ml
Eleftherios Sidirourgos joined the Systems Group on 1 March 2017. Eleftherios comes to us from CWI Amsterdam.
Markus Pilman defended his PhD Dissertation "Tell: An Elastic Database System for Mixed Workloads".
The paper "Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit" by Kevin Schawinski, Ce Zhang, Hantian Zhang, Lucas Fowler, and Gokula Krishnan has been published in Monthly Notices of the Royal Astronomical Society.
Project page: space.ml
Tal Ben Nun joined the Systems Group as Postdoc on 20 February 2017. Tal comes to us from the Hebrew University of Jerusalem.
The paper "Formalizing Memory Accesses and Interrupts" by Reto Achermann, Lukas Humbel, David Cock and Timothy Roscoe has been accepted at MARS 2017 (2nd Workshop on Models for Formal Analysis of Real Systems), Uppsala, Sweden, April 29, 2017.
Andrea Lattuada joined the Systems Group as PhD student. Andrea completed his Master Thesis in the Systems Group.
The following five papers have been accepted at 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017) in Orlando, Florida, USA, May 29 - June 2, 2017:
SlimSell: A Vectorized Graph Representation for Breadth-First Search by M. Besta, F. Marending, E. Solomonik, T. Hoefler
Transparent Caching for RMA Systems by S. Di Girolamo, F. Vella and T. Hoefler
Corrected Gossip Algorithms for Fast Reliable Broadcast on Unreliable Systems by T. Hoefler, A. Barak, A. Shiloh and Z. Drezner
Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations by T. Wicky, E. Solomonik and T. Hoefler
Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL by S. Ramos and T. Hoefler
Lois Orosa joined the Systems Group as Academic Guest. Lois comes to us from the University of Campinas in Brazil.
Sebastian Wicki joined the Systems Group as Scientific Collaborator. Sebastian completed his Master Thesis in our group.
Minesh Patel und Jeremie Kim joined the Systems Group as PhD students. Minesh comes to us from the University of Texas, Austin and Jeremie from Carnegie Mellon University, Pittsburg.
Sabela Ramos started her new position at Google Zurich this week.
The following paper has been accepted at SIGMOD 2017 in Raleigh, NC, USA; May 14-19, 2017.
BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads by Darko Makreshanski, Jana Giceva, Claude Barthels and Gustavo Alonso.
Vasiliki Kalavri joined the Systems Group as Postdoc on 3 January 2017.
Jana Giceva defended her PhD Dissertation "Database/Operating system co-design".
Lucas Braun defended his PhD Dissertation "Confidentiality and Performance for Cloud Databases".
Hasan Hassan joined the Systems Group as a new PhD student. He completed his master thesis at TOBB University of Economics and Technology, Ankara, Turkey.
Besmira Nushi defended her PhD Dissertation "Quality Control and Optimization for Hybrid Crowd-Machine Learning Systems".
Besmira Nushi presented the paper "A Human-in-the-loop Approach for Troubleshooting Machine Learning Systems"(Besmira Nushi, Ece Kamar, Donald Kossmann and Eric Horvitz) at the Future of Interactive Learning Machines Workshop (FILM) collocated with NIPS 2016 in Barcelona, Spain.
The following short paper has been presented at the 25th International Conference on Field-Programmable Technology (FPT'16) in Xi'an, China: "Debugging Framework for FPGA-based Soft Processors" by David Sidler and Ken Eguro, Microsoft Research.
Ce Zhang gave the talk "Machine learning on the edge and beyond" at Schindler in Luzern.
The Systems Group has been chosen to participate in the Intel Hardware Accelerator Research Program which grants access to an Intel Xeon+FPGA system (Broadwell + Arria10) platform for research purposes. The grant is a continuation of a previous one around the HARPv1 system.
The following four papers have been presented at The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16) in Salt Lake City:
Scheduling-Aware Routing for Supercomputers by J. Domke and T. Hoefler
dCUDA: Hardware Supported Overlap of Computation and Communication by T. Gysi, J. Baer, T. Hoefler
Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide by W. Tang, B. Wang, S. Ethier, G. Kwasniewski, T. Hoefler, K. Ibrahim, K. Madduri, S. Williams, L. Oliker, C. Rosales-Fernandez, T. Williams
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers by M. Martinasso, G. Kwasniewski, S. Alam, T. Schulthess, T. Hoefler.
The following two papers have been accepted at the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA), which will be held in Austin, TX, USA, February 4-8, 2017.
“SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies,” by Hasan Hassan*, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk Lee, Oguz Ergin, Onur Mutlu.
“Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques,” by Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, Erich F. Haratsch.
*Hasan Hassan, the first author of the first paper, will join the Systems Group in December 2016.
Gustavo Alonso gave the talks "Data Processing on the Fast Lane" at Intel Santa Clara and at the Workshop on Reconfigurable Hardware for HPC held jointly with Supercomputing. He also gave the talk "Accelerating Database Engines with FPGAs" at Xilinx.
The following two papers have been accepted for publication at VLDB 2017:
"Distributed Join Algorithms on Thousands of Cores" by Claude Barthels, Ingo Müller, Timo Schneider, Gustavo Alonso, and Torsten Hoefler
"An Experimental Evaluation of SimRank-based Similarity Search Algorithms" Zhipeng Zhang*, Yingxia Shao*, Bin Cui* and Ce Zhang
*Peking University
The paper "On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems" by Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann has been accepted at The Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), which will be held February 4-9, 2017 in San Francisco.
The following three papers have been accepted at SIGMOD 2017, May 14-19, Raleigh, NC, USA:
"Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures" by David Sidler, Zsolt István, Mohsen Ewaida, Gustavo Alonso
"Heterogeneity-aware Distributed Parameter Servers" by Jiawei Jiang*, Bin Cui*, Ce Zhang, Lele Yu* *(Peking University)
"FPGA Based Data Partitioning" by Kaan Kara, Jana Giceva, Gustavo Alonso
Ce Zhang gave the talk entitled "Accessible Data Sciences with Efficient Data Systems" at IBM Zurich Research Lab.
Abstract:
One important problem for the current state of data science is that many of the techniques needed to unleash the next big thing are available but still far from accessible. Specifically, the current machine-learning ecosystems are difficult to use by non-computer science users and they are still far from achieving the full potential that can be provided by modern hardware. With more than five ongoing data sciences applications here at ETH Zurich, ranging from genomics, social sciences, and astronomy, our dream is to design the next generation of data science ecosystems that are fast, scalable, and easier to use. In this talk, I will first describe the abundant opportunities for data sciences at ETH Zurich, and then describe two enabling techniques that are being developed by my group. The general direction of these techniques is the co-design of machine learning (or artificial intelligence) with modern hardware and systems. I will talk about our recent work that introduced a data structure for dense linear regression. It can potentially reduce the memory bandwidth by 20x while training. Then I will introduce a novel database architecture, which makes the production system of a leading security company 100x faster. It contains an SMT solver to answer queries that it was not originally designed for.
Ankit Singla presented the paper "Fat-FREE topologies" at The Fifteenth ACM Workshop on Hot Topics in Networks (HotNets 2016) in Atlanta, Georgia, USA.
Abstract:
With the growing size of data center networks, full-bandwidth connectivity between all pairs of servers is becoming difficult and expensive to scale. Thus, numerous recent topology proposals incorporate reconfigurable wireless and optical connectivity, allowing the topology to adapt to the traffic demands — only servers that require bandwidth at any given time receive such dynamic connections. Implicitly, this work has suggested that statically wired topologies are fundamentally inflexible, and would need to be built at full capacity to handle unpredictably skewed traffic. This work shows the reports of inflexibility of statically wired networks to be greatly exaggerated — if traffic en- gineering were efficient, certain statically wired networks could achieve performance and cost comparable to topology- adaptive designs, even for skewed workloads. Thus, alongside the development of reconfigurable topologies, the community should also invest in developing superior traffic engineering over static networks other than fat-trees as an alternate path forward. These results also call for a rigorous quantification of the difference between the power of two techniques for handling dynamic, unpredictable traffic with limited network resources: traffic engineering over suitable static networks, and changing the topology itself dynamically.
Andrei Marian Dan, Patrick Lam, Torsten Hoefler, and Martin Vechev won the Distinguished Paper Award at The International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016) for their paper “Modeling and Analysis of Remote Memory Access Programming”.
Jana Giceva gave the talk entitled “Customized OS support for data processing on modern hardware" at Massachusetts Institute of Technology (MIT).
The paper entitled "MQJoin: Efficient Shared Execution of Main-memory Joins" by Darko Makreshanski, Georgios Giannikis, Gustavo Alonso and Donald Kossmann has been selected as one of the Best Papers of VLDB 2016.
Abstract:
Database architectures typically process queries one-at-a-time, ex- ecuting concurrent queries in independent execution contexts. Of- ten, such a design leads to unpredictable performance and poor scalability. One approach to circumvent the problem is to take advantage of sharing opportunities across concurrently running queries. In this paper we propose Many-Query Join (MQJoin), a novel method for sharing the execution of a join that can effi- ciently deal with hundreds of concurrent queries.
This is achieved by minimizing redundant work and making efficient use of main- memory bandwidth and multi-core architectures. Compared to ex- isting proposals, MQJoin is able to efficiently handle larger work- loads regardless of the schema by exploiting more sharing oppor- tunities. We also compared MQJoin to two commercial main- memory column-store databases. For a TPC-H based workload, we show that MQJoin provides 2-5x higher throughput with signif- icantly more stable response times.
The paper entitled "Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads" by Milad Hashemi, Onur Mutlu, and Yale N. Patt, has been selected one of the 6 best papers at MICRO 2016.
Onur Mutlu gave the talk entitled "Rethinking Memory System Design (and the Computing Platforms We Design Around It)" at IBM Research Zurich.
Abstract:
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability.
In this talk, we first discuss major challenges facing modern memory systems in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges and thus enable scalable memory systems for the future. We discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of memory and the rest of the system, 2) designing a memory system that intelligently employs emerging non-volatile memory (NVM) technologies and coordinates memory and storage management, 3) reducing memory interference and providing predictable performance to applications sharing the memory system. If time permits, we will also touch upon our ongoing related work in combating scaling challenges of NAND flash memory.
An accompanying paper, slightly outdated (circa 2015), can be found here.
Besmira Nushi presented the paper entitled "Learning and Feature Selection under Budget Constraints in Crowdsourcing" (Besmira Nushi, Adish Singla, Andreas Krause and Donald Kossmann) at HCOMP 2016, in Austin, TX, USA.
Yaohua Wang, Postdoc from the National Univ. of Defense Technology in Changsha, China, joined the Systems Group as Academic Guest. Yoahua will be staying with us until October 2017.
Reto Achermann presented the following paper at 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016) in Savannah, GA, USA.
"Machine-Aware Atomic Broadcast Trees for Multicores" by Stefan Kaestle, Reto Achermann, Roni Haecki, Moritz Hoffmann, Sabela Ramos, and Timothy Roscoe.
John Liagouris gave the talk “Explaining Outputs in Modern Data Analytics” at the Department of Computer Science, University of Hong Kong (HKU).
Torsten Hoefler gave the keynote talk "Theory and Practice in HPC: Modeling, Programming, and Networking" at HPC China 2016 in Xi'an, China.
Abstract:
We advocate the usage of mathematical models and abstractions in practical high-performance computing. For this, we show a series of examples and use-cases where the abstractions introduced by performance models can lead to clearer pictures of the core problems and often provide non-obvious insights. We start with models of parallel algorithms leading to close-to-optimal practical implementations. We continue our tour with distributed-memory programming models that provide various abstractions to application developers. A short digression on how to measure parallel systems shows common pitfalls of practical performance modeling. Application performance models based on such accurate measurements support insight into the resource consumption and scalability of parallel programs on particular architectures. We close with a demonstration of how mathematical models can be used to derive practical network topologies and routing algorithms. In each of these areas, we demonstrate newest developments but also point to open problems. All these examples testify to the value of modeling in practical high-performance computing. We assume that a broader use of these techniques and the development of a solid theory for parallel performance will lead to deep insights at many fronts.
Ce Zhang gave the talk entitled "Accessible Data Sciences with Efficient Data Systems" at following universities in China:
Timothy Roscoe gave the talk "Barrelfish: An OS for Real, Modern Hardware" at Stanford Cloud Workshop 2016.
Jana Giceva gives the talk entitled "Customized OS support for data processing on modern hardware" at the SFB Colloquium (TU Dortmund).
Onur Mutlu gave the keynote talk “Rethinking Memory System Design for Data-Intensive Computing: Business as Usual in the Next Decade?” at the 2nd EAI International Conference on Future Access Enablers of Ubiquitous and Intelligent Infrastructures (FABULOUS 2016) in Belgrade, Serbia.
The paper entitled "READY: Completeness is in the Eye of the Beholder" by Badrish Chandramouli, Johannes Gehrke, Jonathan Goldstein, Donald Kossmann, Justin Levandoski, Renato Marroquin and Wenlei Xie has been accepted at CIDR 2017.
The paper entitled "Analytics on Fast Data: Main-Memory Database Systems versus Modern Streaming Systems by Andreas Kipf, Varun Pandey, Jan Böttcher, Lucas Braun, Thomas Neumann and Alfons Kemper has been accepted at EDBT 2017.
Ce Zhang gave the talk entitled "Data Science Applications and Machine Learning/System Co-Design" at Huawei European Research Center, Munich, Germany.
Debopam Bhattacherjee joined the Systems Group as Phd Student. Debopam comes to us from KTH Royal Institute of Technology and Aalto University.
The paper entitled "A Survey of Microarchitectural Timing Attacks and Countermeasures on Contemporary Hardware" by Qian Ge, Yuval Yarom, David Cock and Gernot Heiser, will be published in the Journal of Cryptographic Engineering (JCEN).
Abstract: Microarchitectural timing channels expose hidden hardware state though timing. We survey recent attacks that exploit microarchitectural features in shared hardware, especially as they are relevant for cloud computing. We classify types of attacks according to a taxonomy of the shared resources leveraged for such attacks. Moreover, we take a detailed look at attacks used against shared caches. We survey existing countermeasures. We finally discuss trends in the attacks, challenges to combating them, and future directions, especially with respect to hardware support.
Ankit Singla gave his inaugural lecture entitled "Internet at the Speed of Light".
Onur Mutlu gave the keynote talk "Rethinking Memory System Design" at 27th International Symposium on Rapid System Prototyping (RSP 2016) in Pittsburgh, PA, USA.
Abstract:
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security, predictability and robustness.
In this talk, we first discuss major challenges modern memory systems face in the presence of increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges and enable scalable memory systems for the future. We discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of memory and the rest of the system, 2) designing a memory system that intelligently employs emerging non-volatile memory (NVM) technologies, 3) reducing memory interference and providing predictable performance to applications sharing the memory system. If time permits, we will also touch upon our ongoing related work in combating scaling challenges of NAND flash memory.
The paper entitled "Machine-Aware Atomic Broadcast Trees for Multicores" by Stefan Kaestle, Reto Achermann, Roni Haecki, Moritz Hoffmann, Sabela Ramos, and Timothy Roscoe has been accepted at 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016) in Savannnah,GA, USA, November 2–4, 2016.
The Interview with Torsten Hoefler entitled "How to program large-scale heterogeneous parallel computers" has been published as Spotlight on the ETH Department of Computer Science Web page.
Stefan Kästle defended his PhD Dissertation entitled "Machine-aware memory allocation and synchronization".
Stefan took up a position with Oracle Labs, Zürich.
Tobias Grosser received an SNSF Ambizione Fellowship to work the next three years on heterogeneous compute in http://polly.llvm.org/.
Onur Mutlu gave the keynote talk entitled "Rethinking Memory System Design" at the 2nd Workshop on Mobile System Technologies (MST) in Milan.
Abstract:
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability.
In this talk, we first discuss major challenges facing modern memory systems in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges and thus enable scalable memory systems for the future. Specifically, we discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of the memory and the rest of the system, 2) designing a memory system that intelligently employs multiple memory technologies and coordinates memory and storage management using non-volatile memory technologies, 3) providing predictable performance and QoS to applications sharing the memory/storage system. If time permits, we might also briefly touch upon our ongoing related work in combating scaling challenges of NAND flash memory.
An accompanying paper, slightly outdated, can be found here.
The paper "Fat-Free Topologies" by Ankit Singla has been accepted at HotNets 2016.
Two more Systems Group talks have been given at ARM Research Summit in Cambridge, UK:
Tobias Grosser gave the talk entitled "Polly-ACC: Transparent Compilation to Heterogeneous Hardware"
David Cock gave the poster presentation entitled "Barrelfish on ARM at ETH".
Anja Grünheid defended her PhD Dissertation entitled “Data Integration with Dynamic Data Sources”.
Anja took up a position with Google, New York.
Onur Mutlu gave the keynote talk entitled "Rethinking Memory System Design: Business as Usual in the Next Decade?" at the inaugural ARM Research Summit in Cambridge, UK.
Timothy Roscoe gave the talk entitled “Barrelfish: an OS for real, modern hardware” at Cavium in San Jose. At Google in Mountain View he gave the talk entitled “Online simulation as a diagnostic tool for datacenters".
Gustavo Alonso gave the keynote talk ”Accelerating Data Science“ at the Fourth International Workshop on In-Memory Data Management and Analytics (IMDM 2016) collocated with VLDB 2016 in New Delhi.
Abstract:
Data science or big data, whatever one wants to call it, raises important challenges in terms efficient processing. One the one hand, the application demands are becoming more stringent (more data, more complex analysis, faster results, larger workloads, etc.). On the other hand, hardware and computing platforms are in a complex phase with little stability in terms of architectures and lacking an overall direction. In this talk I will discuss the problem, arguing that there is an opportunity for specialized designs departing from general purpose systems. I will illustrate the point with examples from our research and then show how we are exploiting reconfigurable hardware (FPGAs) to explore a wide range of architectural designs, new algorithms for data processing, and redesigning the entire system stack to better support data science. The talk will conclude with a number of ideas on how the database community can contribute to the development of new hardware and how to orchestrate a more coherent, collective research agenda.
The following two Systems Group papers have been presented at VLDB 2016 in New Delhi:
Darko Makreshanski presented: "MQJoin: Efficient Shared Execution of Main-Memory Joins" by Darko Makreshanski, Georgios Giannikis, Gustavo Alonso and Donald Kossmann.
John Liagouris presented: "Explaining Outputs in Modern Data Analytics" by Zaheer Chothia, John Liagouris, Frank McSherry and Timothy Roscoe.
Ce Zhang talks about his first impressions and his plans at ETH in the "Spotlight" interview published on the ETH Department of Computer Science Web page.
The following two short papers have been presented at 26th International Conference on Field-Programmable Logic and Applications (FPL'16) in Lausanne:
David Sidler presented: "Low-Latency TCP/IP Stack for Data Center Applications" by David Sidler, Zsolt István and Gustavo Alonso.
Kaan Kara presented: "Fast and Robust Hashing for Database Operators" by Kaan Kara and Gustavo Alonso.
Stefan Müller defended his PhD Dissertation entitled "Astronomy and Computing".
Stefan took up a position with Amazon, Vancouver.
Ce Zhang has joined the Systems Group as Assistant Professor.
Zhang's research interests are databases, data processing, machine learning and data science. His applications focus on systems to help scientists analyse and understand large quantities of data. He links conventional areas of databases and information retrieval with new methods of machine learning. More...
Gustavo Alonso gave a keynote talk entitled "Data Processing on the Fast Lane" at 26th International Conference on Field-Programmable Logic and Applications (FPL 2016) in Lausanne.
Abstract:
Data processing is changing in radical ways. On the one hand, data science and big data have brought an unprecedented growth and variety in data sizes, demanding workloads, data types, and applications. On the other hand, hardware is no longer a source of performance as it has been in the last decades. Instead, it has become a complex, fast evolving, highly specialized, and heterogeneous platform that requires considerable tuning and effort to use optimally. In this talk I will discuss the problem, arguing that there is an opportunity for specialized designs based on FPGAs and showing the challenges to data processing resulting from modern hardware. I will illustrate the points with examples from research and recent developments from industry to argue there is a significant opportunity for FPGAs in data centers if one focuses on the correct problems and finds the proper architecture for the complete system.
Timo Schneider received the Best Student Paper Award at Hot Interconnects 2016 for the paper "Ensuring Deadlock-Freedom in Low-Diameter InfiniBand Networks" by Timo Schneider, Otto Bibartiu, Toersten Hoefler.
Tha paper entitled "Learning and Feature Selection under Budget Constraints in Crowdsourcing" by Besmira Nushi, Adish Singla, Andreas Krause and Donald Kossmann has been accepted at HCOMP 2016, in Austin, TX, USA, October 30 - November 3, 2016.
Vojislav Dukic joined the Systems Group as PhD Student on 15 August 2016.
Oracle Labs Invited Talk by Claude Barthels
"Distributed Join Algorithms on a Thousand Cores"
2 August 2016, 6.00 - 7.00 pm
CAB E 72
Timothy Roscoe interviewed in "The Register" on 29 July 2016
"The bigger they get, the harder we fall: Thinking our way out of cloud crash"
Arash Tavakkol joined the Systems Group as Postdoc on 2 August 2016.
Ce Zhang, from Stanford University, will join the Computer Science Department and the Systems Group as assistant professor in September 2016.
The paper "So many performance events, so little time" by Gerd Zellweger, Denny Lin and Timothy Roscoe has been accepted at the 7th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys 2016) in Hong Kong, China, August 4-5, 2016.
The paper entitled "Measuring and Understanding Throughput of Network Topologies" by Sangeetha Abdu Jyothi, Ankit Singla, P. Brighten Godfrey, and Alexandra Kolla has been accepted at Super Computing 2016 in Salt Lake City, Utah, USA, 13-18 Nov. 2016.
Timothy Roscoe gives the following course at the Twelfth International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES 2016) in Fiuggi, Italy:
What's happening to computer hardware, and what does it mean for the OS?
David Sidler und Kaan Kara are attending the eleventh Microsoft Research Summer School at Microsoft Research in Cambridge, UK.
Jana Giceva presented the paper entitled "Customized OS support for data processing" (Jana Giceva, Gerd Zellweger, Gustavo Alonso and Timothy Roscoe) at the Twelfth International Workshop on Data Management on New Hardware (DaMoN 2016) in San Francisco.
Gustavo Alonso talks about the collaboration with Microsoft Research - Swiss Joint Research Center in an interview published as a Departmental Spotlight.
The following two short papers have been accepted at 26th International Conference on Field-Programmable Logic and Applications (FPL'16):
"Low-Latency TCP/IP Stack for Data Center Applications" by David Sidler, Zsolt Istvan and Gustavo Alonso.
"Fast and Robust Hashing for Database Operators" by Kaan Kara and Gustavo Alonso.
The paper "Explaining Outputs in Modern Data Analytics" by Zaheer Chothia, John Liagouris, Frank McSherry and Timothy Roscoe has been accepted for presentation at VLDB 2016 in New Delhi, India, Sept. 5 - 9, 2016.
Jana Giceva is presenting her work on "Basslet: customized OS support for data processing on modern hardware" to our industry partners in the US.
Timothy Roscoe gives a talk entitled "Barrelfish: an operating system for modern hardware" at USENIX Vail Computer Elements Workshop (VCEW 2016) in Vail, CO, USA.
Gustavo Alonso gave the Colloquium talk entitled “Accelerating Data Science” at Technical University of Munich (TUM).
Abstract:
Data science or big data, whatever one wants to call it, raises important challenges in terms efficient processing. One the one hand, the application demands are becoming more stringent (more data, more complex analysis, faster results, larger workloads, etc.). On the other hand, hardware and computing platforms are in a complex phase with little stability in terms of architectures and lacking an overall direction. In this talk I will discuss the problem, arguing that there is an opportunity for specialized designs departing from general purpose systems. I will illustrate the point with examples from our research and then show how we are exploiting reconfigurable hardware (FPGAs) to explore a wide range of architectural designs, new algorithms for data processing, and redesigning the entire system stack to better support data science. The talk will conclude with a number of ideas on how the database community can contribute to the development of new hardware and how to orchestrate a more coherent, collective research agenda.
Ankit Singla talks about his plans at ETH in the "Spotlight" interview published on the ETH Department of Computer Science Web page.
Patrick Schmid, Maciej Besta and Torsten Hoefler received the first Karsten Schwan Best Paper Award for their paper "High-Performance Distributed RMA Locks" at ACM HPDC'16 in Kyoto.
Gustavo Alonso gave a talk entitled "Accelerating Data Science" at the 2nd Portugal|UT Austin Summer School in Systems and Networking in Costa da Caparica, Lisbon, Portugal.
Timothy Roscoe moderates the "Future of the Internet Seminar at Google UK in London. The Seminar is one of the Science & Technology events at "Zürich meets London".
Moritz Hoffmann joined HPE Labs in Palo Alto, CA, USA for an Internship until September 16th, 2016.
The paper entitled "Customized OS support for data processing" by Jana Giceva, Gerd Zellweger, Gustavo Alonso and Timothy Roscoe was accepted at the Twelfth International Workshop on Data Management on New Hardware (DaMoN 2016).
Onur Mutlu joined the Systems Group as a Full Professor.
Onur conducts research in in computer architecture, systems, and bioinformatics. His work spans and stretches the boundaries between applications, systems, languages, system software, compilers, and hardware. His research tackles many issues in high performance, energy efficiency, hardware security, fault tolerance, predictable systems, dependable systems, and hardware/software cooperation. He is especially excited about novel computation, communication and memory/storage paradigms, applied to emerging systems, technologies, and bioinformatics/medical applications. He is also excited about system design for bioinformatics and biologically inspired computing paradigms.
Gustavo Alonso gave the following talk at École polytechnique fédérale de Lausanne (EPFL):
Abstract:
David Sidler presents the paper entitled "Runtime Parameterizable Regular Expression Operators for Databases" by Zsolt Istvan, David Sidler and Gustavo Alonso at FCCM 2016 in Washington DC on May 2, 2016.
Timothy Roscoe organizes "The Future of the Internet" seminar, one of the Science & Technology events at "Zürich meets London". The "Future of the Internet" workshop takes place on May 17 at Google UK with researchers from Zurich and London and focuses on challenges and opportunities in a constantly evolving internet world. More information here
Besmira Nushi gave a talk entitled “Crowdsourcing on a budget: Plan early, plan twice!” at Microsoft Research Redmond. She presented her work on learning and making predictions from crowdsourced data under budget constraints.
Pravin Shinde defended his PhD Dissertation entitled "Rethinking host network stack architecture using a dataflow modeling approach".
Pravin took up a position with Oracle Labs, Zürich.
Ankit Singla talks about the Internet at the speed of light (Ein Lichtgeschwindigkeits-Internet) in Electro Suisse Bulletin.
Markus Pilman accepted a Software Engineer position at Snowflake Computing in San Mateo, CA, USA, starting May 2, 2016.
The Systems Group is presenting at two Workshops at EuroSys 2016 in London, April 18-21, 2016:
Jana Giceva: Basslet: an OS Runtime for Parallel Data Processing (Jana Giceva, Gerd Zellweger, Gustavo Alonso and Timothy Roscoe) - Workshop on Multicore and Rack-scale Systems (MaRS'16)
David Cock: Litmus Testing at Rack Scale - Workshop on Multicore and Rack-scale Systems (MaRS'16)
Roni Häcki: Replication in Rack-scale Systems - 10th EuroSys Doctoral Workshop (EuroDW'16)
Gerd Zellweger was one of the presenters of the following paper at ASPLOS 2016 in Atlanta, Georgia, USA:
SpaceJMP: Programming with Multiple Virtual Address Spaces by Izzat El Hajj (University of Illinois at Champaign-Urbana), Alexander Merritt (Georgia Institute of Technology), Gerd Zellweger (ETH Zürich), Dejan Milojicic (HP Labs), Wen-Mei Hwu (University of Illinois at Champaign-Urbana), Karsten Schwan (Georgia Institute of Technology), Timothy Roscoe, Reto Achermann (ETH Zürich), and Paolo Faraboschi (HP Labs)
Gustavo Alonso gave a keynote talk entitled "Generalization versus Specialization in cloud computing infrastructures" at IEEE International Conference on Cloud Engineering (IC2E) 2016 in Berlin.
Abstract :
Cloud computing represents a fundamental change in the business model behind IT: a shift from manufacturing of software and hardware products towards packaging infrastructure, processing, and storage as services. Cloud data centers, given their intended use for general purpose computing, would seem to push towards homogeneity in architectures and platforms. Modern applications and use cases, from scientific computing to big data, push in exactly the opposite direction: an increase in specialization as a way to efficiently meet demanding requirements. In this talk I will illustrate both trends and argue that, contradictory as they seem to be, there are many opportunities in combining them. Doing so requires to work in two areas. One is to find better ways to extend the performance and efficiency advantages of specialization to general purpose settings. The other is to develop the necessary software and hardware layers to allow generalized use of specialized systems. Taking together, these efforts outline an exciting research and development landscape that I will outline as a conclusion of the talk.
Anja Grünheid gave a talk entitled "Data Integration with Dynamic Data Sources" at IBM Research in Rüschlikon.
Zsolt István presented the following paper at NSDI 2016 in Santa Clara, CA:
Consensus in a Box: Inexpensive Coordination in Hardware by Zsolt István, David Sidler, and Gustavo Alonso, ETH Zürich; Marko Vukolić, IBM Research—Zürich.
Gustavo Alonso gave a keynote talk entitled "Data Processing in Modern Hardware" at EDBT/ICDT 2016 in Bordeaux, France.
Abstract:
Data processing is changing in radical ways. On the one hand, data science and big data have brought an unprecedented growth and variety in data sizes, demanding workloads, data types, and applications. On the other hand, hardware is no longer a source of performance as it has been in the last decades. Instead, it has become a complex, fast evolving, highly specialized, and heterogeneous platform that requires considerable tuning and effort to use optimally. In this talk I will discuss the issues in data processing that arise as a result of modern hardware: the need to deal with parallelism and distribution, the increasing importance of networking, the proliferation of accelerators, and the raise of heterogeneity in the machine. These issues are both a threat and a challenge, demanding a radical redesign of many aspects of data processing and database engines. Using examples from recent work, I will present several exciting and radically new directions that are opening up for database research.
David Sidler gave the following talk at Oracle Labs (Redwood City, CA) and at Intel Labs (Santa Clara, CA): "Accelerating String Matching Queries with Hybrid CPU-FPGA Multicores".
Zsolt István gives the following talk at IBM Research and at Xilinx (San Jose, CA):
"Consensus in a Box: Inexpensive Coordination in Hardware"
Timothy Roscoe gives the following talk at VMware in Palo Alto, CA, USA:
Title: "Physical Memory Management for Modern Hardware"
Abstract:
Classical VM is an opaque abstraction of RAM, backed by demand paging. However, most systems today (from phones to data-centers) do not page, and indeed may require the performance benefits of non-paged physical memory, precise NUMA allocation, and distinguish very different kinds of memory (such as NVRAM). Moreover, MMU hardware is now useful for other purposes, such as detecting page access or providing large page translation. Accordingly, the venerable VM abstraction in OSes like Windows and Linux has acquired a plethora of extra APIs to poke at the policy behind the illusion of a virtual address space.
In the Barrelfish OS, we rethink how an OS supports virtual memory. The Barrelfish memory system which inverts the traditional model:. Applications explicitly manage their physical RAM of different types, and directly (though safely) program the translation hardware. The approach requires no virtualization support, and outperforms VMM-based approaches for all but the smallest working sets. I'll talk about use-cases for virtual memory not possible in Linux-like systems today, and show that other use-cases are simple to program and significantly faster.
The following two talks will be presented at the Workshop on Multicore and Rack-scale Systems (MaRS 2016):
"Basslet: an OS runtime for parallel data processing" by Jana Giceva, Gerd Zellweger, Gustavo Alonso, Timothy Roscoe
"Litmus Testing at Rack Scale" by David Cock
Lukas Mosimann joined the Systems Group as a new PhD student by Prof. Torsten Hoefler.
The paper entitled "Runtime Parameterizable Regular Expression Operators for Databases" by Zsolt Istvan, David Sidler and Gustavo Alonso has been accepted at FCCM 2016 in Washington DC, May 1-3.
Gustavo Alonso gave a talk entitled "Databases in the Cloud: they have to be different" at the "Oracle Labs Workshop on Databases in the Cloud" (TU Munich, Germany), Feb 10-11, 2016.
Torsten Hoefler and Zsolt István give talks at the Workshop of the Swiss Joint Research Center (JRC) MSR-ETHZ-EPFL at ETH Zürich, February 02-03, 2016:
"Availability and Reliability as a Resource for Large-Scale in Memory DBs on Datacenter Computers" by Torsten Hoefler
"Building Distributed Applications on Clusters of FPGAs" by Zsolt István
Lukas Humbel joined the Systems Group as a PhD student. Lukas completed his master thesis in the Systems Group.
Timothy Roscoe participated as an invited panelist at the Microsoft Research Academic Research Summit in Pune, India.
The following demo has been accepted at ICDE 2016, Helsinki, Finland, May 16-20, 2016:
"graphVizdb: A Scalable Platform for Interactive Large Graph Visualization" by Nikos Bikakis (NTU Athens), John Liagouris (ETH Zürich), Maria Krommyda (NTU Athens), George Papastefanatos (IMIS, Research Center 'Athena'), Timos Sellis (RMIT University).
Frank McSherry is one of the TCC 2016 Test-of-Time Award recipients for the paper: Calibrating Noise to Sensitivity in Private Data Analysis, published in TCC 2006:
Calibrating Noise to Sensitivity in Private Data Analysis by Cynthia Dwork (Microsoft Research), Frank McSherry, Kobbi Nissim (Ben-Gurion University), and Adam Smith (Penn State).
Ankit Singla joined the Systems Group as an Assistant Professor. More information about Ankit Singla can be found here.
.
Gustavo Alonso gives talks on data processing and modern hardware at Hong Kong's Polytechnic University and the University of Science and Technology.
We have welcomed two new Systems Group members today:
Adam Turowski - Software Engineer
Ingo Müller - PostDoc
The paper entitled "MQJoin: Efficient Shared Execution of Main-Memory Joins" by Darko Makreshanski, Georgios Giannikis, Gustavo Alonso, Donald Kossmann has been accepted at VLDB 2016 in New Delhi, India, Sept. 5 - 9., 2016.
Torsten Hoefler's project about weather and climate models has been broadcasted at BBC Arabic.
Pratanu Roy defended his PhD Dissertation entitled "Data Analytics in a Data- and Hardware-Conscious Way"
Pratanu will join Oracle Labs in February 2016.
The paper entitled "Zookeeper in a Box: Hardened Consensus for Inexpensive Coordination" by Zsolt István, David Sidler, Gustavo Alonso and Marko Vukolic (IBM Zurich) has been accepted at NSDI 2016, Santa Clara, CA, USA, March 16-18, 2016.
Three new members joined the Systems Group today:
Muhsen Owaida - new Postdoc
Kaan Kara - new PhD student
Roni Häcki -new PhD student
Our alumnus, Michael Duller, joined Datometry, a start-up company in San Francisco.
Devesh Agrawal joined the Systems Group as a new PhD student. He completed his master thesis at University of Massachusetts, Amherst, MA, USA.
The following paper has been accepted at ASPLOS 2016 in Atlanta, Georgia, USA, April 2-6, 2016:
"SpaceJMP: Programming with Multiple Virtual Address Spaces" by Izzat El Hajj (University of Illinois at Champaign-Urbana) Alexander Merritt (Georgia Institute of Technology) Gerd Zellweger (ETH Zürich) Dejan Milojicic (HP Labs) Wen-Mei Hwu (University of Illinois at Champaign-Urbana) Karsten Schwan (Georgia Institute of Technology) Timothy Roscoe (ETH Zürich) Reto Achermann (ETH Zürich) Paolo Faraboschi (HP Labs)
The following paper has been accepted at SIGMOD 2016 in San Francisco, USA, June 26th - July 1st, 2016:
"Augmented Sketch: Faster and More Accurate Stream Processing" by Pratanu Roy, Arijit Khan, and Gustavo Alonso.
Besmira Nushi presented the paper "Crowd Access Path Optimization: Diversity Matters" (Besmira Nushi, Adish Singla, Anja Gruenheid, Erfan Zamanian, Andreas Krause, Donald Kossmann) at the Conference on Human Computation & Crowdsourcing (HCOMP 2015), in Kona Kai Resort and Marina, San Diego, USA.
Torsten Hoefler won the 2015 Latsis Prize for his contributions to performance modelling, simulation, and optimization of large-scale parallel applications; topologies, routing, and host interfaces of large-scale networks; and advanced parallel programming techniques and runtime environments.
The article entitled "Computing for Climate (part 1): Evolution of Models" by Torsten Hoefler, Christoph Schär (ETH Zurich) and Oliver Fuhrer (MeteoSchweiz) was published in ETH Zukunftsblog.
Jana Giceva, Torsten Hoefler, and Gustavo Alonso are participating and presenting this week at the Dagstuhl workshop on Rack-Scale Computing. Jana presented her research on workload scheduling and Gustavo discussed workloads for rack-scale computing.
Nadia Mouci joined the Systems Group today as our new administrative assistant.
Timothy Roscoe gave the talk entitled "Rethinking OS design for future hardware" at Apple, Cupertino, CA, USA and at Xilinx, San Jose, CA, USA.
The following Systems Group papers have been presented at VLDB 2015 in Kohala Coast, Hawai'I, VLDB 2015, August 31-Sept. 4, 2015:
Zsolt István and David Sidler presented their demo "Building a Distributed Key-value Store with FPGA-based Microservers" at FPL 2015 (International Conference on Field-programmable Logic and Applications) in London, Sept. 2-4, 2015.
Daniela Dorneanu took up a Solution Developer and Trainer position at Appway, Zürich.
Marcos Vaz Salles, former PhD student of the Systems Group, has been promoted to associate professor at DIKU, Denmark.
The following PhD students left for their internships this summer:
Gerd Zellweger – HP, Palo Alto, CA, USA
Anja Grünheid – Microsoft Research, Redmond, WA, USA
Renato Marroquin – NASA Jet Propulsion Laboratory, Pasadena, CA, USA
David Sidler – Microsoft Research, Redmond, WA, USA
Besmira Nushi – Microsoft Research, Redmond, WA, USA
Stefan Kaestle presented the following paper at USENIX ATC in Santa Clara, CA, USA.
"Shoal: Smart Allocation and Replication of Memory For Parallel Programs" by Stefan Kaestle, Reto Achermann, and Timothy Roscoe and Tim Harris, (Oracle Labs, Cambridge)
The paper entitled "Top-k Reliable Edge Colors in Uncertain Graphs" by Arijit Khan, Francesco Gullo (Yahoo! Labs Barcelona), Thomas Wohler (ETH), and Francesco Bonchi (Yahoo! Labs Barcelona) has been accepted as a short paper at the CIKM 2015 in Melbourne, Australia, Oct 19-23, 2015.
Stefan Müller has been awarded computing resources at Amazon Web Services for his research on Pydron. The grant by the Square Kilometre Array and Amazon will be used to explore how Pydron can be used to make it easier for astonomers to work with very large data sets.
Frank McSherry joined the Systems Group today as our new Senior Scientist. His research will be in the area of Big Data system design and implementation.
The paper entitled "Crowd Access Path Optimization: Diversity Matters" by Besmira Nushi, Adish Singla, Anja Gruenheid, Erfan Zamanian, Brown University; Andreas Krause and Donald Kossmann has been accepted at the Conference on Human Computation & Crowdsourcing, November 8-11 2015, Kona Kai Resort and Marina, San Diego, USA.
The following tutorial by Arijit Khan and Lei Chen (Hong Kong University of Science and Technology) has been accepted at VLDB 2015, August 31-Sept. 4, 2015, Kohala Coast, Hawai'i.
The paper "Callisto-RTS: Fine-Grain Parallel Loops" by Tim Harris, Oracle Labs and Stefan Kaestle, ETH Zürich has been accepted at USENIX 2015, which will be held in Washington DC, August 12-14, 2015.
Arijit Khan will join the School of Computer Engineering (SCE) at NTU Singapore as a tenure-track assistant professor starting December 22, 2015.
The paper entitled "CrowdSTAR: A Social Task Routing Framework for Online Communities" by Besmira Nushi, Omar Alonso (Microsoft), Martin Hentschel (SnowFlake Computing) and Vasileios Kandylas (Microsoft) has ben accepted at ICWE 2015.
Zsolt István gives the talk entitled "Better Energy Efficiency and Performance in Datacenters through Specialization" at Microsoft Research (Redmond, USA) and at University of Washington.
The paper 'Indexing and Selecting Hierarchical Business Logic' by Alessandra Loro, Anja Gruenheid, Donald Kossmann, Damien Profeta (Amadeus) & Philippe Beaudequin (Amadeus) has been accepted at VLDB 2015 Industry Track.
Donald Kossman gave a keynote talk entitled “Cipherbase: End-to-end Encryption for SQL” at WAIM 2015 in Qingdao, Shandong, China.
Abstract:
This talk presents the design of the Cipherbase system. Cipherbase was designed to protect "data-in-use". That is, data remains encrypted even when it is processed. To this end, Cipherbase features a novel fine-grained architecture that farms out all computations on encrypted data to an FPGA. The talk provides the motivation and design space for systems like Cipherbase and presents the results of recent performance experiments with the Cipherbase prototype developed at Microsoft Research.
The demo entitled "Building a Distributed Key-value Store with FPGA-based Microservers" by Zsolt István, David Sidler and Gustavo Alonso has been accepted at The International Conference on Field-programmable Logic and Applications (FPL) in London, Sept. 2-4, 2015.
The following four papers and a demo have been presented at SIGMOD 2015, in Melbourne, Australia.
Claude Barthels, Simon Loesing, Donald Kossmann, Gustavo Alonso. Rack-Scale InMemory Join Processing using RDMA. SIGMOD 2015.
Lucas Braun, Thomas Etter, Georgios Gasparis, Martin Kaufmann, Donald Kossmann, Daniel Widmer. Analytics in Motion: High Performance Event-Processing AND Real-Time Analytics in the Same Database. SIGMOD 2015.
Phil Bernstein, Sudipto Das, Bailu Ding, Markus Pilman. Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases. SIGMOD 2015.
Simon Loesing, Markus Pilman, Thomas Etter, Donald Kossmann. On the Design and Scalability of Distributed Shared-Data Databases. SIGMOD 2015.
Anja Gruenheid, Theodoros Rekatsinas, Donald Kossmann, Divesh Srivastava. DEMO: StoryPivot: Comparing and Contrasting Story Evolution. SIGMOD 2015.
Kornilios Kourtis joined IBM Research in Rüschlikon as a visiting scientist on Monday, 1.6.2015.
The paper entitled "To Lock, CAS, or Elide: On the Interplay of Hardware Transactional Memory and Lock-Free Indexing" by Darko Makreshanski, Justin Levandoski (MSR), and Ryan Stutsman (MSR) has been accepted at VLDB 2015, August 31-Sept. 4, 2015, Kohala Coast, Hawai'i.
Last Friday, Timothy Roscoe gave a talk at HP Labs with the title " Physical Memory Management: Not Your Parents' Physical Address Space".
David Sidler presented his work on TCP/IP for FPGAs at FCCM 2015, in Vancouver, Canada.
"Scalable 10 Gbps TCP/IP Stack Architecture for Reconfigurable Hardware" by David Sidler, Michaela Blott, Kimon Karras, Raymond Carley, Gustavo Alonso and Kees Vissers.
Pravin Shinde, Kornilios Kourtis, Antoine Kaufmann and Timothy Roscoe win the Best Poster award for "Dragonet: a Host Network Stack for Harnessing NIC Hardware"
The following paper has been accepted at the 2015 USENIX Annual Technical Conference (USENIX ATC), Santa Clara, CA, USA; July 8-10, 2015.
Shoal: smart allocation and replication of memory for parallel programs by Stefan Kaestle, Reto Achermann, Timothy Roscoe (Systems Group, Dept. of Computer Science, ETH Zurich), Tim Harris (Oracle Labs, Cambridge, UK)
The following four posters have been presented at Eurosys and co-located workshops in Bordeaux, France:
Jana Giceva has presented the FruitBox project at the Rack Scale Computing Workshop at EuroSys 2015, Bordeaux, France.
Selected as the best paper of FCCM'13, an extended version of the work on implementing skyline queries on FPGAs has appeared in ACM TRETS
Parallelizing Data Processing on FPGAs with Shifter Lists Louis Woods, Gustavo Alonso, Jens Teubner ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 8 Issue 2, March 2015
The following paper has been accepted for publication in the IEEE Transactions on Knowledge and Data Engineering (TKDE) Journal:
"Querying Knowledge Graphs by Example Entity Tuples". Nandish Jayaram (UT Arlington), Arijit Khan, Chengkai Li (UT Arlington), Xifeng Yan (UCSB), and Ramez Elmasri (UT Arlington)
Timothy Roscoe gave a distinguished lecture "What’s happening to computer hardware, and what does it mean for systems software?" at University of St Andrews, Scotland.
Selected as one of the top papers of FPL'13, an extended version of the work on hash table design on FPGAs by Zsolt Istvan has appeared in ACM TRETS:
A Hash Table for Line-Rate Data Processing, Z. István, G. Alonso, M. Blott, K.Vissers ACM Transactions on Reconfigurable Technology and Systems (TRETS) 8 (2), March 2015
A paper describing the Vela systems developed by Tudor Salomie has been published at the IEEE Data Engineering Bulletin:
Tudor-Ioan Salomie, Gustavo Alonso: Scaling Off-the-Shelf Databases with Vela: An approach based on Virtualization and Replication. IEEE Data Eng. Bull. 38(1): 58-72 (2015)
This year four papers and one demo have been accepted at SIGMOD 2015 in Melbourne, Australia, May 21 - June 4, 2015:
Claude Barthels, Simon Loesing, Donald Kossmann, Gustavo Alonso. Rack-Scale InMemory Join Processing using RDMA. SIGMOD 2015.
Lucas Braun, Thomas Etter, Georgios Gasparis, Martin Kaufmann, Donald Kossmann, Daniel Widmer. Analytics in Motion: High Performance Event-Processing AND Real-Time Analytics in the Same Database. SIGMOD 2015.
Phil Bernstein, Sudipto Das, Bailu Ding, Markus Pilman. Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases. SIGMOD 2015.
Simon Loesing, Markus Pilman, Thomas Etter, Donald Kossmann. On the Design and Scalability of Distributed Shared-Data Databases. SIGMOD 2015.
Anja Gruenheid, Theodoros Rekatsinas, Donald Kossmann, Divesh Srivastava. DEMO: StoryPivot: Comparing and Contrasting Story Evolution. SIGMOD 2015.
In addition to a workshop paper and three posters, the following poster was accepted at EUROSYS 2015:
Zsolt István, David Sidler, Gustavo Alonso: Specialized Microservers for the Data Center
The following Workshop paper and 3 posters have been accepted at EUROSYS 2015.
Jana Giceva, Darko Makreshanski, Claude Barthels, Alessandro Dovis, Gustavo Alonso, Donald Kossmann. Rack - scale Data Processing System. WRSC Workshop on Rack - Scale Computing.
Stefan Kaestle, Reto Achermann, Timothy Roscoe. Shoal: smart allocation and replication of memory for parallel programs.- Poster
Pravin Shinde, Kornilios Kourtis , Antoine Kaufmann (University of Washington) and Timothy Roscoe. Intelligent NIC Queue Management in the Dragonet Network Stack - Poster
Moritz Hoffmann: Rack - aware operating systems. Doctoral Workshop. - Poster
The following three papers have been accepted at HotOS 2015, May 18-20, 2015; Kartause Ittingen, Switzerland.
John Liagouris joined the System Group as a new Postdoc. John is coming to us from the National Technical University of Athens (NTUA).
Jonas Pfefferle is presenting his work on I/O virtualization at the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2015).
Daniela Dorneanu gave a talk entitled "CAPI/Flash + PackedObjects, a novel off-heap solution for scaling-up complex Java applications" at IBM Research Rüschlikon. In this talk Daniela presented her work done at IBM Research (Austin, Texas) during her internship last year.
The paper entitled "Scalable 10 Gbps TCP/IP Stack Architecture for Reconfigurable Hardware" by David Sidler, Michaela Blott, Kimon Karras, Raymond Carley, Gustavo Alonso and Kees Vissers has been accepted at FCCM 2015, May 3-5, in Vancouver, Canada.
Jonas Pfefferle joins the Systems Group as new PhD student.
Timothy Roscoe and Robert Soulé (USI) received the Google Research Award for their "Online Data Center Modeling" project.
The Workshop proposal: Big-O(Q): Big Graphs Online Querying, organized by Arijit Khan (System Group, ETH), Prasenjit Mitra (Penn State University), Cong Yu (Google Research, New York) has been accepted at VLDB 2015.
Zsolt István gave a talk "Specialized Hardware in the Datacenter" at the 2nd Annual Swiss Joint Research Centre Workshop held at EPFL 10 -11 February, 2015.
Martin Kaufmann joins Deloitte in Zürich as a Consultant.
The article "Maximizing 21st century technologies for knowledge" about Pydron, Stefan Müller's parallelizing tool for Python code, has been published as a "Spotlight" article on the ETH Department of Computer Science Web page.
David Cock joined the Systems Group as a new Past Doc. David comes to us from NICTA (National ICT Australia).
Marko Vukolic spent three months in our group as a visiting professor. Starting January 5th, 2015 Marko is joining IBM Research in Rüschlikon.
Simon Loesing will join ELCA Zürich as a Software Architect in March 2015.
Tahmineh Sanamrad will join Google Zürich as a software engineer in January 2015.
Tahmineh Sanamrad defended her PhD Dissertation entitled: Encrypting Databases in the Cloud: Threats and Solutions
Committee:
Tahmine will join Google Zürich in January 2015.
Simon Loesing defended his PhD Dissertation entitled: Architectures for Elastic Database Services
Committee:
Moritz Hoffmann joined the Systems Group as a new PhD student. Moritz completed his Master Thesis in our group.
Reto Achermann joined the Systems Group as a new PhD student. Reto completed his Master Thesis in our group.
Louis Woods has been awarded the 2015 ETH Medal for his dissertation: FPGA-Enhanced Data Processing Systems.
The paper entitled "Bi-temporal Timeline Index: A Data Structure for Processing Queries on Bi-temporal Data" by Martin Kaufmann, Peter M. Fischer, Norman May, Chang Ge, Anil K. Goel and Donald Kossmann was accepted at ICDE 2015 in Seoul, Korea, April 2015.
The paper entitled "Arrakis: The Operating System is the Control Plane" by Simon Peter (Systems Group alumnus), Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe won the "Jay Lepreau Best Paper" Award at OSDI 2014 in Broomfield, Colorado.
In addition, one of the other two best papers was the paper with our alumnus Andrew Baumann as the first author:
Shielding Applications from an Untrusted Cloud with Haven by Andrew Baumann, Marcus Peinado, and Galen Hunt (Microsoft Research)
This year 2 Systems group papers and one paper with Timothy Roscoe as a co-author are presented at OSDI 2014 in Broomfield, Colorado.
Marko Vukolic, faculty at EUROCOM in France, has joined our group as a visiting professor. Marko works in distributed computing and will stay with us for the next months.
The paper entitled "Deployment of query plans on multicores" by Jana Giceva, Gustavo Alonso, Timothy Roscoe (ETH Zurich) and Tim Harris (Oracle Labs Cambridge) has been accepted to VLDB 2015, August 31 - Sept. 4, 2015, Hilton Waikoloa Village Kohala Coast, Hawaii.
Cagri Balkesen has obtained an Excellent Presentation Award at VLDB 2014 for his talk on implementing hash and sort-merge joins.
The following papers are being presented at VLDB 2014 in Hangzhou, China, September 1-5, 2014.
• Pratanu Roy, Jens Teubner, Rainer Gemulla. Low Latency Handshake Join.
• Cagri Balkesen, Gustavo. Alonso, Jens Teubner, Tamer Özsu. Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited.
• Georgios Giannikis, Darko Makreshanski, Gustavo Alonso, Donald Kossmann. Shared Workload Optimization.
• Anja Grünheid, Xin Luna Dong, Divesh Srivastava: Incremetal Record Linkage.
• Arijit Khan and Sameh Elnikety (MSR, Redmond): Tutorial - Systems for Big-Graphs.
• Louis Woods, Zsolt Istvan, and Gustavo Alonso : Ibex - An Intelligent Storage Engine with Support for Advanced SQL Off-loading.
Daniela will be doing her Internship with IBM Research in Austin, Texas from September to December 2014.
Gustavo Alonso gives a talk entitled The Case for Custom System Development and participates at the "Modern Data Management Systems Summit" organized by Tsinghua University and IBM Research in Beijing, China.
In addition to the paper on Pydron, the two following papers were accepted at OSDI 2014:
"Decoupling Cores, Kernels, and Operating Systems" by Gerd Zellweger, Simon Gerber, Kornilios Kourtis, and Timothy Roscoe
"Arrakis: The Operating System is the Control Plane" by Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe
Desislava Dimitrova joined the Systems Group as a Post Doc on August 4th,
2014. Desislava comes to our group from University of Bern.
The following paper was presented by Tahmineh Sanamrad at the Conference on Data and Application Security and Privacy (DBSEC 2014) in Vienna, July 14-16, 2014:
“Randomly Partitioned Encryption on Cloud Databases” by Tahmineh Sanamrad, Lucas Braun, Donald Kossmann and Ramarathnam Venkatesan (Microsoft Research)
The paper "Pydron: semi-automatic parallelization for multi-core and the cloud" by Stefan Müller, Gustavo Alonso, Adam Amara, and Andre Csillaghy has been accepted at OSDI 2014 (October 6-8, Broomfield, Colorado, USA)
Qin Yin joined Google Zürich today as a software engineer.
In her role of ETH Leader of the Forum for Women, Tahmineh was invited as a panelist guest to the Credit Suisse IT Women’s Council Switzerland and donna informatica seminar.
The topic of the seminar was: Conflict Management Techniques and how to deal with resistance at your workplace specially as a woman in a male-dominant area.
System Group alumna Angela Nicoara has joined Intel Labs in Portland, Oregon.
Timothy Roscoe gave a talk entitled "An overview of the Barrelfish research operating system" at WMware
Abstract:
Some years ago, a group of people at ETH Zurich and Microsoft Research decided to write a new operating system from scratch, both as a research project in itself, and to act as a vehicle for systems research unencumbered by the built-in assumptions of existing OSes like Windows, Linux, Xen, etc. I'll review some of the hardware and software trends that motivated our decision to do this, and then describe the many of the key architectural features of the new OS, which we named Barrelfish. This talk will present a survey of past, present, and future work at ETH in, or using, Barrelfish. Barrelfish is a "multikernel": the OS is structured as a distributed system where cores share no state, and instead communicate using messages (even on a cache-coherent multiprocessor). It also manages memory entirely using a capabilities, a decision which was motivated by reasons that turned out to be entirely misguided, but which has recently allowed us to implement a novel approach to dynamic and reconfigurable processors, and in our current work, begin to understand the challenge of large, multiple, intersecting physical address spaces on modern machines.
Yanlei Diao from University of Massachussets, Amherst has joined the Systems Group as a visiting professor. She will stay at ETHZ until the end of December 2014.
The following paper was presented by Besmira Nushi at the ICML 2014 Crowdsourcing and Human Computing Workshop in Beijing:
Besmira Nushi, Adish Singla, Anja Grünheid, Andreas Krause, Donald Kossmann. Quality Assurance and Crowd Access Optimization: Why does diversity matter?
The following Systems Group papers are being presented at SIGMOD 2014 in Snowbird, Utah, USA, June 22-27, 2014:
At the Big Uncertain Data Workshop (BUDA) Workshop, affiliated with SIGMOD 2014, Anja Grünheid is presenting the following paper:
Georgios Giannikis defended his PhD Dissertation entitled: Work Sharing Data Processing Systems
Committee:
Georgios will join Oracle Zürich in August 2014.
Timothy Roscoe, Tom Anderson, Mic Bowman, David Culler and Larry Peterson, received the 2014 Software Tools User Group (STUG) Award for their work on the Planet Lab Software System.
The impact of the Planet Lab Software System has been recently recognized through the following two "Test of Time" awards:
The paper "Operating system support for planetary-scale network services" co-authored by Timothy Roscoe received the first NSDI 10-year "Test of Time" award.
Timothy Roscoe was one of the recipients of the 2013 ACM SIGCOMM "Test of Time" award for the paper "PlanetLab: An Overlay Testbed for Broad-Coverage Services"
Jana Giceva is one of the recipients of this years’ Google PhD Fellowship. The Fellowship recognizes the most outstanding graduate researchers and awards them with a Doctoral Fellowship for up to three years. More…
Gustavo gave the talk "The Case for Customized System Development" as one of the keynotes in the 7th ACM International Systems and Storage Conference (Systor) in Haifa, Israel.
Abstract:
The IT industry is undergoing a number of substantial changes that go beyond the "technology revolutions" that marketing strategies periodically generate. The shift in computer systems' scale, the move towards a service model, the total cost of ownership for IT systems, and the added stringent requirements on these systems have changed the predominant balance of the last decades. Instead of general purpose solutions, the trend in system development these days is toward new degrees of customization of both hardware and software. Such customization is predominantly guided by hardware developments, which are introducing significant constraints on how programs run but also opening many new possibilities for those who can exploit them. In this talk, I discuss the trend towards custom design, explain why it is happening, present several examples, and give an overview of the opportunities and challenges it creates for systems research.
Gustavo presented the talk "Performance in the multicore era" at the Technion Computer Engineering Systems Day in Haifa, Israel.
Title: Performance in the multicore era
Abstract:
The pace and nature of the changes taking place at the processor and computer architecture level are a formidable challenge to system designers. Very few of the established assumptions about bottlenecks, optimizations, implementation techniques, and algorithm behavior hold when modern multicore/manycore machines are involved.
In this talk I will briefly overview some of our recent work in this area related to the design and implementation of a database appliance (SwissBox) we are building at ETH Zurich. Then I will go in depth on an analysis of the behavior of relational join operators on multicore machines covering from hardware oblivious designs to highly customized approaches. The results of this work illustrate the difficulties of achieving predictable performance on modern hardware, a problem that cuts across all levels software layers, from the OS to the application. The results also provide valuable insights on the problems of exploiting the parallelism inherent on multicore and inform some of the design decisions we are making in SwissBox. I will conclude the talk outlining some general goals for this research: from influencing hardware designers to developing customized architectures for data appliances.
This summer again several PhD students will be widening their knowledge and experience through their internships. Below is the list of students and companies they are visiting:
Jana Giceva - Microsoft Research, Silicon Valley
Simon Gerber – Hewlett Packard, Silicon Valley
Zsolt Istvan - Microsoft Research, Redmond
Lucas Braun – Microsoft Research, Redmond
Darko Makreshanski – Microsoft Research, Redmond
Gerd Zellweger –Microsoft Research, Redmond (starting mid June)
Stefan Müller - Amazon (starting July)
The Tutorial entitled "Systems for Big-Graphs" by Arijit Khan and Sameh Elnikety (MSR, Redmond) has been accepted at VLDB 2014, Hangzhou, China, September 2014.
Abstract:
Graphs have become increasingly important to represent complicated structures and schema-less data including the World Wide Web, social networks, knowledge graphs, genome and scientific databases, e-commerce, medical and government records. Today’s big-graphs consist of millions of vertices and billions of edges. In the presence of data objects associated with vertices and edges, graph data can reach petabytes in size. In this tutorial, we discuss the design of the emerging systems for processing of big-graphs, key features of distributed graph algorithms, as well as graph partitioning and workload balancing techniques. We emphasize the current challenges and highlight some future research directions.
Tahmineh Sanamrad is one of the prestigious Google Anita Borg Scholarship Award recipients. The award supports women in computing and technology. More about Anita Borg Scholarship Award here.
Donald Kossmann moderated the "Big Data Seminar", one of the science and technology events at the “Zürich Meets New York“. The "Big Data Seminar" took place at New York Academy of Sciences and focused on the ways in which big data is revolutionizing our world. More information here
The paper entitled "Grok the Data Center" by Zaheer Chothia, Qin Yin and Timothy Roscoe has been accepted to the 5th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys 2014), June 25-26th, 2014 in Beijing, China.
The papaer entitled "Cost-Efficient Querying Strategies for the Crowd" by Anja Grünheid, Besmira Nushi, and Donald Kossmann has been accepted to the Big Uncertain Data Workshop (BUDA 2014) in Snowbird, Utah, USA, June 22nd, 2014 - In conjunction with SIGMOD/PODS 2014.
Martin Kaufmann defended his PhD Dissertation entitled: Storing and Processing Temporal Data in Main Memory Column Stores
Committee: Prof. Dr. Donald Kossmann, ETH; Prof. Dr. Gustavo Alonso, ETH; Prof. Dr. Christian Jensen, Aarhus University, Denmark; Dr. Norman May, SAP, Germany.
The paper "Ibex - An Intelligent Storage Engine with Support for Advanced SQL Off-loading" by Louis Woods, Zsolt Istvan, and Gustavo Alonso has been accepted at VLDB 2014, Hangzhou, China, September 2014.
Louis Woods defended his PhD Dissertation entitled: FPGA-Enhanced Data Processing Systems
Committee: Prof. Dr. Gustavo Alonso, ETH; Prof. Dr. Donald Kossmann, ETH; Dr. Ken Eguro, MSR; Prof. Dr. Jens Teubner, TU Dortmund.
Louis will join Oracle Research, Redwood Shores (USA) this summer.
Thomas Heinis, who completed his PhD at the Systems Group and is currently a senior researcher at EPFL, has accepted a position as a Lecturer (assistant professor) at Imperial College in London, UK.
Congratulations, Thomas!
Spyros Voulgaris has gotten Tenure at the Computer Science Department of VU Amsterdam, The Netherlands. Spyros was a post-doc in the Systems Group from 2006 to 2008 before he joined VU Amsterdam. Congratulations, Spyros!
The paper entitled "Quality Assurance and Crowd Access Optimization: Why does diversity matter?" by Besmira Nushi, Adish Singla, Anja Gruenheid, Andreas Krause and Donald Kossmann has been accepted to Crowdsourcing and Human Computing workshop at the 31st International Conference on Machine Learning (ICML 2014) in Beijing, June 21-26, 2014.
Gustavo Alonso gives an overview of new trends and job profiles in IT profession in the Netzwoche article entitled: "Berufswandel in der IT: Düstere Zukunft für Quereinsteiger".
The paper "Parallelizing Data Processing on FPGAs with Shifter Lists", by Louis Woods, Jens Teubner, and Gustavo Alonso, has been accepted for publication by the journal ACM Transactions on Reconfigurable Technology and Systems (ACM TRETS).
Cagri Balkesen defended his PhD Dissertation entitled: In-Memory Parallel Join Processing on Multi-Core Processors
Committee: Prof. Dr. Gustavo Alonso, Prof. Dr. Donald Kossmann, Prof. Dr. Tamer Ozsu (U. of Waterloo), Dr. Stefan Manegold (CWI, the Netherlands).
Cagri will join Oracle Labs (Zürich) in July.
Timothy Roscoe has been giving the following talk at three institutions in the US and Canada (Cornell University, University of Toronto and Waterloo University).
Title: Treating cores as devices
Abstract:
Power management, dark silicon, and partial failures mean that, in the future, computer hardware will most likely consist of a dynamically- changing set of heterogeneous processor cores. Contemporary operating system structures were not designed with this hardware model in mind, and have difficulty adapting to relatively simple concepts such as processor hotplug. Our work on meeting this challenge in the Barrelfish research OS has led us to treat cores as much as possible (but not entirely) like any other devices in the system. Several novel ideas make this possible: aside from the multikernel architecture itself, we leverage the externalization of kernel state through capabilities, and the concept of a "boot driver", which is the equivalent of a device driver for a processor core.
In this talk I will present our framework for managing a changing set ofcores in a multikernel OS, and some of the surprising consequences: individual kernels can be rebooted, replaced, or upgraded on the fly, cores and hardware threads can be temporarily turned into coprocessors and back again, and per-core OS state can be quickly moved around the hardware to minimize energy usage or enforce performance guarantees.
The paper "A Hash Table for Line Rate Data Processing" by Zsolt Istvan, Gustavo Alonso, Michaela Blott (Xilinx), and Kees Vissers (Xilinx) has been accepted for publication by the journal ACM Transactions on Reconfigurable Technology and Systems (ACM TRETS).
The paper "Histograms as a Side Effect of Data Movement for Big Data" by Zsolt Istvan, Louis Woods, Gustavo Alonso has been accepted to SIGMOD 2014, Snowbird, Utah, USA, June 22-27, 2014.
The following paper has been accepted at SIGMOD 2014, Snowbird, Utah, USA, June 22-27, 2014:
“Towards Indexing Functions: Answering Scalar Product Queries”. Arijit Khan, Pouya Yanki, Bojana Dimcheva, and Donald Kossmann.
Abstract
“In this study, we consider a broad category of OLAP queries which can be expressed as the scalar product between a known expression (function) over multiple database attributes and an unknown set of parameters. Scalar product queries naturally arise in a wide range of applications including moving-objects intersection finding, time-series prediction, scientific simulation, and in active learning. We design a lightweight, yet scalable, dynamic, and generalized indexing scheme, called the Planar index, for answering such scalar product queries in an accurate manner.”
The paper “Randomly Partitioned Encryption on Cloud Databases” by Tahmineh Sanamrad, Lucas Braun, Donald Kossmann and Ramarathnam Venkatesan (Microsoft Research) is accepted at the Conference on Data and Application Security and Privacy (DBSEC 2014) in Vienna, July 14-16, 2014.
Timothy Roscoe is delivering the opening keynote talk entitled “Minding the Gap” at the Conférence d’informatique en Parallélisme, Architecture et Système (ComPAS) in Neuchâtel, Switzerland.
Abstract:
There is a (by now) well-established gap between the functionality provided by hardware architects, and the issues and requirements that system software (and particularly operating system) designers worry about. Recently, with the accelerating pace of hardware innovation, this has become a serious and pressing issue, and I will argue that the way we have traditionally built low-level system software is simply not up to the challenge of supporting modern, diverse, highly parallel, rapidly evolving, and quirky hardware.
The Barrelfish research operating system was conceived as a response to trends in hardware design, and continues to serve as a testbed for new ideas in OS construction which apply modern techniques from programming languages, knowledge representation, and networking to the problem of managing the machine. I will describe a selection of these in the talk, together with our experiences implementing them. Our ideas have proven useful and effective at dealing with the mismatch between hardware architecture and software needs, but ultimately we hope they will foster a more fruitful conversation between the hardware and software sides of "the gap", and lead to better principles and practices on both sides.
Claude started his 6-month internship with Oracle as part of the ongoing collaboration between the Systems Group and Oracle on the Rapid project.
A paper describing Pydron, Stefan Müller's parallelizing tool for Python code, has been accepted for publication at IEEE Computer in a special issue about computing in astronomy.
Scaling Astroinformatics with Pydron
Astronomers like working directly with their data using scripting languages such as Python. However, huge data volumes now force scientists to run their code on clusters and clouds. Pydron helps to bridge the gap between interactive data analysis and scalable computing resources.
This year the Systems Group was represented at EuroSys 2014 and co-located workshops through the following talks/presentations:
• Gustavo Alonso gave an invited talk entitled "Rackscale - the things that matter" at the 1st Workshop on Rackscale Computing, April, 2014 - EuroSys 2014, Amsterdam, the Netherlands.
Abstract:
Rackscale computing has become the standard for many applications running on a data center. For a variety of reasons, today it is possible to developed fully customized solution that achieve impressive performance numbers. In this talk I will argue that customization is important but needs to be sustained by general purpose techniques and components. The research agenda in the next years should focus on the latter, rather than on producing an infinity variety of high performance systems tailored for narrow use cases. Otherwise, the inevitable problems with total cost of ownership during the life cycle of a real systems (maintenance, further development, software evolution, additional functionality) will soon catch up with many existing proposals.
• Jana Giceva presented her poster “Rethinking the Scheduling unit for Parallel DB Operators”.
• Zaheer Chothia presented his short paper “Grok the Datacenter” at the 8th EuroSys Doctoral Workshop (EuroDW 2014)
The paper "Operating system support for planetary-scale network services" co-authored by Timothy Roscoe received the first NSDI 10-year "Test of Time" award. The paper was published in 2004 at the first USENIX Symposium on Networked Systems Design and Implementation (NSDI 2004) held in San Francisco.
This is the second "Test of Time" award for the work on the same system. Last year Timothy Roscoe was one of the recipients of the 2013 ACM SIGCOMM "Test of Time" award for the paper "PlanetLab: An Overlay Testbed for Broad-Coverage Services".
Timothy Roscoe visited USI (Università della Svizzera italiana) and gave a talk entitled: "Hardware Complexity as a First-class OS Problem".
David Sidler joined the Systems Group as a new PhD student. David completed his Master Thesis in our group.
Two following Systems Group papers are being presented at EDBT in Athens, Greece this week.
On March 25th there was a mini-workshop covering the ongoing
collaborations between the Systems Group and Oracle Research together
with Thomas Würthinger, of the Oracle Labs in Linz, Austria.
Gustavo giving a talk at HTDC 2014 Gustavo is giving a talk at the 7th School on Hot Topics in Distributed Computing.
Title: "Crazy little thing called Hardware"
Abstract:
Modern hardware is changing faster than software can adapt to it. The architecture of the today's predominant computing platforms is radically different from that taken for granted in the past decades. In this talk I will describe in detail such changes and point out to even bigger challenges yet to come. I will then go over some of the many new research questions that arise as a result of the current hardware evolution.
Alessandra Loro joins the Systems Group as a new Post Diplomand. Alessandra completed her Master Thesis in our group in February.
Timothy Roscoe has been elected to the grade of ACM Fellow for contributions to operating systems and networking research. The inteview with Timothy Roscoe has been published as a "Spotlight" item on the ETH Department of Computer Science Web page.
The poster entitled "Rethinking the Unit of Scheduling of Parallel DB Operators" by Jana Giceva, Gustavo Alonso and Timothy Roscoe has been accepted at EuroSys 2014, which will be held in Amsterdam, April 13-16, 2014.
Abstract:
Scheduling of parallelized DB operators is both challenging and important problem for modern databases. The penalties for resource sharing are aggravated with the new operator algorithms tuned for efficiently exploiting the new hardware features. Hence, they must be considered more seriously now and in the future. By rethinking the unit of scheduling from CPU cores to, for instance NUMA regions, we can achieve less performance degradation due to sharing, and more efficient leverage of machine parallelism that will in turn maximize the overall workload throughput. Even though we are still at an early stage of evaluation, the initial results are encouraging and we believe that the approach is applicable to a wider range of database operators as well as other applications that have similar properties and requirements.
Anja Grünheid's paper "Incremental Record Linkage" is accepted to VLDB 2014 in Hangzhou, China, September 1-5, 2014.
This paper presents an end-to-end framework that can incrementally and efficiently update linkage results when data updates arrive... The paper will be published in September 2014.
Pratanu Roy's paper entitled "Low Latency Handshake Join" is accepted at VLDB 2014 in Hangzhou, China, September 1-5, 2014.
Prataus work is based on the recently proposed handshake join algorithm, which is a mechanism to parallelize the processing of stream joins in a NUMA-aware and hardware-friendly manner. Handshake join achieves high throughput and scalability, but it suffers from a high latency penalty and a non-deterministic ordering of the tuples in the physical result stream... The paper will be published after the VLDB 2014.
Accenture is offering Internship positions in India for the period between July and September 2014. More info is available here.
Timothy Roscoe visted Google and HP Labs in Silicon Valley and gave the following talk at both companies: Towards an intra-machine network architecture.
Stefan started his internship with Oracle Cambridge. He will be working on the design of a runtime system that uses a manifest to expresses application characteristics, which enables automatic tuning of applications to the machines they are running on.
Selected as one of the best paper of ICDE 2013, Cagri Balkesen's work on main memory hash joins has just been accepted to the IEEE journal Transactions on Knowledge and Data Engineering.
Ercan Ucan accepted a software engineering position at Avaloq Evolution AG in Zürich. Ercan will take up his position in March 2014.
Renato Marroquin joined the Systems group as a new PhD student. Renato completed his Master Thesis at Pontifical Catholic University of Rio de Janeiro, Brazil.
Emirates Airline has drafted a challenge for ETH students to create a vision of the travel landscape in ten years’ time. The winning idea will be rewarded with US$ 3’000.00.
Team up and write an essay on the “Future of Travel” – for details please contact Prof. Donald Kossmann (kossmann@inf.ethz.ch).
The deadline for submissions is March 31, 2014.
More Information available here
Gustavo Alonso gives a talk on "Heterogeneous Architectures in Data Centers" at the Xilinx Emerging Technologies Symposium in San Jose, California.
Gustavo Alonso presented the project around data processing with FPGAs at the Microsoft Research EPFL-ETHZ Joint Research Center (JRC) workshop taking place at Microsoft Research Labs in Cambridge, UK. 5th & 6th February 2014.
Ercan Ucan defended his PhD Dissertation entitled: Data Storage, Transfers and Communication in Personal Clouds.
Committee: Prof. Dr. Timothy Roscoe, ETH; Prof. Dr. Gustavo Alonso, ETH; Dr. Anne-Marie Kermarrec, Senior researcher at INRIA and "part time invited Prof." at EPFL
Tudor Salomie accepted a software engineering position at Google Zürich. He will take up his position in May 2014.
Marica Stojanov will join Credit Suisse on February 3rd, 2014.
Tim Harris from Oracle (Cambridge, UK) visited the Systems Group on January 27 and 28, 2014.
Tudor Salomie defended his PhD Dissertation entitled: Cloud-ready scalable and elastic data processing using off-the-shelf databases, replication and virtualization.
Committee: Prof. Dr. Gustavo Alonso, Prof Dr. Timothy Roscoe, Dr. Tim Harris (Oracle, Cambridge, UK)
One more paper has been accepted at EDBT 2014, which will be held in Athens, Greece, March 24-28, 2014.
Title: Benchmarking Bitemporal Database Systems: Ready for the Future or Stuck in the Past
Authors: Martin Kaufmann (ETH), Peter M. Fischer (University of Freiburg), Norman May (SAP), Donald Kossmann (ETH)
Abstract:
After more than a decade of a virtual standstill, the adoption of temporal data management features has recently picked up speed, driven by customer demand and the inclusion of temporal expressions into SQL:2011. Most of the big commercial DBMS now include support for bitemporal data and operators. In this paper, we perform a thorough analysis of these commercial temporal DBMS: We investigate their architecture, determine their performance and study the impact of performance tuning. This analysis utilizes our recent (TPCTC 2013) benchmark proposal, which includes a comprehensive temporal workload definition. The results of our analysis show that the support for temporal data is still in its infancy: All systems store their data in regular, statically partitioned tables and rely on standard indexes as well as query rewrites for their operations. As shown by our measurements, this causes considerable performance variations on slight workload variations and a significant effort for performance tuning. In some cases, there is considerable overhead for temporal operations even after extensive tuning.
The following paper has been accepted at EDBT 2014, to be held March 24-28, 2014 in Athens, Greece.
Title: Fast Reliability Search in Uncertain Graphs
Authors: Arijit Khan (ETH), Francesco Bonchi (Yahoo!), Aristides Gionis (Aalto University), and Francesco Gullo (Yahoo!)
Abstract: Given an uncertain graph with probabilities on edges, we propose efficient indexing methods to find all nodes reachable from a query node with probability no less than an input probability threshold. Such reachability queries can be useful in influence maximization in social networks, predicting co-complex memberships in protein-interaction networks, and computing packet delivery probability between a pair of nodes in mobile ad-hoc networks.
Piscataway, New Jersey, USA, January 2014: Prof. Gustavo Alonso from the Computer Science Department of ETH Zurich has been named an IEEE Fellow. He is being recognized for contributions to data management and distributed systems. In particular, his work in understanding data replication and how to combine efficiency with high levels of consistency are nowadays the basis for many distributed database products and cloud based solutions.
“The election to IEEE Fellow is a great honor and I am very grateful for the recognition it makes of the work done at my group over many years”, said Alonso. “I am proud of this achievement but it is also humbling to see my name added to such an illustrious group of people”.
Gustavo Alonso received his PhD from UC Santa Barbara and worked at the IBM Almaden Research Center before joining ETH Zurich. He has received numerous awards for his contributions to several areas of computer science, including several most influential paper awards: the AOSD 2012 Most Influential Paper Award for work on dynamic aspect oriented programming, and the VLDB 2010 Ten Year Best Paper Award for work on consistent database replication. Gustavo Alonso is an ACM Fellow.
The IEEE Grade of Fellow is conferred by the IEEE Board of Directors upon a person with an outstanding record of accomplishments in any of the IEEE fields of interest. The total number selected in any one year cannot exceed one-tenth of one- percent of the total voting membership. IEEE Fellow is the highest grade of membership and is recognized by the technical community as a prestigious honor and an important career achievement.
The IEEE is the world’s leading professional association for advancing technology for humanity. If you would like to learn more about IEEE or the IEEE Fellow Program, please visit www.ieee.org.
The paper "Shared Workload Optimization" by Georgios Giannikis, Darko Makreshanski, Gustavo Alonso, and Donald Kossmann has been accepted for presentation at VLDB 2014. The paper describes a query optimizer for the SharedDB database engine developed in the group.
Every year, ACM recognizes and honors outstanding ACM members for their achievements in computer science. In 2013, Timothy Roscoe has been elected to the grade of Fellow for contributions to operating systems and networking research. ACM Press Release
Eric Sedlar from Oracle Labs visited the System Group. A small workshop took place on 3rd December 2013 to discuss ongoing research around data processing on modern hardware.
The research proposal “Efficient data processing through massive parallelism and FPGA based acceleration” by Gustavo Alonso has been accepted by the Microsoft-EPFL-ETHZ Joint Research Center. The project will be done in collaboration with Ken Eguro from Microsoft Research in Redmond.
Tudor Salomie presented his work on virtualized and replicated
databases at the IMDEA Networks Institute in Madrid, Spain
Adrian Schüpbach has joined Oracle Labs in Zürich, starting December 2nd.
Thomas Etter took up the positon of a Software Ingineer at Google, Zürich. Thomas will start at Google on Dec. 2.
Gustavo Alonso gave the following talk at MAKI Distinguished Lecture Series, TU Darmstadt, Nov. 7. 2013.
"Need for Speed: Building Big Data Platforms"
Abstract:
Big data and the NoSQL movement are just manifestations of a number of trends on the way data is processed and the type of services built on top of data. One the one hand, Internet, mobile telephones, social networks, on-line advertisement, and the proliferation of digital services have resulted in unseen demands on existing data platforms. On the other hand, the complexity of the queries, the need to support strict SLAs, and the wide variety of data types call for new approaches in the design of data processing engines. At the same time, the underlying hardware platforms are evolving at an amazing speed, changing many of the established assumptions about how to build software systems.
In this talk I will cover SwissBox, a long term, umbrella project at the Systems Group at ETH Zurich, where we are redesigning data processing engines from the ground up to adapt them to modern workloads. The project covers from hardware acceleration and operating system – database codesign, to radically new ways of organizing the engine's architecture and novel strategies to exploit multi-core hardware. Part of the system is already used commercially and the results obtained so far indicate the performance and the guarantees that can be provided with the new architecture exceed by orders of magnitude what can be achieved today.
Kornilios Kourtis presented the following paper at PLOS 2013 (7th Workshop on Programming Languages and Operating Systems), Nemacolin Woodlands Resort, Pennsylvania, USA, Nov. 3, 2013:
"Modeling NICs with Unicorn" by Pravin Shinde, Antoine Kaufmann, Kornilios Kourtis and Timothy Roscoe.
The following paper was accepted for publication in VLDB Journal:
High Availability, Elasticity, and Strong Consistency for Massively Parallel Scans over Relational Data by Philipp Unterbrunner, Gustavo Alonso, Donald Kossmann.
The following paper will be published in ACM Transactions on Database Systems:
"XLynx—An FPGA-based XML Filter for Hybrid XQuery Processing" by Jens Teubner, Louis Woods, and Chongling Nie.
Zaheer Chothia joined the Systems group as a new PhD student. Zaheer completed his Master Thesis in our group.
Claude Barthels joins the Systems Group tomorrow (Oct. 1st) as a new PhD student. Claude completed his Master Thesis in our group.
Daniela Dorneanu will join the Systems Group on October 1. Daniela is coming to us from Politecnico di Milano.
Arijit Khan joined the Systems Group today (9.11.2013) as a postdoctoral researcher. Arijit is coming to our Group from University of California, Santa Barbara, CA, USA.
The following papers and demo done at ETH and with ETH co-authors were presented at 23rd International Conference on Field Programmable Logic and Applications (FPL 2013) in Porto, Portugal, September 2-4, 2013.
The Systems Group is presenting the following papers and demos at VLDB and co-located workshops in Riva del Garda, Italy, August 26-30, 2013.
Nihal Dindar took up a positon of an Associate at McKinsey, Istanbul. Nihal will start at McKinsey in September.
The paper "Modeling the execution semantics of stream processing engines with SECRET" by Nihal Dindar, Nesime Tatbul, Renée J. Miller, Laura M. Haas, Irina Botan is published in VLDB Journal. Volume 22. Number 4. August 2013.
Donald Kossmann will be in sabbatical at Microsoft Research in Redmond through November 2013.
Marica Stojanov joins the Systems Group as a new PhD Student. Marica comes to us from EPFL Lausanne, where she completed her Master Thesis.
The following paper was accepted at TPCTC 2013, Riva del Garda, Italy, August 2013:
TPC-BiH: A Benchmark for Bi-Temporal Databases by Martin Kaufmann, Peter M. Fischer, Norman May, Andreas Tonder, Donald Kossmann.
Pravin Shinde started his 3-month internship in the Systems research group at HP Labs in Paolo Alto, CA, USA. Pravin will be working on exploratory project involving figuring out how smart NICs can be used for the benefits of the applications.
The paper, "PlanetLab: An Overlay Testbed for Broad-Coverage Services" by Brent Chun, David Culler, Timothy Roscoe, Andy Bavier, Larry Peterson, Mike Wawrzoniak, and Mic Bowman, is one of the two papers honoured with the ACM SIGCOMM Test of Time award this year.
The ACM SIGCOMM Test of Time Award recognizes a paper published 10 to 12 years in the past in Computer Communication Review or any SIGCOMM sponsored or co-sponsored conference that is deemed to be an outstanding paper whose contents are still a vibrant and useful contribution today. Read more.
The following demo was accepted to FPL 2013 (23rd International Conference on Field Programmable Logic and Applications), Porto, Portugal, 2-4 Septemeber, 2013:
Hybrid FPGA-accelerated SQL Query Processing by Louis Woods, Zsolt Istvàn, and Gustavo Alonso.
Andreas Marfurt joins the Systems Group as a new Post Diplomand. Andreas completed his Master Thesis in our Group this spring.
The following paper was accepted at VLDB 2014, Hangzhou, China, September 1-5, 2014.
Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited by C. Balkesen, G. Alonso, J. Teubner (TU Dortmund University), T. Özsu (University of Waterloo).
A formal cooperation agreement has been established between Xilinx and the System group to explore the use of FPGAs in data processing and the data center.
Gustavo Alonso has been recognized for his research in distributed systems, middleware, and data management at the ACM Fellow Ceremony in San Francisco, CA. Interview with Gustavo Alonso
Markus Gross, Vinton Cerf and Gustavo Alonso at the ACM annual Awards Banquet
The following paper was accepted at FPL'13 (23rd International Conference on Field Programmable Logic and Applications): "A Flexible Hash Table Design For 10Gbps Key-value Stores on FPGAs" by István Zsolt, Gustavo Alonso, Michaela Blott (Xilinx) and Kees Vissers (Xilinx). The work is part of the collaboration between Xilinx and the Systems Group.
The book "Data Processing on FPGAs" published by Jens Teubner and Louis Woods gives an introduction to field-programmable gate arrays targeted at a database audience. Besides explaining core FPGA technology and programming models, the book also discusses several database use cases that benefit from FPGAs, including stream processing applications, core database operators, and security sensitive applications.
The following two papers were presented by Cagri Balkesen at DEBS 2013 in Arlington, TX, USA, June 29 - July 3, 2013:
Adaptive Input Admission and Management for Parallel Stream Processing, Cagri Balkesen, M. Tamer Ozsu, Nesime Tatbul.
RIP: Run-based Intra-query Parallelism for Scalable Complex Event Processing, Cagri Balkesen, Nihal Dindar, Matthias Wetter, Nesime Tatbul.
An interview with Donald Kossmann entitled "Big Data, big risks?" was published in ETH Life. The interview is available here (in German).
Besmira Nushi started her 3-month internship in Social Bing team at Microsoft, Silicon Valley.
The Systems Group is presenting the following papers and demos at SIGMOD in New York, USA, June 22-27, 2013.
Jana and Pratanu started their internship at Oracle Research as part of the ongoing collaboration between the Systems Group and Oracle on the Rapid project.
Karolina Alexiou joins the Systems Group as a new Post Diplomand. Karolina completed her Master Thesis in our Group this spring.
Darko and Markus join Microsoft Research in Redmond, USA for a 3 month internship. Darko will be working in Cipher team whereas Markus will be part of Data Management, Exploration and Mining group.
A paper describing an implementation of Memcached on FPGAs, co-authored by Zsolt Istvan and Jeremia Baer, has been accepted at HotCloud 2013. The work is part of the ECC collaboration between Xilinx and the Systems Group.
Timothy Roscoe and Pravin Shinde presented the following paper at the 14th Workshop on Hot Topics in Operating Systems (HotOS 2013) in Santa Ana Pueblo, NM, USA, May 13-15, 2013:
Pravin Shinde, Antoine Kaufmann, Timothy Roscoe, Stefan Kästle. We need to talk about NICs. HotOS 2013, Santa Ana Pueblo, NM, USA, May 13-15, 2013.
Ercan Ucan presented the following paper at NETYS in Marrakesh, Marocco, May 2-4, 2013.
Establishing Efficient Routes between Personal Clouds; Ercan Ucan, Timothy Roscoe.
The following three papers have been accepted to the research track of ACM DEBS'13 Conference:
"Adaptive Input Admission and Management for Parallel Stream Processing", Cagri Balkesen, M. Tamer Ozsu, and Nesime Tatbul
"RIP: Run-based Intra-query Parallelism for Scalable Complex Event Processing", Cagri Balkesen, Nihal Dindar, Matthias Wetter, and Nesime Tatbul
"Ariadne: Managing Fine-Grained Provenance on Data Streams",Boris Glavic, Kyumars Sheykh Esmaili, Peter M. Fischer, and Nesime Tatbul
The conference will take place in Arlington, Texas, USA on June 29-July 3, 2013.
The paper "Parallel Computation of Skyline Queries" by Louis Woods, Jens Teubner, and Gustavo Alonso got the best paper award at FCCM'13, Seattle, USA.
Gustavo Alonso will be on sabbatical at Yale University through August 2013.
Zsolt István joined the Systems Group today as a new PhD student. Zsolt completed his Master Theisis in our group in March, 2013.
The Systems Group presented the following papers and demos at EuroSys in Prague, Czech Republic, April 14-17, 2013:
RapiLog: Reducing System Complexity Through Verification by Gernot Heiser, Etienne Le Sueur, Adrian Danis, and Aleksander Budzynowski (NICTA and UNSW), Tudor-Ioan Salomie and Gustavo Alonso
Application Level Ballooning for Efficient Server Consolidation by Tudor-Ioan Salomie, Gustavo Alonso, Timothy Roscoe, and Kevin Elphinstone (UNSW and NICTA)
Why Execute One Query, when you can Execute Thousands by Georgios Giannikis and Darko Makreshanski (Poster & demo)
Doctoral Workshop:
What kind of distributed system is a multicore by Stefan Kaestle (Poster and 5 min. presentation)
Memory Management for Heterogeneous Multicores by Simon Gerber (Poster and 5 min. presentation)
Nesime Tatbul has joined Intel Labs as a senior research scientist to work as part of the Intel Science and Technology Center for Big Data based at MIT CSAIL.
Georgios Gasparis joins the Systems Group as a new Post Diplomand. Georgios completed his Master Thesis in our Group this spring.
Gustavo Alonso gives a keynote talk, entitled “Hardware Killed the Software Star”, at ICDE 2013, in Brisbane Australia.
The paper "Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware" by Cagri Balkesen, Jens Teubner, Gustavo Alonso, and Tamer Ozsu has been selected as one of the top 3 papers of the ICDE 2013 conference and received the Best Paper Honorable Mention.
The "Time Travel in Column Stores" paper by Martin Kaufmann, Amin Amiri Manjili, Stefan Hildenbrand, Donald Kossmann, Andreas Tonder (SAP) was presented by Martin Kaufmann at ICDE 2013 in Brisbane, Australia.
Martin also presented the following demo: "A Generic Database Benchmarking Service" by Martin Kaufmann, Peter M. Fischer (Univ. of Freiburg), Donald Kossmann, and Norman May (SAP).
The paper "Establishing Efficient Routes between Personal Clouds" by Ercan Ucan and Timothy Roscoe was accepted at the International Conference on Network Systems (NETYS) 2013, which will be held May 2-4 in Marrakech, Marocco.
Gustavo Alonso has presented several of the projects around SwissBox at VU Amsterdam and at IBM Rüschlikon.
Starting in April, Jens Teubner will become a Professor at TU Dortmund.
Timothy Roscoe will be in sabbatical at University of Washington through September 2013.
As part of the ongoing collaboration with Xilinx in ECC, Zsolt Istvan has completed a master thesis implementing a key value store in an FPGA.
Cagri Balkesen has given a talk at Oracle Research presenting his work on implementing main memory relational joins on multicore architectures.
Jens Teubner und Louis Woods gave a Tutorial on data management on new hardware at BTW in Magdeburg on 12.03.2013.
Alexandru Moga took up the position of R&D Computer Scientist with ABB Corporate Research in Baden-Daettwil. He will start at ABB in April 2013.
The paper "Parallel Computation of Skyline Queries" by Louis Woods, Jens Teubner, and Gustavo Alonso has been accepted at the IEEE International Symposium on Field-Programmable Custom Computing Machines that will take place on April 28-30, 2013, in Seattle, Washington.