Evaluation of Power CPU architecture for deep learning

Project goal

We are investigating the performance of distributed training and inference of different deep-learning models on a cluster consisting of IBM Power8 CPUs (with NVIDIA V100 GPUs) installed at CERN. A series of deep neural networks is being developed to reproduce the initial steps in the data-processing chain of the DUNE experiment. More specifically, a combination of convolutional neural networks and graph neural networks are being designed to reduce noise and select specific portions of the data to focus on during the reconstruction step (region selector).

R&D topic

Machine learning and data analytics

Project coordinator(s)

Maria Girone and Sofia Vallecorsa

Team members

Marco Rossi

Collaborator liaison(s)

Eric Aquaronne, Oliver Bethmann

Collaborators

Project background

Neutrinos are elusive particles: they have a very low probability of interacting with other matter. In order to maximise the likelihood of detection, neutrino detectors are built as large, sensitive volumes. Such detectors produce very large data sets. Although large in size, these data sets are usually very sparse, meaning dedicated techniques are needed to process them efficiently. Deep-learning methods are being investigated by the community with great success.

Recent progress

We have developed a deep neural network architecture based on a combination of two-dimensional convolutional layers and graphs. These networks can analyse both real and simulated data from protoDUNE and perform the region selection and de-noising tasks, which are usually applied to the raw detector data before any other processing is run.

Both of these methods improve on the classical approaches currently integrated in the experiment software stack. In order to reduce training time and set up hyper-parameter scans, the training process for the networks is parallelised and has been benchmarked on the IBM Minsky cluster.

In accordance with the concept of data-parallel distributed learning, we trained our models on a total of twelve GPUs, distributed over the three nodes that comprise the test Power cluster. Each GPU ingests a unique part of the physics dataset for training the model.

Next steps

We will work to further optimise our region-selection and noise-reduction models for the DUNE data. We will test its performance on real data collected from ProtoDUNE, the prototype experiment built at CERN.

Today, high-resolution images (millions of pixels) representing DUNE data are split into a series of small crops (32x32 pixels). A new U-Net architecture approach is being investigated in order to overcome this limitation and process entire images in one single step, thus accelerating the whole data-processing process.

Our plan is to then extend this approach to perform several other steps in the data-processing chain. Our ultimate, long-term goal is to develop a tool capable of processing the raw data from the DUNE experiment, thus making it possible to replace the entire offline reconstruction approach.

Publications

cern.ch/go/kzj6

Presentations

cern.ch/go/7BsK

cern.ch/go/xwr7

D. H. Cámpora Pérez, Millions of circles per second. RICH at LHCb at CERN (7 June). Presented as a seminar in the University of Seville, Seville, 2018.

cern.ch/go/VK7P

cern.ch/go/Z6jT

Evaluation of Power CPU architecture for deep learning

Follow us

Disclaimer

CERN Accelerating science

CERN Accelerating science

CERN Accelerating science

Main navigation

Evaluation of Power CPU architecture for deep learning

Address