DOME project

DOME is a Dutch government-funded project between IBM and ASTRON in form of a public-private-partnership focussing on the Square Kilometre Array (SKA), the world's largest planned radio telescope. SKA will be built in Australia and South Africa. The DOME project objective is technology roadmap development that applies both to SKA and IBM. The 5-year project was started in 2012 and is co-funded by the Dutch government and IBM Research in Zürich, Switzerland and ASTRON in the Netherlands. The project ended officially on 30 September 2017.

The DOME project is focusing on three areas of computing, green computing, data and streaming and nano-photonics and partitioned into seven research projects.
 * P1 Algorithms & Machines – As traditional computing scaling have essentially hit a wall, a new set of methodologies and principles is needed for the design of future large-scale computers. This will be an umbrella project for the other six.
 * P2 Access Patterns – When faced with storing petabytes of data per day, new thinking of data storage tiering and storage medium must be developed.
 * P3 Nano Photonics – Fiber optic communication over long distances and between systems is nothing new, but there is a lot to do for optic communications within computer systems and within the telescopes themselves.
 * P4 Microservers – New demands on higher computing density, higher performance per Watt, and reduced complexity of systems suggests a new kind of custom designed server
 * P5 Accelerators – With the flattening of general computing performance, special architectures for addressing next level of performance will be investigated for specialized tasks like signal processing and analysis.
 * P6 Compressive Sampling – Fundamental research into tailored signal processing and machine learning algorithms for the capture, processing, and analysis of the radio astronomy data. Compressive sensing, algebraic systems, machine learning and pattern recognition are focus areas.
 * P7 Real-Time Communication – Reduce the latency caused by redundant network operations at very large scale systems and optimize the utility of the communications bandwidth so that the correct data gets to the correct processing unit in real time.

P1 Algorithms & Machines
The design of computers has changed dramatically in the last decades but the old paradigms still reign. Current designs stem from single computers working on small data sets in one location. SKA will face a completely different landscape, working on an extremely large data set, collected on myriad of geographically separated locations using tens of thousands of separate computers in real time. The fundamental principles for designing such a machine will have to be reexamined. Parameters concerning power envelope, accelerator technologies, workload distribution, memory size, CPU architecture, node intercommunications, must be investigated to draw new baseline to design from. The tools that result from this project are being open-sourced early 2018.

This fundamental research will work as the umbrella for the other six focus areas, help making proper decisions regarding architectural directions.

A first step will be a retrospective analysis of the design of the LOFAR and MeerKAT telescopes and development of a design tool to use when designing very large and distributed computers.

P2 Access Patterns
This project will focus on the very large amount of data the DOME must handle. SKA will generates petabytes of data daily and this must be handled differently according to urgency and geographical location whether its near the telescope arrays or in the datacenters. A complex tiered solution must be devised using a lot of technologies that currently is beyond the state of the art. Driving forces behind the designs will be lowest possible cost, accessibility and energy efficiency.

This multi-tier approach will combine several different kinds of software technologies to analyze, sift, distribute, store and retrieve data on hardware ranging from traditional storage media like magnetic tape and hard drives to newly developed technologies like phase-change memory. The suitability of different storage media heavily depends on the usage patterns when writing and reading data, and these patterns will change over time, so there must also be room for changes to the designs.

P3 Nano Photonics
Transport of data is a major factor, influencing design on the largest scales to the smallest of DOME. The cost of communicating electrically on copper wires will drive the application of low-power photonic interconnects, from connections between collecting antennas and datacenters to connecting devices inside the computers. Both IBM and ASTRON have advanced research programs into nano photonics, beamforming and optical links and they will combine their efforts for the new designs.

This research project is divided into four R&D sections, investigating digital optical interconnects, analog optical interconnects and analog optical signal processing.
 * 1) Digital optical interconnect technology for astronomy signal processing boards.
 * 2) Analog optical interconnection technology for focal-plane array front-ends.
 * 3) Analog optical interconnection technology for photonic phased array receiver tiles.
 * 4) Analog optical interconnection and signal processing technology for photonic focal plane arrays.

In February 2013 at the International Solid-State Circuits Conference (ISSCC), IBM and École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland showed a 100 Gbit/s analog-to-digital converter (ADC). In February 2014 at ISSCC, IBM and ASTRON demoed a 400 Gbit/s ADC.

P4 Microservers
In 2012 a team at IBM led by Ronald P. Luijten started pursuing a computational dense, and energy efficient 64-bit compute server design based on commodity components, running Linux. A system-on-chip (SoC) design where most necessary components would fit on a single chip would fit these goals best, and a definition of "microserver" emerged where essentially a complete motherboard (except RAM and boot flash) would fit on chip. ARM, x86 and Power ISA based solutions were investigated and a solution based on Freescale's Power ISA-based dual core P5020 / quad core P5040 processor came out on top.

Design
The resulting microserver is fit inside the same form factor as standard FB-DIMM socket. The SoC chip, about 20 GB of DRAM and a few control chips (such as the PSoC 3 from Cypress used for monitoring, debugging and booting) comprise a complete compute node with the physical dimensions of 133×55 mm. The card's pins are used for a SATA, five Gbit and two 10 Gbit Ethernet ports, one SD card interface, one USB 2 interface, and power.

The compute card operates within a 35 W power envelope with headroom up to 70 W. The idea is to fit about a hundred of these compute cards within a 19" rack 2U drawer together with network switchboards for external storage and communication. Cooling will be provided via the Aquasar hot water cooling solution pioneered by the SuperMUC supercomputer in Germany.

Future
In late 2013 a new SoC was chosen. Freescale's newer 12 core T4240 is significantly more powerful and operates within the same power envelope as the T5020. A new prototype micro server card was built and validated for the larger scale deployment in the full 2U drawer in early 2014. Later an 8-core ARMv8 board was developed using the LS2088A part from NXP (Formerly Freescale). At the end of 2017, IBM is licensing the technology to a startup who plans to take this to market by mid 2018.

P5 Accelerators
Traditional high performance processors hit a performance wall during the late 2000s when clock-speeds couldn't be increased anymore due to increasing power requirements. One of the solutions is to include hardware to off load the most common and/or compute intensive tasks to specialized hardware called accelerators. This research area will try to identify these areas and design algorithms and hardware to overcome the bottlenecks. There will probably be accelerators doing pattern detection, parsing, data lookup and signal processing. The hardware will be of two classes; fixed accelerators for static tasks, or programmable accelerators for a family of tasks with similar characteristics. The project will also look att massively parallel computing using commodity graphics processors.

P6 Compressive Sampling
The compressive sampling project is fundamental research into signal processing in collabrotation with Delft University of Technology. In the context of radio astronomy capture, analysis and processing of signals is extremely compute intensive on enormous datasets. The goal is to do sampling and compression simultaneously and use machine learning to detect what to keep and what to throw away, preferably as close to the data collectors as possible. This project's goal is to develop compressive sampling algorithms to use in capturing the signal and to calibrate the patterns to keep, in an ever-increasing number of pattern clusters. The research will also tackle the problem of degraded pattern quality, outlier detection, object classification and image formation.

P7 Real-Time Communication
Moving data from the collectors to the process facilities are traditionally bogged down due to high latency I/O, low bandwidth connections and data is often multiplied along the way due to lack of purposeful design of the communication network. This research project will try to reduce latency to a minimum and design the I/O systems so data will be written directly into the processing engines on an exascale computer design. The first phase will identify system bottlenecks, and investigate Remote direct memory access (RDMA). The second phase will investigate using standard RDMA technology onto interconnect networking. Phase three includes development of functional prototypes.