-
Janus II: a new generation application-driven computer for spin-system simulations
Authors:
Janus Collaboration,
M. Baity-Jesi,
R. A. Baños,
A. Cruz,
L. A. Fernandez,
J. M. Gil-Narvion,
A. Gordillo-Guerrero,
D. Iñiguez,
A. Maiorano,
F. Mantovani,
E. Marinari,
V. Martin-Mayor,
J. Monforte-Garcia,
A. Muñoz Sudupe,
D. Navarro,
G. Parisi,
S. Perez-Gaviro,
M. Pivanti,
F. Ricci-Tersenghi,
J. J. Ruiz-Lorenzo,
S. F. Schifano,
B. Seoane,
A. Tarancon,
R. Tripiccione,
D. Yllanes
Abstract:
This paper describes the architecture, the development and the implementation of Janus II, a new generation application-driven number cruncher optimized for Monte Carlo simulations of spin systems (mainly spin glasses). This domain of computational physics is a recognized grand challenge of high-performance computing: the resources necessary to study in detail theoretical models that can make cont…
▽ More
This paper describes the architecture, the development and the implementation of Janus II, a new generation application-driven number cruncher optimized for Monte Carlo simulations of spin systems (mainly spin glasses). This domain of computational physics is a recognized grand challenge of high-performance computing: the resources necessary to study in detail theoretical models that can make contact with experimental data are by far beyond those available using commodity computer systems. On the other hand, several specific features of the associated algorithms suggest that unconventional computer architectures, which can be implemented with available electronics technologies, may lead to order of magnitude increases in performance, reducing to acceptable values on human scales the time needed to carry out simulation campaigns that would take centuries on commercially available machines. Janus II is one such machine, recently developed and commissioned, that builds upon and improves on the successful JANUS machine, which has been used for physics since 2008 and is still in operation today. This paper describes in detail the motivations behind the project, the computational requirements, the architecture and the implementation of this new machine and compares its expected performances with those of currently available commercial systems.
△ Less
Submitted 3 October, 2013;
originally announced October 2013.
-
Reconfigurable computing for Monte Carlo simulations: results and prospects of the Janus project
Authors:
Janus Collaboration,
M. Baity-Jesi,
R. A. Banos,
A. Cruz,
L. A. Fernandez,
J. M. Gil-Narvion,
A. Gordillo-Guerrero,
M. Guidetti,
D. Iniguez,
A. Maiorano,
F. Mantovani,
E. Marinari,
V. Martin-Mayor,
J. Monforte-Garcia,
A. Munoz Sudupe,
D. Navarro,
G. Parisi,
M. Pivanti,
S. Perez-Gaviro,
F. Ricci-Tersenghi,
J. J. Ruiz-Lorenzo,
S. F. Schifano,
B. Seoane,
A. Tarancon,
P. Tellez
, et al. (2 additional authors not shown)
Abstract:
We describe Janus, a massively parallel FPGA-based computer optimized for the simulation of spin glasses, theoretical models for the behavior of glassy materials. FPGAs (as compared to GPUs or many-core processors) provide a complementary approach to massively parallel computing. In particular, our model problem is formulated in terms of binary variables, and floating-point operations can be (almo…
▽ More
We describe Janus, a massively parallel FPGA-based computer optimized for the simulation of spin glasses, theoretical models for the behavior of glassy materials. FPGAs (as compared to GPUs or many-core processors) provide a complementary approach to massively parallel computing. In particular, our model problem is formulated in terms of binary variables, and floating-point operations can be (almost) completely avoided. The FPGA architecture allows us to run many independent threads with almost no latencies in memory access, thus updating up to 1024 spins per cycle. We describe Janus in detail and we summarize the physics results obtained in four years of operation of this machine; we discuss two types of physics applications: long simulations on very large systems (which try to mimic and provide understanding about the experimental non-equilibrium dynamics), and low-temperature equilibrium simulations using an artificial parallel tempering dynamics. The time scale of our non-equilibrium simulations spans eleven orders of magnitude (from picoseconds to a tenth of a second). On the other hand, our equilibrium simulations are unprecedented both because of the low temperatures reached and for the large systems that we have brought to equilibrium. A finite-time scaling ansatz emerges from the detailed comparison of the two sets of simulations. Janus has made it possible to perform spin-glass simulations that would take several decades on more conventional architectures. The paper ends with an assessment of the potential of possible future versions of the Janus architecture, based on state-of-the-art technology.
△ Less
Submitted 18 April, 2012;
originally announced April 2012.
-
An FPGA-based Torus Communication Network
Authors:
Marcello Pivanti,
Sebastiano Fabio Schifano,
Hubert Simma
Abstract:
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighbor communications between commodity multi-core processors. The aim of this project is to build up tightly interconnected and scalable parallel systems for scientific computing. The design includes the VHDL code to implement on latest FPGA devices a network processor, which can be accessed by the CPU…
▽ More
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighbor communications between commodity multi-core processors. The aim of this project is to build up tightly interconnected and scalable parallel systems for scientific computing. The design includes the VHDL code to implement on latest FPGA devices a network processor, which can be accessed by the CPU through a PCIe interface and which controls the external PHYs of the physical links. Moreover, a Linux driver and a library implementing custom communication APIs are provided. The TNW has been successfully integrated in two recent parallel machine projects, QPACE and AuroraScience. We describe some details of the porting of the TNW for the AuroraScience system and report performance results.
△ Less
Submitted 11 February, 2011;
originally announced February 2011.
-
QPACE -- a QCD parallel computer based on Cell processors
Authors:
H. Baier,
H. Boettiger,
M. Drochner,
N. Eicker,
U. Fischer,
Z. Fodor,
A. Frommer,
C. Gomez,
G. Goldrian,
S. Heybrock,
D. Hierl,
M. Hüsken,
T. Huth,
B. Krill,
J. Lauritsen,
T. Lippert,
T. Maurer,
B. Mendl,
N. Meyer,
A. Nobile,
I. Ouda,
M. Pivanti,
D. Pleiter,
M. Ries,
A. Schäfer
, et al. (10 additional authors not shown)
Abstract:
QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. To achieve the very hig…
▽ More
QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. To achieve the very high packaging density of 26 TFlops per rack a new water cooling concept has been developed and successfully realized. In this paper we give an overview of the architecture and highlight some important technical details of the system. Furthermore, we provide initial performance results and report on the installation of 8 QPACE racks providing an aggregate peak performance of 200 TFlops.
△ Less
Submitted 23 December, 2009; v1 submitted 11 November, 2009;
originally announced November 2009.