-
A Novel Ranking Scheme for the Performance Analysis of Stochastic Optimization Algorithms using the Principles of Severity
Authors:
Sowmya Chandrasekaran,
Thomas Bartz-Beielstein
Abstract:
Stochastic optimization algorithms have been successfully applied in several domains to find optimal solutions. Because of the ever-growing complexity of the integrated systems, novel stochastic algorithms are being proposed, which makes the task of the performance analysis of the algorithms extremely important. In this paper, we provide a novel ranking scheme to rank the algorithms over multiple…
▽ More
Stochastic optimization algorithms have been successfully applied in several domains to find optimal solutions. Because of the ever-growing complexity of the integrated systems, novel stochastic algorithms are being proposed, which makes the task of the performance analysis of the algorithms extremely important. In this paper, we provide a novel ranking scheme to rank the algorithms over multiple single-objective optimization problems. The results of the algorithms are compared using a robust bootstrap**-based hypothesis testing procedure that is based on the principles of severity. Analogous to the football league scoring scheme, we propose pairwise comparison of algorithms as in league competition. Each algorithm accumulates points and a performance metric of how good or bad it performed against other algorithms analogous to goal differences metric in football league scoring system. The goal differences performance metric can not only be used as a tie-breaker but also be used to obtain a quantitative performance of each algorithm. The key novelty of the proposed ranking scheme is that it takes into account the performance of each algorithm considering the magnitude of the achieved performance improvement along with its practical relevance and does not have any distributional assumptions. The proposed ranking scheme is compared to classical hypothesis testing and the analysis of the results shows that the results are comparable and our proposed ranking showcases many additional benefits.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Tree semi-separable matrices: a simultaneous generalization of sequentially and hierarchically semi-separable representations
Authors:
Nithin Govindarajan,
Shivkumar Chandrasekaran
Abstract:
We present a unification and generalization of sequentially and hierarchically semi-separable (SSS and HSS) matrices called tree semi-separable (TSS) matrices. Our main result is to show that any dense matrix can be expressed in a TSS format. Here, the dimensions of the generators are specified by the ranks of the Hankel blocks of the matrix. TSS matrices satisfy a graph-induced rank structure (GI…
▽ More
We present a unification and generalization of sequentially and hierarchically semi-separable (SSS and HSS) matrices called tree semi-separable (TSS) matrices. Our main result is to show that any dense matrix can be expressed in a TSS format. Here, the dimensions of the generators are specified by the ranks of the Hankel blocks of the matrix. TSS matrices satisfy a graph-induced rank structure (GIRS) property. It is shown that TSS matrices generalize the algebraic properties of SSS and HSS matrices under addition, products, and inversion. Subsequently, TSS matrices admit linear time matrix-vector multiply, matrix-matrix multiply, matrix-matrix addition, inversion, and solvers.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Latest Development of Electropolishing Optimization for 650 MHz Niobium Cavity
Authors:
V. Chouhan,
D. Bice,
D. Burk,
S. Chandrasekaran,
A. Cravatta,
P. Dubiel,
G. V. Eremeev,
F. Furuta,
O. Melnychuk,
A. Netepenko,
M. K. Ng,
J. Ozelis,
H. Park,
T. Ring,
G. Wu,
B. Guilfoyle,
M. P. Kelly,
T. Reid
Abstract:
Electropolishing (EP) of 1.3 GHz niobium superconducting RF cavities is conducted to achieve a desired smooth and contaminant-free surface that yields good RF performance. Achieving a smooth surface of a large-sized elliptical cavity with the standard EP conditions was found to be challenging. This work aimed to conduct a systematic parametric EP study to understand the effects of various EP param…
▽ More
Electropolishing (EP) of 1.3 GHz niobium superconducting RF cavities is conducted to achieve a desired smooth and contaminant-free surface that yields good RF performance. Achieving a smooth surface of a large-sized elliptical cavity with the standard EP conditions was found to be challenging. This work aimed to conduct a systematic parametric EP study to understand the effects of various EP parameters on the surface of 650 MHz niobium cavities used in the Proton Improvement Plan-II (PIP-II) linear accelerator. Parameters optimized in this study provided a smooth surface of the cavities. The electropolished cavity showed significantly a higher accelerating gradient meeting baseline requirement and qualified for further surface treatment to improve the cavity quality factor.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Impact of Solenoid Induced Residual Magnetic Fields on The Prototype SSR1 CM Performance
Authors:
D. Passarelli,
J. Bernardini,
C. Boffo,
S. Chandrasekaran,
A. Hogberg,
T. Khabiboulline,
J. Ozelis,
M. Parise,
V. Roger,
G. Romanov,
A. Sukhanov,
G. Wu,
V. Yakovlev,
Y. Xie
Abstract:
A prototype cryomodule containing eight Single Spoke Resonators type-1 (SSR1) operating at 325 MHz and four superconducting focusing lenses was successfully assembled, cold tested, and accelerated beam in the framework of the PIP-II project at Fermilab. The impact of induced residual magnetic fields from the solenoids on performance of cavities is presented in this contribution. In addition, desig…
▽ More
A prototype cryomodule containing eight Single Spoke Resonators type-1 (SSR1) operating at 325 MHz and four superconducting focusing lenses was successfully assembled, cold tested, and accelerated beam in the framework of the PIP-II project at Fermilab. The impact of induced residual magnetic fields from the solenoids on performance of cavities is presented in this contribution. In addition, design optimizations for the production cryomodules as a result of this impact are highlighted.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
CIMGEN: Controlled Image Manipulation by Finetuning Pretrained Generative Models on Limited Data
Authors:
Chandrakanth Gudavalli,
Erik Rosten,
Lakshmanan Nataraj,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Content creation and image editing can benefit from flexible user controls. A common intermediate representation for conditional image generation is a semantic map, that has information of objects present in the image. When compared to raw RGB pixels, the modification of semantic map is much easier. One can take a semantic map and easily modify the map to selectively insert, remove, or replace obj…
▽ More
Content creation and image editing can benefit from flexible user controls. A common intermediate representation for conditional image generation is a semantic map, that has information of objects present in the image. When compared to raw RGB pixels, the modification of semantic map is much easier. One can take a semantic map and easily modify the map to selectively insert, remove, or replace objects in the map. The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map. The method leverages traditional pre-trained image-to-image translation GANs, such as CycleGAN or Pix2Pix GAN, that are fine-tuned on a limited dataset of reference images associated with the semantic maps. We discuss the qualitative and quantitative performance of our technique to illustrate its capacity and possible applications in the fields of image forgery and image editing. We also demonstrate the effectiveness of the proposed image forgery technique in thwarting the numerous deep learning-based image forensic techniques, highlighting the urgent need to develop robust and generalizable image forensic tools in the fight against the spread of fake media.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Reproducible image-based profiling with Pycytominer
Authors:
Erik Serrano,
Srinivas Niranj Chandrasekaran,
Dave Bunten,
Kenneth I. Brewer,
Jenna Tomkinson,
Roshan Kern,
Michael Bornholdt,
Stephen Fleming,
Ruifan Pei,
John Arevalo,
Hillary Tsang,
Vincent Rubinetti,
Callum Tromans-Coia,
Tim Becker,
Erin Weisbart,
Charlotte Bunne,
Alexandr A. Kalinin,
Rebecca Senft,
Stephen J. Taylor,
Nasim Jamali,
Adeniyi Adeboye,
Hamdah Shafqat Abbasi,
Allen Goodman,
Juan C. Caicedo,
Anne E. Carpenter
, et al. (3 additional authors not shown)
Abstract:
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics st…
▽ More
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics steps, known as image-based profiling. We demonstrate Pycytominers usefulness in a machine learning project to predict nuisance compounds that cause undesirable cell injuries.
△ Less
Submitted 2 July, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
PIII Project Overview and Status
Authors:
R. Stanek,
C. Boffo,
S. Chandrasekaran,
S. Dixon,
E. Harms,
L. Kokoska,
I. Kourbanis,
J. Leibfritz,
O. Napoly,
D. Passarelli,
E. Pozdeyev,
A. Rowe
Abstract:
The Proton Improvement Plan II (PIP-II) project is an essential upgrade to Fermilab's particle accelerator complex to enable the world's most intense neutrino beam for LBNF/DUNE and a broad particle physics program for many decades to come. PIP-II will deliver 1.2 MW of proton beam power from the Main Injector, upgradeable to multi-MW capability. The central element of PIP-II is an 800 MeV superco…
▽ More
The Proton Improvement Plan II (PIP-II) project is an essential upgrade to Fermilab's particle accelerator complex to enable the world's most intense neutrino beam for LBNF/DUNE and a broad particle physics program for many decades to come. PIP-II will deliver 1.2 MW of proton beam power from the Main Injector, upgradeable to multi-MW capability. The central element of PIP-II is an 800 MeV superconducting radio frequency (SRF) linac, which comprises a room temperature front end followed by an SRF section. The SRF section consists of five different flavors of cavities/cryomodules, including Half Wave Resonators (HWR), Single Spoke and elliptical resonators operating at, or above, state-of-the-art parameters. The first two PIP-II cryomodules, Half Wave Resonator (HWR) and Single Spoke Resonator 1 (SSR1) were installed in the PIP-II Injector Test facility (PIP2IT) and have accelerated beam to above 17 MeV. PIP-II is the first U.S. accelerator project that will be constructed with significant contributions from international partners, including India, Italy, France, United Kingdom and Poland. The project was baselined in April 2022, and the construction phase is underway.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
LLM4VV: Develo** LLM-Driven Testsuite for Compiler Validation
Authors:
Christian Munley,
Aaron Jarmusch,
Sunita Chandrasekaran
Abstract:
Large language models (LLMs) are a new and powerful tool for a wide span of applications involving natural language and demonstrate impressive code generation abilities. The goal of this work is to automatically generate tests and use these tests to validate and verify compiler implementations of a directive-based parallel programming paradigm, OpenACC. To do so, in this paper, we explore the capa…
▽ More
Large language models (LLMs) are a new and powerful tool for a wide span of applications involving natural language and demonstrate impressive code generation abilities. The goal of this work is to automatically generate tests and use these tests to validate and verify compiler implementations of a directive-based parallel programming paradigm, OpenACC. To do so, in this paper, we explore the capabilities of state-of-the-art LLMs, including open-source LLMs -- Meta Codellama, Phind fine-tuned version of Codellama, Deepseek Deepseek Coder and closed-source LLMs -- OpenAI GPT-3.5-Turbo and GPT-4-Turbo. We further fine-tuned the open-source LLMs and GPT-3.5-Turbo using our own testsuite dataset along with using the OpenACC specification. We also explored these LLMs using various prompt engineering techniques that include code template, template with retrieval-augmented generation (RAG), one-shot example, one-shot with RAG, expressive prompt with code template and RAG. This paper highlights our findings from over 5000 tests generated via all the above mentioned methods. Our contributions include: (a) exploring the capabilities of the latest and relevant LLMs for code generation, (b) investigating fine-tuning and prompt methods, and (c) analyzing the outcome of LLMs generated tests including manually analysis of representative set of tests. We found the LLM Deepseek-Coder-33b-Instruct produced the most passing tests followed by GPT-4-Turbo.
△ Less
Submitted 10 March, 2024; v1 submitted 7 October, 2023;
originally announced October 2023.
-
EZ: An Efficient, Charge Conserving Current Deposition Algorithm for Electromagnetic Particle-In-Cell Simulations
Authors:
Klaus Steiniger,
Rene Widera,
Sergei Bastrakov,
Michael Bussmann,
Sunita Chandrasekaran,
Benjamin Hernandez,
Kristina Holsapple,
Axel Huebl,
Guido Juckeland,
Jeffrey Kelling,
Matt Leinhauser,
Richard Pausch,
David Rogers,
Ulrich Schramm,
Jeff Young,
Alexander Debus
Abstract:
We present EZ, a novel current deposition algorithm for particle-in-cell (PIC) simulations. EZ calculates the current density on the electromagnetic grid due to macro-particle motion within a time step by solving the continuity equation of electrodynamics. Being a charge conserving hybridization of Esirkepov's method and ZigZag, we refer to it as ``EZ'' as shorthand for ``Esirkepov meets ZigZag''.…
▽ More
We present EZ, a novel current deposition algorithm for particle-in-cell (PIC) simulations. EZ calculates the current density on the electromagnetic grid due to macro-particle motion within a time step by solving the continuity equation of electrodynamics. Being a charge conserving hybridization of Esirkepov's method and ZigZag, we refer to it as ``EZ'' as shorthand for ``Esirkepov meets ZigZag''. Simulations of a warm, relativistic plasma with PIConGPU show that EZ achieves the same level of charge conservation as the commonly used method by Esirkepov, yet reaches higher performance for macro-particle assignment-functions up to third-order. In addition to a detailed description of the functioning of EZ, reasons for the expected and observed performance increase are given, and guidelines for its implementation aiming at highest performance on GPUs are provided.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Installation, commissioning, and testing of the HB650 CM at PIP2IT
Authors:
M White,
J Makara,
S Ranpariya,
L Pei,
M Barba,
J Subedi,
J Dong,
B Hansen,
A E T Akintola,
J Holzbauer,
J Ozelis,
S Chandrasekaran,
V Roger
Abstract:
The Proton Improvement Plan-II (PIP-II) is a major upgrade to the Fermilab accelerator complex, featuring a new 800-MeV Superconducting Radio-Frequency (SRF) linear accelerator (LINAC) powering the accelerator complex to provide the world's most intense high-energy neutrino beam. This paper describes the conversion of the PIP-II Injector Test Facility (PIP2IT) cryogenic system into a test stand fo…
▽ More
The Proton Improvement Plan-II (PIP-II) is a major upgrade to the Fermilab accelerator complex, featuring a new 800-MeV Superconducting Radio-Frequency (SRF) linear accelerator (LINAC) powering the accelerator complex to provide the world's most intense high-energy neutrino beam. This paper describes the conversion of the PIP-II Injector Test Facility (PIP2IT) cryogenic system into a test stand for PIP-II High-Beta 650 MHz (HB650) cryomodules at Fermilab's Cryomodule Test Facility (CMTF). A description of the associated mechanical, electrical, and controls modifications necessary for testing HB650 cryomodules are provided. The cooldown and warmup requirements, procedures and associated controls logic is described.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
The Evaluation of Mechanical Properties of LB650 Cavities
Authors:
J. Holzbauer,
G. Wu,
H. Park,
K. McGee,
A. Wixson,
T. Khabiboulline,
G. Romanov,
S. Adams,
D. Bice,
S. K. Chandrasekaran,
J. Ozelis,
I. Gonin,
C. Narug,
R. Thiede,
R. Treece,
C. Grimm
Abstract:
The PIP-II project's LB650 cavities could potentially be vulnerable to mechanical deformation because of the geometric shape of the cavity due to reduced beta. The mechanical property of the niobium half-cell was measured following various heat treatments. The 5-cell cavities were tested in a controlled drop test fashion and the real-world road test. The result showed that the 900 $°$C heat treatm…
▽ More
The PIP-II project's LB650 cavities could potentially be vulnerable to mechanical deformation because of the geometric shape of the cavity due to reduced beta. The mechanical property of the niobium half-cell was measured following various heat treatments. The 5-cell cavities were tested in a controlled drop test fashion and the real-world road test. The result showed that the 900 $°$C heat treatment was compatible with cavity handling and transportation during production. The test provides the bases of the transportation specification and ship** container design guidelines.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Transportation Fatigue Testing of the pHB650 Power Coupler Antenna for the PIP-II Project at Fermilab
Authors:
J. Helsper,
S. Chandrasekaran,
J. Holzbauer,
N. Solyak
Abstract:
The PIP-II Project will see international shipment of cryomodules from Europe to the United States, and as such, the shocks which can occur during shipment pose a risk to the internal components. Of particular concern is the coupler ceramic window and surrounding brazes, which will see stresses during an excitation event. Since the antenna design is new, and because of the setback failure would cr…
▽ More
The PIP-II Project will see international shipment of cryomodules from Europe to the United States, and as such, the shocks which can occur during shipment pose a risk to the internal components. Of particular concern is the coupler ceramic window and surrounding brazes, which will see stresses during an excitation event. Since the antenna design is new, and because of the setback failure would create, a cyclic stress test was devised for the antenna. This paper presents the experimental methods, setup, and results of the test.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Impact of Electron-Withdrawing Groups on Ion Transport and Structure in Lithium Borate Ionic Liquids
Authors:
Volodymyr Koverga,
Selvaraj S. Chandrasekaran,
Anh T. Ngo
Abstract:
Among the distinctive structural features of lithium ionic liquids (LILs), a novel class of single-component electrolytes, the variation of the electron-withdrawing group stands out as a key factor in determining their dynamics. To understand this phenomenon, we conducted molecular dynamics (MD) simulations for LILs based on hexafluoro-2-propanoxy (LIL2), hexafluoro-2-methyl-2-propanoxy (LIL4), an…
▽ More
Among the distinctive structural features of lithium ionic liquids (LILs), a novel class of single-component electrolytes, the variation of the electron-withdrawing group stands out as a key factor in determining their dynamics. To understand this phenomenon, we conducted molecular dynamics (MD) simulations for LILs based on hexafluoro-2-propanoxy (LIL2), hexafluoro-2-methyl-2-propanoxy (LIL4), and trifluoro-2-propanoxy (LIL6) derivatives. Results revealed that correlated ion dynamics govern the general transport characteristics in LILs, while the electron-withdrawing group regulates the Li transport mechanism. Upon saturation by fluorine atoms, LILs exhibit higher inhomogeneity in their transport and structure properties. Strong coordination along the ethoxide group promotes jumps of Li across positive domains, while in fluorine-poor LILs, stronger coordination in proximity to boron atoms carries the anion along Li transport. Understanding the results of MD simulation will aid the further design and widespread use of this class of electrolytes in production of the energy storage and conversion devices
△ Less
Submitted 26 February, 2024; v1 submitted 22 May, 2023;
originally announced May 2023.
-
Distributed State Estimation for Linear Time-Varying Systems with Sensor Network Delays
Authors:
Sanjay Chandrasekaran,
Vishnu Varadan,
Siva Vignesh Krishnan,
Florian Dörfler,
Mohammad H. Mamduhi
Abstract:
Distributed sensor networks often include a multitude of sensors, each measuring parts of a process state space or observing the operations of a system. Communication of measurements between the sensor nodes and estimator(s) cannot realistically be considered delay-free due to communication errors and transmission latency in the channels. We propose a novel stability-based method that mitigates th…
▽ More
Distributed sensor networks often include a multitude of sensors, each measuring parts of a process state space or observing the operations of a system. Communication of measurements between the sensor nodes and estimator(s) cannot realistically be considered delay-free due to communication errors and transmission latency in the channels. We propose a novel stability-based method that mitigates the influence of sensor network delays in distributed state estimation for linear time-varying systems. Our proposed algorithm efficiently selects a subset of sensors from the entire sensor nodes in the network based on the desired stability margins of the distributed Kalman filter estimates, after which, the state estimates are computed only using the measurements of the selected sensors. We provide comparisons between the estimation performance of our proposed algorithm and a greedy algorithm that exhaustively selects an optimal subset of nodes. We then apply our method to a simulative scenario for estimating the states of a linear time-varying system using a sensor network including 2000 sensor nodes. Simulation results demonstrate the performance efficiency of our algorithm and show that it closely follows the achieved performance by the optimal greedy search algorithm.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Authors:
Anjana Arunkumar,
Shubham Sharma,
Rakhi Agrawal,
Sriram Chandrasekaran,
Chris Bryan
Abstract:
Cross-task generalization is a significant outcome that defines mastery in natural language understanding. Humans show a remarkable aptitude for this, and can solve many different types of tasks, given definitions in the form of textual instructions and a small set of examples. Recent work with pre-trained language models mimics this learning style: users can define and exemplify a task for the mo…
▽ More
Cross-task generalization is a significant outcome that defines mastery in natural language understanding. Humans show a remarkable aptitude for this, and can solve many different types of tasks, given definitions in the form of textual instructions and a small set of examples. Recent work with pre-trained language models mimics this learning style: users can define and exemplify a task for the model to attempt as a series of natural language prompts or instructions. While prompting approaches have led to higher cross-task generalization compared to traditional supervised learning, analyzing 'bias' in the task instructions given to the model is a difficult problem, and has thus been relatively unexplored. For instance, are we truly modeling a task, or are we modeling a user's instructions? To help investigate this, we develop LINGO, a novel visual analytics interface that supports an effective, task-driven workflow to (1) help identify bias in natural language task instructions, (2) alter (or create) task instructions to reduce bias, and (3) evaluate pre-trained model performance on debiased task instructions. To robustly evaluate LINGO, we conduct a user study with both novice and expert instruction creators, over a dataset of 1,616 linguistic tasks and their natural language instructions, spanning 55 different languages. For both user groups, LINGO promotes the creation of more difficult tasks for pre-trained models, that contain higher linguistic diversity and lower instruction bias. We additionally discuss how the insights learned in develo** and evaluating LINGO can aid in the design of future dashboards that aim to minimize the effort involved in prompt creation across multiple domains.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
MalGrid: Visualization Of Binary Features In Large Malware Corpora
Authors:
Tajuddin Manhar Mohammed,
Lakshmanan Nataraj,
Satish Chikkagoudar,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
The number of malware is constantly on the rise. Though most new malware are modifications of existing ones, their sheer number is quite overwhelming. In this paper, we present a novel system to visualize and map millions of malware to points in a 2-dimensional (2D) spatial grid. This enables visualizing relationships within large malware datasets that can be used to develop triage solutions to sc…
▽ More
The number of malware is constantly on the rise. Though most new malware are modifications of existing ones, their sheer number is quite overwhelming. In this paper, we present a novel system to visualize and map millions of malware to points in a 2-dimensional (2D) spatial grid. This enables visualizing relationships within large malware datasets that can be used to develop triage solutions to screen different malware rapidly and provide situational awareness. Our approach links two visualizations within an interactive display. Our first view is a spatial point-based visualization of similarity among the samples based on a reduced dimensional projection of binary feature representations of malware. Our second spatial grid-based view provides a better insight into similarities and differences between selected malware samples in terms of the binary-based visual representations they share. We also provide a case study where the effect of packing on the malware data is correlated with the complexity of the packing algorithm.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Application Experiences on a GPU-Accelerated Arm-based HPC Testbed
Authors:
Wael Elwasif,
William Godoy,
Nick Hagerty,
J. Austin Harris,
Oscar Hernandez,
Balint Joo,
Paul Kent,
Damien Lebrun-Grandie,
Elijah Maccarthy,
Veronica G. Melesse Vergara,
Bronson Messer,
Ross Miller,
Sarp Opal,
Sergei Bastrakov,
Michael Bussmann,
Alexander Debus,
Klaus Steinger,
Jan Stephan,
Rene Widera,
Spencer H. Bryngelson,
Henry Le Berre,
Anand Radhakrishnan,
Jefferey Young,
Sunita Chandrasekaran,
Florina Ciorba
, et al. (6 additional authors not shown)
Abstract:
This paper assesses and reports the experience of ten teams working to port,validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems built by GIGABYTE, each one equipped with a server-class Arm CPU from Ampere Computing and A100 data center GPU from NVIDIA Corp. The syst…
▽ More
This paper assesses and reports the experience of ten teams working to port,validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems built by GIGABYTE, each one equipped with a server-class Arm CPU from Ampere Computing and A100 data center GPU from NVIDIA Corp. The systems are connected together using Infiniband high-bandwidth low-latency interconnect. The selected applications and mini-apps are written using several programming languages and use multiple accelerator-based programming models for GPUs such as CUDA, OpenACC, and OpenMP offloading. Working on application porting requires a robust and easy-to-access programming environment, including a variety of compilers and optimized scientific libraries. The goal of this work is to evaluate platform readiness and assess the effort required from developers to deploy well-established scientific workloads on current and future generation Arm-based GPU-accelerated HPC systems. The reported case studies demonstrate that the current level of maturity and diversity of software and tools is already adequate for large-scale production deployments.
△ Less
Submitted 19 December, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Design, Manufacturing, Assembly, Testing, And Lessons Learned Of The Prototype 650 Mhz Couplers
Authors:
J. Helsper,
S. Chandrasekaran,
N. Solyak,
S. Kazakov,
K. Premo,
G. Wu,
F. Furuta,
J. Ozelis,
B. Hanna
Abstract:
Six 650 MHz high-power couplers will be integrated into the prototype High Beta 650 MHz (HB650) cryomodule for the PIP-II project at Fermilab. The design of the coupler is described, including design optimizations from the previous generation. This paper then describes the coupler life-cycle, including manufacturing, assembly, testing, conditioning, and the lessons learned at each stage.
Six 650 MHz high-power couplers will be integrated into the prototype High Beta 650 MHz (HB650) cryomodule for the PIP-II project at Fermilab. The design of the coupler is described, including design optimizations from the previous generation. This paper then describes the coupler life-cycle, including manufacturing, assembly, testing, conditioning, and the lessons learned at each stage.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP
Authors:
Thomas Huber,
Swaroop Pophale,
Nolan Baker,
Michael Carr,
Nikhil Rao,
Jaydon Reap,
Kristina Holsapple,
Joshua Hoke Davis,
Tobias Burnus,
Seyong Lee,
David E. Bernholdt,
Sunita Chandrasekaran
Abstract:
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of…
▽ More
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of compiler and hardware vendors that support OpenMP. In this manuscript, we focus on evaluating the conformity and implementation progress of various compiler vendors such as Cray, IBM, GNU, Clang/LLVM, NVIDIA, Intel and AMD. We specifically address the 4.5, 5.0, and 5.1 versions of the specification.
△ Less
Submitted 14 November, 2022; v1 submitted 28 August, 2022;
originally announced August 2022.
-
Analysis of Validating and Verifying OpenACC Compilers 3.0 and Above
Authors:
A. M. Jarmusch,
A. Liu,
C. Munley,
D. Horta,
V. Ravichandran,
J. Denny,
S. Chandrasekaran
Abstract:
OpenACC is a high-level directive-based parallel programming model that can manage the sophistication of heterogeneity in architectures and abstract it from the users. The portability of the model across CPUs and accelerators has gained the model a wide variety of users. This means it is also crucial to analyze the reliability of the compilers' implementations. To address this challenge, the OpenA…
▽ More
OpenACC is a high-level directive-based parallel programming model that can manage the sophistication of heterogeneity in architectures and abstract it from the users. The portability of the model across CPUs and accelerators has gained the model a wide variety of users. This means it is also crucial to analyze the reliability of the compilers' implementations. To address this challenge, the OpenACC Validation and Verification team has proposed a validation testsuite to verify the OpenACC implementations across various compilers with an infrastructure for a more streamlined execution. This paper will cover the following aspects: (a) the new developments since the last publication on the testsuite, (b) outline the use of the infrastructure, (c) discuss tests that highlight our workflow process, (d) analyze the results from executing the testsuite on various systems, and (e) outline future developments.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Validation of the 650 MHZ SRF tuner on the low and high beta cavities for PIP-II at 2 K*
Authors:
C. Contreras-Martinez,
S. Chandrasekaran,
S. Cheban,
G. Eremeev,
I. Gonin,
T. Khabiboulline,
Y. Pischalnikov,
O. Prokofiev,
A. Sukhanov,
JC. Yun
Abstract:
The PIP-II linac will include thirty-six beta=0.61 and twenty-four beta=0.92 650 MHz 5 cell elliptical SRF cavities. Each cavity will be equipped with a tuning system consisting of a double lever slow tuner for coarse frequency tuning and a piezoelectric actuator for fine frequency tuning. The same tuner will be used for both the beta=0.61 and beta=0.92 cavities. Results of testing the cavity-tune…
▽ More
The PIP-II linac will include thirty-six beta=0.61 and twenty-four beta=0.92 650 MHz 5 cell elliptical SRF cavities. Each cavity will be equipped with a tuning system consisting of a double lever slow tuner for coarse frequency tuning and a piezoelectric actuator for fine frequency tuning. The same tuner will be used for both the beta=0.61 and beta=0.92 cavities. Results of testing the cavity-tuner system for the beta=0.61 will be presented for the first time.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
CNNs Avoid Curse of Dimensionality by Learning on Patches
Authors:
Vamshi C. Madala,
Shivkumar Chandrasekaran,
Jason Bunk
Abstract:
Despite the success of convolutional neural networks (CNNs) in numerous computer vision tasks and their extraordinary generalization performances, several attempts to predict the generalization errors of CNNs have only been limited to a posteriori analyses thus far. A priori theories explaining the generalization performances of deep neural networks have mostly ignored the convolutionality aspect…
▽ More
Despite the success of convolutional neural networks (CNNs) in numerous computer vision tasks and their extraordinary generalization performances, several attempts to predict the generalization errors of CNNs have only been limited to a posteriori analyses thus far. A priori theories explaining the generalization performances of deep neural networks have mostly ignored the convolutionality aspect and do not specify why CNNs are able to seemingly overcome curse of dimensionality on computer vision tasks like image classification where the image dimensions are in thousands. Our work attempts to explain the generalization performance of CNNs on image classification under the hypothesis that CNNs operate on the domain of image patches. Ours is the first work we are aware of to derive an a priori error bound for the generalization error of CNNs and we present both quantitative and qualitative evidences in the support of our theory. Our patch-based theory also offers explanation for why data augmentation techniques like Cutout, CutMix and random crop** are effective in improving the generalization error of CNNs.
△ Less
Submitted 12 April, 2023; v1 submitted 22 May, 2022;
originally announced May 2022.
-
NLP Based Anomaly Detection for Categorical Time Series
Authors:
Matthew Horak,
Sowmya Chandrasekaran,
Giovanni Tobar
Abstract:
Identifying anomalies in large multi-dimensional time series is a crucial and difficult task across multiple domains. Few methods exist in the literature that address this task when some of the variables are categorical in nature. We formalize an analogy between categorical time series and classical Natural Language Processing and demonstrate the strength of this analogy for anomaly detection and…
▽ More
Identifying anomalies in large multi-dimensional time series is a crucial and difficult task across multiple domains. Few methods exist in the literature that address this task when some of the variables are categorical in nature. We formalize an analogy between categorical time series and classical Natural Language Processing and demonstrate the strength of this analogy for anomaly detection and root cause investigation by implementing and testing three different machine learning anomaly detection and root cause investigation models based upon it.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites
Authors:
Holger Brunst,
Sunita Chandrasekaran,
Florina Ciorba,
Nick Hagerty,
Robert Henschel,
Guido Juckeland,
Junjie Li,
Veronica G. Melesse Vergara,
Sandra Wienke,
Miguel Zavala
Abstract:
Modern HPC systems are built with innovative system architectures and novel programming models to further push the speed limit of computing. The increased complexity poses challenges for performance portability and performance evaluation. The Standard Performance Evaluation Corporation -SPEC has a long history of producing industry standard benchmarks for modern computer systems. SPEC is a newly r…
▽ More
Modern HPC systems are built with innovative system architectures and novel programming models to further push the speed limit of computing. The increased complexity poses challenges for performance portability and performance evaluation. The Standard Performance Evaluation Corporation -SPEC has a long history of producing industry standard benchmarks for modern computer systems. SPEC is a newly released SPEChpc 2021 benchmark suites, developed by the High Performance Group, are a bold attempt to provide a fair and objective benchmarking tool designed for state of the art HPC systems. With the support of multiple host and accelerator programming models, the suites are portable across both homogeneous and heterogeneous architectures. Different workloads are developed to fit system sizes ranging from a few compute nodes to a few hundred compute nodes. In this manuscript, we take a first glance at these benchmark suites and evaluate their portability and basic performance characteristics on various popular and emerging HPC architectures, including x86 CPU, NVIDIA GPU, and AMD GPU. This study provides a first-hand experience of executing the SPEChpc 2021 suites at scale on production HPC systems, discusses real-world use cases, and serves as an initial guideline for using the benchmark suites.
△ Less
Submitted 28 March, 2022; v1 submitted 13 March, 2022;
originally announced March 2022.
-
Computer Vision Based Parking Optimization System
Authors:
Siddharth Chandrasekaran,
Jeffrey Matthew Reginald,
Wei Wang,
Ting Zhu
Abstract:
An improvement in technology is linearly related to time and time-relevant problems. It has been seen that as time progresses, the number of problems humans face also increases. However, technology to resolve these problems tends to improve as well. One of the earliest existing problems which started with the invention of vehicles was parking. The ease of resolving this problem using technology ha…
▽ More
An improvement in technology is linearly related to time and time-relevant problems. It has been seen that as time progresses, the number of problems humans face also increases. However, technology to resolve these problems tends to improve as well. One of the earliest existing problems which started with the invention of vehicles was parking. The ease of resolving this problem using technology has evolved over the years but the problem of parking still remains unsolved. The main reason behind this is that parking does not only involve one problem but it consists of a set of problems within itself. One of these problems is the occupancy detection of the parking slots in a distributed parking ecosystem. In a distributed system, users would find preferable parking spaces as opposed to random parking spaces. In this paper, we propose a web-based application as a solution for parking space detection in different parking spaces. The solution is based on Computer Vision (CV) and is built using the Django framework written in Python 3.0. The solution works to resolve the occupancy detection problem along with providing the user the option to determine the block based on availability and his preference. The evaluation results for our proposed system are promising and efficient. The proposed system can also be integrated with different systems and be used for solving other relevant parking problems.
△ Less
Submitted 31 December, 2021;
originally announced January 2022.
-
OMD: Orthogonal Malware Detection Using Audio, Image, and Static Features
Authors:
Lakshmanan Nataraj,
Tajuddin Manhar Mohammed,
Tejaswi Nanjundaswamy,
Satish Chikkagoudar,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
With the growing number of malware and cyber attacks, there is a need for "orthogonal" cyber defense approaches, which are complementary to existing methods by detecting unique malware samples that are not predicted by other methods. In this paper, we propose a novel and orthogonal malware detection (OMD) approach to identify malware using a combination of audio descriptors, image similarity descr…
▽ More
With the growing number of malware and cyber attacks, there is a need for "orthogonal" cyber defense approaches, which are complementary to existing methods by detecting unique malware samples that are not predicted by other methods. In this paper, we propose a novel and orthogonal malware detection (OMD) approach to identify malware using a combination of audio descriptors, image similarity descriptors and other static/statistical features. First, we show how audio descriptors are effective in classifying malware families when the malware binaries are represented as audio signals. Then, we show that the predictions made on the audio descriptors are orthogonal to the predictions made on image similarity descriptors and other static features. Further, we develop a framework for error analysis and a metric to quantify how orthogonal a new feature set (or type) is with respect to other feature sets. This allows us to add new features and detection methods to our overall framework. Experimental results on malware datasets show that our approach provides a robust framework for orthogonal malware detection.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and Statistical Analysis
Authors:
Tajuddin Manhar Mohammed,
Lakshmanan Nataraj,
Satish Chikkagoudar,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Malicious PDF documents present a serious threat to various security organizations that require modern threat intelligence platforms to effectively analyze and characterize the identity and behavior of PDF malware. State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware. However, ML models are often susceptible to evasion attacks, in which an adversary…
▽ More
Malicious PDF documents present a serious threat to various security organizations that require modern threat intelligence platforms to effectively analyze and characterize the identity and behavior of PDF malware. State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware. However, ML models are often susceptible to evasion attacks, in which an adversary obfuscates the malware code to avoid being detected by an Antivirus. In this paper, we derive a simple yet effective holistic approach to PDF malware detection that leverages signal and statistical analysis of malware binaries. This includes combining orthogonal feature space models from various static and dynamic malware detection methods to enable generalized robustness when faced with code obfuscations. Using a dataset of nearly 30,000 PDF files containing both malware and benign samples, we show that our holistic approach maintains a high detection rate (99.92%) of PDF malware and even detects new malicious files created by simple methods that remove the obfuscation conducted by malware authors to hide their malware, which are undetected by most antiviruses.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
OpenACC Acceleration of an Agent-Based Biological Simulation Framework
Authors:
Matt Stack,
Paul Macklin,
Robert Searles,
Sunita Chandrasekaran
Abstract:
Computational biology has increasingly turned to agent-based modeling to explore complex biological systems. Biological diffusion (diffusion, decay, secretion, and uptake) is a key driver of biological tissues. GPU computing can vastly accelerate the diffusion and decay operators in the partial differential equations used to represent biological transport in an agent-based biological modeling syst…
▽ More
Computational biology has increasingly turned to agent-based modeling to explore complex biological systems. Biological diffusion (diffusion, decay, secretion, and uptake) is a key driver of biological tissues. GPU computing can vastly accelerate the diffusion and decay operators in the partial differential equations used to represent biological transport in an agent-based biological modeling system. In this paper, we utilize OpenACC to accelerate the diffusion portion of PhysiCell, a cross-platform agent-based biosimulation framework. We demonstrate an almost 40x speedup on the state-of-the-art NVIDIA A100 GPU compared to a serial run on AMD's EPYC 7742. We also demonstrate 9x speedup on the 64 core AMD EPYC 7742 multicore platform. By using OpenACC for both the CPUs and the GPUs, we maintain a single source code base, thus creating a portable yet performant solution. With the simulator's most significant computational bottleneck significantly reduced, we can continue cancer simulations over much longer times.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
Challenges Porting a C++ Template-Metaprogramming Abstraction Layer to Directive-based Offloading
Authors:
Jeffrey Kelling,
Sergei Bastrakov,
Alexander Debus,
Thomas Kluge,
Matt Leinhauser,
Richard Pausch,
Klaus Steiniger,
Jan Stephan,
René Widera,
Jeff Young,
Michael Bussmann,
Sunita Chandrasekaran,
Guido Juckeland
Abstract:
HPC systems employ a growing variety of compute accelerators with different architectures and from different vendors. Large scientific applications are required to run efficiently across these systems but need to retain a single code-base in order to not stifle development. Directive-based offloading programming models set out to provide the required portability, but, to existing codes, they thems…
▽ More
HPC systems employ a growing variety of compute accelerators with different architectures and from different vendors. Large scientific applications are required to run efficiently across these systems but need to retain a single code-base in order to not stifle development. Directive-based offloading programming models set out to provide the required portability, but, to existing codes, they themselves represent yet another API to port to. Here, we present our approach of porting the GPU-accelerated particle-in-cell code PIConGPU to OpenACC and OpenMP target by adding two new backends to its existing C++-template metaprogramming-based offloading abstraction layer alpaka and avoiding other modifications to the application code. We introduce our approach in the face of conflicts between requirements and available features in the standards as well as practical hurdles posed by immature compiler support.
△ Less
Submitted 24 January, 2022; v1 submitted 16 October, 2021;
originally announced October 2021.
-
Metrics and Design of an Instruction Roofline Model for AMD GPUs
Authors:
Matthew Leinhauser,
René Widera,
Sergei Bastrakov,
Alexander Debus,
Michael Bussmann,
Sunita Chandrasekaran
Abstract:
Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD architectures (CPU-GPU), which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs…
▽ More
Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD architectures (CPU-GPU), which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs. In this paper, we design an instruction roofline model for AMD GPUs using AMD's ROCProfiler and a benchmarking tool, BabelStream (the HIP implementation), as a way to measure an application's performance in instructions and memory transactions on new AMD hardware. Specifically, we create instruction roofline models for a case study scientific application, PIConGPU, an open source particle-in-cell (PIC) simulations application used for plasma and laser-plasma physics on the NVIDIA V100, AMD Radeon Instinct MI60, and AMD Instinct MI100 GPUs. When looking at the performance of multiple kernels of interest in PIConGPU we find that although the AMD MI100 GPU achieves a similar, or better, execution time compared to the NVIDIA V100 GPU, profiling tool differences make comparing performance of these two architectures hard. When looking at execution time, GIPS, and instruction intensity, the AMD MI60 achieves the worst performance out of the three GPUs used in this work.
△ Less
Submitted 10 November, 2021; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Seam Carving Detection and Localization using Two-Stage Deep Neural Networks
Authors:
Lakshmanan Nataraj,
Chandrakanth Gudavalli,
Tajuddin Manhar Mohammed,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Seam carving is a method to resize an image in a content aware fashion. However, this method can also be used to carve out objects from images. In this paper, we propose a two-step method to detect and localize seam carved images. First, we build a detector to detect small patches in an image that has been seam carved. Next, we compute a heatmap on an image based on the patch detector's output. Us…
▽ More
Seam carving is a method to resize an image in a content aware fashion. However, this method can also be used to carve out objects from images. In this paper, we propose a two-step method to detect and localize seam carved images. First, we build a detector to detect small patches in an image that has been seam carved. Next, we compute a heatmap on an image based on the patch detector's output. Using these heatmaps, we build another detector to detect if a whole image is seam carved or not. Our experimental results show that our approach is effective in detecting and localizing seam carved images.
△ Less
Submitted 3 September, 2021;
originally announced September 2021.
-
SeeTheSeams: Localized Detection of Seam Carving based Image Forgery in Satellite Imagery
Authors:
Chandrakanth Gudavalli,
Erik Rosten,
Lakshmanan Nataraj,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Seam carving is a popular technique for content aware image retargeting. It can be used to deliberately manipulate images, for example, change the GPS locations of a building or insert/remove roads in a satellite image. This paper proposes a novel approach for detecting and localizing seams in such images. While there are methods to detect seam carving based manipulations, this is the first time t…
▽ More
Seam carving is a popular technique for content aware image retargeting. It can be used to deliberately manipulate images, for example, change the GPS locations of a building or insert/remove roads in a satellite image. This paper proposes a novel approach for detecting and localizing seams in such images. While there are methods to detect seam carving based manipulations, this is the first time that robust localization and detection of seam carving forgery is made possible. We also propose a seam localization score (SLS) metric to evaluate the effectiveness of localization. The proposed method is evaluated extensively on a large collection of images from different sources, demonstrating a high level of detection and localization performance across these datasets. The datasets curated during this work will be released to the public.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Refactoring the MPS/University of Chicago Radiative MHD(MURaM) Model for GPU/CPU Performance Portability Using OpenACC Directives
Authors:
Eric Wright,
Damien Przybylski,
Matthias Rempel,
Cena Miller,
Supreeth Suresh,
Shiquan Su,
Richard Loft,
Sunita Chandrasekaran
Abstract:
The MURaM (Max Planck University of Chicago Radiative MHD) code is a solar atmosphere radiative MHD model that has been broadly applied to solar phenomena ranging from quiet to active sun, including eruptive events such as flares and coronal mass ejections. The treatment of physics is sufficiently realistic to allow for the synthesis of emission from visible light to extreme UV and X-rays, which i…
▽ More
The MURaM (Max Planck University of Chicago Radiative MHD) code is a solar atmosphere radiative MHD model that has been broadly applied to solar phenomena ranging from quiet to active sun, including eruptive events such as flares and coronal mass ejections. The treatment of physics is sufficiently realistic to allow for the synthesis of emission from visible light to extreme UV and X-rays, which is critical for a detailed comparison with available and future multi-wavelength observations. This component relies critically on the radiation transport solver (RTS) of MURaM; the most computationally intensive component of the code. The benefits of accelerating RTS are multiple fold: A faster RTS allows for the regular use of the more expensive multi-band radiation transport needed for comparison with observations, and this will pave the way for the acceleration of ongoing improvements in RTS that are critical for simulations of the solar chromosphere. We present challenges and strategies to accelerate a multi-physics, multi-band MURaM using a directive-based programming model, OpenACC in order to maintain a single source code across CPUs and GPUs. Results for a $288^3$ test problem show that MURaM with the optimized RTS routine achieves 1.73x speedup using a single NVIDIA V100 GPU over a fully subscribed 40-core Intel Skylake CPU node and with respect to the number of simulation points (in millions) per second, a single NVIDIA V100 GPU is equivalent to 69 Skylake cores. We also measure parallel performance on up to 96 GPUs and present weak and strong scaling results.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Minimal Rank Completions for Overlap** Blocks
Authors:
Ethan N. Epperly,
Nithin Govindarajan,
Shivkumar Chandrasekaran
Abstract:
We consider the multi-objective optimization problem of choosing the bottom left block-entry of a block lower triangular matrix to minimize the ranks of all block sub-matrices. We provide a proof that there exists a simultaneous rank-minimizer by constructing the complete set of all minimizers.
We consider the multi-objective optimization problem of choosing the bottom left block-entry of a block lower triangular matrix to minimize the ranks of all block sub-matrices. We provide a proof that there exists a simultaneous rank-minimizer by constructing the complete set of all minimizers.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
In situ ultrasound imaging of shear shock waves in the porcine brain
Authors:
Sandhya Chandrasekaran,
Francisco Santibanez,
Bharat B. Tripathi,
Ryan DeRuiter,
Gianmarco F. Pinton
Abstract:
Using high frame-rate ultrasound and high sensitivity motion tracking, we recently showed that shear waves sent to the ex vivo porcine brain develop into shear shock waves with destructive local accelerations inside the brain which may be a key mechanism behind deep traumatic brain injuries. Direct measurement of brain motion at an adequate frame-rate during impacts has been a persistent challenge…
▽ More
Using high frame-rate ultrasound and high sensitivity motion tracking, we recently showed that shear waves sent to the ex vivo porcine brain develop into shear shock waves with destructive local accelerations inside the brain which may be a key mechanism behind deep traumatic brain injuries. Direct measurement of brain motion at an adequate frame-rate during impacts has been a persistent challenge. Here we present the ultrasound observation of shear shock waves in the acoustically challenging environment of the in situ porcine brain during a single-shot impact. The brain was attached to a plate source which was vibrated at a moderate amplitude of 25g, to propagate a 40 Hz shear wave into the brain. Simultaneously, images of the moving brain were acquired at 2193 images/s, using a custom imaging sequence with 8 interleaved ultrasound transmit-receive events, designed to accurately track shear shocks. To achieve a long field-of-view, wide-beam emissions were designed using time-reversal ultrasound simulations and no compounding was used to avoid motion blurring. A peak acceleration of 102g was measured at the shock-front, 7.1 mm deep inside the brain. It is also shown that experimental shear velocity, acceleration, and strain-rate waveforms in brain are in excellent agreement with theoretical predictions from a custom higher-order finite volume method hence demonstrating the capabilities to measure rapid brain motion even in the presence of strong acoustical reverberations from the porcine skull.
△ Less
Submitted 24 April, 2021;
originally announced April 2021.
-
Assessing Validity of Static Analysis Warnings using Ensemble Learning
Authors:
Anshul Tanwar,
Hariharan Manikandan,
Krishna Sundaresan,
Prasanna Ganesan,
Sathish Kumar Chandrasekaran,
Sriram Ravi
Abstract:
Static Analysis (SA) tools are used to identify potential weaknesses in code and fix them in advance, while the code is being developed. In legacy codebases with high complexity, these rules-based static analysis tools generally report a lot of false warnings along with the actual ones. Though the SA tools uncover many hidden bugs, they are lost in the volume of fake warnings reported. The develop…
▽ More
Static Analysis (SA) tools are used to identify potential weaknesses in code and fix them in advance, while the code is being developed. In legacy codebases with high complexity, these rules-based static analysis tools generally report a lot of false warnings along with the actual ones. Though the SA tools uncover many hidden bugs, they are lost in the volume of fake warnings reported. The developers expend large hours of time and effort in identifying the true warnings. Other than impacting the developer productivity, true bugs are also missed out due to this challenge. To address this problem, we propose a Machine Learning (ML)-based learning process that uses source codes, historic commit data, and classifier-ensembles to prioritize the True warnings from the given list of warnings. This tool is integrated into the development workflow to filter out the false warnings and prioritize actual bugs. We evaluated our approach on the networking C codes, from a large data pool of static analysis warnings reported by the tools. Time-to-time these warnings are addressed by the developers, labelling them as authentic bugs or fake alerts. The ML model is trained with full supervision over the code features. Our results confirm that applying deep learning over the traditional static analysis reports is an assuring approach for drastically reducing the false positive rates.
△ Less
Submitted 21 April, 2021;
originally announced April 2021.
-
Multi-context Attention Fusion Neural Network for Software Vulnerability Identification
Authors:
Anshul Tanwar,
Hariharan Manikandan,
Krishna Sundaresan,
Prasanna Ganesan,
Sathish Kumar Chandrasekaran,
Sriram Ravi
Abstract:
Security issues in shipped code can lead to unforeseen device malfunction, system crashes or malicious exploitation by crackers, post-deployment. These vulnerabilities incur a cost of repair and foremost risk the credibility of the company. It is rewarding when these issues are detected and fixed well ahead of time, before release. Common Weakness Estimation (CWE) is a nomenclature describing gene…
▽ More
Security issues in shipped code can lead to unforeseen device malfunction, system crashes or malicious exploitation by crackers, post-deployment. These vulnerabilities incur a cost of repair and foremost risk the credibility of the company. It is rewarding when these issues are detected and fixed well ahead of time, before release. Common Weakness Estimation (CWE) is a nomenclature describing general vulnerability patterns observed in C code. In this work, we propose a deep learning model that learns to detect some of the common categories of security vulnerabilities in source code efficiently. The AI architecture is an Attention Fusion model, that combines the effectiveness of recurrent, convolutional and self-attention networks towards decoding the vulnerability hotspots in code. Utilizing the code AST structure, our model builds an accurate understanding of code semantics with a lot less learnable parameters. Besides a novel way of efficiently detecting code vulnerability, an additional novelty in this model is to exactly point to the code sections, which were deemed vulnerable by the model. Thus hel** a developer to quickly focus on the vulnerable code sections; and this becomes the "explainable" part of the vulnerability detection. The proposed AI achieves 98.40% F1-score on specific CWEs from the benchmarked NIST SARD dataset and compares well with state of the art.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Holistic Image Manipulation Detection using Pixel Co-occurrence Matrices
Authors:
Lakshmanan Nataraj,
Michael Goebel,
Tajuddin Manhar Mohammed,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Digital image forensics aims to detect images that have been digitally manipulated. Realistic image forgeries involve a combination of splicing, resampling, region removal, smoothing and other manipulation methods. While most detection methods in literature focus on detecting a particular type of manipulation, it is challenging to identify doctored images that involve a host of manipulations. In t…
▽ More
Digital image forensics aims to detect images that have been digitally manipulated. Realistic image forgeries involve a combination of splicing, resampling, region removal, smoothing and other manipulation methods. While most detection methods in literature focus on detecting a particular type of manipulation, it is challenging to identify doctored images that involve a host of manipulations. In this paper, we propose a novel approach to holistically detect tampered images using a combination of pixel co-occurrence matrices and deep learning. We extract horizontal and vertical co-occurrence matrices on three color channels in the pixel domain and train a model using a deep convolutional neural network (CNN) framework. Our method is agnostic to the type of manipulation and classifies an image as tampered or untampered. We train and validate our model on a dataset of more than 86,000 images. Experimental results show that our approach is promising and achieves more than 0.99 area under the curve (AUC) evaluation metric on the training and validation subsets. Further, our approach also generalizes well and achieves around 0.81 AUC on an unseen test dataset comprising more than 19,740 images released as part of the Media Forensics Challenge (MFC) 2020. Our score was highest among all other teams that participated in the challenge, at the time of announcement of the challenge results.
△ Less
Submitted 12 April, 2021;
originally announced April 2021.
-
Adversarially Optimized Mixup for Robust Classification
Authors:
Jason Bunk,
Srinjoy Chattopadhyay,
B. S. Manjunath,
Shivkumar Chandrasekaran
Abstract:
Mixup is a procedure for data augmentation that trains networks to make smoothly interpolated predictions between datapoints. Adversarial training is a strong form of data augmentation that optimizes for worst-case predictions in a compact space around each data-point, resulting in neural networks that make much more robust predictions. In this paper, we bring these ideas together by adversarially…
▽ More
Mixup is a procedure for data augmentation that trains networks to make smoothly interpolated predictions between datapoints. Adversarial training is a strong form of data augmentation that optimizes for worst-case predictions in a compact space around each data-point, resulting in neural networks that make much more robust predictions. In this paper, we bring these ideas together by adversarially probing the space between datapoints, using projected gradient descent (PGD). The fundamental approach in this work is to leverage backpropagation through the mixup interpolation during training to optimize for places where the network makes unsmooth and incongruous predictions. Additionally, we also explore several modifications and nuances, like optimization of the mixup ratio and geometrical label assignment, and discuss their impact on enhancing network robustness. Through these ideas, we have been able to train networks that robustly generalize better; experiments on CIFAR-10 and CIFAR-100 demonstrate consistent improvements in accuracy against strong adversaries, including the recent strong ensemble attack AutoAttack. Our source code would be released for reproducibility.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Attribution of Gradient Based Adversarial Attacks for Reverse Engineering of Deceptions
Authors:
Michael Goebel,
Jason Bunk,
Srinjoy Chattopadhyay,
Lakshmanan Nataraj,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Machine Learning (ML) algorithms are susceptible to adversarial attacks and deception both during training and deployment. Automatic reverse engineering of the toolchains behind these adversarial machine learning attacks will aid in recovering the tools and processes used in these attacks. In this paper, we present two techniques that support automated identification and attribution of adversarial…
▽ More
Machine Learning (ML) algorithms are susceptible to adversarial attacks and deception both during training and deployment. Automatic reverse engineering of the toolchains behind these adversarial machine learning attacks will aid in recovering the tools and processes used in these attacks. In this paper, we present two techniques that support automated identification and attribution of adversarial ML attack toolchains using Co-occurrence Pixel statistics and Laplacian Residuals. Our experiments show that the proposed techniques can identify parameters used to generate adversarial samples. To the best of our knowledge, this is the first approach to attribute gradient based adversarial attacks and estimate their parameters. Source code and data is available at: https://github.com/michael-goebel/ei_red
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
Malware Detection Using Frequency Domain-Based Image Visualization and Deep Learning
Authors:
Tajuddin Manhar Mohammed,
Lakshmanan Nataraj,
Satish Chikkagoudar,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
We propose a novel method to detect and visualize malware through image classification. The executable binaries are represented as grayscale images obtained from the count of N-grams (N=2) of bytes in the Discrete Cosine Transform (DCT) domain and a neural network is trained for malware detection. A shallow neural network is trained for classification, and its accuracy is compared with deep-networ…
▽ More
We propose a novel method to detect and visualize malware through image classification. The executable binaries are represented as grayscale images obtained from the count of N-grams (N=2) of bytes in the Discrete Cosine Transform (DCT) domain and a neural network is trained for malware detection. A shallow neural network is trained for classification, and its accuracy is compared with deep-network architectures such as ResNet that are trained using transfer learning. Neither dis-assembly nor behavioral analysis of malware is required for these methods. Motivated by the visual similarity of these images for different malware families, we compare our deep neural network models with standard image features like GIST descriptors to evaluate the performance. A joint feature measure is proposed to combine different features using error analysis to get an accurate ensemble model for improved classification performance. A new dataset called MaleX which contains around 1 million malware and benign Windows executable samples is created for large-scale malware detection and classification experiments. Experimental results are quite promising with 96% binary classification accuracy on MaleX. The proposed model is also able to generalize well on larger unseen malware samples and the results compare favorably with state-of-the-art static analysis-based malware detection algorithms.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Technical Report: Flushing Strategies in Drinking Water Systems
Authors:
Margarita Rebolledo,
Sowmya Chandrasekaran,
Thomas Bartz-Beielstein
Abstract:
Drinking water supply and distribution systems are critical infrastructure that has to be well maintained for the safety of the public. One important tool in the maintenance of water distribution systems (WDS) is flushing. Flushing is a process carried out in a periodic fashion to clean sediments and other contaminants in the water pipes. Given the different topographies, water composition and sup…
▽ More
Drinking water supply and distribution systems are critical infrastructure that has to be well maintained for the safety of the public. One important tool in the maintenance of water distribution systems (WDS) is flushing. Flushing is a process carried out in a periodic fashion to clean sediments and other contaminants in the water pipes. Given the different topographies, water composition and supply demand between WDS no single flushing strategy is suitable for all of them. In this report a non-exhaustive overview of optimization methods for flushing in WDS is given. Implementation of optimization methods for the flushing procedure and the flushing planing are presented. Suggestions are given as a possible option to optimise existing flushing planing frameworks.
△ Less
Submitted 25 December, 2020;
originally announced December 2020.
-
Predicting Generalization in Deep Learning via Local Measures of Distortion
Authors:
Abhejit Rajagopal,
Vamshi C. Madala,
Shivkumar Chandrasekaran,
Peder E. Z. Larson
Abstract:
We study generalization in deep learning by appealing to complexity measures originally developed in approximation and information theory. While these concepts are challenged by the high-dimensional and data-defined nature of deep learning, we show that simple vector quantization approaches such as PCA, GMMs, and SVMs capture their spirit when applied layer-wise to deep extracted features giving r…
▽ More
We study generalization in deep learning by appealing to complexity measures originally developed in approximation and information theory. While these concepts are challenged by the high-dimensional and data-defined nature of deep learning, we show that simple vector quantization approaches such as PCA, GMMs, and SVMs capture their spirit when applied layer-wise to deep extracted features giving rise to relatively inexpensive complexity measures that correlate well with generalization performance. We discuss our results in 2020 NeurIPS PGDL challenge.
△ Less
Submitted 15 December, 2020; v1 submitted 13 December, 2020;
originally announced December 2020.
-
EventDetectR -- An Open-Source Event Detection System
Authors:
Sowmya Chandrasekaran,
Margarita Rebolledo,
Thomas Bartz-Beielstein
Abstract:
EventDetectR: An efficient Event Detection System (EDS) capable of detecting unexpected water quality conditions. This approach uses multiple algorithms to model the relationship between various multivariate water quality signals. Then the residuals of the models were utilized in constructing the event detection algorithm, which provides a continuous measure of the probability of an event at every…
▽ More
EventDetectR: An efficient Event Detection System (EDS) capable of detecting unexpected water quality conditions. This approach uses multiple algorithms to model the relationship between various multivariate water quality signals. Then the residuals of the models were utilized in constructing the event detection algorithm, which provides a continuous measure of the probability of an event at every time step. The proposed framework was tested for water contamination events with industrial data from automated water quality sensors. The results showed that the framework is reliable with better performance and is highly suitable for event detection.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Sensor Placement for Contamination Detection in Water Distribution Systems
Authors:
Margarita Rebolledo,
Sowmya Chandrasekaran,
Thomas Bartz-Beielstein
Abstract:
Sensor placement for contaminant detection in water distribution systems (WDS) has become a topic of great interest aiming to secure a population's water supply. Several approaches can be found in the literature with differences ranging from the objective selected to optimize to the methods implemented to solve the optimization problem. In this work we aim to give an overview of the current work i…
▽ More
Sensor placement for contaminant detection in water distribution systems (WDS) has become a topic of great interest aiming to secure a population's water supply. Several approaches can be found in the literature with differences ranging from the objective selected to optimize to the methods implemented to solve the optimization problem. In this work we aim to give an overview of the current work in sensor placement with focus on contaminant detection for WDS. We present some of the objectives for which the sensor placement problem is defined along with common optimization algorithms and Toolkits available to help with algorithm testing and comparison.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Future Directions of the Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Program
Authors:
Ritu Arora,
Xiaosong Li,
Bonnie Hurwitz,
Daniel Fay,
Dhabaleswar K. Panda,
Edward Valeev,
Shaowen Wang,
Shirley Moore,
Sunita Chandrasekaran,
Ting Cao,
Holly Bik,
Matthew Curry,
Tanzima Islam
Abstract:
The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The main objectives of this workshop were to (1) understand the impact of the CSSI program on the community over the last 9 years, (2) engage workshop participants in identifying gaps and opportunities in the current CSSI landscape, (3) gather ideas on the cyberinfrastructure needs and expectations of the community with resp…
▽ More
The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The main objectives of this workshop were to (1) understand the impact of the CSSI program on the community over the last 9 years, (2) engage workshop participants in identifying gaps and opportunities in the current CSSI landscape, (3) gather ideas on the cyberinfrastructure needs and expectations of the community with respect to the CSSI program, and (4) prepare a report summarizing the feedback gathered from the community that can inform the future solicitations of the CSSI program. The workshop brought together different stakeholders interested in provisioning sustainable cyberinfrastructure that can power discoveries impacting the various fields of science and technology and maintaining the nation's competitiveness in the areas such as scientific software, HPC, networking, cybersecurity, and data/information science. The workshop served as a venue for gathering the community-feedback on the current state of the CSSI program and its future directions.
△ Less
Submitted 15 October, 2020;
originally announced October 2020.
-
Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs
Authors:
Joshua Hoke Davis,
Christopher Daley,
Swaroop Pophale,
Thomas Huber,
Sunita Chandrasekaran,
Nicholas J. Wright
Abstract:
Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging sys…
▽ More
Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging systems. This work focuses on applying and demonstrating OpenMP offloading directives on five proxy applications. We observe that the performance varies widely from one compiler to the other; a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers. While some issues can be worked around by the developer, there are other issues that must be reported to the compiler vendors. By restructuring OpenMP offloading directives, we gain an 18x speedup for the su3 proxy application on NERSC's Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the laplace mini-app when using the Cray-llvm compiler on Cori.
△ Less
Submitted 2 December, 2020; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Exploiting Context for Robustness to Label Noise in Active Learning
Authors:
Sudipta Paul,
Shivkumar Chandrasekaran,
B. S. Manjunath,
Amit K. Roy-Chowdhury
Abstract:
Several works in computer vision have demonstrated the effectiveness of active learning for adapting the recognition model when new unlabeled data becomes available. Most of these works consider that labels obtained from the annotator are correct. However, in a practical scenario, as the quality of the labels depends on the annotator, some of the labels might be wrong, which results in degraded re…
▽ More
Several works in computer vision have demonstrated the effectiveness of active learning for adapting the recognition model when new unlabeled data becomes available. Most of these works consider that labels obtained from the annotator are correct. However, in a practical scenario, as the quality of the labels depends on the annotator, some of the labels might be wrong, which results in degraded recognition performance. In this paper, we address the problems of i) how a system can identify which of the queried labels are wrong and ii) how a multi-class active learning system can be adapted to minimize the negative impact of label noise. Towards solving the problems, we propose a noisy label filtering based learning approach where the inter-relationship (context) that is quite common in natural data is utilized to detect the wrong labels. We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available. Comparing the new beliefs with the prior relational information, we generate a dissimilarity score to detect the incorrect labels and update the recognition model with correct labels which result in better recognition performance. This is demonstrated in three different applications: scene classification, activity classification, and document classification.
△ Less
Submitted 18 October, 2020;
originally announced October 2020.
-
Super-resolved shear shock focusing in the human head
Authors:
Bharat B. Tripathi,
Sandhya Chandrasekaran,
Gianmarco Pinton
Abstract:
Shear shocks, which exist in a completely different regime from compressional shocks, were recently observed in the brain. These low phase speed ($\approx$ 2 m/s) high Mach number ($\approx$ 1) waves could be the primary mechanism behind diffuse axonal injury due to a very high local acceleration at the shock front. The extreme nonlinearity of these waves results in unique behaviors that are diffe…
▽ More
Shear shocks, which exist in a completely different regime from compressional shocks, were recently observed in the brain. These low phase speed ($\approx$ 2 m/s) high Mach number ($\approx$ 1) waves could be the primary mechanism behind diffuse axonal injury due to a very high local acceleration at the shock front. The extreme nonlinearity of these waves results in unique behaviors that are different from more commonly studied nonlinear compressional waves. Here we show the first observation of super-resolved shear shock wave focusing. Shear shock wave imaging and numerical simulations in a human head phantom over a range of frequencies/amplitudes shows the super-resolution of shock waves in the low strain and high strain-rate regime. These results suggest that even for mild accelerations injuries as small as a grain of rice on the scale of mm$^2$ can be easily created deep inside the brain.
△ Less
Submitted 25 April, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Detection, Attribution and Localization of GAN Generated Images
Authors:
Michael Goebel,
Lakshmanan Nataraj,
Tejaswi Nanjundaswamy,
Tajuddin Manhar Mohammed,
Shivkumar Chandrasekaran,
B. S. Manjunath
Abstract:
Recent advances in Generative Adversarial Networks (GANs) have led to the creation of realistic-looking digital images that pose a major challenge to their detection by humans or computers. GANs are used in a wide range of tasks, from modifying small attributes of an image (StarGAN [14]), transferring attributes between image pairs (CycleGAN [91]), as well as generating entirely new images (ProGAN…
▽ More
Recent advances in Generative Adversarial Networks (GANs) have led to the creation of realistic-looking digital images that pose a major challenge to their detection by humans or computers. GANs are used in a wide range of tasks, from modifying small attributes of an image (StarGAN [14]), transferring attributes between image pairs (CycleGAN [91]), as well as generating entirely new images (ProGAN [36], StyleGAN [37], SPADE/GauGAN [64]). In this paper, we propose a novel approach to detect, attribute and localize GAN generated images that combines image features with deep learning methods. For every image, co-occurrence matrices are computed on neighborhood pixels of RGB channels in different directions (horizontal, vertical and diagonal). A deep learning network is then trained on these features to detect, attribute and localize these GAN generated/manipulated images. A large scale evaluation of our approach on 5 GAN datasets comprising over 2.76 million images (ProGAN, StarGAN, CycleGAN, StyleGAN and SPADE/GauGAN) shows promising results in detecting GAN generated images.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.