Search | arXiv e-print repository

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research. △ Less

Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2308.04760 [pdf, ps, other]

Automated Driving Without Ethics: Meaning, Design and Real-World Implementation

Authors: Katherine Evans, Nelson de Moura, Raja Chatila, Stéphane Chauvier

Abstract: The ethics of automated vehicles (AV) has received a great amount of attention in recent years, specifically in regard to their decisional policies in accident situations in which human harm is a likely consequence. After a discussion about the pertinence and cogency of the term 'artificial moral agent' to describe AVs that would accomplish these sorts of decisions, and starting from the assumptio… ▽ More The ethics of automated vehicles (AV) has received a great amount of attention in recent years, specifically in regard to their decisional policies in accident situations in which human harm is a likely consequence. After a discussion about the pertinence and cogency of the term 'artificial moral agent' to describe AVs that would accomplish these sorts of decisions, and starting from the assumption that human harm is unavoidable in some situations, a strategy for AV decision making is proposed using only pre-defined parameters to characterize the risk of possible accidents and also integrating the Ethical Valence Theory, which paints AV decision-making as a type of claim mitigation, into multiple possible decision rules to determine the most suitable action given the specific environment and decision context. The goal of this approach is not to define how moral theory requires vehicles to behave, but rather to provide a computational approach that is flexible enough to accommodate a number of human 'moral positions' concerning what morality demands and what road users may expect, offering an evaluation tool for the social acceptability of an automated vehicle's decision making. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: Chapter 7 of the book Connected and Automated Vehicles: Integrating Engineering and Ethics (https://link.springer.com/book/9783031399909)

arXiv:2209.00076 [pdf, other]

Connecticut Redistricting Analysis

Authors: Kyle Evans, Katherine T. Chang

Abstract: Connecticut passed their new state House of Representatives district plan on November 18, 2021 and passed their new state Senate district plan on November 23, 2021. Each passed unanimously in their 9-person bipartisan Reapportionment Commission; however, the process has been criticized for legislators controlling the process and for the negotiations that serve to protect incumbents. This paper inv… ▽ More Connecticut passed their new state House of Representatives district plan on November 18, 2021 and passed their new state Senate district plan on November 23, 2021. Each passed unanimously in their 9-person bipartisan Reapportionment Commission; however, the process has been criticized for legislators controlling the process and for the negotiations that serve to protect incumbents. This paper investigates the extent of incumbent protection in the new Assembly maps while also providing summary data on the new districts. The impact of new districts on incumbents is analyzed through the location of district borders, an ensemble analysis (using MCMC methods) to determine if the protection of incumbents constitutes a statistical outlier, and by investigating changes to competitive districts. △ Less

Submitted 31 August, 2022; originally announced September 2022.

Comments: 13 pages, 3 tables

arXiv:2105.07582 [pdf, other]

doi 10.1007/978-3-031-23020-2_2

RAIDER: Reinforcement-aided Spear Phishing Detector

Authors: Keelan Evans, Alsharif Abuadbba, Tingmin Wu, Kristen Moore, Mohiuddin Ahmed, Ganna Pogrebna, Surya Nepal, Mike Johnstone

Abstract: Spear Phishing is a harmful cyber-attack facing business and individuals worldwide. Considerable research has been conducted recently into the use of Machine Learning (ML) techniques to detect spear-phishing emails. ML-based solutions may suffer from zero-day attacks; unseen attacks unaccounted for in the training data. As new attacks emerge, classifiers trained on older data are unable to detect… ▽ More Spear Phishing is a harmful cyber-attack facing business and individuals worldwide. Considerable research has been conducted recently into the use of Machine Learning (ML) techniques to detect spear-phishing emails. ML-based solutions may suffer from zero-day attacks; unseen attacks unaccounted for in the training data. As new attacks emerge, classifiers trained on older data are unable to detect these new varieties of attacks resulting in increasingly inaccurate predictions. Spear Phishing detection also faces scalability challenges due to the growth of the required features which is proportional to the number of the senders within a receiver mailbox. This differs from traditional phishing attacks which typically perform only a binary classification between phishing and benign emails. Therefore, we devise a possible solution to these problems, named RAIDER: Reinforcement AIded Spear Phishing DEtectoR. A reinforcement-learning based feature evaluation system that can automatically find the optimum features for detecting different types of attacks. By leveraging a reward and penalty system, RAIDER allows for autonomous features selection. RAIDER also keeps the number of features to a minimum by selecting only the significant features to represent phishing emails and detect spear-phishing attacks. After extensive evaluation of RAIDER over 11,000 emails and across 3 attack scenarios, our results suggest that using reinforcement learning to automatically identify the significant features could reduce the dimensions of the required features by 55% in comparison to existing ML-based systems. It also improves the accuracy of detecting spoofing attacks by 4% from 90% to 94%. In addition, RAIDER demonstrates reasonable detection accuracy even against a sophisticated attack named Known Sender in which spear-phishing emails greatly resemble those of the impersonated sender. △ Less

Submitted 3 January, 2023; v1 submitted 16 May, 2021; originally announced May 2021.

Comments: 16 pages

Journal ref: International Conference on Network and System Security, 2022

arXiv:2008.03582 [pdf, other]

Error Autocorrelation Objective Function for Improved System Modeling

Authors: Anand Ramakrishnan, Warren B. Jackson, Kent Evans

Abstract: Deep learning models are trained to minimize the error between the model's output and the actual values. The typical cost function, the Mean Squared Error (MSE), arises from maximizing the log-likelihood of additive independent, identically distributed Gaussian noise. However, minimizing MSE fails to minimize the residuals' cross-correlations, leading to over-fitting and poor extrapolation of the… ▽ More Deep learning models are trained to minimize the error between the model's output and the actual values. The typical cost function, the Mean Squared Error (MSE), arises from maximizing the log-likelihood of additive independent, identically distributed Gaussian noise. However, minimizing MSE fails to minimize the residuals' cross-correlations, leading to over-fitting and poor extrapolation of the model outside the training set (generalization). In this paper, we introduce a "whitening" cost function, the Ljung-Box statistic, which not only minimizes the error but also minimizes the correlations between errors, ensuring that the fits enforce compatibility with an independent and identically distributed (i.i.d) gaussian noise model. The results show significant improvement in generalization for recurrent neural networks (RNNs) (1d) and image autoencoders (2d). Specifically, we look at both temporal correlations for system-id in simulated and actual mechanical systems. We also look at spatial correlation in vision autoencoders to demonstrate that the whitening objective functions lead to much better extrapolation--a property very desirable for reliable control systems. △ Less

Submitted 11 May, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

Comments: 7 pages, 3 Figures, 8 Tables

arXiv:2006.01962 [pdf, other]

Characterizing an Analogical Concept Memory for Architectures Implementing the Common Model of Cognition

Authors: Shiwali Mohan, Matt Klenk, Matthew Shreve, Kent Evans, Aaron Ang, John Maxwell

Abstract: Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. In this paper, we explore how computational models of analogical processing can be brought into these architectures to enable concept acquisition from examples obtained interactively. We propose a new anal… ▽ More Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. In this paper, we explore how computational models of analogical processing can be brought into these architectures to enable concept acquisition from examples obtained interactively. We propose a new analogical concept memory for Soar that augments its current system of declarative long-term memories. We frame the problem of concept learning as embedded within the larger context of interactive task learning (ITL) and embodied language processing (ELP). We demonstrate that the analogical learning methods implemented in the proposed memory can quickly learn a diverse types of novel concepts that are useful not only in recognition of a concept in the environment but also in action selection. Our approach has been instantiated in an implemented cognitive system \textsc{Aileen} and evaluated on a simulated robotic domain. △ Less

Submitted 29 July, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

Comments: To be presented the Eighth Annual Conference on Advances in Cognitive Systems (ACS 2020) (https://advancesincognitivesystems.github.io/acs/)

arXiv:2005.02969 [pdf, other]

Generating Memorable Images Based on Human Visual Memory Schemas

Authors: Cameron Kyle-Davidson, Adrian G. Bors, Karla K. Evans

Abstract: This research study proposes using Generative Adversarial Networks (GAN) that incorporate a two-dimensional measure of human memorability to generate memorable or non-memorable images of scenes. The memorability of the generated images is evaluated by modelling Visual Memory Schemas (VMS), which correspond to mental representations that human observers use to encode an image into memory. The VMS m… ▽ More This research study proposes using Generative Adversarial Networks (GAN) that incorporate a two-dimensional measure of human memorability to generate memorable or non-memorable images of scenes. The memorability of the generated images is evaluated by modelling Visual Memory Schemas (VMS), which correspond to mental representations that human observers use to encode an image into memory. The VMS model is based upon the results of memory experiments conducted on human observers, and provides a 2D map of memorability. We impose a memorability constraint upon the latent space of a GAN by employing a VMS map prediction model as an auxiliary loss. We assess the difference in memorability between images generated to be memorable or non-memorable through an independent computational measure of memorability, and additionally assess the effect of memorability on the realness of the generated images. △ Less

Submitted 6 May, 2020; originally announced May 2020.

arXiv:2004.13956 [pdf, other]

Zero-shot topic generation

Authors: Oleg Vasilyev, Kathryn Evans, Anna Venancio-Marques, John Bohannon

Abstract: We present an approach to generating topics using a model trained only for document title generation, with zero examples of topics given during training. We leverage features that capture the relevance of a candidate span in a document for the generation of a title for that document. The output is a weighted collection of the phrases that are most relevant for describing the document and distingui… ▽ More We present an approach to generating topics using a model trained only for document title generation, with zero examples of topics given during training. We leverage features that capture the relevance of a candidate span in a document for the generation of a title for that document. The output is a weighted collection of the phrases that are most relevant for describing the document and distinguishing it within a corpus, without requiring access to the rest of the corpus. We conducted a double-blind trial in which human annotators scored the quality of our machine-generated topics along with original human-written topics associated with news articles from The Guardian and The Huffington Post. The results show that our zero-shot model generates topic labels for news documents that are on average equal to or higher quality than those written by humans, as judged by humans. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: 12 pages, 9 figures, 3 tables

arXiv:1912.05470 [pdf, other]

Human Gist Processing Augments Deep Learning Breast Cancer Risk Assessment

Authors: Skylar W. Wurster, Arkadiusz Sitek, Jian Chen, Karla Evans, Gaeun Kim, Jeremy M. Wolfe

Abstract: Radiologists can classify a mammogram as normal or abnormal at better than chance levels after less than a second's exposure to the images. In this work, we combine these radiologists' gist inputs into pre-trained machine learning models to validate that integrating gist with a CNN model can achieve an AUC (area under the curve) statistically significantly higher than either the gist perception of… ▽ More Radiologists can classify a mammogram as normal or abnormal at better than chance levels after less than a second's exposure to the images. In this work, we combine these radiologists' gist inputs into pre-trained machine learning models to validate that integrating gist with a CNN model can achieve an AUC (area under the curve) statistically significantly higher than either the gist perception of radiologists or the model without gist input. △ Less

Submitted 27 November, 2019; originally announced December 2019.

arXiv:1907.08514 [pdf, other]

Predicting Visual Memory Schemas with Variational Autoencoders

Authors: Cameron Kyle-Davidson, Adrian Bors, Karla Evans

Abstract: Visual memory schema (VMS) maps show which regions of an image cause that image to be remembered or falsely remembered. Previous work has succeeded in generating low resolution VMS maps using convolutional neural networks. We instead approach this problem as an image-to-image translation task making use of a variational autoencoder. This approach allows us to generate higher resolution dual channe… ▽ More Visual memory schema (VMS) maps show which regions of an image cause that image to be remembered or falsely remembered. Previous work has succeeded in generating low resolution VMS maps using convolutional neural networks. We instead approach this problem as an image-to-image translation task making use of a variational autoencoder. This approach allows us to generate higher resolution dual channel images that represent visual memory schemas, allowing us to evaluate predicted true memorability and false memorability separately. We also evaluate the relationship between VMS maps, predicted VMS maps, ground truth memorability scores, and predicted memorability scores. △ Less

Submitted 19 July, 2019; originally announced July 2019.

Comments: Accepted to BMVC2019

arXiv:1904.02546 [pdf, other]

doi 10.1109/MCSE.2019.2924204

Automated Fortran--C++ Bindings for Large-Scale Scientific Applications

Authors: Seth R. Johnson, Andrey Prokopenko, Katherine J. Evans

Abstract: Although many active scientific codes use modern Fortran, most contemporary scientific software "libraries" are implemented in C and C++. Providing their numerical, algorithmic, or data management features to Fortran codes requires writing and maintaining substantial amounts of glue code. This article introduces a tool that automatically generates native Fortran 2003 interfaces to C and C++ libr… ▽ More Although many active scientific codes use modern Fortran, most contemporary scientific software "libraries" are implemented in C and C++. Providing their numerical, algorithmic, or data management features to Fortran codes requires writing and maintaining substantial amounts of glue code. This article introduces a tool that automatically generates native Fortran 2003 interfaces to C and C++ libraries. The tool supports C++ features that have no direct Fortran analog, such as templated functions and exceptions. A set of simple examples demonstrate the utility and scope of the tool, and timing measurements with a mock numerical library illustrate the minimal performance impact of the generated wrapper code. △ Less

Submitted 24 May, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

arXiv:1903.02056 [pdf, other]

Defining Image Memorability using the Visual Memory Schema

Authors: Erdem Akagunduz, Adrian G. Bors, Karla K. Evans

Abstract: Memorability of an image is a characteristic determined by the human observers' ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. {The current study aims to enhance our understanding and prediction of image memorability, improving upon existing approaches by incorporating the proper… ▽ More Memorability of an image is a characteristic determined by the human observers' ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. {The current study aims to enhance our understanding and prediction of image memorability, improving upon existing approaches by incorporating the properties of cumulative human annotations.} We propose a new concept called the Visual Memory Schema (VMS) referring to an organisation of image components human observers share when encoding and recognising images. The concept of VMS is operationalised by asking human observers to define memorable regions of images they were asked to remember during an episodic memory test. We then statistically assess the consistency of VMSs across observers for either correctly or incorrectly recognised images. The associations of the VMSs with eye fixations and saliency are analysed separately as well. Lastly, we adapt various deep learning architectures for the reconstruction and prediction of memorable regions in images and analyse the results when using transfer learning at the outputs of different convolutional network layers. △ Less

Submitted 5 March, 2019; originally announced March 2019.

Comments: Submitted to TPAMI on Aug 4, 2017

Showing 1–12 of 12 results for author: Evans, K