Search | arXiv e-print repository

PathoLM: Identifying pathogenicity from the DNA sequence through the Genome Foundation Model

Authors: Sajib Acharjee Dip, Uddip Acharjee Shuvo, Tran Chau, Haoqiu Song, Petra Choi, Xuan Wang, Liqing Zhang

Abstract: Pathogen identification is pivotal in diagnosing, treating, and preventing diseases, crucial for controlling infections and safeguarding public health. Traditional alignment-based methods, though widely used, are computationally intense and reliant on extensive reference databases, often failing to detect novel pathogens due to their low sensitivity and specificity. Similarly, conventional machine… ▽ More Pathogen identification is pivotal in diagnosing, treating, and preventing diseases, crucial for controlling infections and safeguarding public health. Traditional alignment-based methods, though widely used, are computationally intense and reliant on extensive reference databases, often failing to detect novel pathogens due to their low sensitivity and specificity. Similarly, conventional machine learning techniques, while promising, require large annotated datasets and extensive feature engineering and are prone to overfitting. Addressing these challenges, we introduce PathoLM, a cutting-edge pathogen language model optimized for the identification of pathogenicity in bacterial and viral sequences. Leveraging the strengths of pre-trained DNA models such as the Nucleotide Transformer, PathoLM requires minimal data for fine-tuning, thereby enhancing pathogen detection capabilities. It effectively captures a broader genomic context, significantly improving the identification of novel and divergent pathogens. We developed a comprehensive data set comprising approximately 30 species of viruses and bacteria, including ESKAPEE pathogens, seven notably virulent bacterial strains resistant to antibiotics. Additionally, we curated a species classification dataset centered specifically on the ESKAPEE group. In comparative assessments, PathoLM dramatically outperforms existing models like DciPatho, demonstrating robust zero-shot and few-shot capabilities. Furthermore, we expanded PathoLM-Sp for ESKAPEE species classification, where it showed superior performance compared to other advanced deep learning methods, despite the complexities of the task. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 9 pages, 3 figures

arXiv:2406.11912 [pdf, other]

AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology

Authors: Minh Huynh Nguyen, Thang Phan Chau, Phong X. Nguyen, Nghi D. Q. Bui

Abstract: Software agents have emerged as promising tools for addressing complex software engineering tasks. However, existing works oversimplify software development workflows by following the waterfall model. Thus, we propose AgileCoder, a multi-agent system that integrates Agile Methodology (AM) into the framework. This system assigns specific AM roles such as Product Manager, Developer, and Tester to di… ▽ More Software agents have emerged as promising tools for addressing complex software engineering tasks. However, existing works oversimplify software development workflows by following the waterfall model. Thus, we propose AgileCoder, a multi-agent system that integrates Agile Methodology (AM) into the framework. This system assigns specific AM roles such as Product Manager, Developer, and Tester to different agents, who then collaboratively develop software based on user inputs. AgileCoder enhances development efficiency by organizing work into sprints, focusing on incrementally develo** software through sprints. Additionally, we introduce Dynamic Code Graph Generator, a module that creates a Code Dependency Graph dynamically as updates are made to the codebase. This allows agents to better comprehend the codebase, leading to more precise code generation and modifications throughout the software development process. AgileCoder surpasses existing benchmarks, like ChatDev and MetaGPT, establishing a new standard and showcasing the capabilities of multi-agent systems in advanced software engineering environments. Our source code can be found at https://github.com/FSoft-AI4Code/AgileCoder. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2306.13116 [pdf, other]

A Machine Learning Pressure Emulator for Hydrogen Embrittlement

Authors: Minh Triet Chau, João Lucas de Sousa Almeida, Elie Alhajjar, Alberto Costa Nogueira Junior

Abstract: A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fid… ▽ More A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fidelity results, the current PDE-based simulators are time- and computationally-demanding. Using simulation data, we train an ML model to predict the pressure on the pipelines' inner walls, which is a first step for pipeline system surveillance. We found that the physics-based method outperformed the purely data-driven method and satisfy the physical constraints of the gas flow system. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2301.10966 [pdf]

Design of Mobile Manipulator for Fire Extinguisher Testing. Part II: Design and Simulation

Authors: Thai Nguyen Chau, Xuan Quang Ngo, Van Tu Duong, Trong Trung Nguyen, Huy Hung Nguyen, Tan Tien Nguyen

Abstract: All flames are extinguished as early as possible, or fire services have to deal with major conflagrations. This leads to the fact that the quality of fire extinguishers has become a very sensitive and important issue in firefighting. Inspired by the development of automatic fire fighting systems, this paper presents a mobile manipulator to evaluate the power of fire extinguishers, which is designe… ▽ More All flames are extinguished as early as possible, or fire services have to deal with major conflagrations. This leads to the fact that the quality of fire extinguishers has become a very sensitive and important issue in firefighting. Inspired by the development of automatic fire fighting systems, this paper presents a mobile manipulator to evaluate the power of fire extinguishers, which is designed according to the standard of fire extinguishers named as ISO 7165:2009 and ISO 11601:2008. A detailed discussion on key specifications solutions and mechanical design of the chassis of the mobile manipulator has been presented in Part I: Key Specifications and Conceptual Design. The focus of this part is on the rest of the mechanical design and controller de-sign of the mobile manipulator. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: 10 pages, 15 figures, the 7th International Conference on Advanced Engineering, Theory and Applications

arXiv:2301.10965 [pdf]

Design of Mobile Manipulator for Fire Extinguisher Testing. Part I Key Specifications and Conceptual Design

Authors: Xuan Quang Ngo, Thai Nguyen Chau, Cong Thang Doan, Van Tu Duong, Duy Vo Hoang, Tan Tien Nguyen

Abstract: All flames are extinguished as early as possible, or fire services have to deal with major conflagrations. This leads to the fact that the quality of fire extinguishers has become a very sensitive and important issue in firefighting. Inspired by the development of automatic fire fighting systems, this paper proposes key specifications based on the standard of fire extinguishers that is ISO 7165:20… ▽ More All flames are extinguished as early as possible, or fire services have to deal with major conflagrations. This leads to the fact that the quality of fire extinguishers has become a very sensitive and important issue in firefighting. Inspired by the development of automatic fire fighting systems, this paper proposes key specifications based on the standard of fire extinguishers that is ISO 7165:2009 and ISO 11601:2008, and feasible solutions to design a mobile manipulator for automatically evaluating the quality or, more specifically, power of fire extinguishers. In addition, a part of the mechanical design is also discussed. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: 10 pages, 8 figures, the 7th International Conference on Advanced Engineering, Theory and Applications

arXiv:2210.07271 [pdf, other]

BLOX: Macro Neural Architecture Search Benchmark and Algorithms

Authors: Thomas Chun Pong Chau, Łukasz Dudziak, Hongkai Wen, Nicholas Donald Lane, Mohamed S Abdelfattah

Abstract: Neural architecture search (NAS) has been successfully used to design numerous high-performance neural networks. However, NAS is typically compute-intensive, so most existing approaches restrict the search to decide the operations and topological structure of a single block only, then the same block is stacked repeatedly to form an end-to-end model. Although such an approach reduces the size of se… ▽ More Neural architecture search (NAS) has been successfully used to design numerous high-performance neural networks. However, NAS is typically compute-intensive, so most existing approaches restrict the search to decide the operations and topological structure of a single block only, then the same block is stacked repeatedly to form an end-to-end model. Although such an approach reduces the size of search space, recent studies show that a macro search space, which allows blocks in a model to be different, can lead to better performance. To provide a systematic study of the performance of NAS algorithms on a macro search space, we release Blox - a benchmark that consists of 91k unique models trained on the CIFAR-100 dataset. The dataset also includes runtime measurements of all the models on a diverse set of hardware platforms. We perform extensive experiments to compare existing algorithms that are well studied on cell-based search spaces, with the emerging blockwise approaches that aim to make NAS scalable to much larger macro search spaces. The benchmark and code are available at https://github.com/SamsungLabs/blox. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: Published in the Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks

arXiv:2209.09570 [pdf, other]

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Authors: Hongxiang Fan, Thomas Chau, Stylianos I. Venieris, Royson Lee, Alexandros Kouris, Wayne Luk, Nicholas D. Lane, Mohamed S. Abdelfattah

Abstract: Attention-based neural networks have become pervasive in many AI tasks. Despite their excellent algorithmic performance, the use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources, which often compromises their hardware performance. Although various sparse variants have been introduced, most approaches only focus on mitigating the quadrat… ▽ More Attention-based neural networks have become pervasive in many AI tasks. Despite their excellent algorithmic performance, the use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources, which often compromises their hardware performance. Although various sparse variants have been introduced, most approaches only focus on mitigating the quadratic scaling of attention on the algorithm level, without explicitly considering the efficiency of map** their methods on real hardware designs. Furthermore, most efforts only focus on either the attention mechanism or the FFNs but without jointly optimizing both parts, causing most of the current designs to lack scalability when dealing with different input lengths. This paper systematically considers the sparsity patterns in different variants from a hardware perspective. On the algorithmic level, we propose FABNet, a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs. On the hardware level, a novel adaptable butterfly accelerator is proposed that can be configured at runtime via dedicated hardware control to accelerate different butterfly layers using a single unified hardware engine. On the Long-Range-Arena dataset, FABNet achieves the same accuracy as the vanilla Transformer while reducing the amount of computation by 10 to 66 times and the number of parameters 2 to 22 times. By jointly optimizing the algorithm and hardware, our FPGA-based butterfly accelerator achieves 14.2 to 23.2 times speedup over state-of-the-art accelerators normalized to the same computational budget. Compared with optimized CPU and GPU designs on Raspberry Pi 4 and Jetson Nano, our system is up to 273.8 and 15.1 times faster under the same power budget. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Paper accepted by MICRO'22

arXiv:2106.06799 [pdf, other]

Zero-Cost Operation Scoring in Differentiable Architecture Search

Authors: Lichuan Xiang, Łukasz Dudziak, Mohamed S. Abdelfattah, Thomas Chau, Nicholas D. Lane, Hongkai Wen

Abstract: We formalize and analyze a fundamental component of differentiable neural architecture search (NAS): local "operation scoring" at each operation choice. We view existing operation scoring functions as inexact proxies for accuracy, and we find that they perform poorly when analyzed empirically on NAS benchmarks. From this perspective, we introduce a novel \textit{perturbation-based zero-cost operat… ▽ More We formalize and analyze a fundamental component of differentiable neural architecture search (NAS): local "operation scoring" at each operation choice. We view existing operation scoring functions as inexact proxies for accuracy, and we find that they perform poorly when analyzed empirically on NAS benchmarks. From this perspective, we introduce a novel \textit{perturbation-based zero-cost operation scoring} (Zero-Cost-PT) approach, which utilizes zero-cost proxies that were recently studied in multi-trial NAS but degrade significantly on larger search spaces, typical for differentiable NAS. We conduct a thorough empirical evaluation on a number of NAS benchmarks and large search spaces, from NAS-Bench-201, NAS-Bench-1Shot1, NAS-Bench-Macro, to DARTS-like and MobileNet-like spaces, showing significant improvements in both search time and accuracy. On the ImageNet classification task on the DARTS search space, our approach improved accuracy compared to the best current training-free methods (TE-NAS) while being over 10$\times$ faster (total searching time 25 minutes on a single GPU), and observed significantly better transferability on architectures searched on the CIFAR-10 dataset with an accuracy increase of 1.8 pp. Our code is available at: https://github.com/zerocostptnas/zerocost_operation_score. △ Less

Submitted 8 February, 2023; v1 submitted 12 June, 2021; originally announced June 2021.

Comments: Accepted at AAAI 2023

arXiv:2101.02768 [pdf, other]

EmoconLite: Bridging the Gap Between Emotiv and Play for Children With Severe Disabilities

Authors: Javad Rahimipour Anaraki, Chelsea Anne Rauh, Jason Leung, Tom Chau

Abstract: Brain-computer interfaces (BCIs) allow users to control computer applications by modulating their brain activity. Since BCIs rely solely on brain activity, they have enormous potential as an alternative access method for engaging children with severe disabilities and/or medical complexities in therapeutic recreation and leisure. In particular, one commercially available BCI platform is the Emotiv… ▽ More Brain-computer interfaces (BCIs) allow users to control computer applications by modulating their brain activity. Since BCIs rely solely on brain activity, they have enormous potential as an alternative access method for engaging children with severe disabilities and/or medical complexities in therapeutic recreation and leisure. In particular, one commercially available BCI platform is the Emotiv EPOC headset, which is a portable and affordable electroencephalography (EEG) device. Combined with the EmotivBCI software, the Emotiv system can generate a model to discern between different mental tasks based on the user's EEG signals in real-time. While the Emotiv system shows promise for use by the pediatric population in the setting of a BCI clinic, it lacks integrated support that allows users to directly control computer applications using the generated classification output. To achieve this, users would have to create their own program, which can be challenging for those who may not be technologically inclined. To address this gap, we developed a freely available and user-friendly BCI software application called EmoconLite. Using the classification output from EmotivBCI, EmoconLite allows users to play YouTube video clips and a variety of video games from multiple platforms, ultimately creating an end-to-end solution for users. Through its deployment in the Holland Bloorview Kids Rehabilitation Hospital's BCI clinic, EmoconLite is bridging the gap between research and clinical practice, providing children with access to BCI technology and supporting BCI-enabled play. △ Less

Submitted 25 May, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

Comments: 12 pages, 3 figures

MSC Class: 92C55; 68-04 ACM Class: H.5.2; I.5.4

arXiv:2010.15250 [pdf, other]

Semantic video segmentation for autonomous driving

Authors: Minh Triet Chau

Abstract: We aim to solve semantic video segmentation in autonomous driving, namely road detection in real time video, using techniques discussed in (Shelhamer et al., 2016a). While fully convolutional network gives good result, we show that the speed can be halved while preserving the accuracy. The test dataset being used is KITTI, which consists of real footage from Germany's streets. We aim to solve semantic video segmentation in autonomous driving, namely road detection in real time video, using techniques discussed in (Shelhamer et al., 2016a). While fully convolutional network gives good result, we show that the speed can be halved while preserving the accuracy. The test dataset being used is KITTI, which consists of real footage from Germany's streets. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: This work was done around 2017. Some minor changes were added

arXiv:2009.02397 [pdf, ps, other]

A Deep Learning Approach to Tongue Detection for Pediatric Population

Authors: Javad Rahimipour Anaraki, Silvia Orlandi, Tom Chau

Abstract: Children with severe disabilities and complex communication needs face limitations in the usage of access technology (AT) devices. Conventional ATs (e.g., mechanical switches) can be insufficient for nonverbal children and those with limited voluntary motion control. Automatic techniques for the detection of tongue gestures represent a promising pathway. Previous studies have shown the robustness… ▽ More Children with severe disabilities and complex communication needs face limitations in the usage of access technology (AT) devices. Conventional ATs (e.g., mechanical switches) can be insufficient for nonverbal children and those with limited voluntary motion control. Automatic techniques for the detection of tongue gestures represent a promising pathway. Previous studies have shown the robustness of tongue detection algorithms on adult participants, but further research is needed to use these methods with children. In this study, a network architecture for tongue-out gesture recognition was implemented and evaluated on videos recorded in a naturalistic setting when children were playing a video-game. A cascade object detector algorithm was used to detect the participants' faces, and an automated classification scheme for tongue gesture detection was developed using a convolutional neural network (CNN). In evaluation experiments conducted, the network was trained using adults and children's images. The network classification accuracy was evaluated using leave-one-subject-out cross-validation. Preliminary classification results obtained from the analysis of videos of five typically develo** children showed an accuracy of up to 99% in predicting tongue-out gestures. Moreover, we demonstrated that using only children data for training the classifier yielded better performance than adult's one supporting the need for pediatric tongue gesture datasets. △ Less

Submitted 28 September, 2020; v1 submitted 4 September, 2020; originally announced September 2020.

Comments: 7 pages, 1 figure

MSC Class: 68T07 ACM Class: I.4.9; I.5.1

arXiv:2008.07660 [pdf, ps, other]

Revisiting the Application of Feature Selection Methods to Speech Imagery BCI Datasets

Authors: Javad Rahimipour Anaraki, Jae Moon, Tom Chau

Abstract: Brain-computer interface (BCI) aims to establish and improve human and computer interactions. There has been an increasing interest in designing new hardware devices to facilitate the collection of brain signals through various technologies, such as wet and dry electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) devices. The promising results of machine learning methods ha… ▽ More Brain-computer interface (BCI) aims to establish and improve human and computer interactions. There has been an increasing interest in designing new hardware devices to facilitate the collection of brain signals through various technologies, such as wet and dry electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) devices. The promising results of machine learning methods have attracted researchers to apply these methods to their data. However, some methods can be overlooked simply due to their inferior performance against a particular dataset. This paper shows how relatively simple yet powerful feature selection/ranking methods can be applied to speech imagery datasets and generate significant results. To do so, we introduce two approaches, horizontal and vertical settings, to use any feature selection and ranking methods to speech imagery BCI datasets. Our primary goal is to improve the resulting classification accuracies from support vector machines, $k$-nearest neighbour, decision tree, linear discriminant analysis and long short-term memory recurrent neural network classifiers. Our experimental results show that using a small subset of channels, we can retain and, in most cases, improve the resulting classification accuracies regardless of the classifier. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 5 pages, 2 figures

ACM Class: I.2.8

arXiv:2007.08668 [pdf, other]

BRP-NAS: Prediction-based NAS using GCNs

Authors: Łukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, Nicholas D. Lane

Abstract: Neural architecture search (NAS) enables researchers to automatically explore broad design spaces in order to improve efficiency of neural networks. This efficiency is especially important in the case of on-device deployment, where improvements in accuracy should be balanced out with computational demands of a model. In practice, performance metrics of model are computationally expensive to obtain… ▽ More Neural architecture search (NAS) enables researchers to automatically explore broad design spaces in order to improve efficiency of neural networks. This efficiency is especially important in the case of on-device deployment, where improvements in accuracy should be balanced out with computational demands of a model. In practice, performance metrics of model are computationally expensive to obtain. Previous work uses a proxy (e.g., number of operations) or a layer-wise measurement of neural network layers to estimate end-to-end hardware performance but the imprecise prediction diminishes the quality of NAS. To address this problem, we propose BRP-NAS, an efficient hardware-aware NAS enabled by an accurate performance predictor-based on graph convolutional network (GCN). What is more, we investigate prediction quality on different metrics and show that sample efficiency of the predictor-based NAS can be improved by considering binary relations of models and an iterative data selection strategy. We show that our proposed method outperforms all prior methods on NAS-Bench-101 and NAS-Bench-201, and that our predictor can consistently learn to extract useful features from the DARTS search space, improving upon the second-order baseline. Finally, to raise awareness of the fact that accurate latency estimation is not a trivial task, we release LatBench -- a latency dataset of NAS-Bench-201 models running on a broad range of devices. △ Less

Submitted 19 January, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: Published at NeurIPS 2020

arXiv:2002.05022 [pdf, other]

Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator

Authors: Mohamed S. Abdelfattah, Łukasz Dudziak, Thomas Chau, Royson Lee, Hyeji Kim, Nicholas D. Lane

Abstract: Neural architecture search (NAS) has been very successful at outperforming human-designed convolutional neural networks (CNN) in accuracy, and when hardware information is present, latency as well. However, NAS-designed CNNs typically have a complicated topology, therefore, it may be difficult to design a custom hardware (HW) accelerator for such CNNs. We automate HW-CNN codesign using NAS by incl… ▽ More Neural architecture search (NAS) has been very successful at outperforming human-designed convolutional neural networks (CNN) in accuracy, and when hardware information is present, latency as well. However, NAS-designed CNNs typically have a complicated topology, therefore, it may be difficult to design a custom hardware (HW) accelerator for such CNNs. We automate HW-CNN codesign using NAS by including parameters from both the CNN model and the HW accelerator, and we jointly search for the best model-accelerator pair that boosts accuracy and efficiency. We call this Codesign-NAS. In this paper we focus on defining the Codesign-NAS multiobjective optimization problem, demonstrating its effectiveness, and exploring different ways of navigating the codesign search space. For CIFAR-10 image classification, we enumerate close to 4 billion model-accelerator pairs, and find the Pareto frontier within that large search space. This allows us to evaluate three different reinforcement-learning-based search strategies. Finally, compared to ResNet on its most optimal HW accelerator from within our HW design space, we improve on CIFAR-100 classification accuracy by 1.3% while simultaneously increasing performance/area by 41% in just~1000 GPU-hours of running Codesign-NAS. △ Less

Submitted 6 March, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

Comments: accepted at DAC 2020

arXiv:2001.04134 [pdf, other]

NimbRo Logistics -- Project KittingBot

Authors: Rosche Robin, Minh Triet Chau

Abstract: Recovering the pose of an object from mere point clouds is often hindered by the lack of the information that they provide. In this lab, we address this problem by proposing a method that exploits the symmetry of objects as well as using pictures taken from a static camera of the same scene. We apply this approach to detects nuts in a table top scene that includes screws, nuts, washers and several… ▽ More Recovering the pose of an object from mere point clouds is often hindered by the lack of the information that they provide. In this lab, we address this problem by proposing a method that exploits the symmetry of objects as well as using pictures taken from a static camera of the same scene. We apply this approach to detects nuts in a table top scene that includes screws, nuts, washers and several placeholders for grasp planning. △ Less

Submitted 13 January, 2020; originally announced January 2020.

arXiv:1912.11334 [pdf, other]

Open-domain Event Extraction and Embedding for Natural Gas Market Prediction

Authors: Minh Triet Chau, Diego Esteves, Jens Lehmann

Abstract: We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparis… ▽ More We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparison to off-the-shelf tools, our event extraction method detects not only the occurrence of phenomena but also the changes in attribution and characteristics from public sources. Instead of using sentence embedding as a feature, we use every word of the extracted events, encode and organize them before feeding to the learning models. Empirical results show favorable results, in terms of prediction performance, money saved and scalability. △ Less

Submitted 8 December, 2019; originally announced December 2019.

Report number: urn:nbn:de:0074-2611-9

Journal ref: CLEOPATRA 2020

arXiv:1910.13583 [pdf, ps, other]

All 4-variable functions can be perfectly quadratized with only 1 auxiliary variable

Authors: Nike Dattani, Hou Tin Chau

Abstract: We prove that any function with real-valued coefficients, whose input is 4 binary variables and whose output is a real number, is perfectly equivalent to a quadratic function whose input is 5 binary variables and is minimized over the new variable. Our proof is constructive: we provide quadratizations for all possible 4-variable functions. There exists 4 different classes of 4-variable functions t… ▽ More We prove that any function with real-valued coefficients, whose input is 4 binary variables and whose output is a real number, is perfectly equivalent to a quadratic function whose input is 5 binary variables and is minimized over the new variable. Our proof is constructive: we provide quadratizations for all possible 4-variable functions. There exists 4 different classes of 4-variable functions that each have their own 5-variable quadratization formula. Since we provide 'perfect' quadratizations, we can apply these formulas to any 4-variable subset of an n-variable function even if n >> 4. We provide 5 examples of functions that can be quadratized using the result of this work. For each of the 5 examples we compare the best possible quadratization we could construct using previously known methods, to a quadratization that we construct using our new result. In the most extreme example, the quadratization using our new result needs only N auxiliary variables for a 4N-variable degree-4 function, whereas the previous state-of-the-art quadratization requires 2N (double as many) auxiliary variables and therefore we can reduce by the cost of optimizing such a function by a factor of 2^1000 if it were to have 4000 variables before quadratization. In all 5 of our examples, the range of coefficient sizes in our quadratic function is smaller than in the previous state-of-the-art one, and our coefficient range is a factor of 7 times smaller in our 15-term, 5-variable example of a degree-4 function. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Comments: We thank Elisabeth Rodríguez-Heck and Endre Boros for helpful comments on an early version of our manuscript

MSC Class: 05C50; 11A41; 11A51; 11N35; 11N36; 11N80; 11Y05; 65K10; 65P10; 65Y20; 68Q12; 81P68; 81P94; 94A60; 81-08 ACM Class: B.2.4; B.8.2; C.1.3; C.1.m; F.2.1; F.2.3; F.4.1; G.1.0; G.1.3; G.1.5; G.1.6; G.2.0; G.2.1; I.1.2; I.6.4; C.4; E.3; G.0; J.2; K.2

arXiv:1810.06118 [pdf, other]

doi 10.1016/j.commatsci.2019.02.046

Learning to fail: Predicting fracture evolution in brittle material models using recurrent graph convolutional neural networks

Authors: Max Schwarzer, Bryce Rogan, Yadong Ruan, Zhengming Song, Diana Y. Lee, Allon G. Percus, Viet T. Chau, Bryan A. Moore, Esteban Rougier, Hari S. Viswanathan, Gowri Srinivasan

Abstract: We propose a machine learning approach to address a key challenge in materials science: predicting how fractures propagate in brittle materials under stress, and how these materials ultimately fail. Our methods use deep learning and train on simulation data from high-fidelity models, emulating the results of these models while avoiding the overwhelming computational demands associated with running… ▽ More We propose a machine learning approach to address a key challenge in materials science: predicting how fractures propagate in brittle materials under stress, and how these materials ultimately fail. Our methods use deep learning and train on simulation data from high-fidelity models, emulating the results of these models while avoiding the overwhelming computational demands associated with running a statistically significant sample of simulations. We employ a graph convolutional network that recognizes features of the fracturing material and a recurrent neural network that models the evolution of these features, along with a novel form of data augmentation that compensates for the modest size of our training data. We simultaneously generate predictions for qualitatively distinct material properties. Results on fracture damage and length are within 3% of their simulated values, and results on time to material failure, which is notoriously difficult to predict even with high-fidelity models, are within approximately 15% of simulated values. Once trained, our neural networks generate predictions within seconds, rather than the hours needed to run a single simulation. △ Less

Submitted 15 March, 2019; v1 submitted 14 October, 2018; originally announced October 2018.

Report number: LA-UR-18-29693

Journal ref: Computational Materials Science 162, 322-332 (2019)

arXiv:1809.00395 [pdf]

doi 10.1088/1741-2552/aae4b9

Online classification of imagined speech using functional near-infrared spectroscopy signals

Authors: Alborz Rezazadeh Sereshkeh, Rozhin Yousefi, Andrew T Wong, Tom Chau

Abstract: Most brain-computer interfaces (BCIs) based on functional near-infrared spectroscopy (fNIRS) require that users perform mental tasks such as motor imagery, mental arithmetic, or music imagery to convey a message or to answer simple yes or no questions. These cognitive tasks usually have no direct association with the communicative intent, which makes them difficult for users to perform. In this pa… ▽ More Most brain-computer interfaces (BCIs) based on functional near-infrared spectroscopy (fNIRS) require that users perform mental tasks such as motor imagery, mental arithmetic, or music imagery to convey a message or to answer simple yes or no questions. These cognitive tasks usually have no direct association with the communicative intent, which makes them difficult for users to perform. In this paper, a 3-class intuitive BCI is presented which enables users to directly answer yes or no questions by covertly rehearsing the word 'yes' or 'no' for 15 s. The BCI also admits an equivalent duration of unconstrained rest which constitutes the third discernable task. Twelve participants each completed one offline block and six online blocks over the course of 2 sessions. The mean value of the change in oxygenated hemoglobin concentration during a trial was calculated for each channel and used to train a regularized linear discriminant analysis (RLDA) classifier. By the final online block, 9 out of 12 participants were performing above chance (p<0.001), with a 3-class accuracy of 83.8+9.4%. Even when considering all participants, the average online 3-class accuracy over the last 3 blocks was 64.1+20.6%, with only 3 participants scoring below chance (p<0.001). For most participants, channels in the left temporal and temporoparietal cortex provided the most discriminative information. To our knowledge, this is the first report of an online fNIRS 3-class imagined speech BCI. Our findings suggest that imagined speech can be used as a reliable activation task for selected users for the development of more intuitive BCIs for communication. △ Less

Submitted 2 September, 2018; originally announced September 2018.

arXiv:1807.11537 [pdf, other]

Estimating Failure in Brittle Materials using Graph Theory

Authors: M. K. Mudunuru, N. Panda, S. Karra, G. Srinivasan, V. T. Chau, E. Rougier, A. Hunter, H. S. Viswanathan

Abstract: In brittle fracture applications, failure paths, regions where the failure occurs and damage statistics, are some of the key quantities of interest (QoI). High-fidelity models for brittle failure that accurately predict these QoI exist but are highly computationally intensive, making them infeasible to incorporate in upscaling and uncertainty quantification frameworks. The goal of this paper is to… ▽ More In brittle fracture applications, failure paths, regions where the failure occurs and damage statistics, are some of the key quantities of interest (QoI). High-fidelity models for brittle failure that accurately predict these QoI exist but are highly computationally intensive, making them infeasible to incorporate in upscaling and uncertainty quantification frameworks. The goal of this paper is to provide a fast heuristic to reasonably estimate quantities such as failure path and damage in the process of brittle failure. Towards this goal, we first present a method to predict failure paths under tensile loading conditions and low-strain rates. The method uses a $k$-nearest neighbors algorithm built on fracture process zone theory, and identifies the set of all possible pre-existing cracks that are likely to join early to form a large crack. The method then identifies zone of failure and failure paths using weighted graphs algorithms. We compare these failure paths to those computed with a high-fidelity model called the Hybrid Optimization Software Simulation Suite (HOSS). A probabilistic evolution model for average damage in a system is also developed that is trained using 150 HOSS simulations and tested on 40 simulations. A non-parametric approach based on confidence intervals is used to determine the damage evolution over time along the dominant failure path. For upscaling, damage is the key QoI needed as an input by the continuum models. This needs to be informed accurately by the surrogate models for calculating effective modulii at continuum-scale. We show that for the proposed average damage evolution model, the prediction accuracy on the test data is more than 90\%. In terms of the computational time, the proposed models are $\approx \mathcal{O}(10^6)$ times faster compared to high-fidelity HOSS. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Comments: 20 pages, 10 figures

arXiv:1806.01949 [pdf, ps, other]

Reduced-Order Modeling through Machine Learning Approaches for Brittle Fracture Applications

Authors: A. Hunter, B. A. Moore, M. K. Mudunuru, V. T. Chau, R. L. Miller, R. B. Tchoua, C. Nyshadham, S. Karra, D. O. Malley, E. Rougier, H. S. Viswanathan, G. Srinivasan

Abstract: In this paper, five different approaches for reduced-order modeling of brittle fracture in geomaterials, specifically concrete, are presented and compared. Four of the five methods rely on machine learning (ML) algorithms to approximate important aspects of the brittle fracture problem. In addition to the ML algorithms, each method incorporates different physics-based assumptions in order to reduc… ▽ More In this paper, five different approaches for reduced-order modeling of brittle fracture in geomaterials, specifically concrete, are presented and compared. Four of the five methods rely on machine learning (ML) algorithms to approximate important aspects of the brittle fracture problem. In addition to the ML algorithms, each method incorporates different physics-based assumptions in order to reduce the computational complexity while maintaining the physics as much as possible. This work specifically focuses on using the ML approaches to model a 2D concrete sample under low strain rate pure tensile loading conditions with 20 preexisting cracks present. A high-fidelity finite element-discrete element model is used to both produce a training dataset of 150 simulations and an additional 35 simulations for validation. Results from the ML approaches are directly compared against the results from the high-fidelity model. Strengths and weaknesses of each approach are discussed and the most important conclusion is that a combination of physics-informed and data-driven features are necessary for emulating the physics of crack propagation, interaction and coalescence. All of the models presented here have runtimes that are orders of magnitude faster than the original high-fidelity model and pave the path for develo** accurate reduced order models that could be used to inform larger length-scale models with important sub-scale physics that often cannot be accounted for due to computational cost. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Comments: 25 pages, 8 figures

Showing 1–21 of 21 results for author: Chau, T