Search | arXiv e-print repository

Towards shutdownable agents via stochastic choice

Authors: Elliott Thornley, Alexander Roman, Christos Ziakas, Leyton Ho, Louis Thomson

Abstract: Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that doesn't happen. A key part of the IPP is using a novel 'Discounted REward for Same-Length Trajectories (DREST)' reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be 'USEFUL'), and (2) choose stochastically… ▽ More Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that doesn't happen. A key part of the IPP is using a novel 'Discounted REward for Same-Length Trajectories (DREST)' reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be 'USEFUL'), and (2) choose stochastically between different trajectory-lengths (be 'NEUTRAL' about trajectory-lengths). In this paper, we propose evaluation metrics for USEFULNESS and NEUTRALITY. We use a DREST reward function to train simple agents to navigate gridworlds, and we find that these agents learn to be USEFUL and NEUTRAL. Our results thus suggest that DREST reward functions could also train advanced agents to be USEFUL and NEUTRAL, and thereby make these advanced agents useful and shutdownable. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2404.16710 [pdf, other]

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Authors: Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu

Abstract: We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exi… ▽ More We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exit at earlier layers, without adding any auxiliary layers or modules to the model. Third, we present a novel self-speculative decoding solution where we exit at early layers and verify and correct with remaining layers of the model. Our proposed self-speculative decoding approach has less memory footprint than other speculative decoding approaches and benefits from shared compute and activations of the draft and verification stages. We run experiments on different Llama model sizes on different types of training: pretraining from scratch, continual pretraining, finetuning on specific data domain, and finetuning on specific task. We implement our inference solution and show speedups of up to 2.16x on summarization for CNN/DM documents, 1.82x on coding, and 2.0x on TOPv2 semantic parsing task. △ Less

Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: Code open sourcing is in progress

arXiv:2402.02633 [pdf, other]

Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity

Authors: Eric Khiu, Hasti Toossi, David Anugraha, **yu Liu, Jiaxu Li, Juan Armando Parra Flores, Leandro Acros Roman, A. Seza Doğruöz, En-Shiun Annie Lee

Abstract: Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the si… ▽ More Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the size of the fine-tuning corpus, the domain similarity between fine-tuning and testing corpora, and the language similarity between source and target languages. We employ classical regression models to assess how these factors impact the model's performance. Our results indicate that domain similarity has the most critical impact on predicting the performance of Machine Translation models. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: 13 pages, 5 figures, accepted to EACL 2024, findings

arXiv:2401.17129 [pdf, other]

Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes

Authors: Adrian S. Roman, Baladithya Balamurugan, Rithik Pothuganti

Abstract: This technical report details our work towards building an enhanced audio-visual sound event localization and detection (SELD) network. We build on top of the audio-only SELDnet23 model and adapt it to be audio-visual by merging both audio and video information prior to the gated recurrent unit (GRU) of the audio-only network. Our model leverages YOLO and DETIC object detectors. We also build a fr… ▽ More This technical report details our work towards building an enhanced audio-visual sound event localization and detection (SELD) network. We build on top of the audio-only SELDnet23 model and adapt it to be audio-visual by merging both audio and video information prior to the gated recurrent unit (GRU) of the audio-only network. Our model leverages YOLO and DETIC object detectors. We also build a framework that implements audio-visual data augmentation and audio-visual synthetic data generation. We deliver an audio-visual SELDnet system that outperforms the existing audio-visual SELD baseline. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.12238 [pdf, other]

Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms

Authors: Iran R. Roman, Christopher Ick, Sivan Ding, Adrian S. Roman, Brian McFee, Juan P. Bello

Abstract: Sound event localization and detection (SELD) is an important task in machine listening. Major advancements rely on simulated data with sound events in specific rooms and strong spatio-temporal labels. SELD data is simulated by convolving spatialy-localized room impulse responses (RIRs) with sound waveforms to place sound events in a soundscape. However, RIRs require manual collection in specific… ▽ More Sound event localization and detection (SELD) is an important task in machine listening. Major advancements rely on simulated data with sound events in specific rooms and strong spatio-temporal labels. SELD data is simulated by convolving spatialy-localized room impulse responses (RIRs) with sound waveforms to place sound events in a soundscape. However, RIRs require manual collection in specific rooms. We present SpatialScaper, a library for SELD data simulation and augmentation. Compared to existing tools, SpatialScaper emulates virtual rooms via parameters such as size and wall absorption. This allows for parameterized placement (including movement) of foreground and background sound sources. SpatialScaper also includes data augmentation pipelines that can be applied to existing SELD data. As a case study, we use SpatialScaper to add rooms to the DCASE SELD data. Training a model with our data led to progressive performance improves as a direct function of acoustic diversity. These results show that SpatialScaper is valuable to train robust SELD models. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: 5 pages, 4 figures, 1 table, to be presented at ICASSP 2024 in Seoul, South Korea

arXiv:2401.08717 [pdf, other]

Robust DOA estimation using deep acoustic imaging

Authors: Adrian S. Roman, Iran R. Roman, Juan P. Bello

Abstract: Direction of arrival estimation (DoAE) aims at tracking a sound in azimuth and elevation. Recent advancements include data-driven models with inputs derived from ambisonics intensity vectors or correlations between channels in a microphone array. A spherical intensity map (SIM), or acoustic image, is an alternative input representation that remains underexplored. SIMs benefit from high-resolution… ▽ More Direction of arrival estimation (DoAE) aims at tracking a sound in azimuth and elevation. Recent advancements include data-driven models with inputs derived from ambisonics intensity vectors or correlations between channels in a microphone array. A spherical intensity map (SIM), or acoustic image, is an alternative input representation that remains underexplored. SIMs benefit from high-resolution microphone arrays, yet most DoAE datasets use low-resolution ones. Therefore, we first propose a super-resolution method to upsample low-resolution microphones. Next, we benchmark DoAE models that use SIMs as input. We arrive to a model that uses SIMs for DoAE estimation and outperforms a baseline and a state-of-the-art model. Our study highlights the relevance of acoustic imaging for DoAE tasks. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2312.06153 [pdf, other]

Open Datasheets: Machine-readable Documentation for Open Datasets and Responsible AI Assessments

Authors: Anthony Cintron Roman, Jennifer Wortman Vaughan, Valerie See, Steph Ballard, Jehu Torres, Caleb Robinson, Juan M. Lavista Ferres

Abstract: This paper introduces a no-code, machine-readable documentation framework for open datasets, with a focus on responsible AI (RAI) considerations. The framework aims to improve comprehensibility, and usability of open datasets, facilitating easier discovery and use, better understanding of content and context, and evaluation of dataset quality and accuracy. The proposed framework is designed to str… ▽ More This paper introduces a no-code, machine-readable documentation framework for open datasets, with a focus on responsible AI (RAI) considerations. The framework aims to improve comprehensibility, and usability of open datasets, facilitating easier discovery and use, better understanding of content and context, and evaluation of dataset quality and accuracy. The proposed framework is designed to streamline the evaluation of datasets, hel** researchers, data scientists, and other open data users quickly identify datasets that meet their needs and organizational policies or regulations. The paper also discusses the implementation of the framework and provides recommendations to maximize its potential. The framework is expected to enhance the quality and reliability of data used in research and decision-making, fostering the development of more responsible and trustworthy AI systems. △ Less

Submitted 27 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

arXiv:2309.07860 [pdf, other]

Identifying the Group-Theoretic Structure of Machine-Learned Symmetries

Authors: Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Alexander Roman, Eyup B. Unlu, Sarunas Verner

Abstract: Deep learning was recently successfully used in deriving symmetry transformations that preserve important physics quantities. Being completely agnostic, these techniques postpone the identification of the discovered symmetries to a later stage. In this letter we propose methods for examining and identifying the group-theoretic structure of such machine-learned symmetries. We design loss functions… ▽ More Deep learning was recently successfully used in deriving symmetry transformations that preserve important physics quantities. Being completely agnostic, these techniques postpone the identification of the discovered symmetries to a later stage. In this letter we propose methods for examining and identifying the group-theoretic structure of such machine-learned symmetries. We design loss functions which probe the subalgebra structure either during the deep learning stage of symmetry discovery or in a subsequent post-processing stage. We illustrate the new methods with examples from the U(n) Lie group family, obtaining the respective subalgebra decompositions. As an application to particle physics, we demonstrate the identification of the residual symmetries after the spontaneous breaking of non-Abelian gauge symmetries like SU(3) and SU(5) which are commonly used in model building. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 10 pages, 8 figures, 2 tables

arXiv:2309.00329 [pdf, other]

Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper

Authors: Tomasz Wojnar, Jaroslaw Hryszko, Adam Roman

Abstract: This article introduces Mi-Go, a novel testing framework aimed at evaluating the performance and adaptability of general-purpose speech recognition machine learning models across diverse real-world scenarios. The framework leverages YouTube as a rich and continuously updated data source, accounting for multiple languages, accents, dialects, speaking styles, and audio quality levels. To demonstrate… ▽ More This article introduces Mi-Go, a novel testing framework aimed at evaluating the performance and adaptability of general-purpose speech recognition machine learning models across diverse real-world scenarios. The framework leverages YouTube as a rich and continuously updated data source, accounting for multiple languages, accents, dialects, speaking styles, and audio quality levels. To demonstrate the effectiveness of the framework, the Whisper model, developed by OpenAI, was employed as a test object. The tests involve using a total of 124 YouTube videos to test all Whisper model versions. The results underscore the utility of YouTube as a valuable testing platform for speech recognition models, ensuring their robustness, accuracy, and adaptability to diverse languages and acoustic conditions. Additionally, by contrasting the machine-generated transcriptions against human-made subtitles, the Mi-Go framework can help pinpoint potential misuse of YouTube subtitles, like Search Engine Optimization. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 25 pages, 9 tables, 3 figures

arXiv:2307.04891 [pdf, other]

Accelerated Discovery of Machine-Learned Symmetries: Deriving the Exceptional Lie Groups G2, F4 and E6

Authors: Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Alexander Roman, Eyup B. Unlu, Sarunas Verner

Abstract: Recent work has applied supervised deep learning to derive continuous symmetry transformations that preserve the data labels and to obtain the corresponding algebras of symmetry generators. This letter introduces two improved algorithms that significantly speed up the discovery of these symmetry transformations. The new methods are demonstrated by deriving the complete set of generators for the un… ▽ More Recent work has applied supervised deep learning to derive continuous symmetry transformations that preserve the data labels and to obtain the corresponding algebras of symmetry generators. This letter introduces two improved algorithms that significantly speed up the discovery of these symmetry transformations. The new methods are demonstrated by deriving the complete set of generators for the unitary groups U(n) and the exceptional Lie groups $G_2$, $F_4$, and $E_6$. A third post-processing algorithm renders the found generators in sparse form. We benchmark the performance improvement of the new algorithms relative to the standard approach. Given the significant complexity of the exceptional Lie groups, our results demonstrate that this machine-learning method for discovering symmetries is completely general and can be applied to a wide variety of labeled datasets. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: 11 pages, 7 figures

arXiv:2306.06191 [pdf, other]

Open Data on GitHub: Unlocking the Potential of AI

Authors: Anthony Cintron Roman, Kevin Xu, Arfon Smith, Jehu Torres Vega, Caleb Robinson, Juan M Lavista Ferres

Abstract: GitHub is the world's largest platform for collaborative software development, with over 100 million users. GitHub is also used extensively for open data collaboration, hosting more than 800 million open data files, totaling 142 terabytes of data. This study highlights the potential of open data on GitHub and demonstrates how it can accelerate AI research. We analyze the existing landscape of open… ▽ More GitHub is the world's largest platform for collaborative software development, with over 100 million users. GitHub is also used extensively for open data collaboration, hosting more than 800 million open data files, totaling 142 terabytes of data. This study highlights the potential of open data on GitHub and demonstrates how it can accelerate AI research. We analyze the existing landscape of open data on GitHub and the patterns of how users share datasets. Our findings show that GitHub is one of the largest hosts of open data in the world and has experienced an accelerated growth of open data assets over the past four years. By examining the open data landscape on GitHub, we aim to empower users and organizations to leverage existing open datasets and improve their discoverability -- ultimately contributing to the ongoing AI revolution to help address complex societal issues. We release the three datasets that we have collected to support this analysis as open datasets at https://github.com/github/open-data-on-github. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: In submission to NeurIPS 2023 Track Datasets and Benchmarks

arXiv:2302.05383 [pdf, other]

doi 10.1016/j.physletb.2023.138086

Discovering Sparse Representations of Lie Groups with Machine Learning

Authors: Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Alexander Roman, Eyup B. Unlu, Sarunas Verner

Abstract: Recent work has used deep learning to derive symmetry transformations, which preserve conserved quantities, and to obtain the corresponding algebras of generators. In this letter, we extend this technique to derive sparse representations of arbitrary Lie algebras. We show that our method reproduces the canonical (sparse) representations of the generators of the Lorentz group, as well as the… ▽ More Recent work has used deep learning to derive symmetry transformations, which preserve conserved quantities, and to obtain the corresponding algebras of generators. In this letter, we extend this technique to derive sparse representations of arbitrary Lie algebras. We show that our method reproduces the canonical (sparse) representations of the generators of the Lorentz group, as well as the $U(n)$ and $SU(n)$ families of Lie groups. This approach is completely general and can be used to find the infinitesimal generators for any Lie group. △ Less

Submitted 10 February, 2023; originally announced February 2023.

Comments: 14 pages, 6 figures

arXiv:2302.00806 [pdf, other]

Oracle-Preserving Latent Flows

Authors: Alexander Roman, Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Eyup B. Unlu

Abstract: We develop a deep learning methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset. The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function ensuring the desired symmetry properties. The two new elements in this work are the use… ▽ More We develop a deep learning methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset. The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function ensuring the desired symmetry properties. The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to transformations invariant with respect to high-dimensional oracles. The method is demonstrated with several examples on the MNIST digit dataset. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Comments: 9 pages, 8 figures

arXiv:2301.05638 [pdf, other]

Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras from First Principles

Authors: Roy T. Forestano, Konstantin T. Matchev, Katia Matcheva, Alexander Roman, Eyup Unlu, Sarunas Verner

Abstract: We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset. We use fully connected neural networks to model the symmetry transformations and the corresponding generators. We construct loss functions that ensure that the applied transformations are symmetries and that the corresponding set of generators forms a closed… ▽ More We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset. We use fully connected neural networks to model the symmetry transformations and the corresponding generators. We construct loss functions that ensure that the applied transformations are symmetries and that the corresponding set of generators forms a closed (sub)algebra. Our procedure is validated with several examples illustrating different types of conserved quantities preserved by symmetry. In the process of deriving the full set of symmetries, we analyze the complete subgroup structure of the rotation groups $SO(2)$, $SO(3)$, and $SO(4)$, and of the Lorentz group $SO(1,3)$. Other examples include squeeze map**, piecewise discontinuous labels, and $SO(10)$, demonstrating that our method is completely general, with many possible applications in physics and data science. Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties. △ Less

Submitted 13 January, 2023; originally announced January 2023.

Comments: 21 pages, 26 figures with 47 panels

arXiv:2207.00962 [pdf, other]

doi 10.1103/PhysRevE.108.014101

Low probability states, data statistics, and entropy estimation

Authors: Damián G. Hernández, Ahmed Roman, Ilya Nemenman

Abstract: A fundamental problem in analysis of complex systems is getting a reliable estimate of entropy of their probability distributions over the state space. This is difficult because unsampled states can contribute substantially to the entropy, while they do not contribute to the Maximum Likelihood estimator of entropy, which replaces probabilities by the observed frequencies. Bayesian estimators overc… ▽ More A fundamental problem in analysis of complex systems is getting a reliable estimate of entropy of their probability distributions over the state space. This is difficult because unsampled states can contribute substantially to the entropy, while they do not contribute to the Maximum Likelihood estimator of entropy, which replaces probabilities by the observed frequencies. Bayesian estimators overcome this obstacle by introducing a model of the low-probability tail of the probability distribution. Which statistical features of the observed data determine the model of the tail, and hence the output of such estimators, remains unclear. Here we show that well-known entropy estimators for probability distributions on discrete state spaces model the structure of the low probability tail based largely on few statistics of the data: the sample size, the Maximum Likelihood estimate, the number of coincidences among the samples, the dispersion of the coincidences. We derive approximate analytical entropy estimators for undersampled distributions based on these statistics, and we use the results to propose an intuitive understanding of how the Bayesian entropy estimators work. △ Less

Submitted 3 July, 2022; originally announced July 2022.

arXiv:2201.02696 [pdf, other]

Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra

Authors: Konstantin T. Matchev, Katia Matcheva, Alexander Roman

Abstract: Transit spectroscopy is a powerful tool to decode the chemical composition of the atmospheres of extrasolar planets. In this paper we focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. We demonstrate methods for i) cleaning and validating the data, ii) initial exploratory data analysis based on summary statistics (estimates of location and variability), iii) e… ▽ More Transit spectroscopy is a powerful tool to decode the chemical composition of the atmospheres of extrasolar planets. In this paper we focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. We demonstrate methods for i) cleaning and validating the data, ii) initial exploratory data analysis based on summary statistics (estimates of location and variability), iii) exploring and quantifying the existing correlations in the data, iv) pre-processing and linearly transforming the data to its principal components, v) dimensionality reduction and manifold learning, vi) clustering and anomaly detection, vii) visualization and interpretation of the data. To illustrate the proposed unsupervised methodology, we use a well-known public benchmark data set of synthetic transit spectra. We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations. We explore a number of different techniques for such dimensionality reduction and identify several suitable options in terms of summary statistics, principal components, etc. We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes of the underlying atmospheres. We demonstrate that those branches can be successfully recovered with a K-means clustering algorithm in fully unsupervised fashion. We advocate for a three-dimensional representation of the spectroscopic data in terms of the first three principal components, in order to reveal the existing structure in the data and quickly characterize the chemical class of a planet. △ Less

Submitted 7 January, 2022; originally announced January 2022.

Comments: 10 pages, 11 figures, submitted to MNRAS

arXiv:2112.11600 [pdf, other]

Analytical Modelling of Exoplanet Transit Specroscopy with Dimensional Analysis and Symbolic Regression

Authors: Konstantin T. Matchev, Katia Matcheva, Alexander Roman

Abstract: The physical characteristics and atmospheric chemical composition of newly discovered exoplanets are often inferred from their transit spectra which are obtained from complex numerical models of radiative transfer. Alternatively, simple analytical expressions provide insightful physical intuition into the relevant atmospheric processes. The deep learning revolution has opened the door for deriving… ▽ More The physical characteristics and atmospheric chemical composition of newly discovered exoplanets are often inferred from their transit spectra which are obtained from complex numerical models of radiative transfer. Alternatively, simple analytical expressions provide insightful physical intuition into the relevant atmospheric processes. The deep learning revolution has opened the door for deriving such analytical results directly with a computer algorithm fitting to the data. As a proof of concept, we successfully demonstrate the use of symbolic regression on synthetic data for the transit radii of generic hot Jupiter exoplanets to derive a corresponding analytical formula. As a preprocessing step, we use dimensional analysis to identify the relevant dimensionless combinations of variables and reduce the number of independent inputs, which improves the performance of the symbolic regression. The dimensional analysis also allowed us to mathematically derive and properly parametrize the most general family of degeneracies among the input atmospheric parameters which affect the characterization of an exoplanet atmosphere through transit spectroscopy. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: Submitted to AAS Journals, 24 pages, 7 figures

arXiv:2106.05215 [pdf, other]

A machine learning pipeline for aiding school identification from child trafficking images

Authors: Sumit Mukherjee, Tina Sederholm, Anthony C. Roman, Ria Sankar, Sherrie Caltagirone, Juan Lavista Ferres

Abstract: Child trafficking in a serious problem around the world. Every year there are more than 4 million victims of child trafficking around the world, many of them for the purposes of child sexual exploitation. In collaboration with UK Police and a non-profit focused on child abuse prevention, Global Emancipation Network, we developed a proof-of-concept machine learning pipeline to aid the identificatio… ▽ More Child trafficking in a serious problem around the world. Every year there are more than 4 million victims of child trafficking around the world, many of them for the purposes of child sexual exploitation. In collaboration with UK Police and a non-profit focused on child abuse prevention, Global Emancipation Network, we developed a proof-of-concept machine learning pipeline to aid the identification of children from intercepted images. In this work, we focus on images that contain children wearing school uniforms to identify the school of origin. In the absence of a machine learning pipeline, this hugely time consuming and labor intensive task is manually conducted by law enforcement personnel. Thus, by automating aspects of the school identification process, we hope to significantly impact the speed of this portion of child identification. Our proposed pipeline consists of two machine learning models: i) to identify whether an image of a child contains a school uniform in it, and ii) identification of attributes of different school uniform items (such as color/texture of shirts, sweaters, blazers etc.). We describe the data collection, labeling, model development and validation process, along with strategies for efficient searching of schools using the model predictions. △ Less

Submitted 9 June, 2021; originally announced June 2021.

arXiv:2104.08072 [pdf]

doi 10.1007/978-3-030-66843-3_21

Machine Learning and Glioblastoma: Treatment Response Monitoring Biomarkers in 2021

Authors: Thomas Booth, Bernice Akpinar, Andrei Roman, Haris Shuaib, Aysha Luis, Alysha Chelliah, Ayisha Al Busaidi, Ayesha Mirchandani, Burcu Alparslan, Nina Mansoor, Keyoumars Ashkan, Sebastien Ourselin, Marc Modat

Abstract: The aim of the systematic review was to assess recently published studies on diagnostic test accuracy of glioblastoma treatment response monitoring biomarkers in adults, developed through machine learning (ML). Articles were searched for using MEDLINE, EMBASE, and the Cochrane Register. Included study participants were adult patients with high grade glioma who had undergone standard treatment (max… ▽ More The aim of the systematic review was to assess recently published studies on diagnostic test accuracy of glioblastoma treatment response monitoring biomarkers in adults, developed through machine learning (ML). Articles were searched for using MEDLINE, EMBASE, and the Cochrane Register. Included study participants were adult patients with high grade glioma who had undergone standard treatment (maximal resection, radiotherapy with concomitant and adjuvant temozolomide) and subsequently underwent follow-up imaging to determine treatment response status. Risk of bias and applicability was assessed with QUADAS 2 methodology. Contingency tables were created for hold-out test sets and recall, specificity, precision, F1-score, balanced accuracy calculated. Fifteen studies were included with 1038 patients in training sets and 233 in test sets. To determine whether there was progression or a mimic, the reference standard combination of follow-up imaging and histopathology at re-operation was applied in 67% of studies. The small numbers of patient included in studies, the high risk of bias and concerns of applicability in the study designs (particularly in relation to the reference standard and patient selection due to confounding), and the low level of evidence, suggest that limited conclusions can be drawn from the data. There is likely good diagnostic performance of machine learning models that use MRI features to distinguish between progression and mimics. The diagnostic performance of ML using implicit features did not appear to be superior to ML using explicit features. There are a range of ML-based solutions poised to become treatment response monitoring biomarkers for glioblastoma. To achieve this, the development and validation of ML models require large, well-annotated datasets where the potential for confounding in the study design has been carefully considered. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Journal ref: Kia S.M. et al. (eds) Machine Learning in Clinical Neuroimaging and Radiogenomics in Neuro-oncology. MLCN 2020, RNO-AI 2020. Lecture Notes in Computer Science, vol 12449

arXiv:2008.04137 [pdf, other]

SplitNN-driven Vertical Partitioning

Authors: Iker Ceballos, Vivek Sharma, Eduardo Mugica, Abhishek Singh, Alberto Roman, Praneeth Vepakomma, Ramesh Raskar

Abstract: In this work, we introduce SplitNN-driven Vertical Partitioning, a configuration of a distributed deep learning method called SplitNN to facilitate learning from vertically distributed features. SplitNN does not share raw data or model details with collaborating institutions. The proposed configuration allows training among institutions holding diverse sources of data without the need of complex e… ▽ More In this work, we introduce SplitNN-driven Vertical Partitioning, a configuration of a distributed deep learning method called SplitNN to facilitate learning from vertically distributed features. SplitNN does not share raw data or model details with collaborating institutions. The proposed configuration allows training among institutions holding diverse sources of data without the need of complex encryption algorithms or secure computation protocols. We evaluate several configurations to merge the outputs of the split models, and compare performance and resource efficiency. The method is flexible and allows many different configurations to tackle the specific challenges posed by vertically split datasets. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: First version, please provide feedback

arXiv:1910.12086 [pdf, other]

A holistic approach to polyphonic music transcription with neural networks

Authors: Miguel A. Román, Antonio Pertusa, Jorge Calvo-Zaragoza

Abstract: We present a framework based on neural networks to extract music scores directly from polyphonic audio in an end-to-end fashion. Most previous Automatic Music Transcription (AMT) methods seek a piano-roll representation of the pitches, that can be further transformed into a score by incorporating tempo estimation, beat tracking, key estimation or rhythm quantization. Unlike these methods, our appr… ▽ More We present a framework based on neural networks to extract music scores directly from polyphonic audio in an end-to-end fashion. Most previous Automatic Music Transcription (AMT) methods seek a piano-roll representation of the pitches, that can be further transformed into a score by incorporating tempo estimation, beat tracking, key estimation or rhythm quantization. Unlike these methods, our approach generates music notation directly from the input audio in a single stage. For this, we use a Convolutional Recurrent Neural Network (CRNN) with Connectionist Temporal Classification (CTC) loss function which does not require annotated alignments of audio frames with the score rhythmic information. We trained our model using as input Haydn, Mozart, and Beethoven string quartets and Bach chorales synthesized with different tempos and expressive performances. The output is a textual representation of four-voice music scores based on **kern format. Although the proposed approach is evaluated in a simplified scenario, results show that this model can learn to transcribe scores directly from audio signals, opening a promising avenue towards complete AMT. △ Less

Submitted 26 October, 2019; originally announced October 2019.

Comments: Source code available at https://github.com/mangelroman/audio2score

arXiv:1412.0799 [pdf, ps, other]

Complexity of Road Coloring with Prescribed Reset Words

Authors: Vojtěch Vorel, Adam Roman

Abstract: By the Road Coloring Theorem (Trahtman, 2008), the edges of any aperiodic directed multigraph with a constant out-degree can be colored such that the resulting automaton admits a reset word. There may also be a need for a particular reset word to be admitted. For certain words it is NP-complete to decide whether there is a suitable coloring of a given multigraph. We present a classification of all… ▽ More By the Road Coloring Theorem (Trahtman, 2008), the edges of any aperiodic directed multigraph with a constant out-degree can be colored such that the resulting automaton admits a reset word. There may also be a need for a particular reset word to be admitted. For certain words it is NP-complete to decide whether there is a suitable coloring of a given multigraph. We present a classification of all words over the binary alphabet that separates such words from those that make the problem solvable in polynomial time. We show that the classification becomes different if we consider only strongly connected multigraphs. In this restricted setting the classification remains incomplete. △ Less

Submitted 2 December, 2014; originally announced December 2014.

Comments: To be presented at LATA 2015

arXiv:1403.4749 [pdf, ps, other]

Parameterized Complexity of Synchronization and Road Coloring

Authors: Vojtěch Vorel, Adam Roman

Abstract: First, we close the multivariate analysis of a canonical problem concerning short reset words (SYN), as it was started by Fernau et al. (2013). Namely, we prove that the problem, parameterized by the number of states, does not admit a polynomial kernel unless the polynomial hierarchy collapses. Second, we consider a related canonical problem concerning synchronizing road colorings (SRCP). Here we… ▽ More First, we close the multivariate analysis of a canonical problem concerning short reset words (SYN), as it was started by Fernau et al. (2013). Namely, we prove that the problem, parameterized by the number of states, does not admit a polynomial kernel unless the polynomial hierarchy collapses. Second, we consider a related canonical problem concerning synchronizing road colorings (SRCP). Here we give a similar complete multivariate analysis. Namely, we show that the problem, parameterized by the number of states, admits a polynomial kernel and we close the previous research of restrictions to particular values of both the alphabet size and the maximum word length. △ Less

Submitted 23 June, 2014; v1 submitted 19 March, 2014; originally announced March 2014.

ACM Class: F.1.1; F.2.2

arXiv:1201.0418 [pdf, other]

A New Family of Bounded Divergence Measures and Application to Signal Detection

Authors: Shivakumar Jolad, Ahmed Roman, Mahesh C. Shastry, Mihir Gadgil, Ayanendranath Basu

Abstract: We introduce a new one-parameter family of divergence measures, called bounded Bhattacharyya distance (BBD) measures, for quantifying the dissimilarity between probability distributions. These measures are bounded, symmetric and positive semi-definite and do not require absolute continuity. In the asymptotic limit, BBD measure approaches the squared Hellinger distance. A generalized BBD measure fo… ▽ More We introduce a new one-parameter family of divergence measures, called bounded Bhattacharyya distance (BBD) measures, for quantifying the dissimilarity between probability distributions. These measures are bounded, symmetric and positive semi-definite and do not require absolute continuity. In the asymptotic limit, BBD measure approaches the squared Hellinger distance. A generalized BBD measure for multiple distributions is also introduced. We prove an extension of a theorem of Bradt and Karlin for BBD relating Bayes error probability and divergence ranking. We show that BBD belongs to the class of generalized Csiszar f-divergence and derive some properties such as curvature and relation to Fisher Information. For distributions with vector valued parameters, the curvature matrix is related to the Fisher-Rao metric. We derive certain inequalities between BBD and well known measures such as Hellinger and Jensen-Shannon divergence. We also derive bounds on the Bayesian error probability. We give an application of these measures to the problem of signal detection where we compare two monochromatic signals buried in white noise and differing in frequency and amplitude. △ Less

Submitted 9 April, 2016; v1 submitted 1 January, 2012; originally announced January 2012.

Comments: 12 pages, 4 figures

MSC Class: 94A17; 94A12; 94B70; 97K50 ACM Class: G.3; H.1.1

Journal ref: Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2016), pages 72-83, ISBN: 978-989-758-173-1

Showing 1–24 of 24 results for author: Roman, A