-
Cross-Language Evolution of Divergent Collective Memory Around the Arab Spring
Authors:
H. Laurie Jones,
Brian C. Keegan
Abstract:
The Arab Spring was a historic set of protests beginning in 2011 that toppled governments and led to major conflicts. Collective memories of events like these can vary significantly across social contexts in response to political, cultural, and linguistic factors. While Wikipedia plays an important role in documenting both historic and current events, little attention has been given to how Wikiped…
▽ More
The Arab Spring was a historic set of protests beginning in 2011 that toppled governments and led to major conflicts. Collective memories of events like these can vary significantly across social contexts in response to political, cultural, and linguistic factors. While Wikipedia plays an important role in documenting both historic and current events, little attention has been given to how Wikipedia articles, created in the aftermath of major events, continue to evolve over years or decades. Using the archived content of Arab Spring-related topics across the Arabic and English Wikipedias between 2011 and 2024, we define and evaluate multilingual measures of event salience, deliberation, contextualization, and consolidation of collective memory surrounding the Arab Spring. Our findings about the temporal evolution of the Wikipedia articles' content similarity across languages has implications for theorizing about online collective memory processes and evaluating linguistic models trained on these data.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Authors:
Lukas Schäfer,
Logan Jones,
Anssi Kanervisto,
Yuhan Cao,
Tabish Rashid,
Raluca Georgescu,
Dave Bignell,
Siddhartha Sen,
Andrea Treviño Gavito,
Sam Devlin
Abstract:
Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community. Recent progress in the research, development and open release of large vision models has the potential to amortize some of these costs across the community. However, it…
▽ More
Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community. Recent progress in the research, development and open release of large vision models has the potential to amortize some of these costs across the community. However, it is currently unclear which of these models have learnt representations that retain information critical for sequential decision making. Towards enabling wider participation in the research of gameplaying agents in modern games, we present a systematic study of imitation learning with publicly available visual encoders compared to the typical, task-specific, end-to-end training approach in Minecraft, Minecraft Dungeons and Counter-Strike: Global Offensive.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
Authors:
Llion Jones,
Richard Sproat,
Haruko Ishikawa,
Alexander Gutkin
Abstract:
If one sees the place name Houston Mercer Dog Run in New York, how does one know how to pronounce it? Assuming one knows that Houston in New York is pronounced "how-ston" and not like the Texas city, then one can probably guess that "how-ston" is also used in the name of the dog park. We present a novel architecture that learns to use the pronunciations of neighboring names in order to guess the p…
▽ More
If one sees the place name Houston Mercer Dog Run in New York, how does one know how to pronounce it? Assuming one knows that Houston in New York is pronounced "how-ston" and not like the Texas city, then one can probably guess that "how-ston" is also used in the name of the dog park. We present a novel architecture that learns to use the pronunciations of neighboring names in order to guess the pronunciation of a given target feature. Applied to Japanese place names, we demonstrate the utility of the model to finding and proposing corrections for errors in Google Maps.
To demonstrate the utility of this approach to structurally similar problems, we also report on an application to a totally different task: Cognate reflex prediction in comparative historical linguistics. A version of the code has been open-sourced (https://github.com/google-research/google-research/tree/master/cognate_inpaint_neighbors).
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
MetaCon: Unified Predictive Segments System with Trillion Concept Meta-Learning
Authors:
Keqian Li,
Yifan Hu,
Logan Palanisamy,
Lisa Jones,
Akshay Gupta,
Jason Grigsby,
Ili Selinger,
Matt Gillingham,
Fei Tan
Abstract:
Accurate understanding of users in terms of predicative segments play an essential role in the day to day operation of modern internet enterprises. Nevertheless, there are significant challenges that limit the quality of data, especially on long tail predictive tasks. In this work, we present MetaCon, our unified predicative segments system with scalable, trillion concepts meta learning that addre…
▽ More
Accurate understanding of users in terms of predicative segments play an essential role in the day to day operation of modern internet enterprises. Nevertheless, there are significant challenges that limit the quality of data, especially on long tail predictive tasks. In this work, we present MetaCon, our unified predicative segments system with scalable, trillion concepts meta learning that addresses these challenges. It builds on top of a flat concept representation that summarizes entities' heterogeneous digital footprint, jointly considers the entire spectrum of predicative tasks as a single learning task, and leverages principled meta learning approach with efficient first order meta-optimization procedure under a provable performance guarantee in order to solve the learning task. Experiments on both proprietary production datasets and public structured learning tasks demonstrate that MetaCon can lead to substantial improvements over state of the art recommendation and ranking approaches.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays
Authors:
Thi Ngoc Tho Nguyen,
Douglas L. Jones,
Karn N. Watcharasupat,
Huy Phan,
Woon-Seng Gan
Abstract:
Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightwei…
▽ More
Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightweight variation of a previously proposed SALSA feature for polyphonic SELD. SALSA, which stands for Spatial Cue-Augmented Log-Spectrogram, consists of multichannel log-spectrograms stacked channelwise with the normalized principal eigenvectors of the spectrotemporally corresponding spatial covariance matrices. In contrast to SALSA, which uses eigenvector-based spatial features, SALSA-Lite uses normalized inter-channel phase differences as spatial features, allowing a 30-fold speedup compared to the original SALSA feature. Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset showed that the SALSA-Lite feature achieved competitive performance compared to the full SALSA feature, and significantly outperformed the traditional feature set of multichannel log-mel spectrograms with generalized cross-correlation spectra. Specifically, using SALSA-Lite features increased localization-dependent F1 score and class-dependent localization recall by 15% and 5%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
△ Less
Submitted 4 May, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
Authors:
Thi Ngoc Tho Nguyen,
Karn N. Watcharasupat,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon-Seng Gan
Abstract:
Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses amplitude and/or phase differences between microphones to estimate source directions. As a result, it is often di…
▽ More
Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses amplitude and/or phase differences between microphones to estimate source directions. As a result, it is often difficult to jointly optimize these two subtasks. We propose a novel feature called Spatial cue-Augmented Log-SpectrogrAm (SALSA) with exact time-frequency map** between the signal power and the source directional cues, which is crucial for resolving overlap** sound sources. The SALSA feature consists of multichannel log-spectrograms stacked along with the normalized principal eigenvector of the spatial covariance matrix at each corresponding time-frequency bin. Depending on the microphone array format, the principal eigenvector can be normalized differently to extract amplitude and/or phase differences between the microphones. As a result, SALSA features are applicable for different microphone array formats such as first-order ambisonics (FOA) and multichannel microphone array (MIC). Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset with directional interferences showed that SALSA features outperformed other state-of-the-art features. Specifically, the use of SALSA features in the FOA format increased the F1 score and localization recall by 6% each, compared to the multichannel log-mel spectrograms with intensity vectors. For the MIC format, using SALSA features increased F1 score and localization recall by 16% and 7%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
△ Less
Submitted 6 June, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges
Authors:
Alican Akman,
Harry Coppock,
Alexander Gaskell,
Panagiotis Tzirakis,
Lyn Jones,
Björn W. Schuller
Abstract:
We report on cross-running the recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA. CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVID-positive or COVID-negative based on coughing and breathing audio recordings from a published crowdsou…
▽ More
We report on cross-running the recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA. CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVID-positive or COVID-negative based on coughing and breathing audio recordings from a published crowdsourced dataset. In the current study, we demonstrate the potential of CIdeR at binary COVID-19 diagnosis from both the COVID-19 Cough and Speech Sub-Challenges of INTERSPEECH 2021, ComParE and DiCOVA. CIdeR achieves significant improvements over several baselines.
△ Less
Submitted 30 July, 2021;
originally announced July 2021.
-
Improving Polyphonic Sound Event Detection on Multichannel Recordings with the Sørensen-Dice Coefficient Loss and Transfer Learning
Authors:
Karn N. Watcharasupat,
Thi Ngoc Tho Nguyen,
Ngoc Khanh Nguyen,
Zhen Jian Lee,
Douglas L. Jones,
Woon Seng Gan
Abstract:
The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-e…
▽ More
The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-entropy loss often results in suboptimal detection performance as the training is often overwhelmed by updates from negative samples. In this paper, we investigated the effect of the Dice loss, intra- and inter-modal transfer learning, data augmentation, and recording formats, on the performance of polyphonic sound event detection systems with multichannel inputs. Our analysis showed that polyphonic sound event detection systems trained with Dice loss consistently outperformed those trained with cross-entropy loss across different training settings and recording formats in terms of F1 score and error rate. We achieved further performance gains via the use of transfer learning and an appropriate combination of different data augmentation techniques.
△ Less
Submitted 2 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
Authors:
Thi Ngoc Tho Nguyen,
Karn N. Watcharasupat,
Zhen Jian Lee,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon Seng Gan
Abstract:
Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct corresp…
▽ More
Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct correspondences between the detected sound classes and directions of arrival to multiple overlap** sound events. Previous studies have shown that unknown interferences in reverberant environments often cause major degradation in the performance of SELD systems. To further understand the challenges of the SELD task, we performed a detailed error analysis on two of our SELD systems, which both ranked second in the team category of DCASE SELD Challenge, one in 2020 and one in 2021. Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest. In addition, the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
△ Less
Submitted 2 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Authors:
Yuma Koizumi,
Shigeki Karita,
Scott Wisdom,
Hakan Erdogan,
John R. Hershey,
Llion Jones,
Michiel Bacchiani
Abstract:
Single-channel speech enhancement (SE) is an important task in speech processing. A widely used framework combines an analysis/synthesis filterbank with a mask prediction network, such as the Conv-TasNet architecture. In such systems, the denoising performance and computational efficiency are mainly affected by the structure of the mask prediction network. In this study, we aim to improve the sequ…
▽ More
Single-channel speech enhancement (SE) is an important task in speech processing. A widely used framework combines an analysis/synthesis filterbank with a mask prediction network, such as the Conv-TasNet architecture. In such systems, the denoising performance and computational efficiency are mainly affected by the structure of the mask prediction network. In this study, we aim to improve the sequential modeling ability of Conv-TasNet architectures by integrating Conformer layers into a new mask prediction network. To make the model computationally feasible, we extend the Conformer using linear complexity attention and stacked 1-D dilated depthwise convolution layers. We trained the model on 3,396 hours of noisy speech data, and show that (i) the use of linear complexity attention avoids high computational complexity, and (ii) our model achieves higher scale-invariant signal-to-noise ratio than the improved time-dilated convolution network (TDCN++), an extended version of Conv-TasNet.
△ Less
Submitted 5 August, 2021; v1 submitted 30 June, 2021;
originally announced June 2021.
-
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection
Authors:
Thi Ngoc Tho Nguyen,
Karn Watcharasupat,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon Seng Gan
Abstract:
Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to joi…
▽ More
Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to jointly train these two subtasks simultaneously. We propose a novel feature called spatial cue-augmented log-spectrogram (SALSA) with exact time-frequency map** between the signal power and the source direction-of-arrival. The feature includes multichannel log-spectrograms stacked along with the estimated direct-to-reverberant ratio and a normalized version of the principal eigenvector of the spatial covariance matrix at each time-frequency bin on the spectrograms. Experimental results on the DCASE 2021 dataset for sound event localization and detection with directional interference showed that the deep learning-based models trained on this new feature outperformed the DCASE challenge baseline by a large margin. We combined several models with slightly different architectures that were trained on the new feature to further improve the system performances for the DCASE sound event localization and detection challenge.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Authors:
Shigeki Karita,
Yotaro Kubo,
Michiel Adriaan Unico Bacchiani,
Llion Jones
Abstract:
End-to-end (E2E) modeling is advantageous for automatic speech recognition (ASR) especially for Japanese since word-based tokenization of Japanese is not trivial, and E2E modeling is able to model character sequences directly. This paper focuses on the latest E2E modeling techniques, and investigates their performances on character-based Japanese ASR by conducting comparative experiments. The resu…
▽ More
End-to-end (E2E) modeling is advantageous for automatic speech recognition (ASR) especially for Japanese since word-based tokenization of Japanese is not trivial, and E2E modeling is able to model character sequences directly. This paper focuses on the latest E2E modeling techniques, and investigates their performances on character-based Japanese ASR by conducting comparative experiments. The results are analyzed and discussed in order to understand the relative advantages of long short-term memory (LSTM), and Conformer models in combination with connectionist temporal classification, transducer, and attention-based loss functions. Furthermore, the paper investigates on effectivity of the recent training techniques such as data augmentation (SpecAugment), variational noise injection, and exponential moving average. The best configuration found in the paper achieved the state-of-the-art character error rates of 4.1%, 3.2%, and 3.5% for Corpus of Spontaneous Japanese (CSJ) eval1, eval2, and eval3 tasks, respectively. The system is also shown to be computationally efficient thanks to the efficiency of Conformer transducers.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Scaling Scaling Laws with Board Games
Authors:
Andy L. Jones
Abstract:
The largest experiments in machine learning now require resources far beyond the budget of all but a few institutions. Fortunately, it has recently been shown that the results of these huge experiments can often be extrapolated from the results of a sequence of far smaller, cheaper experiments. In this work, we show that not only can the extrapolation be done based on the size of the model, but on…
▽ More
The largest experiments in machine learning now require resources far beyond the budget of all but a few institutions. Fortunately, it has recently been shown that the results of these huge experiments can often be extrapolated from the results of a sequence of far smaller, cheaper experiments. In this work, we show that not only can the extrapolation be done based on the size of the model, but on the size of the problem as well. By conducting a sequence of experiments using AlphaZero and Hex, we show that the performance achievable with a fixed amount of compute degrades predictably as the game gets larger and harder. Along with our main result, we further show that the test-time and train-time compute available to an agent can be traded off while maintaining performance.
△ Less
Submitted 15 April, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Authors:
Ahmed Elnaggar,
Wei Ding,
Llion Jones,
Tom Gibbs,
Tamas Feher,
Christoph Angerer,
Silvia Severini,
Florian Matthes,
Burkhard Rost
Abstract:
Currently, a growing number of mature natural language processing applications make people's life more convenient. Such applications are built by source code - the language in software engineering. However, the applications for understanding source code language to ease the software engineering process are under-researched. Simultaneously, the transformer model, especially its combination with tra…
▽ More
Currently, a growing number of mature natural language processing applications make people's life more convenient. Such applications are built by source code - the language in software engineering. However, the applications for understanding source code language to ease the software engineering process are under-researched. Simultaneously, the transformer model, especially its combination with transfer learning, has been proven to be a powerful technique for natural language processing tasks. These breakthroughs point out a promising direction for process source code and crack software engineering tasks. This paper describes CodeTrans - an encoder-decoder transformer model for tasks in the software engineering domain, that explores the effectiveness of encoder-decoder transformer models for six software engineering tasks, including thirteen sub-tasks. Moreover, we have investigated the effect of different training strategies, including single-task learning, transfer learning, multi-task learning, and multi-task learning with fine-tuning. CodeTrans outperforms the state-of-the-art models on all the tasks. To expedite future works in the software engineering domain, we have published our pre-trained models of CodeTrans.
https://github.com/agemagician/CodeTrans
△ Less
Submitted 12 May, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.
-
End-2-End COVID-19 Detection from Breath & Cough Audio
Authors:
Harry Coppock,
Alexander Gaskell,
Panagiotis Tzirakis,
Alice Baird,
Lyn Jones,
Björn W. Schuller
Abstract:
Our main contributions are as follows: (I) We demonstrate the first attempt to diagnose COVID-19 using end-to-end deep learning from a crowd-sourced dataset of audio samples, achieving ROC-AUC of 0.846; (II) Our model, the COVID-19 Identification ResNet, (CIdeR), has potential for rapid scalability, minimal cost and improving performance as more data becomes available. This could enable regular CO…
▽ More
Our main contributions are as follows: (I) We demonstrate the first attempt to diagnose COVID-19 using end-to-end deep learning from a crowd-sourced dataset of audio samples, achieving ROC-AUC of 0.846; (II) Our model, the COVID-19 Identification ResNet, (CIdeR), has potential for rapid scalability, minimal cost and improving performance as more data becomes available. This could enable regular COVID-19 testing at apopulation scale; (III) We introduce a novel modelling strategy using a custom deep neural network to diagnose COVID-19 from a joint breath and cough representation; (IV) We release our four stratified folds for cross parameter optimisation and validation on a standard public corpus and details on the models for reproducibility and future reference.
△ Less
Submitted 6 January, 2021;
originally announced February 2021.
-
A New Mathematical Model for Controlled Pandemics Like COVID-19 : AI Implemented Predictions
Authors:
Liam Dowling Jones,
Malik Magdon-Ismail,
Laura Mersini-Houghton,
Steven Meshnick
Abstract:
We present a new mathematical model to explicitly capture the effects that the three restriction measures: the lockdown date and duration, social distancing and masks, and, schools and border closing, have in controlling the spread of COVID-19 infections $i(r, t)$. Before restrictions were introduced, the random spread of infections as described by the SEIR model grew exponentially. The addition o…
▽ More
We present a new mathematical model to explicitly capture the effects that the three restriction measures: the lockdown date and duration, social distancing and masks, and, schools and border closing, have in controlling the spread of COVID-19 infections $i(r, t)$. Before restrictions were introduced, the random spread of infections as described by the SEIR model grew exponentially. The addition of control measures introduces a mixing of order and disorder in the system's evolution which fall under a different mathematical class of models that can eventually lead to critical phenomena. A generic analytical solution is hard to obtain. We use machine learning to solve the new equations for $i(r,t)$, the infections $i$ in any region $r$ at time $t$ and derive predictions for the spread of infections over time as a function of the strength of the specific measure taken and their duration. The machine is trained in all of the COVID-19 published data for each region, county, state, and country in the world. It utilizes optimization to learn the best-fit values of the model's parameters from past data in each region in the world, and it updates the predicted infections curves for any future restrictions that may be added or relaxed anywhere. We hope this interdisciplinary effort, a new mathematical model that predicts the impact of each measure in slowing down infection spread combined with the solving power of machine learning, is a useful tool in the fight against the current pandemic and potentially future ones.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance Computing
Authors:
Ahmed Elnaggar,
Michael Heinzinger,
Christian Dallago,
Ghalia Rihawi,
Yu Wang,
Llion Jones,
Tom Gibbs,
Tamas Feher,
Christoph Angerer,
Martin Steinegger,
Debsindhu Bhowmik,
Burkhard Rost
Abstract:
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction frontiers at low inference costs. Here, we trained two auto-regressive models (Transformer-XL, XLNet) and four auto-encoder models (BERT, Albert, Electra, T5) on data from UniRef and BFD containing up to 393 billion amino acids.…
▽ More
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction frontiers at low inference costs. Here, we trained two auto-regressive models (Transformer-XL, XLNet) and four auto-encoder models (BERT, Albert, Electra, T5) on data from UniRef and BFD containing up to 393 billion amino acids. The LMs were trained on the Summit supercomputer using 5616 GPUs and TPU Pod up-to 1024 cores. Dimensionality reduction revealed that the raw protein LM-embeddings from unlabeled data captured some biophysical features of protein sequences. We validated the advantage of using the embeddings as exclusive input for several subsequent tasks. The first was a per-residue prediction of protein secondary structure (3-state accuracy Q3=81%-87%); the second were per-protein predictions of protein sub-cellular localization (ten-state accuracy: Q10=81%) and membrane vs. water-soluble (2-state accuracy Q2=91%). For the per-residue predictions the transfer of the most informative embeddings (ProtT5) for the first time outperformed the state-of-the-art without using evolutionary information thereby bypassing expensive database searches. Taken together, the results implied that protein LMs learned some of the grammar of the language of life. To facilitate future work, we released our models at https://github.com/agemagician/ProtTrans.
△ Less
Submitted 4 May, 2021; v1 submitted 13 July, 2020;
originally announced July 2020.
-
A two-step system for sound event localization and detection
Authors:
T. N. T. Nguyen,
D. L. Jones,
R. Ranjan,
S. Jayabalan,
W. S. Gan
Abstract:
Sound event detection and sound event localization requires different features from audio input signals. While sound event detection mainly relies on time-frequency patterns to distinguish different event classes, sound event localization uses magnitude or phase differences between microphones to estimate source directions. Therefore, we propose a two-step system to do sound event localization and…
▽ More
Sound event detection and sound event localization requires different features from audio input signals. While sound event detection mainly relies on time-frequency patterns to distinguish different event classes, sound event localization uses magnitude or phase differences between microphones to estimate source directions. Therefore, we propose a two-step system to do sound event localization and detection. In the first step, we detect the sound events and estimate the directions-of-arrival separately. In the second step, we combine the results of the event detector and direction-of-arrival estimator together. The obtained results show a significant improvement over the baseline solution for sound event localization and detection in DCASE 2019 task 3 challenge. Using the evaluation dataset, the proposed system achieved an F1 score of 93.4% for sound event detection and an error of 5.4 degrees for direction-of-arrival estimation, while the winning solution achieved an F1 score of 94.7% and an angle error of 3.7 degrees respectively.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
Ensuring Reliable Monte Carlo Estimates of Network Properties
Authors:
Haema Nilakanta,
Zack W. Almquist,
Galin L. Jones
Abstract:
The literature in social network analysis has largely focused on methods and models which require complete network data; however there exist many networks which can only be studied via sampling methods due to the scale or complexity of the network, access limitations, or the population of interest is hard to reach. In such cases, the application of random walk-based Markov chain Monte Carlo (MCMC)…
▽ More
The literature in social network analysis has largely focused on methods and models which require complete network data; however there exist many networks which can only be studied via sampling methods due to the scale or complexity of the network, access limitations, or the population of interest is hard to reach. In such cases, the application of random walk-based Markov chain Monte Carlo (MCMC) methods to estimate multiple network features is common. However, the reliability of these estimates has been largely ignored. We consider and further develop multivariate MCMC output analysis methods in the context of network sampling to directly address the reliability of the multivariate estimation. This approach yields principled, computationally efficient, and broadly applicable methods for assessing the Monte Carlo estimation procedure. In particular, with respect to two random-walk algorithms, a simple random walk and a Metropolis-Hastings random walk, we construct and compare network parameter estimates, effective sample sizes, coverage probabilities, and stop** rules, all of which speaks to the estimation reliability.
△ Less
Submitted 21 November, 2019; v1 submitted 19 November, 2019;
originally announced November 2019.
-
PizzaBox: Studying Internet Connected Physical Object Manipulation based Food Ordering
Authors:
Luke Jones,
Charith Perera
Abstract:
This paper presents the designing and testing of PizzaBox, a 3D printed, interactive food ordering system that aims to differ from conventional food ordering systems and provide an entertaining and unique experience when ordering a pizza by incorporating underlying technologies that support ubiquitous computing. The PizzaBox has gone through both low and medium fidelity testing while working colla…
▽ More
This paper presents the designing and testing of PizzaBox, a 3D printed, interactive food ordering system that aims to differ from conventional food ordering systems and provide an entertaining and unique experience when ordering a pizza by incorporating underlying technologies that support ubiquitous computing. The PizzaBox has gone through both low and medium fidelity testing while working collaboratively with participants to co-design and refine a product that is approachable to all age groups while maintaining a simple process for ordering food from start to finish. Final testing was conducted at an independent pizzeria where interviews with participants lead us to develop four discussion themes 1) usability and end user engagement, 2) towards connected real-time products and services, 3) healthy eating, 4) evolution of food ordering systems. Our interviews show that in general, PizzaBox would have a greater appeal to a younger audience by providing a fantasy of hel** in the creation and baking of the pizza but also has a novelty value that all ages would enjoy. We investigate the effect that the PizzaBox has in encouraging new healthy habits or promoting a healthier lifestyle as well as how we can improve PizzaBox to better encourage these lifestyle changes.
△ Less
Submitted 8 June, 2019;
originally announced June 2019.
-
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Authors:
Jonathan Shen,
Patrick Nguyen,
Yonghui Wu,
Zhifeng Chen,
Mia X. Chen,
Ye Jia,
Anjuli Kannan,
Tara Sainath,
Yuan Cao,
Chung-Cheng Chiu,
Yanzhang He,
Jan Chorowski,
Smit Hinsu,
Stella Laurenzo,
James Qin,
Orhan Firat,
Wolfgang Macherey,
Suyog Gupta,
Ankur Bapna,
Shuyuan Zhang,
Ruoming Pang,
Ron J. Weiss,
Rohit Prabhavalkar,
Qiao Liang,
Benoit Jacob
, et al. (66 additional authors not shown)
Abstract:
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly w…
▽ More
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
Character-Level Language Modeling with Deeper Self-Attention
Authors:
Rami Al-Rfou,
Dokook Choe,
Noah Constant,
Mandy Guo,
Llion Jones
Abstract:
LSTMs and other RNN variants have shown strong performance on character-level language modeling. These models are typically trained using truncated backpropagation through time, and it is common to assume that their success stems from their ability to remember long-term contexts. In this paper, we show that a deep (64-layer) transformer model with fixed context outperforms RNN variants by a large…
▽ More
LSTMs and other RNN variants have shown strong performance on character-level language modeling. These models are typically trained using truncated backpropagation through time, and it is common to assume that their success stems from their ability to remember long-term contexts. In this paper, we show that a deep (64-layer) transformer model with fixed context outperforms RNN variants by a large margin, achieving state of the art on two popular benchmarks: 1.13 bits per character on text8 and 1.06 on enwik8. To get good results at this depth, we show that it is important to add auxiliary losses, both at intermediate network layers and intermediate sequence positions.
△ Less
Submitted 10 December, 2018; v1 submitted 9 August, 2018;
originally announced August 2018.
-
Optimal Record and Replay under Causal Consistency
Authors:
Russell L. Jones,
Muhammad S. Khan,
Nitin H. Vaidya
Abstract:
We investigate the minimum record needed to replay executions of processes that share causally consistent memory. For a version of causal consistency, we identify optimal records under both offline and online recording setting. Under the offline setting, a central authority has information about every process' view of the execution and can decide what information to record for each process. Under…
▽ More
We investigate the minimum record needed to replay executions of processes that share causally consistent memory. For a version of causal consistency, we identify optimal records under both offline and online recording setting. Under the offline setting, a central authority has information about every process' view of the execution and can decide what information to record for each process. Under the online setting, each process has to decide on the record at runtime as the operations are observed.
△ Less
Submitted 29 October, 2018; v1 submitted 22 May, 2018;
originally announced May 2018.
-
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
Authors:
Mia Xu Chen,
Orhan Firat,
Ankur Bapna,
Melvin Johnson,
Wolfgang Macherey,
George Foster,
Llion Jones,
Niki Parmar,
Mike Schuster,
Zhifeng Chen,
Yonghui Wu,
Macduff Hughes
Abstract:
The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training tec…
▽ More
The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training techniques that are in principle applicable to other seq2seq architectures. In this paper, we tease apart the new architectures and their accompanying techniques in two ways. First, we identify several key modeling and training techniques, and apply them to the RNN architecture, yielding a new RNMT+ model that outperforms all of the three fundamental architectures on the benchmark WMT'14 English to French and English to German tasks. Second, we analyze the properties of each fundamental seq2seq architecture and devise new hybrid architectures intended to combine their strengths. Our hybrid models obtain further improvements, outperforming the RNMT+ model on both benchmark datasets.
△ Less
Submitted 26 April, 2018; v1 submitted 25 April, 2018;
originally announced April 2018.
-
Tensor2Tensor for Neural Machine Translation
Authors:
Ashish Vaswani,
Samy Bengio,
Eugene Brevdo,
Francois Chollet,
Aidan N. Gomez,
Stephan Gouws,
Llion Jones,
Łukasz Kaiser,
Nal Kalchbrenner,
Niki Parmar,
Ryan Sepassi,
Noam Shazeer,
Jakob Uszkoreit
Abstract:
Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.
Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.
△ Less
Submitted 16 March, 2018;
originally announced March 2018.
-
One Model To Learn Them All
Authors:
Lukasz Kaiser,
Aidan N. Gomez,
Noam Shazeer,
Ashish Vaswani,
Niki Parmar,
Llion Jones,
Jakob Uszkoreit
Abstract:
Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrentl…
▽ More
Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition corpus, and an English parsing task. Our model architecture incorporates building blocks from multiple domains. It contains convolutional layers, an attention mechanism, and sparsely-gated layers. Each of these computational blocks is crucial for a subset of the tasks we train on. Interestingly, even if a block is not crucial for a task, we observe that adding it never hurts performance and in most cases improves it on all tasks. We also show that tasks with less data benefit largely from joint training with other tasks, while performance on large tasks degrades only slightly if at all.
△ Less
Submitted 15 June, 2017;
originally announced June 2017.
-
Attention Is All You Need
Authors:
Ashish Vaswani,
Noam Shazeer,
Niki Parmar,
Jakob Uszkoreit,
Llion Jones,
Aidan N. Gomez,
Lukasz Kaiser,
Illia Polosukhin
Abstract:
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experi…
▽ More
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
△ Less
Submitted 1 August, 2023; v1 submitted 12 June, 2017;
originally announced June 2017.
-
Guided-Processing Outperforms Duty-Cycling for Energy-Efficient Systems
Authors:
Long N. Le,
Douglas L. Jones
Abstract:
Energy-efficiency is highly desirable for sensing systems in the Internet of Things (IoT). A common approach to achieve low-power systems is duty-cycling, where components in a system are turned off periodically to meet an energy budget. However, this work shows that such an approach is not necessarily optimal in energy-efficiency, and proposes \textit{guided-processing} as a fundamentally better…
▽ More
Energy-efficiency is highly desirable for sensing systems in the Internet of Things (IoT). A common approach to achieve low-power systems is duty-cycling, where components in a system are turned off periodically to meet an energy budget. However, this work shows that such an approach is not necessarily optimal in energy-efficiency, and proposes \textit{guided-processing} as a fundamentally better alternative. The proposed approach offers 1) explicit modeling of performance uncertainties in system internals, 2) a realistic resource consumption model, and 3) a key insight into the superiority of guided-processing over duty-cycling. Generalization from the cascade structure to the more general graph-based one is also presented. Once applied to optimize a large-scale audio sensing system with a practical detection application, empirical results show that the proposed approach significantly improves the detection performance (up to $1.7\times$ and $4\times$ reduction in false-alarm and miss rate, respectively) for the same energy consumption, when compared to the duty-cycling approach.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Feature-Sharing in Cascade Detection Systems with Multiple Applications
Authors:
Long N. Le,
Douglas L. Jones
Abstract:
Traditional distributed detection systems are often designed for a single target application. However, with the emergence of the Internet of Things (IoT) paradigm, next-generation systems are expected to be a shared infrastructure for multiple applications. To this end, we propose a modular, cascade design for resource-efficient, multi-task detection systems. Two (classes of) applications are cons…
▽ More
Traditional distributed detection systems are often designed for a single target application. However, with the emergence of the Internet of Things (IoT) paradigm, next-generation systems are expected to be a shared infrastructure for multiple applications. To this end, we propose a modular, cascade design for resource-efficient, multi-task detection systems. Two (classes of) applications are considered in the system, a primary and a secondary one. The primary application has universal features that can be shared with other applications, to reduce the overall feature extraction cost, while the secondary application does not. In this setting, the two applications can collaborate via feature sharing. We provide a method to optimize the operation of the multi-application cascade system based on an accurate resource consumption model. In addition, the inherent uncertainties in feature models are articulated and taken into account. For evaluation, the twin-comparison argument is invoked, and it is shown that, with the optimal feature sharing strategy, a system can achieve 9$\times$ resource saving and 1.43$\times$ improvement in detection performance.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Canonical Correlation Analysis for Analyzing Sequences of Medical Billing Codes
Authors:
Corinne L. Jones,
Sham M. Kakade,
Lucas W. Thornblade,
David R. Flum,
Abraham D. Flaxman
Abstract:
We propose using canonical correlation analysis (CCA) to generate features from sequences of medical billing codes. Applying this novel use of CCA to a database of medical billing codes for patients with diverticulitis, we first demonstrate that the CCA embeddings capture meaningful relationships among the codes. We then generate features from these embeddings and establish their usefulness in pre…
▽ More
We propose using canonical correlation analysis (CCA) to generate features from sequences of medical billing codes. Applying this novel use of CCA to a database of medical billing codes for patients with diverticulitis, we first demonstrate that the CCA embeddings capture meaningful relationships among the codes. We then generate features from these embeddings and establish their usefulness in predicting future elective surgery for diverticulitis, an important marker in efforts for reducing costs in healthcare.
△ Less
Submitted 6 January, 2017; v1 submitted 1 December, 2016;
originally announced December 2016.
-
WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia
Authors:
Daniel Hewlett,
Alexandre Lacoste,
Llion Jones,
Illia Polosukhin,
Andrew Fandrianto,
Jay Han,
Matthew Kelcey,
David Berthelot
Abstract:
We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end…
▽ More
We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8%.
△ Less
Submitted 15 March, 2017; v1 submitted 11 August, 2016;
originally announced August 2016.
-
Towards a relation extraction framework for cyber-security concepts
Authors:
Corinne L. Jones,
Robert A. Bridges,
Kelly Huffer,
John Goodall
Abstract:
In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrap** algorithm for extracting security en…
▽ More
In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrap** algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting from the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.
△ Less
Submitted 16 April, 2015;
originally announced April 2015.
-
Optimized Network-coded Scalable Video Multicasting over eMBMS Networks
Authors:
Andrea Tassi,
Ioannis Chatzigeorgiou,
Dejan Vukobratović,
Andrew L. Jones
Abstract:
Delivery of multicast video services over fourth generation (4G) networks such as 3GPP Long Term Evolution-Advanced (LTE-A) is gaining momentum. In this paper, we address the issue of efficiently multicasting layered video services by defining a novel resource allocation framework that aims to maximize the service coverage whilst kee** the radio resource footprint low. A key point in the propose…
▽ More
Delivery of multicast video services over fourth generation (4G) networks such as 3GPP Long Term Evolution-Advanced (LTE-A) is gaining momentum. In this paper, we address the issue of efficiently multicasting layered video services by defining a novel resource allocation framework that aims to maximize the service coverage whilst kee** the radio resource footprint low. A key point in the proposed system mode is that the reliability of multicast video services is ensured by means of an Unequal Error Protection implementation of the Network Coding (UEP-NC) scheme. In addition, both the communication parameters and the UEP-NC scheme are jointly optimized by the proposed resource allocation framework. Numerical results show that the proposed allocation framework can significantly increase the service coverage when compared to a conventional Multi-rate Transmission (MrT) strategy.
△ Less
Submitted 20 January, 2015; v1 submitted 14 January, 2015;
originally announced January 2015.
-
Binary Systematic Network Coding for Progressive Packet Decoding
Authors:
Andrew L. Jones,
Ioannis Chatzigeorgiou,
Andrea Tassi
Abstract:
We consider binary systematic network codes and investigate their capability of decoding a source message either in full or in part. We carry out a probability analysis, derive closed-form expressions for the decoding probability and show that systematic network coding outperforms conventional network coding. We also develop an algorithm based on Gaussian elimination that allows progressive decodi…
▽ More
We consider binary systematic network codes and investigate their capability of decoding a source message either in full or in part. We carry out a probability analysis, derive closed-form expressions for the decoding probability and show that systematic network coding outperforms conventional network coding. We also develop an algorithm based on Gaussian elimination that allows progressive decoding of source packets. Simulation results show that the proposed decoding algorithm can achieve the theoretical optimal performance. Furthermore, we demonstrate that systematic network codes equipped with the proposed algorithm are good candidates for progressive packet recovery owing to their overall decoding delay characteristics.
△ Less
Submitted 14 January, 2015;
originally announced January 2015.
-
Optimal Simultaneous Detection and Signal and Noise Power Estimation
Authors:
Long Le,
Douglas L. Jones
Abstract:
Simultaneous detection and estimation is important in many engineering applications. In particular, there are many applications where it is important to perform signal detection and Signal-to-Noise-Ratio (SNR) estimation jointly. Application of existing frameworks in the literature that handle simultaneous detection and estimation is not straightforward for this class of application. This paper th…
▽ More
Simultaneous detection and estimation is important in many engineering applications. In particular, there are many applications where it is important to perform signal detection and Signal-to-Noise-Ratio (SNR) estimation jointly. Application of existing frameworks in the literature that handle simultaneous detection and estimation is not straightforward for this class of application. This paper therefore aims at bridging the gap between an existing framework, specifically the work by Middleton et al., and the mentioned application class by presenting a jointly optimal detector and signal and noise power estimators. The detector and estimators are given for the Gaussian observation model with appropriate conjugate priors on the signal and noise power. Simulation results affirm the superior performance of the optimal solution compared to the separate detection and estimation approaches.
△ Less
Submitted 15 October, 2014;
originally announced October 2014.
-
Automatic Labeling for Entity Extraction in Cyber Security
Authors:
Robert A. Bridges,
Corinne L. Jones,
Michael D. Iannacone,
Kelly M. Testa,
John R. Goodall
Abstract:
Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often no…
▽ More
Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often not a viable solution. In response, we develop a very precise method to automatically label text from several data sources by leveraging related, domain-specific, structured data and provide public access to a corpus annotated with cyber-security entities. Next, we implement a Maximum Entropy Model trained with the average perceptron on a portion of our corpus ($\sim$750,000 words) and achieve near perfect precision, recall, and accuracy, with training times under 17 seconds.
△ Less
Submitted 9 June, 2014; v1 submitted 22 August, 2013;
originally announced August 2013.
-
Ganga: a tool for computational-task management and easy access to Grid resources
Authors:
J. T. Mościcki,
F. Brochu,
J. Ebke,
U. Egede,
J. Elmsheuser,
K. Harrison,
R. W. L. Jones,
H. C. Lee,
D. Liko,
A. Maier,
A. Muraru,
G. N. Patrick,
K. Pajchel,
W. Reece,
B. H. Samset,
M. W. Slater,
A. Soroko,
C. L. Tan,
D. C. Vanderster,
M. Williams
Abstract:
In this paper, we present the computational task-management tool Ganga, which allows for the specification, submission, bookkee** and post-processing of computational tasks on a wide set of distributed resources. Ganga has been developed to solve a problem increasingly common in scientific projects, which is that researchers must regularly switch between different processing systems, each with…
▽ More
In this paper, we present the computational task-management tool Ganga, which allows for the specification, submission, bookkee** and post-processing of computational tasks on a wide set of distributed resources. Ganga has been developed to solve a problem increasingly common in scientific projects, which is that researchers must regularly switch between different processing systems, each with its own command set, to complete their computational tasks. Ganga provides a homogeneous environment for processing data on heterogeneous resources. We give examples from High Energy Physics, demonstrating how an analysis can be developed on a local system and then transparently moved to a Grid system for processing of all available data. Ganga has an API that can be used via an interactive interface, in scripts, or through a GUI. Specific knowledge about types of tasks or computational resources is provided at run-time through a plugin system, making new developments easy to integrate. We give an overview of the Ganga architecture, give examples of current use, and demonstrate how Ganga can be used in many different areas of science.
△ Less
Submitted 9 June, 2009; v1 submitted 16 February, 2009;
originally announced February 2009.
-
Social Browsing on Flickr
Authors:
Kristina Lerman,
Laurie Jones
Abstract:
The new social media sites - blogs, wikis, del.icio.us and Flickr, among others - underscore the transformation of the Web to a participatory medium in which users are actively creating, evaluating and distributing information. The photo-sharing site Flickr, for example, allows users to upload photographs, view photos created by others, comment on those photos, etc. As is common to other social…
▽ More
The new social media sites - blogs, wikis, del.icio.us and Flickr, among others - underscore the transformation of the Web to a participatory medium in which users are actively creating, evaluating and distributing information. The photo-sharing site Flickr, for example, allows users to upload photographs, view photos created by others, comment on those photos, etc. As is common to other social media sites, Flickr allows users to designate others as ``contacts'' and to track their activities in real time. The contacts (or friends) lists form the social network backbone of social media sites. We claim that these social networks facilitate new ways of interacting with information, e.g., through what we call social browsing. The contacts interface on Flickr enables users to see latest images submitted by their friends. Through an extensive analysis of Flickr data, we show that social browsing through the contacts' photo streams is one of the primary methods by which users find new images on Flickr. This finding has implications for creating personalized recommendation systems based on the user's declared contacts lists.
△ Less
Submitted 7 December, 2006;
originally announced December 2006.
-
GANGA: a user-Grid interface for Atlas and LHCb
Authors:
K. Harrison,
W. T. L. P. Lavrijsen,
P. Mato,
A. Soroko,
C. L. Tan,
C. E. Tull,
N. Brook,
R. W. L. Jones
Abstract:
The Gaudi/Athena and Grid Alliance (GANGA) is a front-end for the configuration, submission, monitoring, bookkee**, output collection, and reporting of computing jobs run on a local batch system or on the grid. In particular, GANGA handles jobs that use applications written for the Gaudi software framework shared by the Atlas and LHCb experiments. GANGA exploits the commonality of Gaudi-based…
▽ More
The Gaudi/Athena and Grid Alliance (GANGA) is a front-end for the configuration, submission, monitoring, bookkee**, output collection, and reporting of computing jobs run on a local batch system or on the grid. In particular, GANGA handles jobs that use applications written for the Gaudi software framework shared by the Atlas and LHCb experiments. GANGA exploits the commonality of Gaudi-based computing jobs, while insulating against grid-, batch- and framework-specific technicalities, to maximize end-user productivity in defining, configuring, and executing jobs. Designed for a python-based component architecture, GANGA has a modular underpinning and is therefore well placed for contributing to, and benefiting from, work in related projects. Its functionality is accessible both from a scriptable command-line interface, for expert users and automated tasks, and through a graphical interface, which simplifies the interaction with GANGA for beginning and c1asual users.
This paper presents the GANGA design and implementation, the development of the underlying software bus architecture, and the functionality of the first public GANGA release.
△ Less
Submitted 13 June, 2003;
originally announced June 2003.
-
Towards Experimental Nanosound Using Almost Disjoint Set Theory
Authors:
Cameron L Jones
Abstract:
Music composition using digital audio sequence editors is increasingly performed in a visual workspace where sound complexes are built from discrete sound objects, called gestures that are arranged in time and space to generate a continuous composition. The visual workspace, common to most industry standard audio loop sequencing software, is premised on the arrangement of gestures defined with g…
▽ More
Music composition using digital audio sequence editors is increasingly performed in a visual workspace where sound complexes are built from discrete sound objects, called gestures that are arranged in time and space to generate a continuous composition. The visual workspace, common to most industry standard audio loop sequencing software, is premised on the arrangement of gestures defined with geometric shape properties. Here, one aspect of fractal set theory was validated using audio-frequency sets to evaluate self-affine scaling behavior when new sound complexes are built through union and intersection operations on discrete musical gestures. Results showed that intersection of two sets revealed lower complexity compared with the union operator, meaning that the intersection of two sound gestures is an almost disjoint set, and in accord with formal logic. These results are also discussed with reference to fuzzy sets, cellular automata, nanotechnology and self-organization to further explore the link between sequenced notation and complexity.
△ Less
Submitted 12 March, 2002;
originally announced March 2002.