-
Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation
Authors:
Xin Yu,
Qi Yang,
Han Liu,
Ho Hin Lee,
Yucheng Tang,
Lucas W. Remedios,
Michael Kim,
Shunxing Bao,
Ann Xenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta…
▽ More
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmentation results. In this work, we propose a novel 3D-to-2D distillation framework, leveraging pre-trained 3D models to enhance 2D single-slice segmentation. Specifically, we extract the prediction distribution centroid from the 3D representations, to guide the 2D student by learning intra- and inter-class correlation. Unlike traditional knowledge distillation methods that require the same data input, our approach employs unpaired 3D CT scans with any contrast to guide the 2D student model. Experiments conducted on 707 subjects from the single-slice Baltimore Longitudinal Study of Aging (BLSA) dataset demonstrate that state-of-the-art 2D multi-organ segmentation methods can benefit from the 3D teacher model, achieving enhanced performance in single-slice multi-organ segmentation. Notably, our approach demonstrates considerable efficacy in low-data regimes, outperforming the model trained with all available training subjects even when utilizing only 200 training subjects. Thus, this work underscores the potential to alleviate manual annotation burdens.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Technical Development of a Semi-Autonomous Robotic Partition
Authors:
Binh Vinh Duc Nguyen,
Andrew Vande Moere
Abstract:
This technical description details the design and engineering process of a semi-autonomous robotic partition. This robotic partition prototype was subsequently employed in a longer-term evaluation in-the-wild study conducted by the authors in a real-world office setting.
This technical description details the design and engineering process of a semi-autonomous robotic partition. This robotic partition prototype was subsequently employed in a longer-term evaluation in-the-wild study conducted by the authors in a real-world office setting.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Research Challenges for Adaptive Architecture: Empowering Occupants of Multi-Occupancy Buildings
Authors:
Binh Vinh Duc Nguyen,
Andrew Vande Moere
Abstract:
This positional paper outlines our vision of 'adaptive architecture', which involves the integration of robotic technology to physically change an architectural space in supporting the changing needs of its occupants, in response to the CHI'24 workshop "HabiTech - Inhabiting Buildings, Data & Technology" call on "How do new technologies enable and empower the inhabitants of multi-occupancy buildin…
▽ More
This positional paper outlines our vision of 'adaptive architecture', which involves the integration of robotic technology to physically change an architectural space in supporting the changing needs of its occupants, in response to the CHI'24 workshop "HabiTech - Inhabiting Buildings, Data & Technology" call on "How do new technologies enable and empower the inhabitants of multi-occupancy buildings?". Specifically, while adaptive architecture holds promise for enhancing occupant satisfaction, comfort, and overall health and well-being, there remains a range of research challenges of (1) how it can effectively support individual occupants, while (2) mediating the conflicting needs of collocated others, and (3) integrating meaningfully into the sociocultural characteristics of their building community.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
The Adaptive Workplace: Orchestrating Architectural Services around the Wellbeing of Individual Occupants
Authors:
Andrew Vande Moere,
Sara Arko,
Alena Safrova Drasilova,
Tomáš Ondráček,
Ilaria Pigliautile,
Benedetta Pioppi,
Anna Laura Pisello,
Jakub Prochazka,
Paula Acuna Roncancio,
Davide Schaumann,
Marcel Schweiker,
Binh Vinh Duc Nguyen
Abstract:
As the academic consortia members of the EU Horizon project SONATA ("Situation-aware OrchestratioN of AdapTive Architecture"), we respond to the workshop call for "Office Wellbeing by Design: Don't Stand for Anything Less" by proposing the "Adaptive Workplace" concept. In essence, our vision aims to adapt a workplace to the ever-changing needs of individual occupants, instead of that occupants are…
▽ More
As the academic consortia members of the EU Horizon project SONATA ("Situation-aware OrchestratioN of AdapTive Architecture"), we respond to the workshop call for "Office Wellbeing by Design: Don't Stand for Anything Less" by proposing the "Adaptive Workplace" concept. In essence, our vision aims to adapt a workplace to the ever-changing needs of individual occupants, instead of that occupants are expected to adapt to their workplace.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
How do Older Adults Set Up Voice Assistants? Lessons Learned from a Deployment Experience for Older Adults to Set Up Standalone Voice Assistants
Authors:
Chen Chen,
Ella T. Lifset,
Yichen Han,
Arkajyoti Roy,
Michael Hogarth,
Alison A. Moore,
Emilia Farcas,
Nadir Weibel
Abstract:
While standalone Voice Assistants (VAs) are promising to support older adults' daily routine and wellbeing management, onboarding and setting up these devices can be challenging. Although some older adults choose to seek assistance from technicians and adult children, easy set up processes that facilitate independent use are still critical, especially for those who do not have access to external r…
▽ More
While standalone Voice Assistants (VAs) are promising to support older adults' daily routine and wellbeing management, onboarding and setting up these devices can be challenging. Although some older adults choose to seek assistance from technicians and adult children, easy set up processes that facilitate independent use are still critical, especially for those who do not have access to external resources. We aim to understand the older adults' experience while setting up commercially available voice-only and voice-first screen-based VAs. Rooted in participants observations and semi-structured interviews, we designed a within-subject study with 10 older adults using Amazon Echo Dot and Echo Show. We identified the values of the built-in touchscreen and the instruction documents, as well as the impact of form factors, and outline important directions to support older adult independence with VAs.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
The Full-scale Assembly Simulation Testbed (FAST) Dataset
Authors:
Alec G. Moore,
Tiffany D. Do,
Nayan N. Chawla,
Antonia Jimenez Iriarte,
Ryan P. McMahan
Abstract:
In recent years, numerous researchers have begun investigating how virtual reality (VR) tracking and interaction data can be used for a variety of machine learning purposes, including user identification, predicting cybersickness, and estimating learning gains. One constraint for this research area is the dearth of open datasets. In this paper, we present a new open dataset captured with our VR-ba…
▽ More
In recent years, numerous researchers have begun investigating how virtual reality (VR) tracking and interaction data can be used for a variety of machine learning purposes, including user identification, predicting cybersickness, and estimating learning gains. One constraint for this research area is the dearth of open datasets. In this paper, we present a new open dataset captured with our VR-based Full-scale Assembly Simulation Testbed (FAST). This dataset consists of data collected from 108 participants (50 females, 56 males, 2 non-binary) learning how to assemble two distinct full-scale structures in VR. In addition to explaining how the dataset was collected and describing the data included, we discuss how the dataset may be used by future researchers.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
The Adaptive Architectural Layout: How the Control of a Semi-Autonomous Mobile Robotic Partition was Shared to Mediate the Environmental Demands and Resources of an Open-Plan Office
Authors:
Binh Vinh Duc Nguyen,
Andrew Vande Moere
Abstract:
A typical open-plan office layout is unable to optimally host multiple collocated work activities, personal needs, and situational events, as its space exerts a range of environmental demands on workers in terms of maintaining their acoustic, visual or privacy comfort. As we hypothesise that these demands could be coped by optimising the environmental resources of the architectural layout, we depl…
▽ More
A typical open-plan office layout is unable to optimally host multiple collocated work activities, personal needs, and situational events, as its space exerts a range of environmental demands on workers in terms of maintaining their acoustic, visual or privacy comfort. As we hypothesise that these demands could be coped by optimising the environmental resources of the architectural layout, we deployed a mobile robotic partition that autonomously manoeuvres between predetermined locations. During a five-weeks in-the-wild study within a real-world open-plan office, we studied how 13 workers adopted four distinct adaptation strategies when sharing the spatiotemporal control of the robotic partition. Based on their logged and self-reported reasoning, we present six initiation regulating factors that determine the appropriateness of each adaptation strategy. This study thus contributes to how future human-building interaction could autonomously improve the experience, comfort, performance, and even the health and wellbeing of multiple workers that share the same workplace.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
A Human-Powered Public Display that Nudges Social Biking via Motion Gesturing
Authors:
Binh Vinh Duc Nguyen,
Andrew Vande Moere
Abstract:
The WeWatt bike serves as an energy station that enables passers-by to charge their mobile devices through physical activity. However, despite multiple people using it simultaneously, the bike is typically used individually. To address this limitation, we developed the WeWattTree, an installation utilising human-powered energy to filter environmental air. Through the orchestration of subtle motion…
▽ More
The WeWatt bike serves as an energy station that enables passers-by to charge their mobile devices through physical activity. However, despite multiple people using it simultaneously, the bike is typically used individually. To address this limitation, we developed the WeWattTree, an installation utilising human-powered energy to filter environmental air. Through the orchestration of subtle motion gestures, our goal is to entice passers-by to participate and encourage them to socially interact, synchronising their pace. In this work-in-progress, we provide insights into the prototy** process, combining physical experimentation and computational simulation, and delve into the underlying concepts of our grammar of motion gestures. We highlight how a single design effectively merged multiple functionalities, how the role of material characteristics shaped the interaction design, and discuss the potential for social performances as captivating public displays.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
ChemTime: Rapid and Early Classification for Multivariate Time Series Classification of Chemical Sensors
Authors:
Alexander M. Moore,
Randy C. Paffenroth,
Kenneth T. Ngo,
Joshua R. Uzarski
Abstract:
Multivariate time series data are ubiquitous in the application of machine learning to problems in the physical sciences. Chemiresistive sensor arrays are highly promising in chemical detection tasks relevant to industrial, safety, and military applications. Sensor arrays are an inherently multivariate time series data collection tool which demand rapid and accurate classification of arbitrary che…
▽ More
Multivariate time series data are ubiquitous in the application of machine learning to problems in the physical sciences. Chemiresistive sensor arrays are highly promising in chemical detection tasks relevant to industrial, safety, and military applications. Sensor arrays are an inherently multivariate time series data collection tool which demand rapid and accurate classification of arbitrary chemical analytes. Previous research has benchmarked data-agnostic multivariate time series classifiers across diverse multivariate time series supervised tasks in order to find general-purpose classification algorithms. To our knowledge, there has yet to be an effort to survey machine learning and time series classification approaches to chemiresistive hardware sensor arrays for the detection of chemical analytes. In addition to benchmarking existing approaches to multivariate time series classifiers, we incorporate findings from a model survey to propose the novel \textit{ChemTime} approach to sensor array classification for chemical sensing. We design experiments addressing the unique challenges of hardware sensor arrays classification including the rapid classification ability of classifiers and minimization of inference time while maintaining performance for deployed lightweight hardware sensing devices. We find that \textit{ChemTime} is uniquely positioned for the chemical sensing task by combining rapid and early classification of time series with beneficial inference and high accuracy.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Subspace Hybrid MVDR Beamforming for Augmented Hearing
Authors:
Sina Hafezi,
Alastair H. Moore,
Pierre H. Guiraud,
Patrick A. Naylor,
Jacob Donley,
Vladimir Tourbabin,
Thomas Lunner
Abstract:
Signal-dependent beamformers are advantageous over signal-independent beamformers when the acoustic scenario - be it real-world or simulated - is straightforward in terms of the number of sound sources, the ambient sound field and their dynamics. However, in the context of augmented reality audio using head-worn microphone arrays, the acoustic scenarios encountered are often far from straightforwa…
▽ More
Signal-dependent beamformers are advantageous over signal-independent beamformers when the acoustic scenario - be it real-world or simulated - is straightforward in terms of the number of sound sources, the ambient sound field and their dynamics. However, in the context of augmented reality audio using head-worn microphone arrays, the acoustic scenarios encountered are often far from straightforward. The design of robust, high-performance, adaptive beamformers for such scenarios is an on-going challenge. This is due to the violation of the typically required assumptions on the noise field caused by, for example, rapid variations resulting from complex acoustic environments, and/or rotations of the listener's head. This work proposes a multi-channel speech enhancement algorithm which utilises the adaptability of signal-dependent beamformers while still benefiting from the computational efficiency and robust performance of signal-independent super-directive beamformers. The algorithm has two stages. (i) The first stage is a hybrid beamformer based on a dictionary of weights corresponding to a set of noise field models. (ii) The second stage is a wide-band subspace post-filter to remove any artifacts resulting from (i). The algorithm is evaluated using both real-world recordings and simulations of a cocktail-party scenario. Noise suppression, intelligibility and speech quality results show a significant performance improvement by the proposed algorithm compared to the baseline super-directive beamformer. A data-driven implementation of the noise field dictionary is shown to provide more noise suppression, and similar speech intelligibility and quality, compared to a parametric dictionary.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Towards Designing Spatial Robots that are Architecturally Motivated
Authors:
Binh Vinh Duc Nguyen,
Andrew Vande Moere
Abstract:
While robots are increasingly integrated into the built environment, little is known how their qualities can meaningfully influence our spaces to facilitate enjoyable and agreeable interaction, rather than robotic settings that are driven by functional goals. Motivated by the premise that future robots should be aware of architectural sensitivities, we developed a set of exploratory studies that c…
▽ More
While robots are increasingly integrated into the built environment, little is known how their qualities can meaningfully influence our spaces to facilitate enjoyable and agreeable interaction, rather than robotic settings that are driven by functional goals. Motivated by the premise that future robots should be aware of architectural sensitivities, we developed a set of exploratory studies that combine methods from both architectural and interaction design. While we empirically discovered that dynamically moving spatial elements, which we coin as spatial robots, can indeed create unique life-sized affordances that encourage or resist human activities, we also encountered many unforeseen design challenges originated from how ordinary users and experts perceived spatial robots. This discussion thus could inform similar design studies in the areas of human-building architecture (HBI) or responsive and interactive architecture.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Design of General Purpose Minimal-Auxiliary Ising Machines
Authors:
Isaac K. Martin,
Andrew G. Moore,
John T. Daly,
Jess J. Meyer,
Teresa M. Ranadive
Abstract:
Ising machines are a form of quantum-inspired processing-in-memory computer which has shown great promise for overcoming the limitations of traditional computing paradigms while operating at a fraction of the energy use. The process of designing Ising machines is known as the reverse Ising problem. Unfortunately, this problem is in general computationally intractable: it is a nonconvex mixed-integ…
▽ More
Ising machines are a form of quantum-inspired processing-in-memory computer which has shown great promise for overcoming the limitations of traditional computing paradigms while operating at a fraction of the energy use. The process of designing Ising machines is known as the reverse Ising problem. Unfortunately, this problem is in general computationally intractable: it is a nonconvex mixed-integer linear programming problem which cannot be naively brute-forced except in the simplest cases due to exponential scaling of runtime with number of spins. We prove new theoretical results which allow us to reduce the search space to one with quadratic scaling. We utilize this theory to develop general purpose algorithmic solutions to the reverse Ising problem. In particular, we demonstrate Ising formulations of 3-bit and 4-bit integer multiplication which use fewer total spins than previously known methods by a factor of more than three. Our results increase the practicality of implementing such circuits on modern Ising hardware, where spins are at a premium.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Microscaling Data Formats for Deep Learning
Authors:
Bita Darvish Rouhani,
Ritchie Zhao,
Ankit More,
Mathew Hall,
Alireza Khodamoradi,
Summer Deng,
Dhruv Choudhary,
Marius Cornea,
Eric Dellinger,
Kristof Denolf,
Stosic Dusan,
Venmugil Elango,
Maximilian Golub,
Alexander Heinecke,
Phil James-Roxby,
Dharmesh Jani,
Gaurav Kolhe,
Martin Langhammer,
Ada Li,
Levi Melnick,
Maral Mesmakhosroshahi,
Andres Rodriguez,
Michael Schulte,
Rasoul Shafipour,
Lei Shao
, et al. (8 additional authors not shown)
Abstract:
Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical result…
▽ More
Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe.
△ Less
Submitted 19 October, 2023; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Deep conditional generative models for longitudinal single-slice abdominal computed tomography harmonization
Authors:
Xin Yu,
Qi Yang,
Yucheng Tang,
Riqiang Gao,
Shunxing Bao,
Leon Y. Cai,
Ho Hin Lee,
Yuankai Huo,
Ann Zenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Two-dimensional single-slice abdominal computed tomography (CT) provides a detailed tissue map with high resolution allowing quantitative characterization of relationships between health conditions and aging. However, longitudinal analysis of body composition changes using these scans is difficult due to positional variation between slices acquired in different years, which leading to different or…
▽ More
Two-dimensional single-slice abdominal computed tomography (CT) provides a detailed tissue map with high resolution allowing quantitative characterization of relationships between health conditions and aging. However, longitudinal analysis of body composition changes using these scans is difficult due to positional variation between slices acquired in different years, which leading to different organs/tissues captured. To address this issue, we propose C-SliceGen, which takes an arbitrary axial slice in the abdominal region as a condition and generates a pre-defined vertebral level slice by estimating structural changes in the latent space. Our experiments on 2608 volumetric CT data from two in-house datasets and 50 subjects from the 2015 Multi-Atlas Abdomen Labeling Challenge dataset (BTCV) Challenge demonstrate that our model can generate high-quality images that are realistic and similar. We further evaluate our method's capability to harmonize longitudinal positional variation on 1033 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset, which contains longitudinal single abdominal slices, and confirmed that our method can harmonize the slice positional variance in terms of visceral fat area. This approach provides a promising direction for map** slices from different vertebral levels to a target slice and reducing positional variance for single-slice longitudinal analysis. The source code is available at: https://github.com/MASILab/C-SliceGen.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Streamlined Data Fusion: Unleashing the Power of Linear Combination with Minimal Relevance Judgments
Authors:
Qiuyu Xu,
Yidong Huang,
Shengli Wu,
Adrian Moore
Abstract:
Linear combination is a potent data fusion method in information retrieval tasks, thanks to its ability to adjust weights for diverse scenarios. However, achieving optimal weight training has traditionally required manual relevance judgments on a large percentage of documents, a labor-intensive and expensive process. In this study, we investigate the feasibility of obtaining near-optimal weights u…
▽ More
Linear combination is a potent data fusion method in information retrieval tasks, thanks to its ability to adjust weights for diverse scenarios. However, achieving optimal weight training has traditionally required manual relevance judgments on a large percentage of documents, a labor-intensive and expensive process. In this study, we investigate the feasibility of obtaining near-optimal weights using a mere 20\%-50\% of relevant documents. Through experiments on four TREC datasets, we find that weights trained with multiple linear regression using this reduced set closely rival those obtained with TREC's official "qrels." Our findings unlock the potential for more efficient and affordable data fusion, empowering researchers and practitioners to reap its full benefits with significantly less effort.
△ Less
Submitted 21 September, 2023; v1 submitted 10 September, 2023;
originally announced September 2023.
-
Evolution of ESG-focused DLT Research: An NLP Analysis of the Literature
Authors:
Walter Hernandez,
Kamil Tylinski,
Alastair Moore,
Niall Roche,
Nikhil Vadgama,
Horst Treiblmaier,
Jiangbo Shangguan,
Paolo Tasca,
Jiahua Xu
Abstract:
As Distributed Ledger Technologies (DLTs) rapidly evolve, their impacts extend beyond technology, influencing environmental and societal aspects. This evolution has increased publications, making manual literature analysis increasingly challenging. We address this with a Natural Language Processing (NLP)-based systematic literature review method to explore the intersection of Distributed Ledger Te…
▽ More
As Distributed Ledger Technologies (DLTs) rapidly evolve, their impacts extend beyond technology, influencing environmental and societal aspects. This evolution has increased publications, making manual literature analysis increasingly challenging. We address this with a Natural Language Processing (NLP)-based systematic literature review method to explore the intersection of Distributed Ledger Technology (DLT) with its Environmental, Social, and Governance (ESG) aspects. Our approach involves building and refining a directed citation network from 107 seed papers to a corpus of 24,539 publications and fine-tuning a transformer-based language model for Named Entity Recognition (NER) on DLT and ESG domains. Applying this model, we distilled the corpus to 505 key publications, enabling an inaugural literature review and temporal graph analysis of DLT's evolution in ESG contexts. Our contributions include an adaptable and scalable NLP-driven systematic literature review methodology and a unique NER dataset of 54,808 entities, tailored for DLT and ESG research. Our inaugural literature review demonstrates their applicability and effectiveness in analyzing DLT's evolution and impacts, proving invaluable for stakeholders in the DLT domain.
△ Less
Submitted 5 February, 2024; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Screen or No Screen? Lessons Learnt from a Real-World Deployment Study of Using Voice Assistants With and Without Touchscreen for Older Adults
Authors:
Chen Chen,
Ella T. Lifset,
Yichen Han,
Arkajyoti Roy,
Michael Hogarth,
Alison A. Moore,
Emilia Farcas,
Nadir Weibel
Abstract:
While voice user interfaces offer increased accessibility due to hands-free and eyes-free interactions, older adults often have challenges such as constructing structured requests and perceiving how such devices operate. Voice-first user interfaces have the potential to address these challenges by enabling multimodal interactions. Standalone voice + touchscreen Voice Assistants (VAs), such as Echo…
▽ More
While voice user interfaces offer increased accessibility due to hands-free and eyes-free interactions, older adults often have challenges such as constructing structured requests and perceiving how such devices operate. Voice-first user interfaces have the potential to address these challenges by enabling multimodal interactions. Standalone voice + touchscreen Voice Assistants (VAs), such as Echo Show, are specific types of devices that adopt such interfaces and are gaining popularity. However, the affordances of the additional touchscreen for older adults are unknown. Through a 40-day real-world deployment with older adults living independently, we present a within-subjects study (N = 16; age M = 82.5, SD = 7.77, min. = 70, max. = 97) to understand how a built-in touchscreen might benefit older adults during device setup, conducting self-report diary survey, and general uses. We found that while participants appreciated the visual outputs, they still preferred to respond via speech instead of touch. We identified six design implications that can inform future innovations of senior-friendly VAs for managing healthcare and improving quality of life.
△ Less
Submitted 15 July, 2023;
originally announced July 2023.
-
With Shared Microexponents, A Little Shifting Goes a Long Way
Authors:
Bita Rouhani,
Ritchie Zhao,
Venmugil Elango,
Rasoul Shafipour,
Mathew Hall,
Maral Mesmakhosroshahi,
Ankit More,
Levi Melnick,
Maximilian Golub,
Girish Varatkar,
Lei Shao,
Gaurav Kolhe,
Dimitry Melts,
Jasmine Klar,
Renee L'Heureux,
Matt Perry,
Doug Burger,
Eric Chung,
Zhaoxia Deng,
Sam Naghshineh,
Jongsoo Park,
Maxim Naumov
Abstract:
This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-p…
▽ More
This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems.
△ Less
Submitted 12 April, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
ChemVise: Maximizing Out-of-Distribution Chemical Detection with the Novel Application of Zero-Shot Learning
Authors:
Alexander M. Moore,
Randy C. Paffenroth,
Ken T. Ngo,
Joshua R. Uzarski
Abstract:
Accurate chemical sensors are vital in medical, military, and home safety applications. Training machine learning models to be accurate on real world chemical sensor data requires performing many diverse, costly experiments in controlled laboratory settings to create a data set. In practice even expensive, large data sets may be insufficient for generalization of a trained model to a real-world te…
▽ More
Accurate chemical sensors are vital in medical, military, and home safety applications. Training machine learning models to be accurate on real world chemical sensor data requires performing many diverse, costly experiments in controlled laboratory settings to create a data set. In practice even expensive, large data sets may be insufficient for generalization of a trained model to a real-world testing distribution. Rather than perform greater numbers of experiments requiring exhaustive mixtures of chemical analytes, this research proposes learning approximations of complex exposures from training sets of simple ones by using single-analyte exposure signals as building blocks of a multiple-analyte space. We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes. Further, we pair these synthetic signals to targets in an information-dense representation space utilizing a large corpus of chemistry knowledge. Through utilization of a semantically meaningful analyte representation spaces along with synthetic targets we achieve rapid analyte classification in the presence of obscurants without corresponding obscured-analyte training data. Transfer learning for supervised learning with molecular representations makes assumptions about the input data. Instead, we borrow from the natural language and natural image processing literature for a novel approach to chemical sensor signal classification using molecular semantics for arbitrary chemical sensor hardware designs.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Single Slice Thigh CT Muscle Group Segmentation with Domain Adaptation and Self-Training
Authors:
Qi Yang,
Xin Yu,
Ho Hin Lee,
Leon Y. Cai,
Kaiwen Xu,
Shunxing Bao,
Yuankai Huo,
Ann Zenobia Moore,
Sokratis Makrogiannis,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Objective: Thigh muscle group segmentation is important for assessment of muscle anatomy, metabolic disease and aging. Many efforts have been put into quantifying muscle tissues with magnetic resonance (MR) imaging including manual annotation of individual muscles. However, leveraging publicly available annotations in MR images to achieve muscle group segmentation on single slice computed tomograp…
▽ More
Objective: Thigh muscle group segmentation is important for assessment of muscle anatomy, metabolic disease and aging. Many efforts have been put into quantifying muscle tissues with magnetic resonance (MR) imaging including manual annotation of individual muscles. However, leveraging publicly available annotations in MR images to achieve muscle group segmentation on single slice computed tomography (CT) thigh images is challenging.
Method: We propose an unsupervised domain adaptation pipeline with self-training to transfer labels from 3D MR to single CT slice. First, we transform the image appearance from MR to CT with CycleGAN and feed the synthesized CT images to a segmenter simultaneously. Single CT slices are divided into hard and easy cohorts based on the entropy of pseudo labels inferenced by the segmenter. After refining easy cohort pseudo labels based on anatomical assumption, self-training with easy and hard splits is applied to fine tune the segmenter.
Results: On 152 withheld single CT thigh images, the proposed pipeline achieved a mean Dice of 0.888(0.041) across all muscle groups including sartorius, hamstrings, quadriceps femoris and gracilis. muscles
Conclusion: To our best knowledge, this is the first pipeline to achieve thigh imaging domain adaptation from MR to CT. The proposed pipeline is effective and robust in extracting muscle groups on 2D single slice CT thigh images.The container is available for public use at https://github.com/MASILab/DA_CT_muscle_seg
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Conversion of Legal Agreements into Smart Legal Contracts using NLP
Authors:
Eason Chen,
Niall Roche,
Yuen-Hsien Tseng,
Walter Hernandez,
Jiangbo Shangguan,
Alastair Moore
Abstract:
A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeli…
▽ More
A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeline to automate the SLC creation process with several Natural Language Processing (NLP) models to convert law contracts to the Accord Project's Concerto model. After evaluating the proposed pipeline, we discovered that our NER pipeline accurately detects CiceroMark from Accord Project template text with an accuracy of 0.8. Additionally, our Question Answering method can extract one-third of the Concerto variables from the template text. We also delve into some limitations and possible future research for the proposed pipeline. Finally, we describe a web interface enabling users to build SLCs. This interface leverages the proposed pipeline to convert text documents to Smart Legal Contracts by using NLP models.
△ Less
Submitted 5 April, 2023; v1 submitted 27 August, 2022;
originally announced October 2022.
-
Reducing Positional Variance in Cross-sectional Abdominal CT Slices with Deep Conditional Generative Models
Authors:
Xin Yu,
Qi Yang,
Yucheng Tang,
Riqiang Gao,
Shunxing Bao,
LeonY. Cai,
Ho Hin Lee,
Yuankai Huo,
Ann Zenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
2D low-dose single-slice abdominal computed tomography (CT) slice enables direct measurements of body composition, which are critical to quantitatively characterizing health relationships on aging. However, longitudinal analysis of body composition changes using 2D abdominal slices is challenging due to positional variance between longitudinal slices acquired in different years. To reduce the posi…
▽ More
2D low-dose single-slice abdominal computed tomography (CT) slice enables direct measurements of body composition, which are critical to quantitatively characterizing health relationships on aging. However, longitudinal analysis of body composition changes using 2D abdominal slices is challenging due to positional variance between longitudinal slices acquired in different years. To reduce the positional variance, we extend the conditional generative models to our C-SliceGen that takes an arbitrary axial slice in the abdominal region as the condition and generates a defined vertebral level slice by estimating the structural changes in the latent space. Experiments on 1170 subjects from an in-house dataset and 50 subjects from BTCV MICCAI Challenge 2015 show that our model can generate high quality images in terms of realism and similarity. External experiments on 20 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset that contains longitudinal single abdominal slices validate that our method can harmonize the slice positional variance in terms of muscle and visceral fat area. Our approach provides a promising direction of map** slices from different vertebral levels to a target slice to reduce positional variance for single slice longitudinal analysis. The source code is available at: https://github.com/MASILab/C-SliceGen.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Longitudinal Variability Analysis on Low-dose Abdominal CT with Deep Learning-based Segmentation
Authors:
Xin Yu,
Yucheng Tang,
Qi Yang,
Ho Hin Lee,
Riqiang Gao,
Shunxing Bao,
Ann Zenobia Moore,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Metabolic health is increasingly implicated as a risk factor across conditions from cardiology to neurology, and efficiency assessment of body composition is critical to quantitatively characterizing these relationships. 2D low dose single slice computed tomography (CT) provides a high resolution, quantitative tissue map, albeit with a limited field of view. Although numerous potential analyses ha…
▽ More
Metabolic health is increasingly implicated as a risk factor across conditions from cardiology to neurology, and efficiency assessment of body composition is critical to quantitatively characterizing these relationships. 2D low dose single slice computed tomography (CT) provides a high resolution, quantitative tissue map, albeit with a limited field of view. Although numerous potential analyses have been proposed in quantifying image context, there has been no comprehensive study for low-dose single slice CT longitudinal variability with automated segmentation. We studied a total of 1816 slices from 1469 subjects of Baltimore Longitudinal Study on Aging (BLSA) abdominal dataset using supervised deep learning-based segmentation and unsupervised clustering method. 300 out of 1469 subjects that have two year gap in their first two scans were pick out to evaluate longitudinal variability with measurements including intraclass correlation coefficient (ICC) and coefficient of variation (CV) in terms of tissues/organs size and mean intensity. We showed that our segmentation methods are stable in longitudinal settings with Dice ranged from 0.821 to 0.962 for thirteen target abdominal tissues structures. We observed high variability in most organ with ICC<0.5, low variability in the area of muscle, abdominal wall, fat and body mask with average ICC>0.8. We found that the variability in organ is highly related to the cross-sectional position of the 2D slice. Our efforts pave quantitative exploration and quality control to reduce uncertainties in longitudinal analysis.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Towards Visualization of Time-Series Ecological Momentary Assessment (EMA) Data on Standalone Voice-First Virtual Assistants
Authors:
Yichen Han,
Christopher Bo Han,
Chen Chen,
Peng Wei Lee,
Michael Hogarth,
Alison A. Moore,
Nadir Weibel,
Emilia Farcas
Abstract:
Population aging is an increasingly important consideration for health care in the 21th century, and continuing to have access and interact with digital health information is a key challenge for aging populations. Voice-based Intelligent Virtual Assistants (IVAs) are promising to improve the Quality of Life (QoL) of older adults, and coupled with Ecological Momentary Assessments (EMA) they can be…
▽ More
Population aging is an increasingly important consideration for health care in the 21th century, and continuing to have access and interact with digital health information is a key challenge for aging populations. Voice-based Intelligent Virtual Assistants (IVAs) are promising to improve the Quality of Life (QoL) of older adults, and coupled with Ecological Momentary Assessments (EMA) they can be effective to collect important health information from older adults, especially when it comes to repeated time-based events. However, this same EMA data is hard to access for the older adult: although the newest IVAs are equipped with a display, the effectiveness of visualizing time-series based EMA data on standalone IVAs has not been explored. To investigate the potential opportunities for visualizing time-series based EMA data on standalone IVAs, we designed a prototype system, where older adults are able to query and examine the time-series EMA data on Amazon Echo Show - a widely used commercially available standalone screen-based IVA. We conducted a preliminary semi-structured interview with a geriatrician and an older adult, and identified three findings that should be carefully considered when designing such visualizations.
△ Less
Submitted 30 July, 2022;
originally announced August 2022.
-
Map** the landscape of histomorphological cancer phenotypes using self-supervised learning on unlabeled, unannotated pathology slides
Authors:
Adalberto Claudio Quiros,
Nicolas Coudray,
Anna Yeaton,
Xinyu Yang,
Bo**g Liu,
Hortense Le,
Luis Chiriboga,
Afreen Karimkhan,
Navneet Narula,
David A. Moore,
Christopher Y. Park,
Harvey Pass,
Andre L. Moreira,
John Le Quesne,
Aristotelis Tsirigos,
Ke Yuan
Abstract:
Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotation…
▽ More
Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotations used for training these models. To address this limitation of supervised methods, we developed Histomorphological Phenotype Learning (HPL), a fully blue{self-}supervised methodology that requires no expert labels or annotations and operates via the automatic discovery of discriminatory image features in small image tiles. Tiles are grouped into morphologically similar clusters which constitute a library of histomorphological phenotypes, revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer tissues, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. We then demonstrate that these properties are maintained in a multi-cancer study. These results show the clusters represent recurrent host responses and modes of tumor growth emerging under natural selection. Code, pre-trained models, learned embeddings, and documentation are available to the community at https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning
△ Less
Submitted 1 September, 2023; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Domain Decomposition in space-time of 4D-VAR Data Assimilation problem: a case study on the ROMS software
Authors:
L. D'Amore,
R. Cacciapuoti,
A. Moore
Abstract:
Domain Decomposition of 4D-VAR Data Assimilation (DD-4DVAR) is made up of decomposition of the spate-time domain, solution of reduced forecast model and minimization of local 4D-VAR functionals. Relying on the existing software implementation of ROMS software, we describe main components of DD-4D VAR DA method, highlighting the topics that we will should address both on the mathematical problem un…
▽ More
Domain Decomposition of 4D-VAR Data Assimilation (DD-4DVAR) is made up of decomposition of the spate-time domain, solution of reduced forecast model and minimization of local 4D-VAR functionals. Relying on the existing software implementation of ROMS software, we describe main components of DD-4D VAR DA method, highlighting the topics that we will should address both on the mathematical problem underlying ROMS and the MPI-based code implementation of the ROMS-IS4DVAR formulation.
△ Less
Submitted 22 December, 2021;
originally announced February 2022.
-
Understanding Barriers and Design Opportunities to Improve Healthcare and QOL for Older Adults through Voice Assistants
Authors:
Chen Chen,
Janet G. Johnson,
Kemeberly Charles,
Alice Lee,
Ella T. Lifset,
Michael Hogarth,
Alison A. Moore,
Emilia Farcas,
Nadir Weibel
Abstract:
Voice based Intelligent Virtual Assistants (IVAs) promise to improve healthcare management and Quality of Life (QOL) by introducing the paradigm of hands free and eye free interactions. However, there has been little understanding regarding the challenges for designing such systems for older adults, especially when it comes to healthcare related tasks. To tackle this, we consider the processes of…
▽ More
Voice based Intelligent Virtual Assistants (IVAs) promise to improve healthcare management and Quality of Life (QOL) by introducing the paradigm of hands free and eye free interactions. However, there has been little understanding regarding the challenges for designing such systems for older adults, especially when it comes to healthcare related tasks. To tackle this, we consider the processes of care delivery and QOL enhancements for older adults as a collaborative task between patients and providers. By interviewing 16 older adults living independently or semi independently and 5 providers, we identified 12 barriers that older adults might encounter during daily routine and while managing health. We ultimately highlighted key design challenges and opportunities that might be introduced when integrating voice based IVAs into the life of older adults. Our work will benefit practitioners who study and attempt to create full fledged IVA powered smart devices to deliver better care and support an increased QOL for aging populations.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Deep Learning based Novel View Synthesis
Authors:
Amit More,
Subhasis Chaudhuri
Abstract:
Predicting novel views of a scene from real-world images has always been a challenging task. In this work, we propose a deep convolutional neural network (CNN) which learns to predict novel views of a scene from given collection of images. In comparison to prior deep learning based approaches, which can handle only a fixed number of input images to predict novel view, proposed approach works with…
▽ More
Predicting novel views of a scene from real-world images has always been a challenging task. In this work, we propose a deep convolutional neural network (CNN) which learns to predict novel views of a scene from given collection of images. In comparison to prior deep learning based approaches, which can handle only a fixed number of input images to predict novel view, proposed approach works with different numbers of input images. The proposed model explicitly performs feature extraction and matching from a given pair of input images and estimates, at each pixel, the probability distribution (pdf) over possible depth levels in the scene. This pdf is then used for estimating the novel view. The model estimates multiple predictions of novel view, one estimate per input image pair, from given image collection. The model also estimates an occlusion mask and combines multiple novel view estimates in to a single optimal prediction. The finite number of depth levels used in the analysis may cause occasional blurriness in the estimated view. We mitigate this issue with simple multi-resolution analysis which improves the quality of the estimates. We substantiate the performance on different datasets and show competitive performance.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Authors:
Yao Lu,
Max Bartolo,
Alastair Moore,
Sebastian Riedel,
Pontus Stenetorp
Abstract:
When primed with only a handful of training samples, very large, pretrained language models such as GPT-3 have shown competitive results when compared to fully-supervised, fine-tuned, large, pretrained language models. We demonstrate that the order in which the samples are provided can make the difference between near state-of-the-art and random guess performance: essentially some permutations are…
▽ More
When primed with only a handful of training samples, very large, pretrained language models such as GPT-3 have shown competitive results when compared to fully-supervised, fine-tuned, large, pretrained language models. We demonstrate that the order in which the samples are provided can make the difference between near state-of-the-art and random guess performance: essentially some permutations are "fantastic" and some not. We analyse this phenomenon in detail, establishing that: it is present across model sizes (even for the largest current models), it is not related to a specific subset of samples, and that a given good permutation for one model is not transferable to another. While one could use a development set to determine which permutations are performant, this would deviate from the true few-shot setting as it requires additional annotated data. Instead, we use the generative nature of language models to construct an artificial development set and based on entropy statistics of the candidate permutations on this set, we identify performant prompts. Our method yields a 13% relative improvement for GPT-family models across eleven different established text classification tasks.
△ Less
Submitted 3 March, 2022; v1 submitted 18 April, 2021;
originally announced April 2021.
-
Multi-task Learning of Negation and Speculation for Targeted Sentiment Classification
Authors:
Andrew Moore,
Jeremy Barnes
Abstract:
The majority of work in targeted sentiment analysis has concentrated on finding better methods to improve the overall results. Within this paper we show that these models are not robust to linguistic phenomena, specifically negation and speculation. In this paper, we propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks, including negation and…
▽ More
The majority of work in targeted sentiment analysis has concentrated on finding better methods to improve the overall results. Within this paper we show that these models are not robust to linguistic phenomena, specifically negation and speculation. In this paper, we propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks, including negation and speculation scope detection, to create English-language models that are more robust to these phenomena. Further we create two challenge datasets to evaluate model performance on negated and speculative samples. We find that multi-task models and transfer learning via language modelling can improve performance on these challenge datasets, but the overall performances indicate that there is still much room for improvement. We release both the datasets and the source code at https://github.com/jerbarnes/multitask_negation_for_targeted_sentiment.
△ Less
Submitted 31 March, 2021; v1 submitted 16 October, 2020;
originally announced October 2020.
-
PIUMA: Programmable Integrated Unified Memory Architecture
Authors:
Sriram Aananthakrishnan,
Nesreen K. Ahmed,
Vincent Cave,
Marcelo Cintra,
Yigit Demir,
Kristof Du Bois,
Stijn Eyerman,
Joshua B. Fryman,
Ivan Ganev,
Wim Heirman,
Hans-Christian Hoppe,
Jason Howard,
Ibrahim Hur,
MidhunChandra Kodiyath,
Samkit Jain,
Daniel S. Klowden,
Marek M. Landowski,
Laurent Montigny,
Ankit More,
Przemyslaw Ossowski,
Robert Pawlowski,
Nick Pepperling,
Fabrizio Petrini,
Mariusz Sikora,
Balasubramanian Seshasayee
, et al. (6 additional authors not shown)
Abstract:
High performance large scale graph analytics is essential to timely analyze relationships in big data sets. Conventional processor architectures suffer from inefficient resource usage and bad scaling on graph workloads. To enable efficient and scalable graph analysis, Intel developed the Programmable Integrated Unified Memory Architecture (PIUMA). PIUMA consists of many multi-threaded cores, fine-…
▽ More
High performance large scale graph analytics is essential to timely analyze relationships in big data sets. Conventional processor architectures suffer from inefficient resource usage and bad scaling on graph workloads. To enable efficient and scalable graph analysis, Intel developed the Programmable Integrated Unified Memory Architecture (PIUMA). PIUMA consists of many multi-threaded cores, fine-grained memory and network accesses, a globally shared address space and powerful offload engines. This paper presents the PIUMA architecture, and provides initial performance estimations, projecting that a PIUMA node will outperform a conventional compute node by one to two orders of magnitude. Furthermore, PIUMA continues to scale across multiple nodes, which is a challenge in conventional multinode setups.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
A Model-Free Sampling Method for Estimating Basins of Attraction Using Hybrid Active Learning (HAL)
Authors:
Xue-She Wang,
Samuel A. Moore,
James D. Turner,
Brian P. Mann
Abstract:
Understanding the basins of attraction (BoA) is often a paramount consideration for nonlinear systems. Most existing approaches to determining a high-resolution BoA require prior knowledge of the system's dynamical model (e.g., differential equation or point map** for continuous systems, cell map** for discrete systems, etc.), which allows derivation of approximate analytical solutions or para…
▽ More
Understanding the basins of attraction (BoA) is often a paramount consideration for nonlinear systems. Most existing approaches to determining a high-resolution BoA require prior knowledge of the system's dynamical model (e.g., differential equation or point map** for continuous systems, cell map** for discrete systems, etc.), which allows derivation of approximate analytical solutions or parallel computing on a multi-core computer to find the BoA efficiently. However, these methods are typically impractical when the BoA must be determined experimentally or when the system's model is unknown. This paper introduces a model-free sampling method for BoA. The proposed method is based upon hybrid active learning (HAL) and is designed to find and label the "informative" samples, which efficiently determine the boundary of BoA. It consists of three primary parts: 1) additional sampling on trajectories (AST) to maximize the number of samples obtained from each simulation or experiment; 2) an active learning (AL) algorithm to exploit the local boundary of BoA; and 3) a density-based sampling (DBS) method to explore the global boundary of BoA. An example of estimating the BoA for a bistable nonlinear system is presented to show the high efficiency of our HAL sampling method.
△ Less
Submitted 10 May, 2022; v1 submitted 24 March, 2020;
originally announced March 2020.
-
FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms
Authors:
Henry B. Moss,
Andrew Moore,
David S. Leslie,
Paul Rayson
Abstract:
We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models. Despite being known to produce unreliable comparisons, it is still common practice to compare model evaluations based on single choices of random seeds. We show that reliable model selection also…
▽ More
We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models. Despite being known to produce unreliable comparisons, it is still common practice to compare model evaluations based on single choices of random seeds. We show that reliable model selection also requires evaluations based on multiple train-test splits (contrary to common practice in many shared tasks). Using bandit theory from the statistics literature, we are able to adaptively determine appropriate numbers of data splits and random seeds used to evaluate each model, focusing computational resources on the evaluation of promising models whilst avoiding wasting evaluations on models with lower performance. Furthermore, our user-friendly Python implementation produces confidence guarantees of correctly selecting the optimal model. We evaluate our algorithms by selecting between 8 target-dependent sentiment analysis methods using dramatically fewer model evaluations than current model selection approaches.
△ Less
Submitted 28 June, 2019;
originally announced June 2019.
-
No Delay: Latency-Driven, Application Performance-Aware, Cluster Scheduling
Authors:
Diana Andreea Popescu,
Andrew W. Moore
Abstract:
Given the network latency variability observed in data centers, applications' performance is also determined by their placement within the data centre. We present NoMora, a cluster scheduling architecture whose core is represented by a latency-driven, application performance-aware, cluster scheduling policy. The policy places the tasks of an application taking into account the expected performance…
▽ More
Given the network latency variability observed in data centers, applications' performance is also determined by their placement within the data centre. We present NoMora, a cluster scheduling architecture whose core is represented by a latency-driven, application performance-aware, cluster scheduling policy. The policy places the tasks of an application taking into account the expected performance based on the measured network latency between pairs of hosts in the data center. Furthermore, if a tenant's application experiences increased network latency, and thus lower application performance, their application may be migrated to a better placement. Preliminary results show that our policy improves the overall average application performance by up to 13.4% and by up to 42% if preemption is enabled, and improves the task placement latency by a factor of 1.79x and the median algorithm runtime by 1.16x compared to a random policy on the Google cluster workload. This demonstrates that application performance can be improved by exploiting the relationship between network latency and application performance, and the current network conditions in a data center, while preserving the demands of low-latency cluster scheduling.
△ Less
Submitted 18 August, 2019; v1 submitted 17 March, 2019;
originally announced March 2019.
-
Multiple source direction of arrival estimation using subspace pseudointensity vectors
Authors:
Alastair H. Moore
Abstract:
The recently proposed subspace pseudointensity method for direction of arrival estimation is applied in the context of Tasks 1 and 2 of the LOCATA Challenge using the Eigenmike recordings. Specific implementation details are described and results reported for the development dataset, for which the ground truth source directions are available. For both single and multiple source scenarios, the aver…
▽ More
The recently proposed subspace pseudointensity method for direction of arrival estimation is applied in the context of Tasks 1 and 2 of the LOCATA Challenge using the Eigenmike recordings. Specific implementation details are described and results reported for the development dataset, for which the ground truth source directions are available. For both single and multiple source scenarios, the average absolute error angle is about 9 degrees.
△ Less
Submitted 28 November, 2018;
originally announced November 2018.
-
Learning to fail: Predicting fracture evolution in brittle material models using recurrent graph convolutional neural networks
Authors:
Max Schwarzer,
Bryce Rogan,
Yadong Ruan,
Zhengming Song,
Diana Y. Lee,
Allon G. Percus,
Viet T. Chau,
Bryan A. Moore,
Esteban Rougier,
Hari S. Viswanathan,
Gowri Srinivasan
Abstract:
We propose a machine learning approach to address a key challenge in materials science: predicting how fractures propagate in brittle materials under stress, and how these materials ultimately fail. Our methods use deep learning and train on simulation data from high-fidelity models, emulating the results of these models while avoiding the overwhelming computational demands associated with running…
▽ More
We propose a machine learning approach to address a key challenge in materials science: predicting how fractures propagate in brittle materials under stress, and how these materials ultimately fail. Our methods use deep learning and train on simulation data from high-fidelity models, emulating the results of these models while avoiding the overwhelming computational demands associated with running a statistically significant sample of simulations. We employ a graph convolutional network that recognizes features of the fracturing material and a recurrent neural network that models the evolution of these features, along with a novel form of data augmentation that compensates for the modest size of our training data. We simultaneously generate predictions for qualitatively distinct material properties. Results on fracture damage and length are within 3% of their simulated values, and results on time to material failure, which is notoriously difficult to predict even with high-fidelity models, are within approximately 15% of simulated values. Once trained, our neural networks generate predictions within seconds, rather than the hours needed to run a single simulation.
△ Less
Submitted 15 March, 2019; v1 submitted 14 October, 2018;
originally announced October 2018.
-
Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for Target Dependent Sentiment Analysis
Authors:
Andrew Moore,
Paul Rayson
Abstract:
Lack of repeatability and generalisability are two significant threats to continuing scientific development in Natural Language Processing. Language models and learning methods are so complex that scientific conference papers no longer contain enough space for the technical depth required for replication or reproduction. Taking Target Dependent Sentiment Analysis as a case study, we show how recen…
▽ More
Lack of repeatability and generalisability are two significant threats to continuing scientific development in Natural Language Processing. Language models and learning methods are so complex that scientific conference papers no longer contain enough space for the technical depth required for replication or reproduction. Taking Target Dependent Sentiment Analysis as a case study, we show how recent work in the field has not consistently released code, or described settings for learning methods in enough detail, and lacks comparability and generalisability in train, test or validation data. To investigate generalisability and to enable state of the art comparative evaluations, we carry out the first reproduction studies of three groups of complementary methods and perform the first large-scale mass evaluation on six different English datasets. Reflecting on our experiences, we recommend that future replication or reproduction experiments should always consider a variety of datasets alongside documenting and releasing their methods and published code in order to minimise the barriers to both repeatability and generalisability. We have released our code with a model zoo on GitHub with Jupyter Notebooks to aid understanding and full documentation, and we recommend that others do the same with their papers at submission time through an anonymised GitHub account.
△ Less
Submitted 6 August, 2018; v1 submitted 13 June, 2018;
originally announced June 2018.
-
Reduced-Order Modeling through Machine Learning Approaches for Brittle Fracture Applications
Authors:
A. Hunter,
B. A. Moore,
M. K. Mudunuru,
V. T. Chau,
R. L. Miller,
R. B. Tchoua,
C. Nyshadham,
S. Karra,
D. O. Malley,
E. Rougier,
H. S. Viswanathan,
G. Srinivasan
Abstract:
In this paper, five different approaches for reduced-order modeling of brittle fracture in geomaterials, specifically concrete, are presented and compared. Four of the five methods rely on machine learning (ML) algorithms to approximate important aspects of the brittle fracture problem. In addition to the ML algorithms, each method incorporates different physics-based assumptions in order to reduc…
▽ More
In this paper, five different approaches for reduced-order modeling of brittle fracture in geomaterials, specifically concrete, are presented and compared. Four of the five methods rely on machine learning (ML) algorithms to approximate important aspects of the brittle fracture problem. In addition to the ML algorithms, each method incorporates different physics-based assumptions in order to reduce the computational complexity while maintaining the physics as much as possible. This work specifically focuses on using the ML approaches to model a 2D concrete sample under low strain rate pure tensile loading conditions with 20 preexisting cracks present. A high-fidelity finite element-discrete element model is used to both produce a training dataset of 150 simulations and an additional 35 simulations for validation. Results from the ML approaches are directly compared against the results from the high-fidelity model. Strengths and weaknesses of each approach are discussed and the most important conclusion is that a combination of physics-informed and data-driven features are necessary for emulating the physics of crack propagation, interaction and coalescence. All of the models presented here have runtimes that are orders of magnitude faster than the original high-fidelity model and pave the path for develo** accurate reduced order models that could be used to inform larger length-scale models with important sub-scale physics that often cannot be accounted for due to computational cost.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
Seek and Push: Detecting Large Traffic Aggregates in the Dataplane
Authors:
Jan Kučera,
Diana Andreea Popescu,
Gianni Antichi,
Jan Kořenek,
Andrew W. Moore
Abstract:
High level goals such as bandwidth provisioning, accounting and network anomaly detection can be easily met if high-volume traffic clusters are detected in real time. This paper presents Elastic Trie, an alternative to approaches leveraging controller-dataplane architectures.
Our solution is a novel push-based network monitoring approach that allows detection, within the dataplane, of high-volum…
▽ More
High level goals such as bandwidth provisioning, accounting and network anomaly detection can be easily met if high-volume traffic clusters are detected in real time. This paper presents Elastic Trie, an alternative to approaches leveraging controller-dataplane architectures.
Our solution is a novel push-based network monitoring approach that allows detection, within the dataplane, of high-volume traffic clusters. Notifications from the switch to the controller can be sent only as required, avoiding the transmission or processing of unnecessary data. Furthermore, the dataplane can iteratively refine the responsible IP prefixes allowing a controller to receive a flexible granularity information. We report and discuss an evaluation of our P4-based prototype, showing our solution to be able to detect (with 95% of precision), hierarchical heavy hitters and superspreaders using less than 8KB or 80KB of active memory respectively. Finally, Elastic Trie can identify changes in the network traffic patterns, symptomatic of Denial-of-Service attack events.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Deep Recurrent Neural Networks for Product Attribute Extraction in eCommerce
Authors:
Bodhisattwa Prasad Majumder,
Aditya Subramanian,
Abhinandan Krishnan,
Shreyansh Gandhi,
A**kya More
Abstract:
Extracting accurate attribute qualities from product titles is a vital component in delivering eCommerce customers with a rewarding online shop** experience via an enriched faceted search. We demonstrate the potential of Deep Recurrent Networks in this domain, primarily models such as Bidirectional LSTMs and Bidirectional LSTM-CRF with or without an attention mechanism. These have improved overa…
▽ More
Extracting accurate attribute qualities from product titles is a vital component in delivering eCommerce customers with a rewarding online shop** experience via an enriched faceted search. We demonstrate the potential of Deep Recurrent Networks in this domain, primarily models such as Bidirectional LSTMs and Bidirectional LSTM-CRF with or without an attention mechanism. These have improved overall F1 scores, as compared to the previous benchmarks (More et al.) by at least 0.0391, showcasing an overall precision of 97.94%, recall of 94.12% and the F1 score of 0.9599. This has made us achieve a significant coverage of important facets or attributes of products which not only shows the efficacy of deep recurrent models over previous machine learning benchmarks but also greatly enhances the overall customer experience while shop** online.
△ Less
Submitted 29 March, 2018;
originally announced March 2018.
-
A Review of Literature on Parallel Constraint Solving
Authors:
Ian P. Gent,
Ciaran McCreesh,
Ian Miguel,
Neil C. A. Moore,
Peter Nightingale,
Patrick Prosser,
Chris Unsworth
Abstract:
As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best…
▽ More
As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation.
Under consideration in Theory and Practice of Logic Programming (TPLP).
△ Less
Submitted 29 March, 2018;
originally announced March 2018.
-
Meta-Learning MCMC Proposals
Authors:
Tongzhou Wang,
Yi Wu,
David A. Moore,
Stuart J. Russell
Abstract:
Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide…
▽ More
Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide fast approximations to block Gibbs conditionals. The learned neural proposals generalize to occurrences of common structural motifs across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler yields higher final F1 scores than classical single-site Gibbs sampling.
△ Less
Submitted 1 January, 2019; v1 submitted 20 August, 2017;
originally announced August 2017.
-
FixMyStreet Brussels: Socio-Demographic Inequality in Crowdsourced Civic Participation
Authors:
Burak Pak,
Alvin Chua,
Andrew Vande Moere
Abstract:
FixMyStreet (FMS) is a web-based civic participation platform that allows inhabitants to report environmental defects like potholes and damaged pavements to the government. In this paper, we examine the use of FMS in Brussels, the capital city of Belgium. Analyzing a total of 30,041 reports since its inception in 2013, we demonstrate how civic participation on FMS varies between the ethnically div…
▽ More
FixMyStreet (FMS) is a web-based civic participation platform that allows inhabitants to report environmental defects like potholes and damaged pavements to the government. In this paper, we examine the use of FMS in Brussels, the capital city of Belgium. Analyzing a total of 30,041 reports since its inception in 2013, we demonstrate how civic participation on FMS varies between the ethnically diverse districts in Brussels. We compare FMS use to a range of sociodemographic indicators derived from official city statistics as well as geotagged social media data from Twitter. Our statistical analysis revealed several significant differences between the districts that suggested that crowdsourced civic participation platforms tend to marginalize low-income and ethnically diverse communities. In this respect, our findings provide timely evidence to inform the design of more inclusive crowdsourced, civic participation platforms in the future.
△ Less
Submitted 7 August, 2017;
originally announced August 2017.
-
Extending programs with debug-related features, with application to hardware development
Authors:
Nik Sultana,
Salvator Galea,
David Greaves,
Marcin Wojcik,
Noa Zilberman,
Richard Clegg,
Luo Mai,
Richard Mortier,
Peter Pietzuch,
Jon Crowcroft,
Andrew W Moore
Abstract:
The capacity and programmability of reconfigurable hardware such as FPGAs has improved steadily over the years, but they do not readily provide any mechanisms for monitoring or debugging running programs. Such mechanisms need to be written into the program itself. This is done using ad hoc methods and primitive tools when compared to CPU programming. This complicates the programming and debugging…
▽ More
The capacity and programmability of reconfigurable hardware such as FPGAs has improved steadily over the years, but they do not readily provide any mechanisms for monitoring or debugging running programs. Such mechanisms need to be written into the program itself. This is done using ad hoc methods and primitive tools when compared to CPU programming. This complicates the programming and debugging of reconfigurable hardware. We introduce Program-hosted Directability (PhD), the extension of programs to interpret direction commands at runtime to enable debugging, monitoring and profiling. Normally in hardware development such features are fixed at compile time. We present a language of directing commands, specify its semantics in terms of a simple controller that is embedded with programs, and implement a prototype for directing network programs running in hardware. We show that this approach affords significant flexibility with low impact on hardware utilisation and performance.
△ Less
Submitted 28 May, 2017;
originally announced May 2017.
-
Lancaster A at SemEval-2017 Task 5: Evaluation metrics matter: predicting sentiment from financial news headlines
Authors:
Andrew Moore,
Paul Rayson
Abstract:
This paper describes our participation in Task 5 track 2 of SemEval 2017 to predict the sentiment of financial news headlines for a specific company on a continuous scale between -1 and 1. We tackled the problem using a number of approaches, utilising a Support Vector Regression (SVR) and a Bidirectional Long Short-Term Memory (BLSTM). We found an improvement of 4-6% using the LSTM model over the…
▽ More
This paper describes our participation in Task 5 track 2 of SemEval 2017 to predict the sentiment of financial news headlines for a specific company on a continuous scale between -1 and 1. We tackled the problem using a number of approaches, utilising a Support Vector Regression (SVR) and a Bidirectional Long Short-Term Memory (BLSTM). We found an improvement of 4-6% using the LSTM model over the SVR and came fourth in the track. We report a number of different evaluations using a finance specific word embedding model and reflect on the effects of using different evaluation metrics.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Signal-based Bayesian Seismic Monitoring
Authors:
David A. Moore,
Stuart J. Russell
Abstract:
Detecting weak seismic events from noisy sensors is a difficult perceptual task. We formulate this task as Bayesian inference and propose a generative model of seismic events and signals across a network of spatially distributed stations. Our system, SIGVISA, is the first to directly model seismic waveforms, allowing it to incorporate a rich representation of the physics underlying the signal gene…
▽ More
Detecting weak seismic events from noisy sensors is a difficult perceptual task. We formulate this task as Bayesian inference and propose a generative model of seismic events and signals across a network of spatially distributed stations. Our system, SIGVISA, is the first to directly model seismic waveforms, allowing it to incorporate a rich representation of the physics underlying the signal generation process. We use Gaussian processes over wavelet parameters to predict detailed waveform fluctuations based on historical events, while degrading smoothly to simple parametric envelopes in regions with no historical seismicity. Evaluating on data from the western US, we recover three times as many events as previous work, and reduce mean location errors by a factor of four while greatly increasing sensitivity to low-magnitude events.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
Prototy** RISC Based, Reconfigurable Networking Applications in Open Source
Authors:
Jong Hun Han,
Noa Zilberman,
Bjoern A. Zeeb,
Andreas Fiessler,
Andrew W. Moore
Abstract:
In the last decade we have witnessed a rapid growth in data center systems, requiring new and highly complex networking devices. The need to refresh networking infrastructure whenever new protocols or functions are introduced, and the increasing costs that this entails, are of a concern to all data center providers. New generations of Systems on Chip (SoC), integrating microprocessors and higher b…
▽ More
In the last decade we have witnessed a rapid growth in data center systems, requiring new and highly complex networking devices. The need to refresh networking infrastructure whenever new protocols or functions are introduced, and the increasing costs that this entails, are of a concern to all data center providers. New generations of Systems on Chip (SoC), integrating microprocessors and higher bandwidth interfaces, are an emerging solution to this problem. These devices permit entirely new systems and architectures that can obviate the replacement of existing networking devices while enabling seamless functionality change. In this work, we explore open source, RISC based, SoC architectures with high performance networking capabilities. The prototype architectures are implemented on the NetFPGA-SUME platform. Beyond details of the architecture, we also describe the hardware implementation and the porting of operating systems to the platform. The platform can be exploited for the development of practical networking appliances, and we provide use case examples.
△ Less
Submitted 16 December, 2016;
originally announced December 2016.
-
Survey of resampling techniques for improving classification performance in unbalanced datasets
Authors:
A**kya More
Abstract:
A number of classification problems need to deal with data imbalance between classes. Often it is desired to have a high recall on the minority class while maintaining a high precision on the majority class. In this paper, we review a number of resampling techniques proposed in literature to handle unbalanced datasets and study their effect on classification performance.
A number of classification problems need to deal with data imbalance between classes. Often it is desired to have a high recall on the minority class while maintaining a high precision on the majority class. In this paper, we review a number of resampling techniques proposed in literature to handle unbalanced datasets and study their effect on classification performance.
△ Less
Submitted 22 August, 2016;
originally announced August 2016.
-
Attribute Extraction from Product Titles in eCommerce
Authors:
A**kya More
Abstract:
This paper presents a named entity extraction system for detecting attributes in product titles of eCommerce retailers like Walmart. The absence of syntactic structure in such short pieces of text makes extracting attribute values a challenging problem. We find that combining sequence labeling algorithms such as Conditional Random Fields and Structured Perceptron with a curated normalization schem…
▽ More
This paper presents a named entity extraction system for detecting attributes in product titles of eCommerce retailers like Walmart. The absence of syntactic structure in such short pieces of text makes extracting attribute values a challenging problem. We find that combining sequence labeling algorithms such as Conditional Random Fields and Structured Perceptron with a curated normalization scheme produces an effective system for the task of extracting product attribute values from titles. To keep the discussion concrete, we will illustrate the mechanics of the system from the point of view of a particular attribute - brand. We also discuss the importance of an attribute extraction system in the context of retail websites with large product catalogs, compare our approach to other potential approaches to this problem and end the paper with a discussion of the performance of our system for extracting attributes.
△ Less
Submitted 14 August, 2016;
originally announced August 2016.
-
Acoustic Characterization of Environments (ACE) Challenge Results Technical Report
Authors:
James Eaton,
Nikolay D. Gaubitch,
Alastair H. Moore,
Patrick A. Naylor
Abstract:
This document provides the results of the tests of acoustic parameter estimation algorithms on the Acoustic Characterization of Environments (ACE) Challenge Evaluation dataset which were subsequently submitted and written up into papers for the Proceedings of the ACE Challenge. This document is supporting material for a forthcoming journal paper on the ACE Challenge which will provide further anal…
▽ More
This document provides the results of the tests of acoustic parameter estimation algorithms on the Acoustic Characterization of Environments (ACE) Challenge Evaluation dataset which were subsequently submitted and written up into papers for the Proceedings of the ACE Challenge. This document is supporting material for a forthcoming journal paper on the ACE Challenge which will provide further analysis of the results.
△ Less
Submitted 27 June, 2017; v1 submitted 17 December, 2015;
originally announced June 2016.