-
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Authors:
Nathan Ng,
Roger Grosse,
Marzyeh Ghassemi
Abstract:
Estimating the uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts. A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also co…
▽ More
Estimating the uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts. A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Improving Black-box Robustness with In-Context Rewriting
Authors:
Kyle O'Brien,
Nathan Ng,
Isha Puri,
Jorge Mendez,
Hamid Palangi,
Yoon Kim,
Marzyeh Ghassemi,
Thomas Hartvigsen
Abstract:
Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a simple post-hoc technique…
▽ More
Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a simple post-hoc technique for improving robustness that sidesteps black-box constraints by aggregating predictions across multiple augmentations of the test input. TTA has seen limited use in NLP due to the challenge of generating effective natural language augmentations. In this work, we propose LLM-TTA, which uses LLM-generated augmentations as TTA's augmentation function. LLM-TTA outperforms conventional augmentation functions across sentiment, toxicity, and news classification tasks for BERT and T5 models, with BERT's OOD robustness improving by an average of 4.30 percentage points without regressing average ID performance. We explore selectively augmenting inputs based on prediction entropy to reduce the rate of expensive LLM augmentations, allowing us to maintain performance gains while reducing the average number of generated augmentations by 57.76%. LLM-TTA is agnostic to the task model architecture, does not require OOD labels, and is effective across low and high-resource settings. We share our data, models, and code for reproducibility.
△ Less
Submitted 15 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Blind Biological Sequence Denoising with Self-Supervised Set Learning
Authors:
Nathan Ng,
Ji Won Park,
Jae Hyeon Lee,
Ryan Lewis Kelly,
Stephen Ra,
Kyunghyun Cho
Abstract:
Biological sequence analysis relies on the ability to denoise the imprecise output of sequencing platforms. We consider a common setting where a short sequence is read out repeatedly using a high-throughput long-read platform to generate multiple subreads, or noisy observations of the same sequence. Denoising these subreads with alignment-based approaches often fails when too few subreads are avai…
▽ More
Biological sequence analysis relies on the ability to denoise the imprecise output of sequencing platforms. We consider a common setting where a short sequence is read out repeatedly using a high-throughput long-read platform to generate multiple subreads, or noisy observations of the same sequence. Denoising these subreads with alignment-based approaches often fails when too few subreads are available or error rates are too high. In this paper, we propose a novel method for blindly denoising sets of sequences without directly observing clean source sequence labels. Our method, Self-Supervised Set Learning (SSSL), gathers subreads together in an embedding space and estimates a single set embedding as the midpoint of the subreads in both the latent and sequence spaces. This set embedding represents the "average" of the subreads and can be decoded into a prediction of the clean sequence. In experiments on simulated long-read DNA data, SSSL methods denoise small reads of $\leq 6$ subreads with 17% fewer errors and large reads of $>6$ subreads with 8% fewer errors compared to the best baseline. On a real dataset of antibody sequences, SSSL improves over baselines on two self-supervised metrics, with a significant improvement on difficult small reads that comprise over 60% of the test set. By accurately denoising these reads, SSSL promises to better realize the potential of high-throughput DNA sequencing data for downstream scientific applications.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
In-situ crack and keyhole pore detection in laser directed energy deposition through acoustic signal and deep learning
Authors:
Lequn Chen,
Xiling Yao,
Chaolin Tan,
Weiyang He,
**long Su,
Fei Weng,
Youxiang Chew,
Nicholas Poh Huat Ng,
Seung Ki Moon
Abstract:
Cracks and keyhole pores are detrimental defects in alloys produced by laser directed energy deposition (LDED). Laser-material interaction sound may hold information about underlying complex physical events such as crack propagation and pores formation. However, due to the noisy environment and intricate signal content, acoustic-based monitoring in LDED has received little attention. This paper pr…
▽ More
Cracks and keyhole pores are detrimental defects in alloys produced by laser directed energy deposition (LDED). Laser-material interaction sound may hold information about underlying complex physical events such as crack propagation and pores formation. However, due to the noisy environment and intricate signal content, acoustic-based monitoring in LDED has received little attention. This paper proposes a novel acoustic-based in-situ defect detection strategy in LDED. The key contribution of this study is to develop an in-situ acoustic signal denoising, feature extraction, and sound classification pipeline that incorporates convolutional neural networks (CNN) for online defect prediction. Microscope images are used to identify locations of the cracks and keyhole pores within a part. The defect locations are spatiotemporally registered with acoustic signal. Various acoustic features corresponding to defect-free regions, cracks, and keyhole pores are extracted and analysed in time-domain, frequency-domain, and time-frequency representations. The CNN model is trained to predict defect occurrences using the Mel-Frequency Cepstral Coefficients (MFCCs) of the lasermaterial interaction sound. The CNN model is compared to various classic machine learning models trained on the denoised acoustic dataset and raw acoustic dataset. The validation results shows that the CNN model trained on the denoised dataset outperforms others with the highest overall accuracy (89%), keyhole pore prediction accuracy (93%), and AUC-ROC score (98%). Furthermore, the trained CNN model can be deployed into an in-house developed software platform for online quality monitoring. The proposed strategy is the first study to use acoustic signals with deep learning for insitu defect detection in LDED process.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Tuning the Tail Latency of Distributed Queries Using Replication
Authors:
Nathan Ng,
Hung Le,
Marco Serafini
Abstract:
Querying graph data with low latency is an important requirement in application domains such as social networks and knowledge graphs. Graph queries perform multiple hops between vertices. When data is partitioned and stored across multiple servers, queries executing at one server often need to hop to vertices stored by another server. Such distributed traversals represent a performance bottleneck…
▽ More
Querying graph data with low latency is an important requirement in application domains such as social networks and knowledge graphs. Graph queries perform multiple hops between vertices. When data is partitioned and stored across multiple servers, queries executing at one server often need to hop to vertices stored by another server. Such distributed traversals represent a performance bottleneck for low-latency queries. To reduce query latency, one can replicate remote data to make distributed traversals unnecessary, but replication is expensive and should be minimized. In this paper, we introduce the problem of finding data replication schemes that satisfy arbitrary user-defined query latency constraints with minimal replication cost. We propose a novel workload model to express data access causality, propose a family of heuristics, and introduce non-trivial sufficient conditions for their correctness. Our evaluation on two representative benchmarks show that our algorithms enable fine-tuning query latency with data replication and can find sweet spots in the latency/replication design space.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
If Influence Functions are the Answer, Then What is the Question?
Authors:
Juhan Bae,
Nathan Ng,
Alston Lo,
Marzyeh Ghassemi,
Roger Grosse
Abstract:
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate…
▽ More
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate terms. We study the contributions of each term on a variety of architectures and datasets and how they vary with factors such as network width and training time. While practical influence function estimates may be a poor match to leave-one-out retraining for nonlinear networks, we show they are often a good approximation to a different object we term the proximal Bregman response function (PBRF). Since the PBRF can still be used to answer many of the questions motivating influence functions, such as identifying influential or mislabeled examples, our results suggest that current algorithms for influence function estimation give more informative results than previous error analyses would suggest.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Authors:
Nathan Ng,
Neha Hulkund,
Kyunghyun Cho,
Marzyeh Ghassemi
Abstract:
Develo** and deploying machine learning models safely depends on the ability to characterize and compare their abilities to generalize to new environments. Although recent work has proposed a variety of methods that can directly predict or theoretically bound the generalization capacity of a model, they rely on strong assumptions such as matching train/test distributions and access to model grad…
▽ More
Develo** and deploying machine learning models safely depends on the ability to characterize and compare their abilities to generalize to new environments. Although recent work has proposed a variety of methods that can directly predict or theoretically bound the generalization capacity of a model, they rely on strong assumptions such as matching train/test distributions and access to model gradients. In order to characterize generalization when these assumptions are not satisfied, we propose neighborhood invariance, a measure of a classifier's output invariance in a local transformation neighborhood. Specifically, we sample a set of transformations and given an input test point, calculate the invariance as the largest fraction of transformed points classified into the same class. Crucially, our measure is simple to calculate, does not depend on the test point's true label, makes no assumptions about the data distribution or model, and can be applied even in out-of-domain (OOD) settings where existing methods cannot, requiring only selecting a set of appropriate data transformations. In experiments on robustness benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our neighborhood invariance measure and actual OOD generalization on over 4,600 models evaluated on over 100 unique train/test domain pairs.
△ Less
Submitted 17 July, 2023; v1 submitted 5 July, 2022;
originally announced July 2022.
-
Model-Agnostic Hybrid Numerical Weather Prediction and Machine Learning Paradigm for Solar Forecasting in the Tropics
Authors:
Nigel Yuan Yun Ng,
Harish Gopalan,
Venugopalan S. G. Raghavan,
Chin Chun Ooi
Abstract:
Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is prop…
▽ More
Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is proposed and evaluated for four radiation models. Weather Research and Forecasting (WRF) model is run in both global and regional mode to provide an estimate for solar irradiance. This estimate is then post-processed using ML to provide a final prediction. Normalized root-mean-square error from WRF is reduced by up to 40-50% with this ML error correction model. Results obtained using CAM, GFDL, New Goddard and RRTMG radiation models were comparable after this correction, negating the need for WRF parameterization tuning. Other models incorporating nearby locations and sensor data are also evaluated, with the latter being particularly promising.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
The Multiscenario Multienvironment BioSecure Multimodal Database (BMDB)
Authors:
Javier Ortega-Garcia,
Julian Fierrez,
Fernando Alonso-Fernandez,
Javier Galbally,
Manuel R Freire,
Joaquin Gonzalez-Rodriguez,
Carmen Garcia-Mateo,
Jose-Luis Alba-Castro,
Elisardo Gonzalez-Agulla,
Enrique Otero-Muras,
Sonia Garcia-Salicetti,
Lorene Allano,
Bao Ly-Van,
Bernadette Dorizzi,
Josef Kittler,
Thirimachos Bourlai,
Norman Poh,
Farzin Deravi,
Ming NR Ng,
Michael Fairhurst,
Jean Hennebert,
Andreas Humm,
Massimo Tistarelli,
Linda Brodo,
Jonas Richiardi
, et al. (7 additional authors not shown)
Abstract:
A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a comm…
▽ More
A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisition sessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of either monomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to be available for research purposes through the BioSecure Association during 2008
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Improving Dialogue Breakdown Detection with Semi-Supervised Learning
Authors:
Nathan Ng,
Marzyeh Ghassemi,
Narendran Thangarajan,
Jiacheng Pan,
Qi Guo
Abstract:
Building user trust in dialogue agents requires smooth and consistent dialogue exchanges. However, agents can easily lose conversational context and generate irrelevant utterances. These situations are called dialogue breakdown, where agent utterances prevent users from continuing the conversation. Building systems to detect dialogue breakdown allows agents to recover appropriately or avoid breakd…
▽ More
Building user trust in dialogue agents requires smooth and consistent dialogue exchanges. However, agents can easily lose conversational context and generate irrelevant utterances. These situations are called dialogue breakdown, where agent utterances prevent users from continuing the conversation. Building systems to detect dialogue breakdown allows agents to recover appropriately or avoid breakdown entirely. In this paper we investigate the use of semi-supervised learning methods to improve dialogue breakdown detection, including continued pre-training on the Reddit dataset and a manifold-based data augmentation method. We demonstrate the effectiveness of these methods on the Dialogue Breakdown Detection Challenge (DBDC) English shared task. Our submissions to the 2020 DBDC5 shared task place first, beating baselines and other submissions by over 12\% accuracy. In ablations on DBDC4 data from 2019, our semi-supervised learning methods improve the performance of a baseline BERT model by 2\% accuracy. These methods are applicable generally to any dialogue task and provide a simple way to improve model performance.
△ Less
Submitted 19 January, 2023; v1 submitted 30 October, 2020;
originally announced November 2020.
-
SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness
Authors:
Nathan Ng,
Kyunghyun Cho,
Marzyeh Ghassemi
Abstract:
Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training exam…
▽ More
Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training examples by using a pair of corruption and reconstruction functions to move randomly on a data manifold. We investigate the use of SSMBA in the natural language domain, leveraging the manifold assumption to reconstruct corrupted text with masked language models. In experiments on robustness benchmarks across 3 tasks and 9 datasets, SSMBA consistently outperforms existing data augmentation methods and baseline models on both in-domain and OOD data, achieving gains of 0.8% accuracy on OOD Amazon reviews, 1.8% accuracy on OOD MNLI, and 1.4 BLEU on in-domain IWSLT14 German-English.
△ Less
Submitted 4 October, 2020; v1 submitted 21 September, 2020;
originally announced September 2020.
-
Simple and Effective Noisy Channel Modeling for Neural Machine Translation
Authors:
Kyra Yee,
Nathan Ng,
Yann N. Dauphin,
Michael Auli
Abstract:
Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. These models perform remarkably well as cha…
▽ More
Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. These models perform remarkably well as channel models, even though they have neither been trained on, nor designed to factor over incomplete target sentences. Experiments with neural language models trained on billions of words show that noisy channel models can outperform a direct model by up to 3.2 BLEU on WMT'17 German-English translation. We evaluate on four language-pairs and our channel models consistently outperform strong alternatives such right-to-left reranking models and ensembles of direct models.
△ Less
Submitted 15 August, 2019;
originally announced August 2019.
-
Facebook FAIR's WMT19 News Translation Task Submission
Authors:
Nathan Ng,
Kyra Yee,
Alexei Baevski,
Myle Ott,
Michael Auli,
Sergey Edunov
Abstract:
This paper describes Facebook FAIR's submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This…
▽ More
This paper describes Facebook FAIR's submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model reranking. Our submissions are ranked first in all four directions of the human evaluation campaign. On En->De, our system significantly outperforms other systems as well as human translations. This system improves upon our WMT'18 submission by 4.5 BLEU points.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Embryo staging with weakly-supervised region selection and dynamically-decoded predictions
Authors:
Tingfung Lau,
Nathan Ng,
Julian Gingold,
Nina Desai,
Julian McAuley,
Zachary C. Lipton
Abstract:
To optimize clinical outcomes, fertility clinics must strategically select which embryos to transfer. Common selection heuristics are formulas expressed in terms of the durations required to reach various developmental milestones, quantities historically annotated manually by experienced embryologists based on time-lapse EmbryoScope videos. We propose a new method for automatic embryo staging that…
▽ More
To optimize clinical outcomes, fertility clinics must strategically select which embryos to transfer. Common selection heuristics are formulas expressed in terms of the durations required to reach various developmental milestones, quantities historically annotated manually by experienced embryologists based on time-lapse EmbryoScope videos. We propose a new method for automatic embryo staging that exploits several sources of structure in this time-lapse data. First, noting that in each image the embryo occupies a small subregion, we jointly train a region proposal network with the downstream classifier to isolate the embryo. Notably, because we lack ground-truth bounding boxes, our we weakly supervise the region proposal network optimizing its parameters via reinforcement learning to improve the downstream classifier's loss. Moreover, noting that embryos reaching the blastocyst stage progress monotonically through earlier stages, we develop a dynamic-programming-based decoder that post-processes our predictions to select the most likely monotonic sequence of developmental stages. Our methods outperform vanilla residual networks and rival the best numbers in contemporary papers, as measured by both per-frame accuracy and transition prediction error, despite operating on smaller data than many.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Multiparty Session Type-safe Web Development with Static Linearity
Authors:
Jonathan King,
Nicholas Ng,
Nobuko Yoshida
Abstract:
Modern web applications can now offer desktop-like experiences from within the browser, thanks to technologies such as WebSockets, which enable low-latency duplex communication between the browser and the server. While these advances are great for the user experience, they represent a new responsibility for web developers who now need to manage and verify the correctness of more complex and potent…
▽ More
Modern web applications can now offer desktop-like experiences from within the browser, thanks to technologies such as WebSockets, which enable low-latency duplex communication between the browser and the server. While these advances are great for the user experience, they represent a new responsibility for web developers who now need to manage and verify the correctness of more complex and potentially stateful interactions in their application. In this paper, we present a technique for develo** interactive web applications that are statically guaranteed to communicate following a given protocol. First, the global interaction protocol is described in the Scribble protocol language -- based on multiparty session types. Scribble protocols are checked for well-formedness, and then each role is projected to a Finite State Machine representing the structure of communication from the perspective of the role. We use source code generation and a novel type-level encoding of FSMs using multi-parameter type classes to leverage the type system of the target language and guarantee only programs that communicate following the protocol will type check.
Our work targets PureScript -- a functional language that compiles to JavaScript -- which crucially has an expressive enough type system to provide static linearity guarantees. We demonstrate the effectiveness of our approach through a web-based Battleship game where communication is performed through WebSocket connections.
△ Less
Submitted 2 April, 2019;
originally announced April 2019.
-
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Authors:
Myle Ott,
Sergey Edunov,
Alexei Baevski,
Angela Fan,
Sam Gross,
Nathan Ng,
David Grangier,
Michael Auli
Abstract:
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found…
▽ More
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found at https://www.youtube.com/watch?v=OtgDdWtHvto
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Predicting Surgery Duration with Neural Heteroscedastic Regression
Authors:
Nathan Ng,
Rodney A Gabriel,
Julian McAuley,
Charles Elkan,
Zachary C Lipton
Abstract:
Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a…
▽ More
Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a surgery-specific notion of our uncertainty about its duration. Estimating this uncertainty can lead to more nuanced and effective scheduling strategies, as we are able to schedule surgeries more efficiently while allowing an informed and case-specific margin of error. Using surgery records %from the UC San Diego Health System, from a large United States health system we demonstrate potential improvements on the order of 20% (in terms of minutes overbooked) compared to current scheduling techniques. Moreover, we demonstrate that surgery durations are indeed heteroscedastic. We show that models that estimate case-specific uncertainty better fit the data (log likelihood). Additionally, we show that the heteroscedastic predictions can more optimally trade off between over and under-booking minutes, especially when idle minutes and scheduling collisions confer disparate costs.
△ Less
Submitted 12 July, 2017; v1 submitted 17 February, 2017;
originally announced February 2017.
-
Fencing off Go: Liveness and Safety for Channel-based Programming (extended version)
Authors:
Julien Lange,
Nicholas Ng,
Bernardo Toninho,
Nobuko Yoshida
Abstract:
Go is a production-level statically typed programming language whose design features explicit message-passing primitives and lightweight threads, enabling (and encouraging) programmers to develop concurrent systems where components interact through communication more so than by lock-based shared memory concurrency. Go can only detect global deadlocks at runtime, but provides no compile-time protec…
▽ More
Go is a production-level statically typed programming language whose design features explicit message-passing primitives and lightweight threads, enabling (and encouraging) programmers to develop concurrent systems where components interact through communication more so than by lock-based shared memory concurrency. Go can only detect global deadlocks at runtime, but provides no compile-time protection against all too common communication mismatches or partial deadlocks. This work develops a static verification framework for liveness and safety in Go programs, able to detect communication errors and partial deadlocks in a general class of realistic concurrent programs, including those with dynamic channel creation, unbounded thread creation and recursion. Our approach infers from a Go program a faithful representation of its communication patterns as a behavioural type. By checking a syntactic restriction on channel usage, dubbed fencing, we ensure that programs are made up of finitely many different communication patterns that may be repeated infinitely many times. This restriction allows us to implement procedures to check for liveness and safety in types which in turn approximates liveness and safety in Go programs. We have implemented a type inference and liveness and safety checks in a tool-chain and tested it against publicly available Go programs.
Note: This is a revised extended version of a paper that appeared in POPL 2017, see page 13 for details.
△ Less
Submitted 28 February, 2017; v1 submitted 27 October, 2016;
originally announced October 2016.
-
Towards deductive verification of MPI programs against session types
Authors:
Eduardo R. B. Marques,
Francisco Martins,
Vasco T. Vasconcelos,
Nicholas Ng,
Nuno Martins
Abstract:
The Message Passing Interface (MPI) is the de facto standard message-passing infrastructure for develo** parallel applications. Two decades after the first version of the library specification, MPI-based applications are nowadays routinely deployed on super and cluster computers. These applications, written in C or Fortran, exhibit intricate message passing behaviours, making it hard to statical…
▽ More
The Message Passing Interface (MPI) is the de facto standard message-passing infrastructure for develo** parallel applications. Two decades after the first version of the library specification, MPI-based applications are nowadays routinely deployed on super and cluster computers. These applications, written in C or Fortran, exhibit intricate message passing behaviours, making it hard to statically verify important properties such as the absence of deadlocks. Our work builds on session types, a theory for describing protocols that provides for correct-by-construction guarantees in this regard. We annotate MPI primitives and C code with session type contracts, written in the language of a software verifier for C. Annotated code is then checked for correctness with the software verifier. We present preliminary results and discuss the challenges that lie ahead for verifying realistic MPI program compliance against session types.
△ Less
Submitted 10 December, 2013;
originally announced December 2013.
-
The second laws of quantum thermodynamics
Authors:
Fernando G. S. L. Brandao,
MichaĆ Horodecki,
Nelly Huei Ying Ng,
Jonathan Oppenheim,
Stephanie Wehner
Abstract:
The second law of thermodynamics tells us which state transformations are so statistically unlikely that they are effectively forbidden. Its original formulation, due to Clausius, states that "Heat can never pass from a colder to a warmer body without some other change, connected therewith, occurring at the same time". The second law applies to systems composed of many particles interacting; howev…
▽ More
The second law of thermodynamics tells us which state transformations are so statistically unlikely that they are effectively forbidden. Its original formulation, due to Clausius, states that "Heat can never pass from a colder to a warmer body without some other change, connected therewith, occurring at the same time". The second law applies to systems composed of many particles interacting; however, we are seeing that one can make sense of thermodynamics in the regime where we only have a small number of particles interacting with a heat bath. Is there a second law of thermodynamics in this regime? Here, we find that for processes which are cyclic or very close to cyclic, the second law for microscopic systems takes on a very different form than it does at the macroscopic scale, imposing not just one constraint on what state transformations are possible, but an entire family of constraints. In particular, we find a family of free energies which generalise the traditional one, and show that they can never increase. We further find that there are three regimes which determine which family of second laws govern state transitions, depending on how cyclic the process is. In one regime one can cause an apparent violation of the usual second law, through a process of embezzling work from a large system which remains arbitrarily close to its original state. These second laws are not only relevant for small systems, but also apply to individual macroscopic systems interacting via long-range interactions, which only satisfy the ordinary second law on average. By making precise the definition of thermal operations, the laws of thermodynamics take on a simple form with the first law defining the class of thermal operations, the zeroeth law emerging as a unique condition ensuring the theory is nontrivial, and the remaining laws being a monotonicity property of our generalised free energies.
△ Less
Submitted 25 September, 2014; v1 submitted 22 May, 2013;
originally announced May 2013.