Skip to main content

Showing 101–148 of 148 results for author: Lipton, Z

.
  1. arXiv:1907.00943  [pdf, other

    cs.CV eess.IV q-bio.QM

    Estimating brain age based on a healthy population with deep learning and structural MRI

    Authors: Xinyang Feng, Zachary C. Lipton, Jie Yang, Scott A. Small, Frank A. Provenzano

    Abstract: Numerous studies have established that estimated brain age, as derived from statistical models trained on healthy populations, constitutes a valuable biomarker that is predictive of cognitive decline and various neurological diseases. In this work, we curate a large-scale heterogeneous dataset (N = 10,158, age range 18 - 97) of structural brain MRIs in a healthy population from multiple publicly-a… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: 32 pages, 9 figures, 6 tables

  2. arXiv:1906.10437  [pdf, other

    cs.LG stat.ML

    Learning Causal State Representations of Partially Observable Environments

    Authors: Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

    Abstract: Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predi… ▽ More

    Submitted 8 February, 2021; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 35 pages, 8 figures

  3. arXiv:1905.13549  [pdf, other

    cs.CV

    Learning Robust Global Representations by Penalizing Local Predictive Power

    Authors: Haohan Wang, Songwei Ge, Eric P. Xing, Zachary C. Lipton

    Abstract: Despite their renowned predictive power on i.i.d. data, convolutional neural networks are known to rely more on high-frequency patterns that humans deem superficial than on low-frequency patterns that agree better with intuitions about what constitutes category membership. This paper proposes a method for training robust convolutional networks by penalizing the predictive power of the local repres… ▽ More

    Submitted 4 November, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Journal ref: NeurIPS 2019

  4. arXiv:1905.11361  [pdf, ps, other

    cs.LG cs.CY stat.ML

    Efficient candidate screening under multiple tests and implications for fairness

    Authors: Lee Cohen, Zachary C. Lipton, Yishay Mansour

    Abstract: When recruiting job candidates, employers rarely observe their underlying skill level directly. Instead, they must administer a series of interviews and/or collate other noisy signals in order to estimate the worker's skill. Traditional economics papers address screening models where employers access worker skill via a single noisy signal. In this paper, we extend this theoretical analysis to a mu… ▽ More

    Submitted 27 May, 2019; originally announced May 2019.

  5. arXiv:1905.11268  [pdf, other

    cs.CL cs.CR cs.LG

    Combating Adversarial Misspellings with Robust Word Recognition

    Authors: Danish Pruthi, Bhuwan Dhingra, Zachary C. Lipton

    Abstract: To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. Our word recognition models build upon the RNN semi-character architecture, introducing several new backoff strategies for handling rare and unseen words. Trained to recognize words corrupted by random adds, drops, swaps, and keyboard mistakes, our method achieves 32% relativ… ▽ More

    Submitted 29 August, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: ACL 2019, long paper

  6. arXiv:1904.12206  [pdf, other

    cs.LG q-bio.QM stat.ML

    Temporal-Clustering Invariance in Irregular Healthcare Time Series

    Authors: Mohammad Taha Bahadori, Zachary Chase Lipton

    Abstract: Electronic records contain sequences of events, some of which take place all at once in a single visit, and others that are dispersed over multiple visits, each with a different timestamp. We postulate that fine temporal detail, e.g., whether a series of blood tests are completed at once or in rapid succession should not alter predictions based on this data. Motivated by this intuition, we propose… ▽ More

    Submitted 27 April, 2019; originally announced April 2019.

  7. arXiv:1904.04419  [pdf, other

    cs.CV cs.LG

    Embryo staging with weakly-supervised region selection and dynamically-decoded predictions

    Authors: Tingfung Lau, Nathan Ng, Julian Gingold, Nina Desai, Julian McAuley, Zachary C. Lipton

    Abstract: To optimize clinical outcomes, fertility clinics must strategically select which embryos to transfer. Common selection heuristics are formulas expressed in terms of the durations required to reach various developmental milestones, quantities historically annotated manually by experienced embryologists based on time-lapse EmbryoScope videos. We propose a new method for automatic embryo staging that… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

  8. arXiv:1903.06256  [pdf, other

    cs.CV cs.LG

    Learning Robust Representations by Projecting Superficial Statistics Out

    Authors: Haohan Wang, Zexue He, Zachary C. Lipton, Eric P. Xing

    Abstract: Despite impressive performance as evaluated on i.i.d. holdout data, deep neural networks depend heavily on superficial statistics of the training data and are liable to break under distribution shift. For example, subtle changes to the background or texture of an image can break a seemingly powerful classifier. Building on previous work on domain generalization, we hope to produce a classifier tha… ▽ More

    Submitted 1 March, 2019; originally announced March 2019.

    Comments: To appear at ICLR 2019. Implementation: https://github.com/HaohanWang/HEX

  9. arXiv:1903.01689  [pdf, other

    cs.LG stat.ML

    Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment

    Authors: Yifan Wu, Ezra Winston, Divyansh Kaushik, Zachary Lipton

    Abstract: Domain adaptation addresses the common problem when the target distribution generating our test data drifts from the source (training) distribution. While absent assumptions, domain adaptation is impossible, strict conditions, e.g. covariate or label shift, enable principled algorithms. Recently-proposed domain-adversarial approaches consist of aligning source and target encodings, often motivatin… ▽ More

    Submitted 11 March, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

  10. arXiv:1812.03372  [pdf, other

    cs.LG stat.ML

    What is the Effect of Importance Weighting in Deep Learning?

    Authors: Jonathon Byrd, Zachary C. Lipton

    Abstract: Importance-weighted risk minimization is a key ingredient in many machine learning algorithms for causal inference, domain adaptation, class imbalance, and off-policy reinforcement learning. While the effect of importance weighting is well-characterized for low-capacity misspecified models, little is known about how it impacts over-parameterized, deep neural networks. This work is inspired by rece… ▽ More

    Submitted 13 June, 2019; v1 submitted 8 December, 2018; originally announced December 2018.

  11. arXiv:1810.11953  [pdf, other

    stat.ML cs.LG

    Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

    Authors: Stephan Rabanser, Stephan Günnemann, Zachary C. Lipton

    Abstract: We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. Machine learning (ML) systems, however, which depend strongly on properties of their inputs (e.g. the i.i.d. assumption), tend to fail silently. This paper explores the problem of building ML systems that fail loudly, investigating methods for detecting dataset shift, identifying exemplars… ▽ More

    Submitted 28 October, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2019

  12. arXiv:1808.05697  [pdf, other

    cs.CL cs.LG stat.ML

    Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

    Authors: Aditya Siddhant, Zachary C. Lipton

    Abstract: Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the… ▽ More

    Submitted 24 September, 2018; v1 submitted 16 August, 2018; originally announced August 2018.

    Comments: To be presented at EMNLP 2018

  13. arXiv:1808.04926  [pdf, ps, other

    cs.CL cs.AI cs.LG stat.ML

    How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks

    Authors: Divyansh Kaushik, Zachary C. Lipton

    Abstract: Many recent papers address reading comprehension, where examples consist of (question, passage, answer) tuples. Presumably, a model must combine information from both questions and passages to predict corresponding answers. However, despite intense interest in the topic, with hundreds of published papers vying for leaderboard dominance, basic questions about the difficulty of many popular benchmar… ▽ More

    Submitted 21 August, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: To appear in EMNLP 2018

  14. arXiv:1807.06610  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Learning Noise-Invariant Representations for Robust Speech Recognition

    Authors: Davis Liang, Zhiheng Huang, Zachary C. Lipton

    Abstract: Despite rapid advances in speech recognition, current models remain brittle to superficial perturbations to their inputs. Small amounts of noise can destroy the performance of an otherwise state-of-the-art model. To harden models against background noise, practitioners often perform data augmentation, adding artificially-noised examples to the training set, carrying over the original label. In thi… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: Under Review at IEEE SLT 2018

  15. arXiv:1807.04801  [pdf, other

    cs.LG stat.ML

    Practical Obstacles to Deploying Active Learning

    Authors: David Lowell, Zachary C. Lipton, Byron C. Wallace

    Abstract: Active learning (AL) is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. In AL one iteratively selects training examples for annotation, often those for which the current model is most uncertain (by some measure). The hope is that active sampling leads to better performance than would be achieved under independent and identically distribut… ▽ More

    Submitted 1 November, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

  16. arXiv:1807.03341  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Troubling Trends in Machine Learning Scholarship

    Authors: Zachary C. Lipton, Jacob Steinhardt

    Abstract: Collectively, machine learning (ML) researchers are engaged in the creation and dissemination of knowledge about data-driven algorithms. In a given paper, researchers might aspire to any subset of the following goals, among others: to theoretically characterize what is learnable, to obtain understanding through empirically rigorous experiments, or to build a working system that has high predictive… ▽ More

    Submitted 26 July, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: Presented at ICML 2018: The Debates

  17. arXiv:1806.05780  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Surprising Negative Results for Generative Adversarial Tree Search

    Authors: Kamyar Azizzadenesheli, Brandon Yang, Weitang Liu, Zachary C Lipton, Animashree Anandkumar

    Abstract: While many recent advances in deep reinforcement learning (RL) rely on model-free methods, model-based approaches remain an alluring prospect for their potential to exploit unsupervised data to learn environment model. In this work, we provide an extensive study on the design of deep generative models for RL environments and propose a sample efficient and robust method to learn the model of Atari… ▽ More

    Submitted 4 September, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

  18. arXiv:1805.04770  [pdf, other

    stat.ML cs.AI cs.LG

    Born Again Neural Networks

    Authors: Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar

    Abstract: Knowledge Distillation (KD) consists of transferring “knowledge” from one machine learning model (the teacher) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student’s compactness, without sacrificing too much performance. We study KD from a new p… ▽ More

    Submitted 29 June, 2018; v1 submitted 12 May, 2018; originally announced May 2018.

    Comments: Published @ICML 2018

  19. arXiv:1803.04477  [pdf, other

    cs.CV

    Correction by Projection: Denoising Images with Generative Adversarial Networks

    Authors: Subarna Tripathi, Zachary C. Lipton, Truong Q. Nguyen

    Abstract: Generative adversarial networks (GANs) transform low-dimensional latent vectors into visually plausible images. If the real dataset contains only clean images, then ostensibly, the manifold learned by the GAN should contain only clean images. In this paper, we propose to denoise corrupted images by finding the nearest point on the GAN manifold, recovering latent vectors by minimizing distances in… ▽ More

    Submitted 12 March, 2018; originally announced March 2018.

  20. arXiv:1803.01442  [pdf, other

    cs.LG stat.ML

    Stochastic Activation Pruning for Robust Adversarial Defense

    Authors: Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, Anima Anandkumar

    Abstract: Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. I… ▽ More

    Submitted 4 March, 2018; originally announced March 2018.

    Comments: ICLR 2018

  21. arXiv:1802.07427  [pdf, other

    cs.LG

    Active Learning with Partial Feedback

    Authors: Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan

    Abstract: While many active learning papers assume that the learner can simply ask for a label and receive it, real annotation often presents a mismatch between the form of a label (say, one among many classes), and the form of an annotation (typically yes/no binary feedback). To annotate examples corpora for multiclass classification, we might need to ask multiple yes/no questions, exploiting a label hiera… ▽ More

    Submitted 8 July, 2019; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: ICLR 2019

  22. arXiv:1802.03916  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Detecting and Correcting for Label Shift with Black Box Predictors

    Authors: Zachary C. Lipton, Yu-Xiang Wang, Alex Smola

    Abstract: Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels. Motivated by medical diagnosis, where diseases (targets) cause symptoms (observations), we focus on label shift, where the label marginal $p(y)$ changes but the conditional $p(x| y)$ does not. We propose Black Box Shift Estimation (BBSE) to… ▽ More

    Submitted 26 July, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: Published at the International Conference on Machine Learning (ICML) 2018

  23. arXiv:1712.04577  [pdf, other

    cs.LG

    Learning From Noisy Singly-labeled Data

    Authors: Ashish Khetan, Zachary C. Lipton, Anima Anandkumar

    Abstract: Supervised learning depends on annotated examples, which are taken to be the \emph{ground truth}. But these labels often come from noisy crowdsourcing platforms, like Amazon Mechanical Turk. Practitioners typically collect multiple labels per example and aggregate the results to mitigate noise (the classic crowdsourcing problem). Given a fixed annotation budget and unlimited unlabeled data, redund… ▽ More

    Submitted 20 May, 2018; v1 submitted 12 December, 2017; originally announced December 2017.

    Comments: 18 pages, 3 figures

  24. arXiv:1711.08037  [pdf, ps, other

    stat.ML

    The Doctor Just Won't Accept That!

    Authors: Zachary C. Lipton

    Abstract: Calls to arms to build interpretable models express a well-founded discomfort with machine learning. Should a software agent that does not even know what a loan is decide who qualifies for one? Indeed, we ought to be cautious about injecting machine learning (or anything else, for that matter) into applications where there may be a significant risk of causing social harm. However, claims that stak… ▽ More

    Submitted 24 November, 2017; v1 submitted 19 November, 2017; originally announced November 2017.

    Comments: Presented at NIPS 2017 Interpretable ML Symposium

  25. arXiv:1711.07076  [pdf, other

    stat.ML cs.LG

    Does mitigating ML's impact disparity require treatment disparity?

    Authors: Zachary C. Lipton, Alexandra Chouldechova, Julian McAuley

    Abstract: Following related work in law and policy, two notions of disparity have come to shape the study of fairness in algorithmic decision-making. Algorithms exhibit treatment disparity if they formally treat members of protected subgroups differently; algorithms exhibit impact disparity when outcomes differ across subgroups, even if the correlation arises unintentionally. Naturally, we can achieve impac… ▽ More

    Submitted 11 January, 2019; v1 submitted 19 November, 2017; originally announced November 2017.

  26. arXiv:1711.05715   

    cs.AI cs.CL cs.LG

    BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

    Authors: Zachary Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng

    Abstract: We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems. Our agents explore via Thompson sampling, drawing Monte Carlo samples from a Bayes-by-Backprop neural network. Our algorithm learns much faster than common exploration strategies such as ε-greedy, Boltzmann, bootstrap**, and intrinsic-reward-based ones. Additionall… ▽ More

    Submitted 19 November, 2017; v1 submitted 15 November, 2017; originally announced November 2017.

    Comments: Duplicate of article already in the arXiv: arXiv:1608.05081

  27. arXiv:1711.04837  [pdf, other

    stat.ML cs.LG cs.NE

    Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals

    Authors: John Alberg, Zachary C. Lipton

    Abstract: On a periodic basis, publicly traded companies are required to report fundamentals: financial data such as revenue, operating income, debt, among others. These data points provide some insight into the financial health of a company. Academic research has identified some factors, i.e. computed features of the reported data, that are known through retrospective analysis to outperform the market aver… ▽ More

    Submitted 25 April, 2018; v1 submitted 13 November, 2017; originally announced November 2017.

  28. arXiv:1707.08308  [pdf, other

    cs.LG

    Tensor Regression Networks

    Authors: Jean Kossaifi, Zachary C. Lipton, Arinbjorn Kolbeinsson, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

    Abstract: Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear struct… ▽ More

    Submitted 20 July, 2020; v1 submitted 26 July, 2017; originally announced July 2017.

  29. arXiv:1707.05928  [pdf, other

    cs.CL

    Deep Active Learning for Named Entity Recognition

    Authors: Yanyao Shen, Hyokun Yun, Zachary C. Lipton, Yakov Kronrod, Animashree Anandkumar

    Abstract: Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it c… ▽ More

    Submitted 3 February, 2018; v1 submitted 18 July, 2017; originally announced July 2017.

  30. arXiv:1706.00439  [pdf, other

    cs.LG

    Tensor Contraction Layers for Parsimonious Deep Nets

    Authors: Jean Kossaifi, Aran Khanna, Zachary C. Lipton, Tommaso Furlanello, Anima Anandkumar

    Abstract: Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. Tensor methods are noted for their ability to discover multi-dimensional dependencies, and tensor decompositions in particular, have been used to produce compact lo… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

  31. arXiv:1705.07904  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Semantically Decomposing the Latent Spaces of Generative Adversarial Networks

    Authors: Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, Julian McAuley

    Abstract: We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining co… ▽ More

    Submitted 22 February, 2018; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: Published as a conference paper at ICLR 2018

  32. arXiv:1703.06891  [pdf, other

    cs.LG cs.MM cs.NE cs.SD stat.ML

    Dance Dance Convolution

    Authors: Chris Donahue, Zachary C. Lipton, Julian McAuley

    Abstract: Dance Dance Revolution (DDR) is a popular rhythm-based video game. Players perform steps on a dance platform in synchronization with music as directed by on-screen step charts. While many step charts are available in standardized packs, players may grow tired of existing charts, or wish to dance to a song for which no chart exists. We introduce the task of learning to choreograph. Given a raw audi… ▽ More

    Submitted 20 June, 2017; v1 submitted 20 March, 2017; originally announced March 2017.

    Comments: Published as a conference paper at ICML 2017

  33. arXiv:1702.05386  [pdf, other

    stat.ML cs.LG cs.NE

    Predicting Surgery Duration with Neural Heteroscedastic Regression

    Authors: Nathan Ng, Rodney A Gabriel, Julian McAuley, Charles Elkan, Zachary C Lipton

    Abstract: Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a… ▽ More

    Submitted 12 July, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

  34. arXiv:1702.04782  [pdf, other

    cs.LG cs.NE stat.ML

    Precise Recovery of Latent Vectors from Generative Adversarial Networks

    Authors: Zachary C. Lipton, Subarna Tripathi

    Abstract: Generative adversarial networks (GANs) transform latent vectors into visually plausible images. It is generally thought that the original GAN formulation gives no out-of-the-box method to reverse the map**, projecting images back into latent space. We introduce a simple, gradient-based technique called stochastic clip**. In experiments, for images generated by the GAN, we precisely recover the… ▽ More

    Submitted 16 February, 2017; v1 submitted 15 February, 2017; originally announced February 2017.

  35. arXiv:1612.05688  [pdf, other

    cs.LG cs.AI cs.CL

    A User Simulator for Task-Completion Dialogues

    Authors: Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen

    Abstract: Despite widespread interests in reinforcement-learning for task-oriented dialogue systems, several obstacles can frustrate research and development progress. First, reinforcement learners typically require interaction with the environment, so conventional dialogue corpora cannot be used directly. Second, each task presents specific challenges, requiring separate corpus of task-specific annotated d… ▽ More

    Submitted 13 November, 2017; v1 submitted 16 December, 2016; originally announced December 2016.

    Comments: 14 pages, 2 Figures

  36. arXiv:1611.01211  [pdf, other

    cs.LG cs.NE stat.ML

    Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear

    Authors: Zachary C. Lipton, Kamyar Azizzadenesheli, Abhishek Kumar, Lihong Li, Jianfeng Gao, Li Deng

    Abstract: Many practical environments contain catastrophic states that an optimal agent would visit infrequently or never. Even on toy problems, Deep Reinforcement Learning (DRL) agents tend to periodically revisit these states upon forgetting their existence under a new policy. We introduce intrinsic fear (IF), a learned reward sha** that guards DRL agents against periodic catastrophes. IF agents possess… ▽ More

    Submitted 13 March, 2018; v1 submitted 3 November, 2016; originally announced November 2016.

  37. arXiv:1608.05081  [pdf, other

    cs.LG cs.NE stat.ML

    BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

    Authors: Zachary C. Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng

    Abstract: We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems. Our agents explore via Thompson sampling, drawing Monte Carlo samples from a Bayes-by-Backprop neural network. Our algorithm learns much faster than common exploration strategies such as $ε$-greedy, Boltzmann, bootstrap**, and intrinsic-reward-based ones. Additiona… ▽ More

    Submitted 23 November, 2017; v1 submitted 17 August, 2016; originally announced August 2016.

    Comments: 13 pages, 9 figures

  38. arXiv:1607.04648  [pdf, other

    cs.CV

    Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

    Authors: Subarna Tripathi, Zachary C. Lipton, Serge Belongie, Truong Nguyen

    Abstract: Given the vast amounts of video available online, and recent breakthroughs in object detection with static images, object detection in video offers a promising new frontier. However, motion blur and compression artifacts cause substantial frame-level variability, even in videos that appear smooth to the eye. Additionally, video datasets tend to have sparsely annotated frames. We present a new fram… ▽ More

    Submitted 18 July, 2016; v1 submitted 15 July, 2016; originally announced July 2016.

    Comments: To appear in BMVC 2016

  39. arXiv:1606.04130  [pdf, other

    cs.LG cs.IR cs.NE stat.ML

    Modeling Missing Data in Clinical Time Series with RNNs

    Authors: Zachary C. Lipton, David C. Kale, Randall Wetzel

    Abstract: We demonstrate a simple strategy to cope with missing data in sequential inputs, addressing the task of multilabel classification of diagnoses given clinical time series. Collected from the pediatric intensive care unit (PICU) at Children's Hospital Los Angeles, our data consists of multivariate time series of observations. The measurements are irregularly spaced, leading to missingness patterns i… ▽ More

    Submitted 11 November, 2016; v1 submitted 13 June, 2016; originally announced June 2016.

  40. arXiv:1606.03490  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    The Mythos of Model Interpretability

    Authors: Zachary C. Lipton

    Abstract: Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet the task of interpretation appears underspecified. Papers provide diverse and sometimes non-overlap** motivations for interpretability, and offer myriad noti… ▽ More

    Submitted 6 March, 2017; v1 submitted 10 June, 2016; originally announced June 2016.

    Comments: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

  41. arXiv:1602.07320  [pdf, other

    cs.LG

    Stuck in a What? Adventures in Weight Space

    Authors: Zachary C. Lipton

    Abstract: Deep learning researchers commonly suggest that converged models are stuck in local minima. More recently, some researchers observed that under reasonable assumptions, the vast majority of critical points are saddle points, not true minima. Both descriptions suggest that weights converge around a point in weight space, be it a local optima or merely a critical point. However, it's possible that ne… ▽ More

    Submitted 23 February, 2016; originally announced February 2016.

  42. arXiv:1511.03683  [pdf, other

    cs.CL cs.LG

    Generative Concatenative Nets Jointly Learn to Write and Classify Reviews

    Authors: Zachary C. Lipton, Sharad Vikram, Julian McAuley

    Abstract: A recommender system's basic task is to estimate how users will respond to unseen items. This is typically modeled in terms of how a user might rate a product, but here we aim to extend such approaches to model how a user would write about the product. To do so, we design a character-level Recurrent Neural Network (RNN) that generates personalized product reviews. The network convincingly learns s… ▽ More

    Submitted 7 April, 2016; v1 submitted 11 November, 2015; originally announced November 2015.

  43. arXiv:1511.03677  [pdf, other

    cs.LG

    Learning to Diagnose with LSTM Recurrent Neural Networks

    Authors: Zachary C. Lipton, David C. Kale, Charles Elkan, Randall Wetzel

    Abstract: Clinical medical data, especially in the intensive care unit (ICU), consist of multivariate time series of observations. For each patient visit (or episode), sensor data and lab test results are recorded in the patient's Electronic Health Record (EHR). While potentially containing a wealth of insights, the data is difficult to mine effectively, owing to varying length, irregular sampling and missi… ▽ More

    Submitted 21 March, 2017; v1 submitted 11 November, 2015; originally announced November 2015.

  44. arXiv:1510.07641  [pdf, other

    cs.LG

    Phenoty** of Clinical Time Series with LSTM Recurrent Neural Networks

    Authors: Zachary C. Lipton, David C. Kale, Randall C. Wetzel

    Abstract: We present a novel application of LSTM recurrent neural networks to multilabel classification of diagnoses given variable-length time series of clinical measurements. Our method outperforms a strong baseline on a variety of metrics.

    Submitted 21 March, 2017; v1 submitted 26 October, 2015; originally announced October 2015.

  45. arXiv:1506.00019  [pdf, other

    cs.LG cs.NE

    A Critical Review of Recurrent Neural Networks for Sequence Learning

    Authors: Zachary C. Lipton, John Berkowitz, Charles Elkan

    Abstract: Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video analysis, and musical information retrieval, a model must learn from inputs that are sequences. Interactive tasks, such as translating natural language, engaging i… ▽ More

    Submitted 17 October, 2015; v1 submitted 29 May, 2015; originally announced June 2015.

  46. arXiv:1505.06449  [pdf, ps, other

    cs.LG

    Efficient Elastic Net Regularization for Sparse Linear Models

    Authors: Zachary C. Lipton, Charles Elkan

    Abstract: This paper presents an algorithm for efficient training of sparse linear models with elastic net regularization. Extending previous work on delayed updates, the new algorithm applies stochastic gradient updates to non-zero features only, bringing weights current as needed with closed-form updates. Closed-form delayed updates for the $\ell_1$, $\ell_{\infty}$, and rarely used $\ell_2$ regularizers… ▽ More

    Submitted 2 July, 2015; v1 submitted 24 May, 2015; originally announced May 2015.

  47. arXiv:1412.7584  [pdf, ps, other

    cs.LG cs.CR cs.DB

    Differential Privacy and Machine Learning: a Survey and Review

    Authors: Zhanglong Ji, Zachary C. Lipton, Charles Elkan

    Abstract: The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when mining sensitive data. For example, medical research represents an important application where it is necessary both to extract useful information and protect pati… ▽ More

    Submitted 23 December, 2014; originally announced December 2014.

  48. arXiv:1402.1892  [pdf, other

    stat.ML cs.IR cs.LG

    Thresholding Classifiers to Maximize F1 Score

    Authors: Zachary Chase Lipton, Charles Elkan, Balakrishnan Narayanaswamy

    Abstract: This paper provides new insight into maximizing F1 scores in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall, F1 score is widely used to measure the success of a binary classifier when one class is rare. Micro average, macro average, and per instance average F1 scores are used in multilabel classification. For any… ▽ More

    Submitted 13 May, 2014; v1 submitted 8 February, 2014; originally announced February 2014.