-
A Precise Characterization of SGD Stability Using Loss Surface Geometry
Authors:
Gregory Dexter,
Borja Ocejo,
Sathiya Keerthi,
Aman Gupta,
Ayan Acharya,
Rajiv Khanna
Abstract:
Stochastic Gradient Descent (SGD) stands as a cornerstone optimization algorithm with proven real-world empirical successes but relatively limited theoretical understanding. Recent research has illuminated a key factor contributing to its practical efficacy: the implicit regularization it instigates. Several studies have investigated the linear stability property of SGD in the vicinity of a statio…
▽ More
Stochastic Gradient Descent (SGD) stands as a cornerstone optimization algorithm with proven real-world empirical successes but relatively limited theoretical understanding. Recent research has illuminated a key factor contributing to its practical efficacy: the implicit regularization it instigates. Several studies have investigated the linear stability property of SGD in the vicinity of a stationary point as a predictive proxy for sharpness and generalization error in overparameterized neural networks (Wu et al., 2022; Jastrzebski et al., 2019; Cohen et al., 2021). In this paper, we delve deeper into the relationship between linear stability and sharpness. More specifically, we meticulously delineate the necessary and sufficient conditions for linear stability, contingent on hyperparameters of SGD and the sharpness at the optimum. Towards this end, we introduce a novel coherence measure of the loss Hessian that encapsulates pertinent geometric properties of the loss function that are relevant to the linear stability of SGD. It enables us to provide a simplified sufficient condition for identifying linear instability at an optimum. Notably, compared to previous works, our analysis relies on significantly milder assumptions and is applicable for a broader class of loss functions than known before, encompassing not only mean-squared error but also cross-entropy loss.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Knowledge Graph Reasoning Based on Attention GCN
Authors:
Meera Gupta,
Ravi Khanna,
Divya Choudhary,
Nandini Rao
Abstract:
We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the chara…
▽ More
We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the characteristics of adjacent entities. We first learn the similarity of entities for node representation learning. By integrating the attributes of the entities and their interactions, this method generates extensive implicit feature vectors for each entity, improving performance in tasks including entity classification and link prediction, outperforming traditional neural network models. To conclude, this work provides crucial methodological support for a range of applications, such as search engines, question-answering systems, recommendation systems, and data integration tasks.
△ Less
Submitted 27 January, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
On Memorization and Privacy Risks of Sharpness Aware Minimization
Authors:
Young In Kim,
Pratiksha Agrawal,
Johannes O. Royset,
Rajiv Khanna
Abstract:
In many recent works, there is an increased focus on designing algorithms that seek flatter optima for neural network loss optimization as there is empirical evidence that it leads to better generalization performance in many datasets. In this work, we dissect these performance gains through the lens of data memorization in overparameterized models. We define a new metric that helps us identify wh…
▽ More
In many recent works, there is an increased focus on designing algorithms that seek flatter optima for neural network loss optimization as there is empirical evidence that it leads to better generalization performance in many datasets. In this work, we dissect these performance gains through the lens of data memorization in overparameterized models. We define a new metric that helps us identify which data points specifically do algorithms seeking flatter optima do better when compared to vanilla SGD. We find that the generalization gains achieved by Sharpness Aware Minimization (SAM) are particularly pronounced for atypical data points, which necessitate memorization. This insight helps us unearth higher privacy risks associated with SAM, which we verify through exhaustive empirical evaluations. Finally, we propose mitigation strategies to achieve a more desirable accuracy vs privacy tradeoff.
△ Less
Submitted 3 January, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI
Authors:
Hangjie Shi,
Leslie Ball,
Govind Thattai,
Desheng Zhang,
Lucy Hu,
Qiaozi Gao,
Suhaila Shakiah,
Xiaofeng Gao,
Aishwarya Padmakumar,
Bofei Yang,
Cadence Chung,
Dinakar Guthy,
Gaurav Sukhatme,
Karthika Arumugam,
Matthew Wen,
Osman Ipek,
Patrick Lange,
Rohan Khanna,
Shreyas Pansare,
Vasu Sharma,
Chao Zhang,
Cris Flagg,
Daniel Pressel,
Lavina Vaz,
Luke Dai
, et al. (17 additional authors not shown)
Abstract:
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented wi…
▽ More
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented with computer vision and physical embodiment. This paper describes the SimBot Challenge, a new challenge in which university teams compete to build robot assistants that complete tasks in a simulated physical environment. This paper provides an overview of the SimBot Challenge, which included both online and offline challenge phases. We describe the infrastructure and support provided to the teams including Alexa Arena, the simulated environment, and the ML toolkit provided to teams to accelerate their building of vision and language models. We summarize the approaches the participating teams took to overcome research challenges and extract key lessons learned. Finally, we provide analysis of the performance of the competing SimBots during the competition.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Generalization Guarantees via Algorithm-dependent Rademacher Complexity
Authors:
Sarah Sachs,
Tim van Erven,
Liam Hodgkinson,
Rajiv Khanna,
Umut Simsekli
Abstract:
Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure…
▽ More
Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class. Combining standard properties of Rademacher complexity with the convenient structure of this class, we are able to (i) obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal dimension-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term that was required in prior work; (ii) we greatly simplify the proof of a recent dimension-independent generalization bound for stochastic gradient descent; and (iii) we easily recover results for VC classes and compression schemes, similar to approaches based on conditional mutual information.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Feature Space Sketching for Logistic Regression
Authors:
Gregory Dexter,
Rajiv Khanna,
Jawad Raheel,
Petros Drineas
Abstract:
We present novel bounds for coreset construction, feature selection, and dimensionality reduction for logistic regression. All three approaches can be thought of as sketching the logistic regression inputs. On the coreset construction front, we resolve open problems from prior work and present novel bounds for the complexity of coreset construction methods. On the feature selection and dimensional…
▽ More
We present novel bounds for coreset construction, feature selection, and dimensionality reduction for logistic regression. All three approaches can be thought of as sketching the logistic regression inputs. On the coreset construction front, we resolve open problems from prior work and present novel bounds for the complexity of coreset construction methods. On the feature selection and dimensionality reduction front, we initiate the study of forward error bounds for logistic regression. Our bounds are tight up to constant factors and our forward error bounds can be extended to Generalized Linear Models.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Alexa Arena: A User-Centric Interactive Platform for Embodied AI
Authors:
Qiaozi Gao,
Govind Thattai,
Suhaila Shakiah,
Xiaofeng Gao,
Shreyas Pansare,
Vasu Sharma,
Gaurav Sukhatme,
Hangjie Shi,
Bofei Yang,
Desheng Zheng,
Lucy Hu,
Karthika Arumugam,
Shui Hu,
Matthew Wen,
Dinakar Guthy,
Cadence Chung,
Rohan Khanna,
Osman Ipek,
Leslie Ball,
Kate Bland,
Heather Rocker,
Yadunandana Rao,
Michael Johnston,
Reza Ghanadan,
Arindam Mandal
, et al. (2 additional authors not shown)
Abstract:
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus openi…
▽ More
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus opening a new venue for high-efficiency HRI data collection and EAI system evaluation. Along with the platform, we introduce a dialog-enabled instruction-following benchmark and provide baseline results for it. We make Alexa Arena publicly available to facilitate research in building generalizable and assistive embodied agents.
△ Less
Submitted 7 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Authors:
Kayhan Behdin,
Qingquan Song,
Aman Gupta,
Sathiya Keerthi,
Ayan Acharya,
Borja Ocejo,
Gregory Dexter,
Rajiv Khanna,
David Durfee,
Rahul Mazumder
Abstract:
Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance. The Sharpness-Aware Minimization (SAM) technique modifies the fundamental loss function that steers gradient descent methods toward flatter minima, which are believed to exhibit enhanced generalization prowess. Our study delves into a specific variant of SAM known as…
▽ More
Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance. The Sharpness-Aware Minimization (SAM) technique modifies the fundamental loss function that steers gradient descent methods toward flatter minima, which are believed to exhibit enhanced generalization prowess. Our study delves into a specific variant of SAM known as micro-batch SAM (mSAM). This variation involves aggregating updates derived from adversarial perturbations across multiple shards (micro-batches) of a mini-batch during training. We extend a recently developed and well-studied general framework for flatness analysis to theoretically show that SAM achieves flatter minima than SGD, and mSAM achieves even flatter minima than SAM. We provide a thorough empirical evaluation of various image classification and natural language processing tasks to substantiate this theoretical advancement. We also show that contrary to previous work, mSAM can be implemented in a flexible and parallelizable manner without significantly increasing computational costs. Our implementation of mSAM yields superior generalization performance across a wide range of tasks compared to SAM, further supporting our theoretical framework.
△ Less
Submitted 30 September, 2023; v1 submitted 19 February, 2023;
originally announced February 2023.
-
Fast Feature Selection with Fairness Constraints
Authors:
Francesco Quinzan,
Rajiv Khanna,
Moshik Hershcovitch,
Sarel Cohen,
Daniel G. Waddington,
Tobias Friedrich,
Michael W. Mahoney
Abstract:
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-…
▽ More
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.
△ Less
Submitted 3 February, 2023; v1 submitted 28 February, 2022;
originally announced February 2022.
-
Identifying Reasoning Flaws in Planning-Based RL Using Tree Explanations
Authors:
Kin-Ho Lam,
Zhengxian Lin,
Jed Irvine,
Jonathan Dodge,
Zeyad T Shureih,
Roli Khanna,
Minsuk Kahng,
Alan Fern
Abstract:
Enabling humans to identify potential flaws in an agent's decision making is an important Explainable AI application. We consider identifying such flaws in a planning-based deep reinforcement learning (RL) agent for a complex real-time strategy game. In particular, the agent makes decisions via tree search using a learned model and evaluation function over interpretable states and actions. This gi…
▽ More
Enabling humans to identify potential flaws in an agent's decision making is an important Explainable AI application. We consider identifying such flaws in a planning-based deep reinforcement learning (RL) agent for a complex real-time strategy game. In particular, the agent makes decisions via tree search using a learned model and evaluation function over interpretable states and actions. This gives the potential for humans to identify flaws at the level of reasoning steps in the tree, even if the entire reasoning process is too complex to understand. However, it is unclear whether humans will be able to identify such flaws due to the size and complexity of trees. We describe a user interface and case study, where a small group of AI experts and developers attempt to identify reasoning flaws due to inaccurate agent learning. Overall, the interface allowed the group to identify a number of significant flaws of varying types, demonstrating the promise of this approach.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
Authors:
Liam Hodgkinson,
Umut Şimşekli,
Rajiv Khanna,
Michael W. Mahoney
Abstract:
Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood. While recent work has revealed connections between generalization and heavy-tailed behavior in stochastic optimization, this work mainly relied on continuous-time ap…
▽ More
Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood. While recent work has revealed connections between generalization and heavy-tailed behavior in stochastic optimization, this work mainly relied on continuous-time approximations; and a rigorous treatment for the original discrete-time iterations is yet to be performed. To bridge this gap, we present novel bounds linking generalization to the lower tail exponent of the transition kernel associated with the optimizer around a local minimum, in both discrete- and continuous-time settings. To achieve this, we first prove a data- and algorithm-dependent generalization bound in terms of the celebrated Fernique-Talagrand functional applied to the trajectory of the optimizer. Then, we specialize this result by exploiting the Markovian structure of stochastic optimizers, and derive bounds in terms of their (data-dependent) transition kernels. We support our theory with empirical results from a variety of neural networks, showing correlations between generalization error and lower tail exponents.
△ Less
Submitted 11 July, 2022; v1 submitted 2 August, 2021;
originally announced August 2021.
-
LocalNewton: Reducing Communication Bottleneck for Distributed Learning
Authors:
Vipul Gupta,
Avishek Ghosh,
Michal Derezinski,
Rajiv Khanna,
Kannan Ramchandran,
Michael Mahoney
Abstract:
To address the communication bottleneck problem in distributed optimization within a master-worker framework, we propose LocalNewton, a distributed second-order algorithm with local averaging. In LocalNewton, the worker machines update their model in every iteration by finding a suitable second-order descent direction using only the data and model stored in their own local memory. We let the worke…
▽ More
To address the communication bottleneck problem in distributed optimization within a master-worker framework, we propose LocalNewton, a distributed second-order algorithm with local averaging. In LocalNewton, the worker machines update their model in every iteration by finding a suitable second-order descent direction using only the data and model stored in their own local memory. We let the workers run multiple such iterations locally and communicate the models to the master node only once every few (say L) iterations. LocalNewton is highly practical since it requires only one hyperparameter, the number L of local iterations. We use novel matrix concentration-based techniques to obtain theoretical guarantees for LocalNewton, and we validate them with detailed empirical evaluation. To enhance practicability, we devise an adaptive scheme to choose L, and we show that this reduces the number of local iterations in worker machines between two model synchronizations as the training proceeds, successively refining the model quality at the master. Via extensive experiments using several real-world datasets with AWS Lambda workers and an AWS EC2 master, we show that LocalNewton requires fewer than 60% of the communication rounds (between master and workers) and less than 40% of the end-to-end running time, compared to state-of-the-art algorithms, to reach the same training~loss.
△ Less
Submitted 15 May, 2021;
originally announced May 2021.
-
Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep Learning
Authors:
Matthew L. Olson,
Roli Khanna,
Lawrence Neal,
Fuxin Li,
Weng-Keen Wong
Abstract:
Counterfactual explanations, which deal with "why not?" scenarios, can provide insightful explanations to an AI agent's behavior. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanati…
▽ More
Counterfactual explanations, which deal with "why not?" scenarios, can provide insightful explanations to an AI agent's behavior. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanations based on generative deep learning. Specifically, a counterfactual state illustrates what minimal change is needed to an Atari game image such that the agent chooses a different action. We also evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our first user study investigates if humans can discern if the counterfactual state explanations are produced by the actual game or produced by a generative deep learning approach. Our second user study investigates if counterfactual state explanations can help non-expert participants identify a flawed agent; we compare against a baseline approach based on a nearest neighbor explanation which uses images from the actual game. Our results indicate that counterfactual state explanations have sufficient fidelity to the actual game images to enable non-experts to more effectively identify a flawed RL agent compared to the nearest neighbor baseline and to having no explanation at all.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification
Authors:
Francisco Utrera,
Evan Kravitz,
N. Benjamin Erichson,
Rajiv Khanna,
Michael W. Mahoney
Abstract:
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source dataset, freezing the early layers that encode essential generic image properties, and then fine-tuning the last few layers in order to capture specific informatio…
▽ More
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source dataset, freezing the early layers that encode essential generic image properties, and then fine-tuning the last few layers in order to capture specific information related to the target situation. This approach is particularly useful when only limited or weakly labeled data are available for the new task. In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models, especially if only limited data are available for the new domain task. Further, we observe that adversarial training biases the learnt representations to retaining shapes, as opposed to textures, which impacts the transferability of the source models. Finally, through the lens of influence functions, we discover that transferred adversarially-trained models contain more human-identifiable semantic information, which explains -- at least partly -- why adversarially-trained models transfer better.
△ Less
Submitted 23 April, 2021; v1 submitted 11 July, 2020;
originally announced July 2020.
-
Boundary thickness and robustness in learning models
Authors:
Yaoqing Yang,
Rajiv Khanna,
Yaodong Yu,
Amir Gholami,
Kurt Keutzer,
Joseph E. Gonzalez,
Kannan Ramchandran,
Michael W. Mahoney
Abstract:
Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with and usefulness for model robustness. Thick decision boundaries lead to improved performance, while thin decision boundaries lead to overfitting (e.g., measured…
▽ More
Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with and usefulness for model robustness. Thick decision boundaries lead to improved performance, while thin decision boundaries lead to overfitting (e.g., measured by the robust generalization gap between training and testing) and lower robustness. We show that a thicker boundary helps improve robustness against adversarial examples (e.g., improving the robust test accuracy of adversarial training) as well as so-called out-of-distribution (OOD) transforms, and we show that many commonly-used regularization and data augmentation procedures can increase boundary thickness. On the theoretical side, we establish that maximizing boundary thickness during training is akin to the so-called mixup training. Using these observations, we show that noise-augmentation on mixup training further increases boundary thickness, thereby combating vulnerability to various forms of adversarial attacks and OOD transforms. We can also show that the performance improvement in several lines of recent work happens in conjunction with a thicker boundary.
△ Less
Submitted 12 January, 2021; v1 submitted 9 July, 2020;
originally announced July 2020.
-
Bayesian Coresets: Revisiting the Nonconvex Optimization Perspective
Authors:
Jacky Y. Zhang,
Rajiv Khanna,
Anastasios Kyrillidis,
Oluwasanmi Koyejo
Abstract:
Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the selected subset closely approximates the posterior inference using the full dataset. This manuscript revisits Bayesian coresets through the lens of sparsity constrain…
▽ More
Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the selected subset closely approximates the posterior inference using the full dataset. This manuscript revisits Bayesian coresets through the lens of sparsity constrained optimization. Leveraging recent advances in accelerated optimization methods, we propose and analyze a novel algorithm for coreset selection. We provide explicit convergence rate guarantees and present an empirical evaluation on a variety of benchmark datasets to highlight our proposed algorithm's superior performance compared to state-of-the-art on speed and accuracy.
△ Less
Submitted 25 February, 2021; v1 submitted 1 July, 2020;
originally announced July 2020.
-
ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data
Authors:
Woojeong **,
Rahul Khanna,
Suji Kim,
Dong-Ho Lee,
Fred Morstatter,
Aram Galstyan,
Xiang Ren
Abstract:
Event forecasting is a challenging, yet important task, as humans seek to constantly plan for the future. Existing automated forecasting studies rely mostly on structured data, such as time-series or event-based knowledge graphs, to help predict future events. In this work, we aim to formulate a task, construct a dataset, and provide benchmarks for develo** methods for event forecasting with lar…
▽ More
Event forecasting is a challenging, yet important task, as humans seek to constantly plan for the future. Existing automated forecasting studies rely mostly on structured data, such as time-series or event-based knowledge graphs, to help predict future events. In this work, we aim to formulate a task, construct a dataset, and provide benchmarks for develo** methods for event forecasting with large volumes of unstructured text data. To simulate the forecasting scenario on temporal news documents, we formulate the problem as a restricted-domain, multiple-choice, question-answering (QA) task. Unlike existing QA tasks, our task limits accessible information, and thus a model has to make a forecasting judgement. To showcase the usefulness of this task formulation, we introduce ForecastQA, a question-answering dataset consisting of 10,392 event forecasting questions, which have been collected and verified via crowdsourcing efforts. We present our experiments on ForecastQA using BERT-based models and find that our best model achieves 60.1% accuracy on the dataset, which still lags behind human performance by about 19%. We hope ForecastQA will support future research efforts in bridging this gap.
△ Less
Submitted 7 June, 2021; v1 submitted 2 May, 2020;
originally announced May 2020.
-
RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms
Authors:
Pei Zhou,
Rahul Khanna,
Seyeon Lee,
Bill Yuchen Lin,
Daniel Ho,
Jay Pujara,
Xiang Ren
Abstract:
Pre-trained language models (PTLMs) have achieved impressive performance on commonsense inference benchmarks, but their ability to employ commonsense to make robust inferences, which is crucial for effective communications with humans, is debated. In the pursuit of advancing fluid human-AI communication, we propose a new challenge, RICA: Robust Inference capability based on Commonsense Axioms, tha…
▽ More
Pre-trained language models (PTLMs) have achieved impressive performance on commonsense inference benchmarks, but their ability to employ commonsense to make robust inferences, which is crucial for effective communications with humans, is debated. In the pursuit of advancing fluid human-AI communication, we propose a new challenge, RICA: Robust Inference capability based on Commonsense Axioms, that evaluates robust commonsense inference despite textual perturbations. To generate data for this challenge, we develop a systematic and scalable procedure using commonsense knowledge bases and probe PTLMs across two different evaluation settings. Extensive experiments on our generated probe sets with more than 10k statements show that PTLMs perform no better than random guessing on the zero-shot setting, are heavily impacted by statistical biases, and are not robust to perturbation attacks. We also find that fine-tuning on similar statements offer limited gains, as PTLMs still fail to generalize to unseen inferences. Our new large-scale benchmark exposes a significant gap between PTLMs and human-level language understanding and offers a new challenge for PTLMs to demonstrate commonsense.
△ Less
Submitted 9 September, 2021; v1 submitted 2 May, 2020;
originally announced May 2020.
-
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
Authors:
Bill Yuchen Lin,
Seyeon Lee,
Rahul Khanna,
Xiang Ren
Abstract:
Recent works show that pre-trained language models (PTLMs), such as BERT, possess certain commonsense and factual knowledge. They suggest that it is promising to use PTLMs as "neural knowledge bases" via predicting masked words. Surprisingly, we find that this may not work for numerical commonsense knowledge (e.g., a bird usually has two legs). In this paper, we investigate whether and to what ext…
▽ More
Recent works show that pre-trained language models (PTLMs), such as BERT, possess certain commonsense and factual knowledge. They suggest that it is promising to use PTLMs as "neural knowledge bases" via predicting masked words. Surprisingly, we find that this may not work for numerical commonsense knowledge (e.g., a bird usually has two legs). In this paper, we investigate whether and to what extent we can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process. To study this, we introduce a novel probing task with a diagnostic dataset, NumerSense, containing 13.6k masked-word-prediction probes (10.5k for fine-tuning and 3.1k for testing). Our analysis reveals that: (1) BERT and its stronger variant RoBERTa perform poorly on the diagnostic dataset prior to any fine-tuning; (2) fine-tuning with distant supervision brings some improvement; (3) the best supervised model still performs poorly as compared to human performance (54.06% vs 96.3% in accuracy).
△ Less
Submitted 17 September, 2020; v1 submitted 1 May, 2020;
originally announced May 2020.
-
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
Authors:
Dong-Ho Lee,
Rahul Khanna,
Bill Yuchen Lin,
Jamin Chen,
Seyeon Lee,
Qinyuan Ye,
Elizabeth Boschee,
Leonardo Neves,
Xiang Ren
Abstract:
Successfully training a deep neural network demands a huge corpus of labeled data. However, each label only provides limited information to learn from and collecting the requisite number of labels involves massive human effort. In this work, we introduce LEAN-LIFE, a web-based, Label-Efficient AnnotatioN framework for sequence labeling and classification tasks, with an easy-to-use UI that not only…
▽ More
Successfully training a deep neural network demands a huge corpus of labeled data. However, each label only provides limited information to learn from and collecting the requisite number of labels involves massive human effort. In this work, we introduce LEAN-LIFE, a web-based, Label-Efficient AnnotatioN framework for sequence labeling and classification tasks, with an easy-to-use UI that not only allows an annotator to provide the needed labels for a task, but also enables LearnIng From Explanations for each labeling decision. Such explanations enable us to generate useful additional labeled data from unlabeled instances, bolstering the pool of available training data. On three popular NLP tasks (named entity recognition, relation extraction, sentiment analysis), we find that using this enhanced supervision allows our models to surpass competitive baseline F1 scores by more than 5-10 percentage points, while using 2X times fewer labeled instances. Our framework is the first to utilize this enhanced supervision technique and does so for three important tasks -- thus providing improved annotation recommendations to users and an ability to build datasets of (data, label, explanation) triples instead of the regular (data, label) pair.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
A1: A Distributed In-Memory Graph Database
Authors:
Chiranjeeb Buragohain,
Knut Magne Risvik,
Paul Brett,
Miguel Castro,
Wonhee Cho,
Joshua Cowhig,
Nikolas Gloy,
Karthik Kalyanaraman,
Richendra Khanna,
John Pao,
Matthew Renzelmann,
Alex Shamis,
Timothy Tan,
Shuheng Zheng
Abstract:
A1 is an in-memory distributed database used by the Bing search engine to support complex queries over structured data. The key enablers for A1 are availability of cheap DRAM and high speed RDMA (Remote Direct Memory Access) networking in commodity hardware. A1 uses FaRM as its underlying storage layer and builds the graph abstraction and query engine on top. The combination of in-memory storage a…
▽ More
A1 is an in-memory distributed database used by the Bing search engine to support complex queries over structured data. The key enablers for A1 are availability of cheap DRAM and high speed RDMA (Remote Direct Memory Access) networking in commodity hardware. A1 uses FaRM as its underlying storage layer and builds the graph abstraction and query engine on top. The combination of in-memory storage and RDMA access requires rethinking how data is allocated, organized and queried in a large distributed system. A single A1 cluster can store tens of billions of vertices and edges and support a throughput of 350+ million of vertex reads per second with end to end query latency in single digit milliseconds. In this paper we describe the A1 data model, RDMA optimized data structures and query execution.
△ Less
Submitted 12 April, 2020;
originally announced April 2020.
-
Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nyström method
Authors:
Michał Dereziński,
Rajiv Khanna,
Michael W. Mahoney
Abstract:
The Column Subset Selection Problem (CSSP) and the Nyström method are among the leading tools for constructing small low-rank approximations of large datasets in machine learning and scientific computing. A fundamental question in this area is: how well can a data subset of size k compete with the best rank k approximation? We develop techniques which exploit spectral properties of the data matrix…
▽ More
The Column Subset Selection Problem (CSSP) and the Nyström method are among the leading tools for constructing small low-rank approximations of large datasets in machine learning and scientific computing. A fundamental question in this area is: how well can a data subset of size k compete with the best rank k approximation? We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees which go beyond the standard worst-case analysis. Our approach leads to significantly better bounds for datasets with known rates of singular value decay, e.g., polynomial or exponential decay. Our analysis also reveals an intriguing phenomenon: the approximation factor as a function of k may exhibit multiple peaks and valleys, which we call a multiple-descent curve. A lower bound we establish shows that this behavior is not an artifact of our analysis, but rather it is an inherent property of the CSSP and Nyström tasks. Finally, using the example of a radial basis function (RBF) kernel, we show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
△ Less
Submitted 18 December, 2020; v1 submitted 20 February, 2020;
originally announced February 2020.
-
The quotient Unimodular Vector group is nilpotent
Authors:
Reema Khanna,
Selby Jose,
Sampat Sharma,
Ravi A. Rao
Abstract:
Jose-Rao introduced and studied the Special Unimodular Vector group $SUm_r(R)$ and $EUm_r(R)$, its Elementary Unimodular Vector subgroup. They proved that for $r \geq 2$, $EUm_r(R)$ is a normal subgroup of $SUm_r(R)$. The Jose-Rao theorem says that the quotient Unimodular Vector group, $SUm_r(R)/EUm_r(R)$, for $r \geq 2$, is a subgroup of the orthogonal quotient group…
▽ More
Jose-Rao introduced and studied the Special Unimodular Vector group $SUm_r(R)$ and $EUm_r(R)$, its Elementary Unimodular Vector subgroup. They proved that for $r \geq 2$, $EUm_r(R)$ is a normal subgroup of $SUm_r(R)$. The Jose-Rao theorem says that the quotient Unimodular Vector group, $SUm_r(R)/EUm_r(R)$, for $r \geq 2$, is a subgroup of the orthogonal quotient group $SO_{2(r+1)}(R)/EO_{2(r + 1)}(R)$. The latter group is known to be nilpotent by the work of Hazrat-Vavilov, following methods of A. Bak; and so is the former.
In this article we give a direct proof, following ideas of A. Bak, to show that the quotient Unimodular Vector group is nilpotent of class $\leq d = \dim(R)$. We also use the Quillen-Suslin theory, inspired by A. Bak's method, to prove that if $R = A[X]$, with $A$ a local ring, then the quotient Unimodular Vector group is abelian.
△ Less
Submitted 20 January, 2020;
originally announced January 2020.
-
Building an Aerial-Ground Robotics System for Precision Farming: An Adaptable Solution
Authors:
Alberto Pretto,
Stéphanie Aravecchia,
Wolfram Burgard,
Nived Chebrolu,
Christian Dornhege,
Tillmann Falck,
Freya Fleckenstein,
Alessandra Fontenla,
Marco Imperoli,
Raghav Khanna,
Frank Liebisch,
Philipp Lottes,
Andres Milioto,
Daniele Nardi,
Sandro Nardi,
Johannes Pfeifer,
Marija Popović,
Ciro Potena,
Cédric Pradalier,
Elisa Rothacker-Feder,
Inkyu Sa,
Alexander Schaefer,
Roland Siegwart,
Cyrill Stachniss,
Achim Walter
, et al. (3 additional authors not shown)
Abstract:
The application of autonomous robots in agriculture is gaining increasing popularity thanks to the high impact it may have on food security, sustainability, resource use efficiency, reduction of chemical treatments, and the optimization of human effort and yield. With this vision, the Flourish research project aimed to develop an adaptable robotic solution for precision farming that combines the a…
▽ More
The application of autonomous robots in agriculture is gaining increasing popularity thanks to the high impact it may have on food security, sustainability, resource use efficiency, reduction of chemical treatments, and the optimization of human effort and yield. With this vision, the Flourish research project aimed to develop an adaptable robotic solution for precision farming that combines the aerial survey capabilities of small autonomous unmanned aerial vehicles (UAVs) with targeted intervention performed by multi-purpose unmanned ground vehicles (UGVs). This paper presents an overview of the scientific and technological advances and outcomes obtained in the project. We introduce multi-spectral perception algorithms and aerial and ground-based systems developed for monitoring crop density, weed pressure, crop nitrogen nutrition status, and to accurately classify and locate weeds. We then introduce the navigation and map** systems tailored to our robots in the agricultural environment, as well as the modules for collaborative map**. We finally present the ground intervention hardware, software solutions, and interfaces we implemented and tested in different field conditions and with different crops. We describe a real use case in which a UAV collaborates with a UGV to monitor the field and to perform selective spraying without human intervention.
△ Less
Submitted 7 June, 2022; v1 submitted 8 November, 2019;
originally announced November 2019.
-
Learning Sparse Distributions using Iterative Hard Thresholding
Authors:
Jacky Y. Zhang,
Rajiv Khanna,
Anastasios Kyrillidis,
Oluwasanmi Koyejo
Abstract:
Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propo…
▽ More
Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a solution to the problem of learning sparse discrete distributions. We study the hardness of using IHT on the space of measures. As a practical alternative, we propose a greedy approximate projection which simultaneously captures appropriate notions of sparsity in distributions, while satisfying the simplex constraint, and investigate the convergence behavior of the resulting procedure in various settings. Our results show, both in theory and practice, that IHT can achieve state of the art results for learning sparse distributions.
△ Less
Submitted 30 January, 2020; v1 submitted 29 October, 2019;
originally announced October 2019.
-
An Efficient Sampling-based Method for Online Informative Path Planning in Unknown Environments
Authors:
Lukas Schmid,
Michael Pantic,
Raghav Khanna,
Lionel Ott,
Roland Siegwart,
Juan Nieto
Abstract:
The ability to plan informative paths online is essential to robot autonomy. In particular, sampling-based approaches are often used as they are capable of using arbitrary information gain formulations. However, they are prone to local minima, resulting in sub-optimal trajectories, and sometimes do not reach global coverage. In this paper, we present a new RRT*-inspired online informative path pla…
▽ More
The ability to plan informative paths online is essential to robot autonomy. In particular, sampling-based approaches are often used as they are capable of using arbitrary information gain formulations. However, they are prone to local minima, resulting in sub-optimal trajectories, and sometimes do not reach global coverage. In this paper, we present a new RRT*-inspired online informative path planning algorithm. Our method continuously expands a single tree of candidate trajectories and rewires segments to maintain the tree and refine intermediate trajectories. This allows the algorithm to achieve global coverage and maximize the utility of a path in a global context, using a single objective function. We demonstrate the algorithm's capabilities in the applications of autonomous indoor exploration as well as accurate Truncated Signed Distance Field (TSDF)-based 3D reconstruction on-board a Micro Aerial vehicle (MAV). We study the impact of commonly used information gain and cost formulations in these scenarios and propose a novel TSDF-based 3D reconstruction gain and cost-utility formulation. Detailed evaluation in realistic simulation environments show that our approach outperforms state of the art methods in these tasks. Experiments on a real MAV demonstrate the ability of our method to robustly plan in real-time, exploring an indoor environment solely with on-board sensing and computation. We make our framework available for future research.
△ Less
Submitted 14 January, 2020; v1 submitted 20 September, 2019;
originally announced September 2019.
-
Geometric Rates of Convergence for Kernel-based Sampling Algorithms
Authors:
Rajiv Khanna,
Liam Hodgkinson,
Michael W. Mahoney
Abstract:
The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated. Under verifiable conditions on the chosen kernel and target measure, we establish a near-geometric rate of convergence for target measures that are nearly atomic. Furthermor…
▽ More
The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated. Under verifiable conditions on the chosen kernel and target measure, we establish a near-geometric rate of convergence for target measures that are nearly atomic. Furthermore, we show these algorithms perform comparably to the theoretical best possible sampling algorithm under the maximum mean discrepancy. An analysis is also conducted in a distributed setting. Our theoretical developments are supported by empirical observations on simulated data as well as a real world application.
△ Less
Submitted 31 October, 2021; v1 submitted 19 July, 2019;
originally announced July 2019.
-
Interpreting Black Box Predictions using Fisher Kernels
Authors:
Rajiv Khanna,
Been Kim,
Joydeep Ghosh,
Oluwasanmi Koyejo
Abstract:
Research in both machine learning and psychology suggests that salient examples can help humans to interpret learning models. To this end, we take a novel look at black box interpretation of test predictions in terms of training examples. Our goal is to ask `which training examples are most responsible for a given set of predictions'? To answer this question, we make use of Fisher kernels as the d…
▽ More
Research in both machine learning and psychology suggests that salient examples can help humans to interpret learning models. To this end, we take a novel look at black box interpretation of test predictions in terms of training examples. Our goal is to ask `which training examples are most responsible for a given set of predictions'? To answer this question, we make use of Fisher kernels as the defining feature embedding of each data point, combined with Sequential Bayesian Quadrature (SBQ) for efficient selection of examples. In contrast to prior work, our method is able to seamlessly handle any sized subset of test predictions in a principled way. We theoretically analyze our approach, providing novel convergence bounds for SBQ over discrete candidate atoms. Our approach recovers the application of influence functions for interpretability as a special case yielding novel insights from this connection. We also present applications of the proposed approach to three use cases: cleaning training data, fixing mislabeled examples and data summarization.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.
-
The Pillars of Relative Quillen--Suslin Theory
Authors:
Rabeya Basu,
Reema Khanna,
Ravi A. Rao
Abstract:
We deduce the relative version of the equivalences relating the relative Local Global Principle and the Normality of the relative Elementary subgroups of the traditional classical groups, viz. general linear, symplectic and orthogonal groups. This generalizes our previous result for the absolute case.
We deduce the relative version of the equivalences relating the relative Local Global Principle and the Normality of the relative Elementary subgroups of the traditional classical groups, viz. general linear, symplectic and orthogonal groups. This generalizes our previous result for the absolute case.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
AgriColMap: Aerial-Ground Collaborative 3D Map** for Precision Farming
Authors:
Ciro Potena,
Raghav Khanna,
Juan Nieto,
Roland Siegwart,
Daniele Nardi,
Alberto Pretto
Abstract:
The combination of aerial survey capabilities of Unmanned Aerial Vehicles with targeted intervention abilities of agricultural Unmanned Ground Vehicles can significantly improve the effectiveness of robotic systems applied to precision agriculture. In this context, building and updating a common map of the field is an essential but challenging task. The maps built using robots of different types s…
▽ More
The combination of aerial survey capabilities of Unmanned Aerial Vehicles with targeted intervention abilities of agricultural Unmanned Ground Vehicles can significantly improve the effectiveness of robotic systems applied to precision agriculture. In this context, building and updating a common map of the field is an essential but challenging task. The maps built using robots of different types show differences in size, resolution and scale, the associated geolocation data may be inaccurate and biased, while the repetitiveness of both visual appearance and geometric structures found within agricultural contexts render classical map merging techniques ineffective. In this paper we propose AgriColMap, a novel map registration pipeline that leverages a grid-based multimodal environment representation which includes a vegetation index map and a Digital Surface Model. We cast the data association problem between maps built from UAVs and UGVs as a multimodal, large displacement dense optical flow estimation. The dominant, coherent flows, selected using a voting scheme, are used as point-to-point correspondences to infer a preliminary non-rigid alignment between the maps. A final refinement is then performed, by exploiting only meaningful parts of the registered maps. We evaluate our system using real world data for 3 fields with different crop species. The results show that our method outperforms several state of the art map registration and matching techniques by a large margin, and has a higher tolerance to large initial misalignments. We release an implementation of the proposed approach along with the acquired datasets with this paper.
△ Less
Submitted 14 March, 2019; v1 submitted 30 September, 2018;
originally announced October 2018.
-
WeedMap: A large-scale semantic weed map** framework using aerial multispectral imaging and deep neural network for precision farming
Authors:
Inkyu Sa,
Marija Popovic,
Raghav Khanna,
Zetao Chen,
Philipp Lottes,
Frank Liebisch,
Juan Nieto,
Cyrill Stachniss,
Achim Walter,
Roland Siegwart
Abstract:
We present a novel weed segmentation and map** framework that processes multispectral images obtained from an unmanned aerial vehicle (UAV) using a deep neural network (DNN). Most studies on crop/weed semantic segmentation only consider single images for processing and classification. Images taken by UAVs often cover only a few hundred square meters with either color only or color and near-infra…
▽ More
We present a novel weed segmentation and map** framework that processes multispectral images obtained from an unmanned aerial vehicle (UAV) using a deep neural network (DNN). Most studies on crop/weed semantic segmentation only consider single images for processing and classification. Images taken by UAVs often cover only a few hundred square meters with either color only or color and near-infrared (NIR) channels. Computing a single large and accurate vegetation map (e.g., crop/weed) using a DNN is non-trivial due to difficulties arising from: (1) limited ground sample distances (GSDs) in high-altitude datasets, (2) sacrificed resolution resulting from downsampling high-fidelity images, and (3) multispectral image alignment. To address these issues, we adopt a stand sliding window approach that operates on only small portions of multispectral orthomosaic maps (tiles), which are channel-wise aligned and calibrated radiometrically across the entire map. We define the tile size to be the same as that of the DNN input to avoid resolution loss. Compared to our baseline model (i.e., SegNet with 3 channel RGB inputs) yielding an area under the curve (AUC) of [background=0.607, crop=0.681, weed=0.576], our proposed model with 9 input channels achieves [0.839, 0.863, 0.782]. Additionally, we provide an extensive analysis of 20 trained models, both qualitatively and quantitatively, in order to evaluate the effects of varying input channels and tunable network hyperparameters. Furthermore, we release a large sugar beet/weed aerial dataset with expertly guided annotations for further research in the fields of remote sensing, precision agriculture, and agricultural robotics.
△ Less
Submitted 6 September, 2018; v1 submitted 31 July, 2018;
originally announced August 2018.
-
Boosting Black Box Variational Inference
Authors:
Francesco Locatello,
Gideon Dresdner,
Rajiv Khanna,
Isabel Valera,
Gunnar Rätsch
Abstract:
Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing ideas from the classic boosting framework, recent approaches attempt to \emph{boost} VI by replacing the selection of a single density with a greedily constructe…
▽ More
Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing ideas from the classic boosting framework, recent approaches attempt to \emph{boost} VI by replacing the selection of a single density with a greedily constructed mixture of densities. In order to guarantee convergence, previous works impose stringent assumptions that require significant effort for practitioners. Specifically, they require a custom implementation of the greedy step (called the LMO) for every probabilistic model with respect to an unnatural variational family of truncated distributions. Our work fixes these issues with novel theoretical and algorithmic insights. On the theoretical side, we show that boosting VI satisfies a relaxed smoothness assumption which is sufficient for the convergence of the functional Frank-Wolfe (FW) algorithm. Furthermore, we rephrase the LMO problem and propose to maximize the Residual ELBO (RELBO) which replaces the standard ELBO optimization in VI. These theoretical enhancements allow for black box implementation of the boosting subroutine. Finally, we present a stop** criterion drawn from the duality gap in the classic FW analyses and exhaustive experiments to illustrate the usefulness of our theoretical and algorithmic contributions.
△ Less
Submitted 28 November, 2018; v1 submitted 6 June, 2018;
originally announced June 2018.
-
IHT dies hard: Provable accelerated Iterative Hard Thresholding
Authors:
Rajiv Khanna,
Anastasios Kyrillidis
Abstract:
We study --both in theory and practice-- the use of momentum motions in classic iterative hard thresholding (IHT) methods. By simply modifying plain IHT, we investigate its convergence behavior on convex optimization criteria with non-convex constraints, under standard assumptions. In diverse scenaria, we observe that acceleration in IHT leads to significant improvements, compared to state of the…
▽ More
We study --both in theory and practice-- the use of momentum motions in classic iterative hard thresholding (IHT) methods. By simply modifying plain IHT, we investigate its convergence behavior on convex optimization criteria with non-convex constraints, under standard assumptions. In diverse scenaria, we observe that acceleration in IHT leads to significant improvements, compared to state of the art projected gradient descent and Frank-Wolfe variants. As a byproduct of our inspection, we study the impact of selecting the momentum parameter: similar to convex settings, two modes of behavior are observed --"rippling" and linear-- depending on the level of momentum.
△ Less
Submitted 13 September, 2019; v1 submitted 26 December, 2017;
originally announced December 2017.
-
Autonomous Electric Race Car Design
Authors:
Niklas Funk,
Nikhilesh Alatur,
Robin Deuber,
Frederick Gonon,
Nico Messikommer,
Julian Nubert,
Moritz Patriarca,
Simon Schaefer,
Dominic Scotoni,
Nicholas Bünger,
Renaud Dube,
Raghav Khanna,
Mark Pfeiffer,
Erik Wilhelm,
Roland Siegwart
Abstract:
Autonomous driving and electric vehicles are nowadays very active research and development areas. In this paper we present the conversion of a standard Kyburz eRod into an autonomous vehicle that can be operated in challenging environments such as Swiss mountain passes. The overall hardware and software architectures are described in detail with a special emphasis on the sensor requirements for au…
▽ More
Autonomous driving and electric vehicles are nowadays very active research and development areas. In this paper we present the conversion of a standard Kyburz eRod into an autonomous vehicle that can be operated in challenging environments such as Swiss mountain passes. The overall hardware and software architectures are described in detail with a special emphasis on the sensor requirements for autonomous vehicles operating in partially structured environments. Furthermore, the design process itself and the finalized system architecture are presented. The work shows state of the art results in localization and controls for self-driving high-performance electric vehicles. Test results of the overall system are presented, which show the importance of generalizable state estimation algorithms to handle a plethora of conditions.
△ Less
Submitted 1 November, 2017;
originally announced November 2017.
-
weedNet: Dense Semantic Weed Classification Using Multispectral Images and MAV for Smart Farming
Authors:
Inkyu Sa,
Zetao Chen,
Marija Popovic,
Raghav Khanna,
Frank Liebisch,
Juan Nieto,
Roland Siegwart
Abstract:
Selective weed treatment is a critical step in autonomous crop management as related to crop health and yield. However, a key challenge is reliable, and accurate weed detection to minimize damage to surrounding plants. In this paper, we present an approach for dense semantic weed classification with multispectral images collected by a micro aerial vehicle (MAV). We use the recently developed encod…
▽ More
Selective weed treatment is a critical step in autonomous crop management as related to crop health and yield. However, a key challenge is reliable, and accurate weed detection to minimize damage to surrounding plants. In this paper, we present an approach for dense semantic weed classification with multispectral images collected by a micro aerial vehicle (MAV). We use the recently developed encoder-decoder cascaded Convolutional Neural Network (CNN), Segnet, that infers dense semantic classes while allowing any number of input image channels and class balancing with our sugar beet and weed datasets. To obtain training datasets, we established an experimental field with varying herbicide levels resulting in field plots containing only either crop or weed, enabling us to use the Normalized Difference Vegetation Index (NDVI) as a distinguishable feature for automatic ground truth generation. We train 6 models with different numbers of input channels and condition (fine-tune) it to achieve about 0.8 F1-score and 0.78 Area Under the Curve (AUC) classification metrics. For model deployment, an embedded GPU system (Jetson TX2) is tested for MAV integration. Dataset used in this paper is released to support the community and future work.
△ Less
Submitted 11 September, 2017;
originally announced September 2017.
-
Build Your Own Visual-Inertial Drone: A Cost-Effective and Open-Source Autonomous Drone
Authors:
Inkyu Sa,
Mina Kamel,
Michael Burri,
Michael Bloesch,
Raghav Khanna,
Marija Popovic,
Juan Nieto,
Roland Siegwart
Abstract:
This paper describes an approach to building a cost-effective and research grade visual-inertial odometry aided vertical taking-off and landing (VTOL) platform. We utilize an off-the-shelf visual-inertial sensor, an onboard computer, and a quadrotor platform that are factory-calibrated and mass-produced, thereby sharing similar hardware and sensor specifications (e.g., mass, dimensions, intrinsic…
▽ More
This paper describes an approach to building a cost-effective and research grade visual-inertial odometry aided vertical taking-off and landing (VTOL) platform. We utilize an off-the-shelf visual-inertial sensor, an onboard computer, and a quadrotor platform that are factory-calibrated and mass-produced, thereby sharing similar hardware and sensor specifications (e.g., mass, dimensions, intrinsic and extrinsic of camera-IMU systems, and signal-to-noise ratio). We then perform a system calibration and identification enabling the use of our visual-inertial odometry, multi-sensor fusion, and model predictive control frameworks with the off-the-shelf products. This implies that we can partially avoid tedious parameter tuning procedures for building a full system. The complete system is extensively evaluated both indoors using a motion capture system and outdoors using a laser tracker while performing hover and step responses, and trajectory following tasks in the presence of external wind disturbances. We achieve root-mean-square (RMS) pose errors between a reference and actual trajectories of 0.036m, while performing hover. We also conduct relatively long distance flight (~180m) experiments on a farm site and achieve 0.82% drift error of the total distance flight. This paper conveys the insights we acquired about the platform and sensor module and returns to the community as open-source code with tutorial documentation.
△ Less
Submitted 6 September, 2018; v1 submitted 22 August, 2017;
originally announced August 2017.
-
Boosting Variational Inference: an Optimization Perspective
Authors:
Francesco Locatello,
Rajiv Khanna,
Joydeep Ghosh,
Gunnar Rätsch
Abstract:
Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. However, as is the case with many other variational inference algorithms, its theoretical pro…
▽ More
Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. However, as is the case with many other variational inference algorithms, its theoretical properties have not been studied. In the present work, we study the convergence properties of this approach from a modern optimization viewpoint by establishing connections to the classic Frank-Wolfe algorithm. Our analyses yields novel theoretical insights regarding the sufficient conditions for convergence, explicit rates, and algorithmic simplifications. Since a lot of focus in previous works for variational inference has been on tractability, our work is especially important as a much needed attempt to bridge the gap between probabilistic models and their corresponding theoretical properties.
△ Less
Submitted 7 March, 2018; v1 submitted 5 August, 2017;
originally announced August 2017.
-
Scalable Greedy Feature Selection via Weak Submodularity
Authors:
Rajiv Khanna,
Ethan Elenberg,
Alexandros G. Dimakis,
Sahand Negahban,
Joydeep Ghosh
Abstract:
Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because for each greedy step we need to refit a model or calculate a function using the previously selected choices and the new candidate.
Two algorithms that are faster…
▽ More
Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because for each greedy step we need to refit a model or calculate a function using the previously selected choices and the new candidate.
Two algorithms that are faster approximations to the greedy forward selection were introduced recently ([Mirzasoleiman et al. 2013, 2015]). They achieve better performance by exploiting distributed computation and stochastic evaluation respectively. Both algorithms have provable performance guarantees for submodular functions.
In this paper we show that divergent from previously held opinion, submodularity is not required to obtain approximation guarantees for these two algorithms. Specifically, we show that a generalized concept of weak submodularity suffices to give multiplicative approximation guarantees. Our result extends the applicability of these algorithms to a larger class of functions. Furthermore, we show that a bounded submodularity ratio can be used to provide data dependent bounds that can sometimes be tighter also for submodular functions. We empirically validate our work by showing superior performance of fast greedy approximations versus several established baselines on artificial and real datasets.
△ Less
Submitted 8 March, 2017;
originally announced March 2017.
-
On Approximation Guarantees for Greedy Low Rank Optimization
Authors:
Rajiv Khanna,
Ethan Elenberg,
Alexandros G. Dimakis,
Sahand Negahban
Abstract:
We provide new approximation guarantees for greedy low rank matrix estimation under standard assumptions of restricted strong convexity and smoothness. Our novel analysis also uncovers previously unknown connections between the low rank estimation and combinatorial optimization, so much so that our bounds are reminiscent of corresponding approximation bounds in submodular maximization. Additionall…
▽ More
We provide new approximation guarantees for greedy low rank matrix estimation under standard assumptions of restricted strong convexity and smoothness. Our novel analysis also uncovers previously unknown connections between the low rank estimation and combinatorial optimization, so much so that our bounds are reminiscent of corresponding approximation bounds in submodular maximization. Additionally, we also provide statistical recovery guarantees. Finally, we present empirical comparison of greedy estimation with established baselines on two important real-world problems.
△ Less
Submitted 8 March, 2017;
originally announced March 2017.
-
A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe
Authors:
Francesco Locatello,
Rajiv Khanna,
Michael Tschannen,
Martin Jaggi
Abstract:
Two of the most fundamental prototypes of greedy optimization are the matching pursuit and Frank-Wolfe algorithms. In this paper, we take a unified view on both classes of methods, leading to the first explicit convergence rates of matching pursuit methods in an optimization sense, for general sets of atoms. We derive sublinear ($1/t$) convergence for both classes on general smooth objectives, and…
▽ More
Two of the most fundamental prototypes of greedy optimization are the matching pursuit and Frank-Wolfe algorithms. In this paper, we take a unified view on both classes of methods, leading to the first explicit convergence rates of matching pursuit methods in an optimization sense, for general sets of atoms. We derive sublinear ($1/t$) convergence for both classes on general smooth objectives, and linear convergence on strongly convex objectives, as well as a clear correspondence of algorithm variants. Our presented algorithms and rates are affine invariant, and do not need any incoherence or sparsity assumptions.
△ Less
Submitted 7 March, 2017; v1 submitted 21 February, 2017;
originally announced February 2017.
-
Dynamic System Identification, and Control for a cost effective open-source VTOL MAV
Authors:
Inkyu Sa,
Mina Kamel,
Raghav Khanna,
Marija Popovic,
Juan Nieto,
Roland Siegwart
Abstract:
This paper describes dynamic system identification, and full control of a cost-effective vertical take-off and landing (VTOL) multi-rotor micro-aerial vehicle (MAV) --- DJI Matrice 100. The dynamics of the vehicle and autopilot controllers are identified using only a built-in IMU and utilized to design a subsequent model predictive controller (MPC). Experimental results for the control performance…
▽ More
This paper describes dynamic system identification, and full control of a cost-effective vertical take-off and landing (VTOL) multi-rotor micro-aerial vehicle (MAV) --- DJI Matrice 100. The dynamics of the vehicle and autopilot controllers are identified using only a built-in IMU and utilized to design a subsequent model predictive controller (MPC). Experimental results for the control performance are evaluated using a motion capture system while performing hover, step responses, and trajectory following tasks in the present of external wind disturbances. We achieve root-mean-square (RMS) errors between the reference and actual trajectory of x=0.021m, y=0.016m, z=0.029m, roll=0.392deg, pitch=0.618deg, and yaw=1.087deg while performing hover. This paper also conveys the insights we have gained about the platform and returned to the community through open-source code, and documentation.
△ Less
Submitted 9 March, 2017; v1 submitted 30 January, 2017;
originally announced January 2017.
-
Restricted Strong Convexity Implies Weak Submodularity
Authors:
Ethan R. Elenberg,
Rajiv Khanna,
Alexandros G. Dimakis,
Sahand Negahban
Abstract:
We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe (2011) from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connection allows us to obtain strong multiplicative performance bounds on several methods without statistical modeling assumptions. We also derive recovery guarantees…
▽ More
We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe (2011) from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connection allows us to obtain strong multiplicative performance bounds on several methods without statistical modeling assumptions. We also derive recovery guarantees of this form under standard assumptions. Our work shows that greedy algorithms perform within a constant factor from the best possible subset-selection solution for a broad class of general objective functions. Our methods allow a direct control over the number of obtained features as opposed to regularization parameters that only implicitly control sparsity. Our proof technique uses the concept of weak submodularity initially defined by Das and Kempe. We draw a connection between convex analysis and submodular set function theory which may be of independent interest for other statistical learning applications that have combinatorial structure.
△ Less
Submitted 12 October, 2017; v1 submitted 2 December, 2016;
originally announced December 2016.
-
AstroSat CZT Imager observations of GRB 151006A: timing, spectroscopy, and polarisation study
Authors:
A. R. Rao,
Vikas Chand,
M. K. Hingar,
S. Iyyani,
Rakesh Khanna,
A. P. K. Kutty,
J. P. Malkar,
D. Paul,
V. B. Bhalerao,
D. Bhattacharya,
G. C. Dewangan,
Pramod Pawar,
A. M. Vibhute,
T. Chattopadhyay,
N. P. S. Mithun,
S. V. Vadawale,
N. Vagshette,
R. Basak,
P. Pradeep,
Essy Samuel,
S. Sreekumar,
P. Vinod,
K. H. Navalgund,
R. Pandiyan,
K. S. Sarma
, et al. (2 additional authors not shown)
Abstract:
AstroSat is a multi-wavelength satellite launched on 2015 September 28. The CZT Imager of AstroSat on its very first day of operation detected a long duration gamma-ray burst (GRB) namely GRB 151006A. Using the off-axis imaging and spectral response of the instrument, we demonstrate that CZT Imager can localise this GRB correct to about a few degrees and it can provide, in conjunction with Swift,…
▽ More
AstroSat is a multi-wavelength satellite launched on 2015 September 28. The CZT Imager of AstroSat on its very first day of operation detected a long duration gamma-ray burst (GRB) namely GRB 151006A. Using the off-axis imaging and spectral response of the instrument, we demonstrate that CZT Imager can localise this GRB correct to about a few degrees and it can provide, in conjunction with Swift, spectral parameters similar to that obtained from Fermi/GBM. Hence CZT Imager would be a useful addition to the currently operating GRB instruments (Swift and Fermi). Specifically, we argue that the CZT Imager will be most useful for the short hard GRBs by providing localisation for those detected by Fermi and spectral information for those detected only by Swift. We also provide preliminary results on a new exciting capability of this instrument: CZT Imager is able to identify Compton scattered events thereby providing polarisation information for bright GRBs. GRB 151006A, in spite of being relatively faint, shows hints of a polarisation signal at 100-300 keV (though at a low significance level). We point out that CZT Imager should provide significant time resolved polarisation measurements for GRBs that have fluence 3 times higher than that of GRB 151006A. We estimate that the number of such bright GRBs detectable by CZT Imager is 5 - 6 per year. CZT Imager can also act as a good hard X-ray monitoring device for possible electromagnetic counterparts of Gravitational Wave events.
△ Less
Submitted 26 August, 2016;
originally announced August 2016.
-
Charged Particle Monitor on the AstroSat mission
Authors:
A. R. Rao,
M. H. Patil,
Yash Bhargava,
Rakesh Khanna,
M. K. Hingar,
A. P. K. Kutty,
J. P. Malkar,
Rupal Basak,
S. Sreekumar,
Essy Samuel,
P. Priya,
P. Vinod,
D. Bhattacharya,
V. Bhalerao,
S. V. Vadawale,
N. P. S. Mithun,
R. Pandiyan,
K. Subbarao,
S. Seetha,
K. Suryanarayana Sarma
Abstract:
Charged Particle Monitor (CPM) on-board the AstroSat satellite is an instrument designed to detect the flux of charged particles at the satellite location. A Cesium Iodide Thallium (CsI(Tl)) crystal is used with a Kapton window to detect protons with energies greater than 1 MeV. The ground calibration of CPM was done using gamma-rays from radioactive sources and protons from particle accelerators.…
▽ More
Charged Particle Monitor (CPM) on-board the AstroSat satellite is an instrument designed to detect the flux of charged particles at the satellite location. A Cesium Iodide Thallium (CsI(Tl)) crystal is used with a Kapton window to detect protons with energies greater than 1 MeV. The ground calibration of CPM was done using gamma-rays from radioactive sources and protons from particle accelerators. Based on the ground calibration results, energy deposition above 1 MeV are accepted and particle counts are recorded. It is found that CPM counts are steady and the signal for the onset and exit of South Atlantic Anomaly (SAA) region are generated in a very reliable and stable manner.
△ Less
Submitted 21 August, 2016;
originally announced August 2016.
-
The Cadmium Zinc Telluride Imager on AstroSat
Authors:
V. Bhalerao,
D. Bhattacharya,
A. Vibhute,
P. Pawar,
A. R. Rao,
M. K. Hingar,
Rakesh Khanna,
A. P. K. Kutty,
J. P. Malkar,
M. H. Patil,
Y. K. Arora,
S. Sinha,
P. Priya,
Essy Samuel,
S. Sreekumar,
P. Vinod,
N. P. S. Mithun,
S. V. Vadawale,
N. Vagshette,
K. H. Navalgund,
K. S. Sarma,
R. Pandiyan,
S. Seetha,
K. Subbarao
Abstract:
The Cadmium Zinc Telluride Imager (CZTI) is a high energy, wide-field imaging instrument on AstroSat. CZT's namesake Cadmium Zinc Telluride detectors cover an energy range from 20 keV to > 200 keV, with 11% energy resolution at 60 keV. The coded aperture mask attains an angular resolution of 17' over a 4.6 deg x 4.6 deg (FWHM) field of view. CZTI functions as an open detector above 100 keV, contin…
▽ More
The Cadmium Zinc Telluride Imager (CZTI) is a high energy, wide-field imaging instrument on AstroSat. CZT's namesake Cadmium Zinc Telluride detectors cover an energy range from 20 keV to > 200 keV, with 11% energy resolution at 60 keV. The coded aperture mask attains an angular resolution of 17' over a 4.6 deg x 4.6 deg (FWHM) field of view. CZTI functions as an open detector above 100 keV, continuously sensitive to GRBs and other transients in about 30% of the sky. The pixellated detectors are sensitive to polarisation above ~100 keV, with exciting possibilities for polarisation studies of transients and bright persistent sources. In this paper, we provide details of the complete CZTI instrument, detectors, coded aperture mask, mechanical and electronic configuration, as well as data and products.
△ Less
Submitted 11 August, 2016;
originally announced August 2016.
-
Information Projection and Approximate Inference for Structured Sparse Variables
Authors:
Rajiv Khanna,
Joydeep Ghosh,
Russell Poldrack,
Oluwasanmi Koyejo
Abstract:
Approximate inference via information projection has been recently introduced as a general-purpose approach for efficient probabilistic inference given sparse variables. This manuscript goes beyond classical sparsity by proposing efficient algorithms for approximate inference via information projection that are applicable to any structure on the set of variables that admits enumeration using a \em…
▽ More
Approximate inference via information projection has been recently introduced as a general-purpose approach for efficient probabilistic inference given sparse variables. This manuscript goes beyond classical sparsity by proposing efficient algorithms for approximate inference via information projection that are applicable to any structure on the set of variables that admits enumeration using a \emph{matroid}. We show that the resulting information projection can be reduced to combinatorial submodular optimization subject to matroid constraints. Further, leveraging recent advances in submodular optimization, we provide an efficient greedy algorithm with strong optimization-theoretic guarantees. The class of probabilistic models that can be expressed in this way is quite broad and, as we show, includes group sparse regression, group sparse principal components analysis and sparse canonical correlation analysis, among others. Moreover, empirical results on simulated data and high dimensional neuroimaging data highlight the superior performance of the information projection approach as compared to established baselines for a range of probabilistic models.
△ Less
Submitted 11 July, 2016;
originally announced July 2016.
-
Pursuits in Structured Non-Convex Matrix Factorizations
Authors:
Rajiv Khanna,
Michael Tschannen,
Martin Jaggi
Abstract:
Efficiently representing real world data in a succinct and parsimonious manner is of central importance in many fields. We present a generalized greedy pursuit framework, allowing us to efficiently solve structured matrix factorization problems, where the factors are allowed to be from arbitrary sets of structured vectors. Such structure may include sparsity, non-negativeness, order, or a combinat…
▽ More
Efficiently representing real world data in a succinct and parsimonious manner is of central importance in many fields. We present a generalized greedy pursuit framework, allowing us to efficiently solve structured matrix factorization problems, where the factors are allowed to be from arbitrary sets of structured vectors. Such structure may include sparsity, non-negativeness, order, or a combination thereof. The algorithm approximates a given matrix by a linear combination of few rank-1 matrices, each factorized into an outer product of two vector atoms of the desired structure. For the non-convex subproblems of obtaining good rank-1 structured matrix atoms, we employ and analyze a general atomic power method. In addition to the above applications, we prove linear convergence for generalized pursuit variants in Hilbert spaces - for the task of approximation over the linear span of arbitrary dictionaries - which generalizes OMP and is useful beyond matrix problems. Our experiments on real datasets confirm both the efficiency and also the broad applicability of our framework in practice.
△ Less
Submitted 12 February, 2016;
originally announced February 2016.
-
Towards a Better Understanding of Predict and Count Models
Authors:
S. Sathiya Keerthi,
Tobias Schnabel,
Rajiv Khanna
Abstract:
In a recent paper, Levy and Goldberg pointed out an interesting connection between prediction-based word embedding models and count models based on pointwise mutual information. Under certain conditions, they showed that both models end up optimizing equivalent objective functions. This paper explores this connection in more detail and lays out the factors leading to differences between these mode…
▽ More
In a recent paper, Levy and Goldberg pointed out an interesting connection between prediction-based word embedding models and count models based on pointwise mutual information. Under certain conditions, they showed that both models end up optimizing equivalent objective functions. This paper explores this connection in more detail and lays out the factors leading to differences between these models. We find that the most relevant differences from an optimization perspective are (i) predict models work in a low dimensional space where embedding vectors can interact heavily; (ii) since predict models have fewer parameters, they are less prone to overfitting.
Motivated by the insight of our analysis, we show how count models can be regularized in a principled manner and provide closed-form solutions for L1 and L2 regularization. Finally, we propose a new embedding model with a convex objective and the additional benefit of being intelligible.
△ Less
Submitted 6 November, 2015;
originally announced November 2015.
-
DPM: A State Space Model for Large-Scale Direct Marketing
Authors:
Yubin Park,
Rajiv Khanna,
Joydeep Ghosh,
Daniel Mihalko
Abstract:
We propose a novel statistical model to answer three challenges in direct marketing: which channel to use, which offer to make, and when to offer. There are several potential applications for the proposed model, for example, develo** personalized marketing strategies and monitoring members' needs. Furthermore, the results from the model can complement and can be integrated with other existing mo…
▽ More
We propose a novel statistical model to answer three challenges in direct marketing: which channel to use, which offer to make, and when to offer. There are several potential applications for the proposed model, for example, develo** personalized marketing strategies and monitoring members' needs. Furthermore, the results from the model can complement and can be integrated with other existing models.
The proposed model, named Dynamic Propensity Model, is a latent variable time series model that utilizes both marketing and purchase histories of a customer. The latent variable in the model represents the customer's propensity to buy a product. The propensity derives from purchases and other observable responses. Marketing touches increase a member's propensity, and propensity score attenuates and propagates over time as governed by data-driven parameters. To estimate the parameters of the model, a new statistical methodology has been developed. This methodology makes use of particle methods with a stochastic gradient descent approach, resulting in fast estimation of the model coefficients even from big datasets. The model is validated using six months' marketing records from one of the largest insurance companies in the U.S. Experimental results indicate that the effects of marketing touches vary depending on both channels and products. We compare the predictive performance of the proposed model with lagged variable logistic regression. Limitations and extensions of the proposed algorithm are also discussed.
△ Less
Submitted 4 July, 2015;
originally announced July 2015.
-
Kinetically engendered sub-spinodal length scales in spontaneous dewetting of thin liquid films
Authors:
TirumalaRao Kotni,
Jayati Sarkar,
Rajesh Khanna
Abstract:
Numerical simulations of spontaneous dewetting of non-slip**, variable viscosity unstable thin liquid films on homogeneous substrates reveal the existence of sub-spinodal lengthscales through formation of satellite holes, a marker of nucleated dewetting and/or heterogeneous substrates, in the late stages of dewetting if the liquid viscosity decreases continually with decreasing film thickness. T…
▽ More
Numerical simulations of spontaneous dewetting of non-slip**, variable viscosity unstable thin liquid films on homogeneous substrates reveal the existence of sub-spinodal lengthscales through formation of satellite holes, a marker of nucleated dewetting and/or heterogeneous substrates, in the late stages of dewetting if the liquid viscosity decreases continually with decreasing film thickness. These films also show established signatures of slip** films such as faster rupture and flatter morphologies in the early stages even without invoking any slippage.
△ Less
Submitted 11 March, 2014;
originally announced March 2014.