Search | arXiv e-print repository

Views Can Be Deceiving: Improved SSL Through Feature Space Augmentation

Authors: Kimia Hamidieh, Haoran Zhang, Swami Sankaranarayanan, Marzyeh Ghassemi

Abstract: Supervised learning methods have been found to exhibit inductive biases favoring simpler features. When such features are spuriously correlated with the label, this can result in suboptimal performance on minority subgroups. Despite the growing popularity of methods which learn from unlabeled data, the extent to which these representations rely on spurious features for prediction is unclear. In th… ▽ More Supervised learning methods have been found to exhibit inductive biases favoring simpler features. When such features are spuriously correlated with the label, this can result in suboptimal performance on minority subgroups. Despite the growing popularity of methods which learn from unlabeled data, the extent to which these representations rely on spurious features for prediction is unclear. In this work, we explore the impact of spurious features on Self-Supervised Learning (SSL) for visual representation learning. We first empirically show that commonly used augmentations in SSL can cause undesired invariances in the image space, and illustrate this with a simple example. We further show that classical approaches in combating spurious correlations, such as dataset re-sampling during SSL, do not consistently lead to invariant representations. Motivated by these findings, we propose LateTVG to remove spurious information from these representations during pre-training, by regularizing later layers of the encoder via pruning. We find that our method produces representations which outperform the baselines on several benchmarks, without the need for group or label information during SSL. △ Less

Submitted 28 May, 2024; originally announced June 2024.

arXiv:2405.16344 [pdf, other]

Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks

Authors: Emily Jensen, Sriram Sankaranarayanan, Bradley Hayes

Abstract: We claim that LLMs can be paired with formal analysis methods to provide accessible, relevant feedback for HRI tasks. While logic specifications are useful for defining and assessing a task, these representations are not easily interpreted by non-experts. Luckily, LLMs are adept at generating easy-to-understand text that explains difficult concepts. By integrating task assessment outcomes and othe… ▽ More We claim that LLMs can be paired with formal analysis methods to provide accessible, relevant feedback for HRI tasks. While logic specifications are useful for defining and assessing a task, these representations are not easily interpreted by non-experts. Luckily, LLMs are adept at generating easy-to-understand text that explains difficult concepts. By integrating task assessment outcomes and other contextual information into an LLM prompt, we can effectively synthesize a useful set of recommendations for the learner to improve their performance. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: Presented at Human-LLM Interaction Workshop at HRI 2024

arXiv:2405.15982 [pdf, other]

Automated Assessment and Adaptive Multimodal Formative Feedback Improves Psychomotor Skills Training Outcomes in Quadrotor Teleoperation

Authors: Emily Jensen, Sriram Sankaranarayanan, Bradley Hayes

Abstract: The workforce will need to continually upskill in order to meet the evolving demands of industry, especially working with robotic and autonomous systems. Current training methods are not scalable and do not adapt to the skills that learners already possess. In this work, we develop a system that automatically assesses learner skill in a quadrotor teleoperation task using temporal logic task specif… ▽ More The workforce will need to continually upskill in order to meet the evolving demands of industry, especially working with robotic and autonomous systems. Current training methods are not scalable and do not adapt to the skills that learners already possess. In this work, we develop a system that automatically assesses learner skill in a quadrotor teleoperation task using temporal logic task specifications. This assessment is used to generate multimodal feedback based on the principles of effective formative feedback. Participants perceived the feedback positively. Those receiving formative feedback viewed the feedback as more actionable compared to receiving summary statistics. Participants in the multimodal feedback condition were more likely to achieve a safe landing and increased their safe landings more over the experiment compared to other feedback conditions. Finally, we identify themes to improve adaptive feedback and discuss and how training for complex psychomotor tasks can be integrated with learning theories. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: Under review at Human-Agent Interaction 2024 conference

arXiv:2405.07119 [pdf, ps, other]

Best-response Algorithms for Integer Convex Quadratic Simultaneous Games

Authors: Sriram Sankaranarayanan

Abstract: We evaluate the best-response algorithm in the context of pure-integer convex quadratic games. We provide a sufficient condition that if certain interaction matrices (the product of the inverse of the positive definite matrix defining the convex quadratic terms and the matrix that connects one player's problem to another's) have all their singular values less than 1, then finite termination of the… ▽ More We evaluate the best-response algorithm in the context of pure-integer convex quadratic games. We provide a sufficient condition that if certain interaction matrices (the product of the inverse of the positive definite matrix defining the convex quadratic terms and the matrix that connects one player's problem to another's) have all their singular values less than 1, then finite termination of the best-response algorithm is guaranteed regardless of the initial point. Termination is triggered through cycling among a finite number of strategies for each player. Our findings indicate that if cycling happens, a relaxed version of the Nash equilibrium can be calculated by identifying a Nash equilibrium of a smaller finite game. Conversely, we prove that if every singular value of the interaction matrices is greater than 1, the algorithm will diverge from a large family of initial points. In addition, we provide an infinite family of examples in which some of the singular values of the interaction matrices are greater than 1, cycling occurs, but any mixed-strategy with support in the strategies where cycling occurs has arbitrarily better deviations. Then, we perform computational tests of our algorithm and compare it with standard algorithms to solve such problems. We notice that our algorithm finds a Nash equilibrium correctly in every instance. Moreover, compared to a state-of-the art algorithm, our method shows similar performance in two-player games and significantly higher speed when involving three or more players. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.00687 [pdf, other]

Optimal Planning for Timed Partial Order Specifications

Authors: Kandai Watanabe, Georgios Fainekos, Bardh Hoxha, Morteza Lahijanian, Hideki Okamoto, Sriram Sankaranarayanan

Abstract: This paper addresses the challenge of planning a sequence of tasks to be performed by multiple robots while minimizing the overall completion time subject to timing and precedence constraints. Our approach uses the Timed Partial Orders (TPO) model to specify these constraints. We translate this problem into a Traveling Salesman Problem (TSP) variant with timing and precedent constraints, and we so… ▽ More This paper addresses the challenge of planning a sequence of tasks to be performed by multiple robots while minimizing the overall completion time subject to timing and precedence constraints. Our approach uses the Timed Partial Orders (TPO) model to specify these constraints. We translate this problem into a Traveling Salesman Problem (TSP) variant with timing and precedent constraints, and we solve it as a Mixed Integer Linear Programming (MILP) problem. Our contributions include a general planning framework for TPO specifications, a MILP formulation accommodating time windows and precedent constraints, its extension to multi-robot scenarios, and a method to quantify plan robustness. We demonstrate our framework on several case studies, including an aircraft turnaround task involving three Jackal robots, highlighting the approach's potential applicability to important real-world problems. Our benchmark results show that our MILP method outperforms state-of-the-art open-source TSP solvers OR-Tools. △ Less

Submitted 8 March, 2024; originally announced May 2024.

Comments: 2024 IEEE International Conference on Robotics and Automation

arXiv:2404.07170 [pdf, other]

doi 10.1145/3644815.3644989

Worst-Case Convergence Time of ML Algorithms via Extreme Value Theory

Authors: Saeid Tizpaz-Niari, Sriram Sankaranarayanan

Abstract: This paper leverages the statistics of extreme values to predict the worst-case convergence times of machine learning algorithms. Timing is a critical non-functional property of ML systems, and providing the worst-case converge times is essential to guarantee the availability of ML and its services. However, timing properties such as worst-case convergence times (WCCT) are difficult to verify sinc… ▽ More This paper leverages the statistics of extreme values to predict the worst-case convergence times of machine learning algorithms. Timing is a critical non-functional property of ML systems, and providing the worst-case converge times is essential to guarantee the availability of ML and its services. However, timing properties such as worst-case convergence times (WCCT) are difficult to verify since (1) they are not encoded in the syntax or semantics of underlying programming languages of AI, (2) their evaluations depend on both algorithmic implementations and underlying systems, and (3) their measurements involve uncertainty and noise. Therefore, prevalent formal methods and statistical models fail to provide rich information on the amounts and likelihood of WCCT. Our key observation is that the timing information we seek represents the extreme tail of execution times. Therefore, extreme value theory (EVT), a statistical discipline that focuses on understanding and predicting the distribution of extreme values in the tail of outcomes, provides an ideal framework to model and analyze WCCT in the training and inference phases of ML paradigm. Building upon the mathematical tools from EVT, we propose a practical framework to predict the worst-case timing properties of ML. Over a set of linear ML training algorithms, we show that EVT achieves a better accuracy for predicting WCCTs than relevant statistical methods such as the Bayesian factor. On the set of larger machine learning training algorithms and deep neural network inference, we show the feasibility and usefulness of EVT models to accurately predict WCCTs, their expected return periods, and their likelihood. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: In 3rd International Conference on AI Engineering: Software Engineering for AI (CAIN 2024)

arXiv:2401.09456 [pdf, ps, other]

Parametric Constraints for Bayesian Knowledge Tracing from First Principles

Authors: Denis Shchepakin, Sreecharan Sankaranarayanan, Dawn Zimmaro

Abstract: Bayesian Knowledge Tracing (BKT) is a probabilistic model of a learner's state of mastery corresponding to a knowledge component. It considers the learner's state of mastery as a "hidden" or latent binary variable and updates this state based on the observed correctness of the learner's response using parameters that represent transition probabilities between states. BKT is often represented as a… ▽ More Bayesian Knowledge Tracing (BKT) is a probabilistic model of a learner's state of mastery corresponding to a knowledge component. It considers the learner's state of mastery as a "hidden" or latent binary variable and updates this state based on the observed correctness of the learner's response using parameters that represent transition probabilities between states. BKT is often represented as a Hidden Markov Model and the Expectation-Maximization (EM) algorithm is used to infer these parameters. However, this algorithm can suffer from several issues including producing multiple viable sets of parameters, settling into a local minima, producing degenerate parameter values, and a high computational cost during fitting. This paper takes a "from first principles" approach to deriving constraints that can be imposed on the BKT parameter space. Starting from the basic mathematical truths of probability and building up to the behaviors expected of the BKT parameters in real systems, this paper presents a mathematical derivation that results in succinct constraints that can be imposed on the BKT parameter space. Since these constraints are necessary conditions, they can be applied prior to fitting in order to reduce computational cost and the likelihood of issues that can emerge from the EM procedure. In order to see that promise through, the paper further introduces a novel algorithm for estimating BKT parameters subject to the newly defined constraints. While the issue of degenerate parameter values has been reported previously, this paper is the first, to our best knowledge, to derive the constrains from first principles while also presenting an algorithm that respects those constraints. △ Less

Submitted 22 December, 2023; originally announced January 2024.

MSC Class: 62F15 (Primary) 62M05; 60J20; 68T30; 91E40 (Secondary)

arXiv:2311.08594 [pdf, other]

Variational Temporal IRT: Fast, Accurate, and Explainable Inference of Dynamic Learner Proficiency

Authors: Yunsung Kim, Sreechan Sankaranarayanan, Chris Piech, Candace Thille

Abstract: Dynamic Item Response Models extend the standard Item Response Theory (IRT) to capture temporal dynamics in learner ability. While these models have the potential to allow instructional systems to actively monitor the evolution of learner proficiency in real time, existing dynamic item response models rely on expensive inference algorithms that scale poorly to massive datasets. In this work, we pr… ▽ More Dynamic Item Response Models extend the standard Item Response Theory (IRT) to capture temporal dynamics in learner ability. While these models have the potential to allow instructional systems to actively monitor the evolution of learner proficiency in real time, existing dynamic item response models rely on expensive inference algorithms that scale poorly to massive datasets. In this work, we propose Variational Temporal IRT (VTIRT) for fast and accurate inference of dynamic learner proficiency. VTIRT offers orders of magnitude speedup in inference runtime while still providing accurate inference. Moreover, the proposed algorithm is intrinsically interpretable by virtue of its modular design. When applied to 9 real student datasets, VTIRT consistently yields improvements in predicting future learner performance over other learner proficiency models. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 9 pages, 16th International Conference on Educational Data Mining (EDM'23)

arXiv:2306.02817 [pdf, other]

Integer Programming Games: A Gentle Computational Overview

Authors: Margarida Carvalho, Gabriele Dragotto, Andrea Lodi, Sriram Sankaranarayanan

Abstract: In this tutorial, we present a computational overview on computing Nash equilibria in Integer Programming Games ($IPG$s), $i.e.$, how to compute solutions for a class of non-cooperative and nonconvex games where each player solves a mixed-integer optimization problem. $IPG$s are a broad class of games extending the modeling power of mixed-integer optimization to multi-agent settings. This class of… ▽ More In this tutorial, we present a computational overview on computing Nash equilibria in Integer Programming Games ($IPG$s), $i.e.$, how to compute solutions for a class of non-cooperative and nonconvex games where each player solves a mixed-integer optimization problem. $IPG$s are a broad class of games extending the modeling power of mixed-integer optimization to multi-agent settings. This class of games includes, for instance, any finite game and any multi-agent extension of traditional combinatorial optimization problems. After providing some background motivation and context of applications, we systematically review and classify the state-of-the-art algorithms to compute Nash equilibria. We propose an essential taxonomy of the algorithmic ingredients needed to compute equilibria, and we describe the theoretical and practical challenges associated with equilibria computation. Finally, we quantitatively and qualitatively compare a sequential Stackelberg game with a simultaneous $IPG$ to highlight the different properties of their solutions. △ Less

Submitted 12 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: To appear in INFORMS TutORials in Operations Research 2023

arXiv:2303.06582 [pdf, other]

Certifiably-correct Control Policies for Safe Learning and Adaptation in Assistive Robotics

Authors: Keyvan Majd, Geoffrey Clark, Tanmay Khandait, Siyu Zhou, Sriram Sankaranarayanan, Georgios Fainekos, Heni Ben Amor

Abstract: Guaranteeing safety in human-centric applications is critical in robot learning as the learned policies may demonstrate unsafe behaviors in formerly unseen scenarios. We present a framework to locally repair an erroneous policy network to satisfy a set of formal safety constraints using Mixed Integer Quadratic Programming (MIQP). Our MIQP formulation explicitly imposes the safety constraints to th… ▽ More Guaranteeing safety in human-centric applications is critical in robot learning as the learned policies may demonstrate unsafe behaviors in formerly unseen scenarios. We present a framework to locally repair an erroneous policy network to satisfy a set of formal safety constraints using Mixed Integer Quadratic Programming (MIQP). Our MIQP formulation explicitly imposes the safety constraints to the learned policy while minimizing the original loss function. The policy network is then verified to be locally safe. We demonstrate the application of our framework to derive safe policies for a robotic lower-leg prosthesis. △ Less

Submitted 12 March, 2023; originally announced March 2023.

Comments: Appeared in the 36th Conference on Neural Information Processing Systems (NeurIPS) - Robot Learning Workshop. arXiv admin note: substantial text overlap with arXiv:2303.04431

arXiv:2303.04431 [pdf, other]

Safe Robot Learning in Assistive Devices through Neural Network Repair

Authors: Keyvan Majd, Geoffrey Clark, Tanmay Khandait, Siyu Zhou, Sriram Sankaranarayanan, Georgios Fainekos, Heni Ben Amor

Abstract: Assistive robotic devices are a particularly promising field of application for neural networks (NN) due to the need for personalization and hard-to-model human-machine interaction dynamics. However, NN based estimators and controllers may produce potentially unsafe outputs over previously unseen data points. In this paper, we introduce an algorithm for updating NN control policies to satisfy a gi… ▽ More Assistive robotic devices are a particularly promising field of application for neural networks (NN) due to the need for personalization and hard-to-model human-machine interaction dynamics. However, NN based estimators and controllers may produce potentially unsafe outputs over previously unseen data points. In this paper, we introduce an algorithm for updating NN control policies to satisfy a given set of formal safety constraints, while also optimizing the original loss function. Given a set of mixed-integer linear constraints, we define the NN repair problem as a Mixed Integer Quadratic Program (MIQP). In extensive experiments, we demonstrate the efficacy of our repair method in generating safe policies for a lower-leg prosthesis. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Journal ref: PMLR 205:2148-2158, 2023

arXiv:2301.01148 [pdf, other]

doi 10.1016/j.apenergy.2023.121323

MERLIN: Multi-agent offline and transfer learning for occupant-centric energy flexible operation of grid-interactive communities using smart meter data and CityLearn

Authors: Kingsley Nweye, Siva Sankaranarayanan, Zoltan Nagy

Abstract: The decarbonization of buildings presents new challenges for the reliability of the electrical grid as a result of the intermittency of renewable energy sources and increase in grid load brought about by end-use electrification. To restore reliability, grid-interactive efficient buildings can provide flexibility services to the grid through demand response. Residential demand response programs are… ▽ More The decarbonization of buildings presents new challenges for the reliability of the electrical grid as a result of the intermittency of renewable energy sources and increase in grid load brought about by end-use electrification. To restore reliability, grid-interactive efficient buildings can provide flexibility services to the grid through demand response. Residential demand response programs are hindered by the need for manual intervention by customers. To maximize the energy flexibility potential of residential buildings, an advanced control architecture is needed. Reinforcement learning is well-suited for the control of flexible resources as it is able to adapt to unique building characteristics compared to expert systems. Yet, factors hindering the adoption of RL in real-world applications include its large data requirements for training, control security and generalizability. Here we address these challenges by proposing the MERLIN framework and using a digital twin of a real-world 17-building grid-interactive residential community in CityLearn. We show that 1) independent RL-controllers for batteries improve building and district level KPIs compared to a reference RBC by tailoring their policies to individual buildings, 2) despite unique occupant behaviours, transferring the RL policy of any one of the buildings to other buildings provides comparable performance while reducing the cost of training, 3) training RL-controllers on limited temporal data that does not capture full seasonality in occupant behaviour has little effect on performance. Although, the zero-net-energy (ZNE) condition of the buildings could be maintained or worsened as a result of controlled batteries, KPIs that are typically improved by ZNE condition (electricity price and carbon emissions) are further improved when the batteries are managed by an advanced controller. △ Less

Submitted 31 December, 2022; originally announced January 2023.

Comments: under review

arXiv:2211.11031 [pdf, other]

Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

Authors: Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

Abstract: Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifel… ▽ More Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs. GRACE writes new map**s into a pre-trained model's latent space, creating a discrete, local codebook of edits without altering model weights. This is the first method enabling thousands of sequential edits using only streaming errors. Our experiments on T5, BERT, and GPT models show GRACE's state-of-the-art performance in making and retaining edits, while generalizing to unseen inputs. Our code is available at https://www.github.com/thartvigsen/grace}. △ Less

Submitted 17 October, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

Comments: Accepted to NeurIPS 2023

arXiv:2211.08194 [pdf]

Machine learning for classifying and interpreting coherent X-ray speckle patterns

Authors: Mingren Shen, Dina Sheyfer, Troy David Loeffler, Subramanian K. R. S. Sankaranarayanan, G. Brian Stephenson, Maria K. Y. Chan, Dane Morgan

Abstract: Speckle patterns produced by coherent X-ray have a close relationship with the internal structure of materials but quantitative inversion of the relationship to determine structure from speckle patterns is challenging. Here, we investigate the link between coherent X-ray speckle patterns and sample structures using a model 2D disk system and explore the ability of machine learning to learn aspects… ▽ More Speckle patterns produced by coherent X-ray have a close relationship with the internal structure of materials but quantitative inversion of the relationship to determine structure from speckle patterns is challenging. Here, we investigate the link between coherent X-ray speckle patterns and sample structures using a model 2D disk system and explore the ability of machine learning to learn aspects of the relationship. Specifically, we train a deep neural network to classify the coherent X-ray speckle patterns according to the disk number density in the corresponding structure. It is demonstrated that the classification system is accurate for both non-disperse and disperse size distributions. △ Less

Submitted 1 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

arXiv:2208.09751 [pdf, other]

doi 10.1109/XLOOP56614.2022.00007

MLExchange: A web-based platform enabling exchangeable machine learning workflows for scientific studies

Authors: Zhuowen Zhao, Tanny Chavez, Elizabeth A. Holman, Guanhua Hao, Adam Green, Harinarayan Krishnan, Dylan McReynolds, Ronald Pandolfi, Eric J. Roberts, Petrus H. Zwart, Howard Yanxon, Nicholas Schwarz, Subramanian Sankaranarayanan, Sergei V. Kalinin, Apurva Mehta, Stuart Campbell, Alexander Hexemer

Abstract: Machine learning (ML) algorithms are showing a growing trend in hel** the scientific communities across different disciplines and institutions to address large and diverse data problems. However, many available ML tools are programmatically demanding and computationally costly. The MLExchange project aims to build a collaborative platform equipped with enabling tools that allow scientists and fa… ▽ More Machine learning (ML) algorithms are showing a growing trend in hel** the scientific communities across different disciplines and institutions to address large and diverse data problems. However, many available ML tools are programmatically demanding and computationally costly. The MLExchange project aims to build a collaborative platform equipped with enabling tools that allow scientists and facility users who do not have a profound ML background to use ML and computational resources in scientific discovery. At the high level, we are targeting a full user experience where managing and exchanging ML algorithms, workflows, and data are readily available through web applications. Since each component is an independent container, the whole platform or its individual service(s) can be easily deployed at servers of different scales, ranging from a personal device (laptop, smart phone, etc.) to high performance clusters (HPC) accessed (simultaneously) by many users. Thus, MLExchange renders flexible using scenarios -- users could either access the services and resources from a remote server or run the whole platform or its individual service(s) within their local network. △ Less

Submitted 26 January, 2023; v1 submitted 20 August, 2022; originally announced August 2022.

Comments: The accepted version with DOI and IEEE copyright notice in the first page

Journal ref: 2022 4th IEEE/ACM Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP)

arXiv:2207.10074 [pdf, other]

Semantic uncertainty intervals for disentangled latent spaces

Authors: Swami Sankaranarayanan, Anastasios N. Angelopoulos, Stephen Bates, Yaniv Romano, Phillip Isola

Abstract: Meaningful uncertainty quantification in computer vision requires reasoning about semantic information -- say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained chal… ▽ More Meaningful uncertainty quantification in computer vision requires reasoning about semantic information -- say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained challenging. In this work, we provide principled uncertainty intervals that are guaranteed to contain the true semantic factors for any underlying generative model. The method does the following: (1) it uses quantile regression to output a heuristic uncertainty interval for each element in the latent space (2) calibrates these uncertainties such that they contain the true value of the latent for a new, unseen input. The endpoints of these calibrated intervals can then be propagated through the generator to produce interpretable uncertainty visualizations for each semantic factor. This technique reliably communicates semantically meaningful, principled, and instance-adaptive uncertainty in inverse problems like image super-resolution and image completion. △ Less

Submitted 30 November, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

Comments: Accepted to NeurIPS 2022. Project page: https://swamiviv.github.io/semantic_uncertainty_intervals/

arXiv:2205.12722 [pdf, other]

Mathematical Models of Human Drivers Using Artificial Risk Fields

Authors: Emily Jensen, Maya Luster, Hansol Yoon, Brandon Pitts, Sriram Sankaranarayanan

Abstract: In this paper, we use the concept of artificial risk fields to predict how human operators control a vehicle in response to upcoming road situations. A risk field assigns a non-negative risk measure to the state of the system in order to model how close that state is to violating a safety property, such as hitting an obstacle or exiting the road. Using risk fields, we construct a stochastic model… ▽ More In this paper, we use the concept of artificial risk fields to predict how human operators control a vehicle in response to upcoming road situations. A risk field assigns a non-negative risk measure to the state of the system in order to model how close that state is to violating a safety property, such as hitting an obstacle or exiting the road. Using risk fields, we construct a stochastic model of the operator that maps from states to likely actions. We demonstrate our approach on a driving task wherein human subjects are asked to drive a car inside a realistic driving simulator while avoiding obstacles placed on the road. We show that the most likely risk field given the driving data is obtained by solving a convex optimization problem. Next, we apply the inferred risk fields to generate distinct driving behaviors while comparing predicted trajectories against ground truth measurements. We observe that the risk fields are excellent at predicting future trajectory distributions with high prediction accuracy for up to twenty seconds prediction horizons. At the same time, we observe some challenges such as the inability to account for how drivers choose to accelerate/decelerate based on the road conditions. △ Less

Submitted 31 August, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

Comments: 8 pages, 4 figures, accepted to Intelligent Transportation Systems Conference

arXiv:2203.17274 [pdf, other]

Exploring Visual Prompts for Adapting Large-Scale Models

Authors: Hyo** Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola

Abstract: We investigate the efficacy of visual prompting to adapt large-scale models in vision. Following the recent approach from prompt tuning and adversarial reprogramming, we learn a single image perturbation such that a frozen model prompted with this perturbation performs a new task. Through comprehensive experiments, we demonstrate that visual prompting is particularly effective for CLIP and robust… ▽ More We investigate the efficacy of visual prompting to adapt large-scale models in vision. Following the recent approach from prompt tuning and adversarial reprogramming, we learn a single image perturbation such that a frozen model prompted with this perturbation performs a new task. Through comprehensive experiments, we demonstrate that visual prompting is particularly effective for CLIP and robust to distribution shift, achieving performance competitive with standard linear probes. We further analyze properties of the downstream dataset, prompt design, and output transformation in regard to adaptation performance. The surprising effectiveness of visual prompting provides a new perspective on adapting pre-trained models in vision. Code is available at http://hjbahng.github.io/visual_prompting . △ Less

Submitted 3 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

Comments: 16 pages, 10 figures

arXiv:2201.04308 [pdf, other]

Cooperative Security Against Interdependent Risks

Authors: Sanjith Gopalakrishnan, Sriram Sankaranarayanan

Abstract: Firms in inter-organizational networks such as supply chains or strategic alliances are exposed to interdependent risks. These are risks that are transferable across partner firms. They can be decomposed into intrinsic risks a firm faces from its own operations and extrinsic risks transferred from its partners. Firms broadly have access to two security strategies: either they can independently eli… ▽ More Firms in inter-organizational networks such as supply chains or strategic alliances are exposed to interdependent risks. These are risks that are transferable across partner firms. They can be decomposed into intrinsic risks a firm faces from its own operations and extrinsic risks transferred from its partners. Firms broadly have access to two security strategies: either they can independently eliminate both intrinsic and extrinsic risks by securing their links with partners, or alternatively, firms can cooperate with partners to eliminate sources of intrinsic risk in the network. We develop a graph-theoretic model of interdependent security and demonstrate that the network-optimal security strategy can be computed in polynomial time. Then, we use cooperative game-theoretic tools to examine whether and when firms can sustain the network-optimal security strategy via cost-sharing mechanisms that are stable, fair, computable, and implementable via a series of bilateral cost-sharing arrangements. We consider different informational assumptions in the network and show that, when the players know only their own costs, firms have a clear incentive to cooperate globally whereas, in the presence of public information, there may not exist cost-sharing mechanisms that can sustain network-wide cooperation. We then design a novel cost-sharing mechanism: the agreeable allocation, that is easy to compute, bilaterally implementable, ensures stability, and is fair in a well-defined sense. However, the agreeable allocation need not always exist. We then generalize levels of agreeable allocation, with weaker implementability properties but greater existence guarantees. △ Less

Submitted 8 May, 2023; v1 submitted 12 January, 2022; originally announced January 2022.

arXiv:2111.07932 [pdf, other]

ZERO: Playing Mathematical Programming Games

Authors: Gabriele Dragotto, Sriram Sankaranarayanan, Margarida Carvalho, Andrea Lodi

Abstract: We present ZERO, a modular and extensible C++ library interfacing Mathematical Programming and Game Theory. ZERO provides a comprehensive toolkit of modeling interfaces and algorithms for Reciprocally Bilinear Games (RBGs), i.e., simultaneous non-cooperative games where each player solves a mathematical program with a linear objective in the player's variable and bilinear in its opponents' variabl… ▽ More We present ZERO, a modular and extensible C++ library interfacing Mathematical Programming and Game Theory. ZERO provides a comprehensive toolkit of modeling interfaces and algorithms for Reciprocally Bilinear Games (RBGs), i.e., simultaneous non-cooperative games where each player solves a mathematical program with a linear objective in the player's variable and bilinear in its opponents' variables. This class of games generalizes the classical problems of Operations Research to a multi-agent setting. ZERO modular structure gives users all the elementary ingredients to design new game-theoretic models and algorithms for RBGs, and find their Nash equilibria. The library provides additional extended support for integer non-convexities, linear bilevel problems, and linear equilibrium problems with equilibrium constraints. We provide an overview of the software's key components and showcase a Knapsack Game, i.e., a game where each player solves a binary knapsack problem. Aiming to boost practical methodological contributions at the interplay of Mathematical Programming and Game Theory, we release ZERO as open-source software. Source code, documentation and examples are available at www.getzero.one. △ Less

Submitted 12 December, 2021; v1 submitted 15 November, 2021; originally announced November 2021.

arXiv:2111.05726 [pdf, other]

The Cut-and-Play Algorithm: Computing Nash Equilibria via Outer Approximations

Authors: Margarida Carvalho, Gabriele Dragotto, Andrea Lodi, Sriram Sankaranarayanan

Abstract: We introduce Cut-and-Play, a practically-efficient algorithm for computing Nash equilibria in simultaneous non-cooperative games where players decide via nonconvex and possibly unbounded optimization problems with separable payoff functions. Our algorithm exploits an intrinsic relationship between the equilibria of the original nonconvex game and the ones of a convexified counterpart. In practice,… ▽ More We introduce Cut-and-Play, a practically-efficient algorithm for computing Nash equilibria in simultaneous non-cooperative games where players decide via nonconvex and possibly unbounded optimization problems with separable payoff functions. Our algorithm exploits an intrinsic relationship between the equilibria of the original nonconvex game and the ones of a convexified counterpart. In practice, Cut-and-Play formulates a series of convex approximations of the game and iteratively refines them with cutting planes and branching operations. Our algorithm does not require convexity or continuity of the player's optimization problems and can be integrated with existing optimization software. We test Cut-and-Play on two families of challenging nonconvex games involving discrete decisions and bilevel problems, and we empirically demonstrate that it efficiently computes equilibria while outperforming existing game-specific algorithms. △ Less

Submitted 3 May, 2024; v1 submitted 10 November, 2021; originally announced November 2021.

arXiv:2109.14053 [pdf, other]

AutoPhaseNN: Unsupervised Physics-aware Deep Learning of 3D Nanoscale Bragg Coherent Diffraction Imaging

Authors: Yudong Yao, Henry Chan, Subramanian Sankaranarayanan, Prasanna Balaprakash, Ross J. Harder, Mathew J. Cherukara

Abstract: The problem of phase retrieval, or the algorithmic recovery of lost phase information from measured intensity alone, underlies various imaging methods from astronomy to nanoscale imaging. Traditional methods of phase retrieval are iterative in nature, and are therefore computationally expensive and time consuming. More recently, deep learning (DL) models have been developed to either provide learn… ▽ More The problem of phase retrieval, or the algorithmic recovery of lost phase information from measured intensity alone, underlies various imaging methods from astronomy to nanoscale imaging. Traditional methods of phase retrieval are iterative in nature, and are therefore computationally expensive and time consuming. More recently, deep learning (DL) models have been developed to either provide learned priors to iterative phase retrieval or in some cases completely replace phase retrieval with networks that learn to recover the lost phase information from measured intensity alone. However, such models require vast amounts of labeled data, which can only be obtained through simulation or performing computationally prohibitive phase retrieval on hundreds of or even thousands of experimental datasets. Using a 3D nanoscale X-ray imaging modality (Bragg Coherent Diffraction Imaging or BCDI) as a representative technique, we demonstrate AutoPhaseNN, a DL-based approach which learns to solve the phase problem without labeled data. By incorporating the physics of the imaging technique into the DL model during training, AutoPhaseNN learns to invert 3D BCDI data from reciprocal space to real space in a single shot without ever being shown real space images. Once trained, AutoPhaseNN is about one hundred times faster than traditional iterative phase retrieval methods while providing comparable image quality. △ Less

Submitted 4 April, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

MSC Class: 68T07; 00A79

arXiv:2109.14041 [pdf, other]

Local Repair of Neural Networks Using Optimization

Authors: Keyvan Majd, Siyu Zhou, Heni Ben Amor, Georgios Fainekos, Sriram Sankaranarayanan

Abstract: In this paper, we propose a framework to repair a pre-trained feed-forward neural network (NN) to satisfy a set of properties. We formulate the properties as a set of predicates that impose constraints on the output of NN over the target input domain. We define the NN repair problem as a Mixed Integer Quadratic Program (MIQP) to adjust the weights of a single layer subject to the given predicates… ▽ More In this paper, we propose a framework to repair a pre-trained feed-forward neural network (NN) to satisfy a set of properties. We formulate the properties as a set of predicates that impose constraints on the output of NN over the target input domain. We define the NN repair problem as a Mixed Integer Quadratic Program (MIQP) to adjust the weights of a single layer subject to the given predicates while minimizing the original loss function over the original training domain. We demonstrate the application of our framework in bounding an affine transformation, correcting an erroneous NN in classification, and bounding the inputs of a NN controller. △ Less

Submitted 28 September, 2021; originally announced September 2021.

arXiv:2108.01227 [pdf, other]

Predictive Runtime Monitoring for Mobile Robots using Logic-Based Bayesian Intent Inference

Authors: Hansol Yoon, Sriram Sankaranarayanan

Abstract: We propose a predictive runtime monitoring framework that forecasts the distribution of future positions of mobile robots in order to detect and avoid impending property violations such as collisions with obstacles or other agents. Our approach uses a restricted class of temporal logic formulas to represent the likely intentions of the agents along with a combination of temporal logic-based optima… ▽ More We propose a predictive runtime monitoring framework that forecasts the distribution of future positions of mobile robots in order to detect and avoid impending property violations such as collisions with obstacles or other agents. Our approach uses a restricted class of temporal logic formulas to represent the likely intentions of the agents along with a combination of temporal logic-based optimal cost path planning and Bayesian inference to compute the probability of these intents given the current trajectory of the robot. First, we construct a large but finite hypothesis space of possible intents represented as temporal logic formulas whose atomic propositions are derived from a detailed map of the robot's workspace. Next, our approach uses real-time observations of the robot's position to update a distribution over temporal logic formulae that represent its likely intent. This is performed by using a combination of optimal cost path planning and a Boltzmann noisy rationality model. In this manner, we construct a Bayesian approach to evaluating the posterior probability of various hypotheses given the observed states and actions of the robot. Finally, we predict the future position of the robot by drawing posterior predictive samples using a Monte-Carlo method. We evaluate our framework using two different trajectory datasets that contain multiple scenarios implementing various tasks. The results show that our method can predict future positions precisely and efficiently, so that the computation time for generating a prediction is a tiny fraction of the overall time horizon. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: Presented at ICRA 2021

arXiv:2108.00893 [pdf, other]

Static analysis of ReLU neural networks with tropical polyhedra

Authors: Eric Goubault, Sébastien Palumby, Sylvie Putot, Louis Rustenholz, Sriram Sankaranarayanan

Abstract: This paper studies the problem of range analysis for feedforward neural networks, which is a basic primitive for applications such as robustness of neural networks, compliance to specifications and reachability analysis of neural-network feedback systems. Our approach focuses on ReLU (rectified linear unit) feedforward neural nets that present specific difficulties: approaches that exploit derivat… ▽ More This paper studies the problem of range analysis for feedforward neural networks, which is a basic primitive for applications such as robustness of neural networks, compliance to specifications and reachability analysis of neural-network feedback systems. Our approach focuses on ReLU (rectified linear unit) feedforward neural nets that present specific difficulties: approaches that exploit derivatives do not apply in general, the number of patterns of neuron activations can be quite large even for small networks, and convex approximations are generally too coarse. In this paper, we employ set-based methods and abstract interpretation that have been very successful in co** with similar difficulties in classical program verification. We present an approach that abstracts ReLU feedforward neural networks using tropical polyhedra. We show that tropical polyhedra can efficiently abstract ReLU activation function, while being able to control the loss of precision due to linear computations. We show how the connection between ReLU networks and tropical rational functions can provide approaches for range analysis of ReLU neural networks. △ Less

Submitted 23 August, 2021; v1 submitted 30 July, 2021; originally announced August 2021.

MSC Class: 68T01; 68N30 ACM Class: F.3.1; I.2.0

arXiv:2107.00218 [pdf]

Comparing Example-Based Collaborative Reflection to Problem Solving Practice for Learning during Team-Based Software Engineering Projects

Authors: Sreecharan Sankaranarayanan, Siddharth Reddy Kandimalla, Christopher Bogart, R. Charles Murray, Haokang An, Michael Hilton, Majd Sakr, Carolyn Rosé

Abstract: Contributing to the literature on aptitude-treatment interactions between worked examples and problem-solving, this paper addresses differential learning from the two approaches when students are positioned as domain experts learning new concepts. Our evaluation is situated in a team project that is part of an advanced software engineering course. In this course, students who possess foundational… ▽ More Contributing to the literature on aptitude-treatment interactions between worked examples and problem-solving, this paper addresses differential learning from the two approaches when students are positioned as domain experts learning new concepts. Our evaluation is situated in a team project that is part of an advanced software engineering course. In this course, students who possess foundational domain knowledge but are learning new concepts engage alternatively in programming followed by worked example-based reflection. They are either allowed to finish programming or are curtailed after a pre-specified time to participate in a longer worked example-based reflection. We find significant pre- to post-test learning gains in both conditions. Then, we not only find significantly more learning when students participated in longer worked example-based reflections but also a significant performance improvement on a problem-solving transfer task. These findings suggest that domain experts learning new concepts benefit more from worked example-based reflections than from problem-solving. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: 4 pages, 1 image, 1 table, 14th Computer Supported Collaborative Learning (CSCL) Proceedings at the Annual Meeting of the International Society of the Learning Sciences (ISLS)

Journal ref: 14th Computer-Supported Collaborative Learning Proceedings at the Annual Meeting of the International Society of the Learning Sciences 2021, pp. 213-216

arXiv:2006.09441 [pdf]

doi 10.1063/5.0031486

Real-time 3D Nanoscale Coherent Imaging via Physics-aware Deep Learning

Authors: Henry Chan, Youssef S. G. Nashed, Saugat Kandel, Stephan Hruszkewycz, Subramanian Sankaranarayanan, Ross J. Harder, Mathew J. Cherukara

Abstract: Phase retrieval, the problem of recovering lost phase information from measured intensity alone, is an inverse problem that is widely faced in various imaging modalities ranging from astronomy to nanoscale imaging. The current process of phase recovery is iterative in nature. As a result, the image formation is time-consuming and computationally expensive, precluding real-time imaging. Here, we us… ▽ More Phase retrieval, the problem of recovering lost phase information from measured intensity alone, is an inverse problem that is widely faced in various imaging modalities ranging from astronomy to nanoscale imaging. The current process of phase recovery is iterative in nature. As a result, the image formation is time-consuming and computationally expensive, precluding real-time imaging. Here, we use 3D nanoscale X-ray imaging as a representative example to develop a deep learning model to address this phase retrieval problem. We introduce 3D-CDI-NN, a deep convolutional neural network and differential programming framework trained to predict 3D structure and strain solely from input 3D X-ray coherent scattering data. Our networks are designed to be "physics-aware" in multiple aspects; in that the physics of x-ray scattering process is explicitly enforced in the training of the network, and the training data are drawn from atomistic simulations that are representative of the physics of the material. We further refine the neural network prediction through a physics-based optimization procedure to enable maximum accuracy at lowest computational cost. 3D-CDI-NN can invert a 3D coherent diffraction pattern to real-space structure and strain hundreds of times faster than traditional iterative phase retrieval methods, with negligible loss in accuracy. Our integrated machine learning and differential programming solution to the phase retrieval problem is broadly applicable across inverse problems in other application areas. △ Less

Submitted 16 June, 2020; originally announced June 2020.

arXiv:2006.03963 [pdf, other]

Combinatorial Black-Box Optimization with Expert Advice

Authors: Hamid Dadkhahi, Karthikeyan Shanmugam, Jesus Rios, Payel Das, Samuel Hoffman, Troy David Loeffler, Subramanian Sankaranarayanan

Abstract: We consider the problem of black-box function optimization over the boolean hypercube. Despite the vast literature on black-box function optimization over continuous domains, not much attention has been paid to learning models for optimization over combinatorial domains until recently. However, the computational complexity of the recently devised algorithms are prohibitive even for moderate number… ▽ More We consider the problem of black-box function optimization over the boolean hypercube. Despite the vast literature on black-box function optimization over continuous domains, not much attention has been paid to learning models for optimization over combinatorial domains until recently. However, the computational complexity of the recently devised algorithms are prohibitive even for moderate numbers of variables; drawing one sample using the existing algorithms is more expensive than a function evaluation for many black-box functions of interest. To address this problem, we propose a computationally efficient model learning algorithm based on multilinear polynomials and exponential weight updates. In the proposed algorithm, we alternate between simulated annealing with respect to the current polynomial representation and updating the weights using monomial experts' advice. Numerical experiments on various datasets in both unconstrained and sum-constrained boolean optimization indicate the competitive performance of the proposed algorithm, while improving the computational time up to several orders of magnitude compared to state-of-the-art algorithms in the literature. △ Less

Submitted 13 October, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

Journal ref: KDD 2020

arXiv:2002.10401 [pdf]

BLAST: Bridging Length/time scales via Atomistic Simulation Toolkit

Authors: Henry Chan, Badri Narayanan, Mathew Cherukara, Troy D. Loeffler, Michael G. Sternberg, Anthony Avarca, Subramanian K. R. S. Sankaranarayanan

Abstract: The ever-increasing power of supercomputers coupled with highly scalable simulation codes have made molecular dynamics an indispensable tool in applications ranging from predictive modeling of materials to computational design and discovery of new materials for a broad range of applications. Multi-fidelity scale bridging between the various flavors of molecular dynamics i.e. ab-initio, classical a… ▽ More The ever-increasing power of supercomputers coupled with highly scalable simulation codes have made molecular dynamics an indispensable tool in applications ranging from predictive modeling of materials to computational design and discovery of new materials for a broad range of applications. Multi-fidelity scale bridging between the various flavors of molecular dynamics i.e. ab-initio, classical and coarse-grained models has remained a long-standing challenge. Here, we introduce our framework BLAST (Bridging Length/time scales via Atomistic Simulation Toolkit) that leverages machine learning principles to address this challenge. BLAST is a multi-fidelity scale bridging framework that provide users with the capabilities to train and develop their own classical atomistic and coarse-grained interatomic potentials (force fields) for molecular simulations. BLAST is designed to address several long-standing problems in the molecular simulations community, such as unintended misuse of existing force fields due to knowledge gap between developers and users, bottlenecks in traditional force field development approaches, and other issues relating to the accuracy, efficiency, and transferability of force fields. Here, we discuss several important aspects in force field development and highlight features in BLAST that enable its functionalities and ease of use. △ Less

Submitted 21 February, 2020; originally announced February 2020.

arXiv:2001.08088 [pdf, other]

Training Neural Network Controllers Using Control Barrier Functions in the Presence of Disturbances

Authors: Shakiba Yaghoubi, Georgios Fainekos, Sriram Sankaranarayanan

Abstract: Control Barrier Functions (CBF) have been recently utilized in the design of provably safe feedback control laws for nonlinear systems. These feedback control methods typically compute the next control input by solving an online Quadratic Program (QP). Solving QP in real-time can be a computationally expensive process for resource constraint systems. In this work, we propose to use imitation learn… ▽ More Control Barrier Functions (CBF) have been recently utilized in the design of provably safe feedback control laws for nonlinear systems. These feedback control methods typically compute the next control input by solving an online Quadratic Program (QP). Solving QP in real-time can be a computationally expensive process for resource constraint systems. In this work, we propose to use imitation learning to learn Neural Network-based feedback controllers which will satisfy the CBF constraints. In the process, we also develop a new class of High Order CBF for systems under external disturbances. We demonstrate the framework on a unicycle model subject to external disturbances, e.g., wind or currents. △ Less

Submitted 18 January, 2020; originally announced January 2020.

arXiv:1912.08112 [pdf, other]

A learning-based algorithm to quickly compute good primal solutions for Stochastic Integer Programs

Authors: Yoshua Bengio, Emma Fre**ger, Andrea Lodi, Rahul Patel, Sriram Sankaranarayanan

Abstract: We propose a novel approach using supervised learning to obtain near-optimal primal solutions for two-stage stochastic integer programming (2SIP) problems with constraints in the first and second stages. The goal of the algorithm is to predict a "representative scenario" (RS) for the problem such that, deterministically solving the 2SIP with the random realization equal to the RS, gives a near-opt… ▽ More We propose a novel approach using supervised learning to obtain near-optimal primal solutions for two-stage stochastic integer programming (2SIP) problems with constraints in the first and second stages. The goal of the algorithm is to predict a "representative scenario" (RS) for the problem such that, deterministically solving the 2SIP with the random realization equal to the RS, gives a near-optimal solution to the original 2SIP. Predicting an RS, instead of directly predicting a solution ensures first-stage feasibility of the solution. If the problem is known to have complete recourse, second-stage feasibility is also guaranteed. For computational testing, we learn to find an RS for a two-stage stochastic facility location problem with integer variables and linear constraints in both stages and consistently provide near-optimal solutions. Our computing times are very competitive with those of general-purpose integer programming solvers to achieve a similar solution quality. △ Less

Submitted 17 December, 2019; originally announced December 2019.

arXiv:1910.06452 [pdf, other]

When Nash Meets Stackelberg

Authors: Margarida Carvalho, Gabriele Dragotto, Felipe Feijoo, Andrea Lodi, Sriram Sankaranarayanan

Abstract: This article introduces a class of $Nash$ games among $Stackelberg$ players ($NASPs$), namely, a class of simultaneous non-cooperative games where the players solve sequential Stackelberg games. Specifically, each player solves a Stackelberg game where a leader optimizes a (parametrized) linear objective function subject to linear constraints while its followers solve convex quadratic problems sub… ▽ More This article introduces a class of $Nash$ games among $Stackelberg$ players ($NASPs$), namely, a class of simultaneous non-cooperative games where the players solve sequential Stackelberg games. Specifically, each player solves a Stackelberg game where a leader optimizes a (parametrized) linear objective function subject to linear constraints while its followers solve convex quadratic problems subject to the standard optimistic assumption. Although we prove that deciding if a $NASP$ instance admits a Nash equilibrium is generally a $Σ^2_p$-hard decision problem, we devise two exact and computationally-efficient algorithms to compute and select Nash equilibria or certify that no equilibrium exists. We apply $NASPs$ to model the hierarchical interactions of international energy markets where climate-change aware regulators oversee the operations of profit-driven energy producers. By combining real-world data with our models, we find that Nash equilibria provide informative, and often counterintuitive, managerial insights for market regulators. △ Less

Submitted 2 November, 2022; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1907.10159 [pdf, other]

Efficient Detection and Quantification of Timing Leaks with Neural Networks

Authors: Saeid Tizpaz-Niari, Pavol Cerny, Sriram Sankaranarayanan, Ashutosh Trivedi

Abstract: Detection and quantification of information leaks through timing side channels are important to guarantee confidentiality. Although static analysis remains the prevalent approach for detecting timing side channels, it is computationally challenging for real-world applications. In addition, the detection techniques are usually restricted to 'yes' or 'no' answers. In practice, real-world application… ▽ More Detection and quantification of information leaks through timing side channels are important to guarantee confidentiality. Although static analysis remains the prevalent approach for detecting timing side channels, it is computationally challenging for real-world applications. In addition, the detection techniques are usually restricted to 'yes' or 'no' answers. In practice, real-world applications may need to leak information about the secret. Therefore, quantification techniques are necessary to evaluate the resulting threats of information leaks. Since both problems are very difficult or impossible for static analysis techniques, we propose a dynamic analysis method. Our novel approach is to split the problem into two tasks. First, we learn a timing model of the program as a neural network. Second, we analyze the neural network to quantify information leaks. As demonstrated in our experiments, both of these tasks are feasible in practice --- making the approach a significant improvement over the state-of-the-art side channel detectors and quantifiers. Our key technical contributions are (a) a neural network architecture that enables side channel discovery and (b) an MILP-based algorithm to estimate the side-channel strength. On a set of micro-benchmarks and real-world applications, we show that neural network models learn timing behaviors of programs with thousands of methods. We also show that neural networks with thousands of neurons can be efficiently analyzed to detect and quantify information leaks through timing side channels. △ Less

Submitted 23 July, 2019; originally announced July 2019.

Comments: To Appear in RV'19

arXiv:1902.03680 [pdf, other]

Learning From Noisy Labels By Regularized Estimation Of Annotator Confusion

Authors: Ryutaro Tanno, Ardavan Saeedi, Swami Sankaranarayanan, Daniel C. Alexander, Nathan Silberman

Abstract: The predictive performance of supervised learning algorithms depends on the quality of labels. In a typical label collection process, multiple annotators provide subjective noisy estimates of the "truth" under the influence of their varying skill-levels and biases. Blindly treating these noisy labels as the ground truth limits the accuracy of learning algorithms in the presence of strong disagreem… ▽ More The predictive performance of supervised learning algorithms depends on the quality of labels. In a typical label collection process, multiple annotators provide subjective noisy estimates of the "truth" under the influence of their varying skill-levels and biases. Blindly treating these noisy labels as the ground truth limits the accuracy of learning algorithms in the presence of strong disagreement. This problem is critical for applications in domains such as medical imaging where both the annotation cost and inter-observer variability are high. In this work, we present a method for simultaneously learning the individual annotator model and the underlying true label distribution, using only noisy observations. Each annotator is modeled by a confusion matrix that is jointly estimated along with the classifier predictions. We propose to add a regularization term to the loss function that encourages convergence to the true annotator confusion matrix. We provide a theoretical argument as to how the regularization is essential to our approach both for the case of single annotator and multiple annotators. Despite the simplicity of the idea, experiments on image classification tasks with both simulated and real labels show that our method either outperforms or performs on par with the state-of-the-art methods and is capable of estimating the skills of annotators even with a single label available per image. △ Less

Submitted 17 June, 2019; v1 submitted 10 February, 2019; originally announced February 2019.

Comments: CVPR 2019, code snippets included

arXiv:1806.04552 [pdf, other]

Combining Model-Free Q-Ensembles and Model-Based Approaches for Informed Exploration

Authors: Sreecharan Sankaranarayanan, Raghuram Mandyam Annasamy, Katia Sycara, Carolyn Penstein Rosé

Abstract: Q-Ensembles are a model-free approach where input images are fed into different Q-networks and exploration is driven by the assumption that uncertainty is proportional to the variance of the output Q-values obtained. They have been shown to perform relatively well compared to other exploration strategies. Further, model-based approaches, such as encoder-decoder models have been used successfully f… ▽ More Q-Ensembles are a model-free approach where input images are fed into different Q-networks and exploration is driven by the assumption that uncertainty is proportional to the variance of the output Q-values obtained. They have been shown to perform relatively well compared to other exploration strategies. Further, model-based approaches, such as encoder-decoder models have been used successfully for next frame prediction given previous frames. This paper proposes to integrate the model-free Q-ensembles and model-based approaches with the hope of compounding the benefits of both and achieving superior exploration as a result. Results show that a model-based trajectory memory approach when combined with Q-ensembles produces superior performance when compared to only using Q-ensembles. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: Submitted to the Thirty-Second Annual Conference on Neural Information Processing Systems (NIPS 2018)

arXiv:1804.05288 [pdf, other]

Path-Following through Control Funnel Functions

Authors: Hadi Ravanbakhsh, Sina Aghli, Christoffer Heckman, Sriram Sankaranarayanan

Abstract: We present an approach to path following using so-called control funnel functions. Synthesizing controllers to "robustly" follow a reference trajectory is a fundamental problem for autonomous vehicles. Robustness, in this context, requires our controllers to handle a specified amount of deviation from the desired trajectory. Our approach considers a timing law that describes how fast to move along… ▽ More We present an approach to path following using so-called control funnel functions. Synthesizing controllers to "robustly" follow a reference trajectory is a fundamental problem for autonomous vehicles. Robustness, in this context, requires our controllers to handle a specified amount of deviation from the desired trajectory. Our approach considers a timing law that describes how fast to move along a given reference trajectory and a control feedback law for reducing deviations from the reference. We synthesize both feedback laws using "control funnel functions" that jointly encode the control law as well as its correctness argument over a mathematical model of the vehicle dynamics. We adapt a previously described demonstration-based learning algorithm to synthesize a control funnel function as well as the associated feedback law. We implement this law on top of a 1/8th scale autonomous vehicle called the Parkour car. We compare the performance of our path following approach against a trajectory tracking approach by specifying trajectories of varying lengths and curvatures. Our experiments demonstrate the improved robustness obtained from the use of control funnel functions. △ Less

Submitted 2 August, 2018; v1 submitted 14 April, 2018; originally announced April 2018.

arXiv:1804.01159 [pdf, other]

Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition

Authors: Rajeev Ranjan, Ankan Bansal, Hongyu Xu, Swami Sankaranarayanan, Jun-Cheng Chen, Carlos D. Castillo, Rama Chellappa

Abstract: In recent years, the performance of face verification and recognition systems based on deep convolutional neural networks (DCNNs) has significantly improved. A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of fac… ▽ More In recent years, the performance of face verification and recognition systems based on deep convolutional neural networks (DCNNs) has significantly improved. A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images or videos. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we propose a new loss function, called Crystal Loss, that restricts the features to lie on a hypersphere of a fixed radius. The loss can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly improves the performance of face verification and recognition systems. We achieve state-of-the-art performance for face verification and recognition on challenging LFW, IJB-A, IJB-B and IJB-C datasets over a large range of false alarm rates (10-1 to 10-7). △ Less

Submitted 3 February, 2019; v1 submitted 3 April, 2018; originally announced April 2018.

Comments: Previously portions of this work appeared in arXiv:1703.09507, which was a conference version. This version is an extended journal version of it

arXiv:1712.00699 [pdf, other]

Improving Network Robustness against Adversarial Attacks with Compact Convolution

Authors: Rajeev Ranjan, Swami Sankaranarayanan, Carlos D. Castillo, Rama Chellappa

Abstract: Though Convolutional Neural Networks (CNNs) have surpassed human-level performance on tasks such as object classification and face verification, they can easily be fooled by adversarial attacks. These attacks add a small perturbation to the input image that causes the network to misclassify the sample. In this paper, we focus on neutralizing adversarial attacks by compact feature learning. In part… ▽ More Though Convolutional Neural Networks (CNNs) have surpassed human-level performance on tasks such as object classification and face verification, they can easily be fooled by adversarial attacks. These attacks add a small perturbation to the input image that causes the network to misclassify the sample. In this paper, we focus on neutralizing adversarial attacks by compact feature learning. In particular, we show that learning features in a closed and bounded space improves the robustness of the network. We explore the effect of L2-Softmax Loss, that enforces compactness in the learned features, thus resulting in enhanced robustness to adversarial perturbations. Additionally, we propose compact convolution, a novel method of convolution that when incorporated in conventional CNNs improves their robustness. Compact convolution ensures feature compactness at every layer such that they are bounded and close to each other. Extensive experiments show that Compact Convolutional Networks (CCNs) neutralize multiple types of attacks, and perform better than existing methods in defending adversarial attacks, without incurring any additional training overhead compared to CNNs. △ Less

Submitted 22 March, 2018; v1 submitted 2 December, 2017; originally announced December 2017.

arXiv:1711.10639 [pdf, other]

doi 10.4204/EPTCS.260.6

A Class of Control Certificates to Ensure Reach-While-Stay for Switched Systems

Authors: Hadi Ravanbakhsh, Sriram Sankaranarayanan

Abstract: In this article, we consider the problem of synthesizing switching controllers for temporal properties through the composition of simple primitive reach-while-stay (RWS) properties. Reach-while-stay properties specify that the system states starting from an initial set I, must reach a goal (target) set G in finite time, while remaining inside a safe set S. Our approach synthesizes switched control… ▽ More In this article, we consider the problem of synthesizing switching controllers for temporal properties through the composition of simple primitive reach-while-stay (RWS) properties. Reach-while-stay properties specify that the system states starting from an initial set I, must reach a goal (target) set G in finite time, while remaining inside a safe set S. Our approach synthesizes switched controllers that select between finitely many modes to satisfy the given RWS specification. To do so, we consider control certificates, which are Lyapunov-like functions that represent control strategies to achieve the desired specification. However, for RWS problems, a control Lyapunov-like function is often hard to synthesize in a simple polynomial form. Therefore, we combine control barrier and Lyapunov functions with an additional compatibility condition between them. Using this approach, the controller synthesis problem reduces to one of solving quantified nonlinear constrained problems that are handled using a combination of SMT solvers. The synthesis of controllers is demonstrated through a set of interesting numerical examples drawn from the related work, and compared with the state-of-the-art tool SCOTS. Our evaluation suggests that our approach is computationally feasible, and adds to the growing body of formal approaches to controller synthesis. △ Less

Submitted 28 November, 2017; originally announced November 2017.

Comments: In Proceedings SYNT 2017, arXiv:1711.10224

Journal ref: EPTCS 260, 2017, pp. 44-61

arXiv:1711.06969 [pdf, other]

Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation

Authors: Swami Sankaranarayanan, Yogesh Balaji, Arpit Jain, Ser Nam Lim, Rama Chellappa

Abstract: Visual Domain Adaptation is a problem of immense importance in computer vision. Previous approaches showcase the inability of even deep neural networks to learn informative representations across domain shift. This problem is more severe for tasks where acquiring hand labeled data is extremely hard and tedious. In this work, we focus on adapting the representations learned by segmentation networks… ▽ More Visual Domain Adaptation is a problem of immense importance in computer vision. Previous approaches showcase the inability of even deep neural networks to learn informative representations across domain shift. This problem is more severe for tasks where acquiring hand labeled data is extremely hard and tedious. In this work, we focus on adapting the representations learned by segmentation networks across synthetic and real domains. Contrary to previous approaches that use a simple adversarial objective or superpixel information to aid the process, we propose an approach based on Generative Adversarial Networks (GANs) that brings the embeddings closer in the learned feature space. To showcase the generality and scalability of our approach, we show that we can achieve state of the art results on two challenging scenarios of synthetic to real domain adaptation. Additional exploratory experiments show that our approach: (1) generalizes to unseen domains and (2) results in improved alignment of source and target distributions. △ Less

Submitted 1 April, 2018; v1 submitted 19 November, 2017; originally announced November 2017.

Comments: Accepted as spotlight talk at CVPR 2018. Code available here: https://github.com/swamiviv/LSD-seg

arXiv:1705.07819 [pdf, other]

Regularizing deep networks using efficient layerwise adversarial training

Authors: Swami Sankaranarayanan, Arpit Jain, Rama Chellappa, Ser Nam Lim

Abstract: Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, its impact on very deep state of the art networks has not been fully investigated. In this paper, we present an efficient approach to perform adversarial training by perturbing intermediate layer activations and study the use of such perturbations as a… ▽ More Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, its impact on very deep state of the art networks has not been fully investigated. In this paper, we present an efficient approach to perform adversarial training by perturbing intermediate layer activations and study the use of such perturbations as a regularizer during training. We use these perturbations to train very deep models such as ResNets and show improvement in performance both on adversarial and original test data. Our experiments highlight the benefits of perturbing intermediate layer activations compared to perturbing only the inputs. The results on CIFAR-10 and CIFAR-100 datasets show the merits of the proposed adversarial training approach. Additional results on WideResNets show that our approach provides significant improvement in classification accuracy for a given base model, outperforming dropout and other base models of larger size. △ Less

Submitted 28 May, 2018; v1 submitted 22 May, 2017; originally announced May 2017.

Comments: Published at the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). Official link: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16634

arXiv:1704.05543 [pdf]

Coordinating Collaborative Chat in Massive Open Online Courses

Authors: Gaurav Singh Tomar, Sreecharan Sankaranarayanan, Xu Wang, Carolyn Penstein Rosé

Abstract: An earlier study of a collaborative chat intervention in a Massive Open Online Course (MOOC) identified negative effects on attrition stemming from a requirement for students to be matched with exactly one partner prior to beginning the activity. That study raised questions about how to orchestrate a collaborative chat intervention in a MOOC context in order to provide the benefit of synchronous s… ▽ More An earlier study of a collaborative chat intervention in a Massive Open Online Course (MOOC) identified negative effects on attrition stemming from a requirement for students to be matched with exactly one partner prior to beginning the activity. That study raised questions about how to orchestrate a collaborative chat intervention in a MOOC context in order to provide the benefit of synchronous social engagement without the coordination difficulties. In this paper we present a careful analysis of an intervention designed to overcome coordination difficulties by welcoming students into the chat on a rolling basis as they arrive rather than requiring them to be matched with a partner before beginning. The results suggest the most positive impact when experiencing a chat with exactly one partner rather than more or less. A qualitative analysis of the chat data reveals differential experiences between these configurations that suggests a potential explanation for the effect and raises questions for future research. △ Less

Submitted 18 April, 2017; originally announced April 2017.

Comments: 8 pages

Journal ref: Proceedings of the International Conference of the Learning Sciences 2016, Volume 1, pp 607-614

arXiv:1704.01705 [pdf, other]

Generate To Adapt: Aligning Domains using Generative Adversarial Networks

Authors: Swami Sankaranarayanan, Yogesh Balaji, Carlos D. Castillo, Rama Chellappa

Abstract: Domain Adaptation is an actively researched problem in Computer Vision. In this work, we propose an approach that leverages unsupervised data to bring the source and target distributions closer in a learned joint feature space. We accomplish this by inducing a symbiotic relationship between the learned embedding and a generative adversarial network. This is in contrast to methods which use the adv… ▽ More Domain Adaptation is an actively researched problem in Computer Vision. In this work, we propose an approach that leverages unsupervised data to bring the source and target distributions closer in a learned joint feature space. We accomplish this by inducing a symbiotic relationship between the learned embedding and a generative adversarial network. This is in contrast to methods which use the adversarial framework for realistic data generation and retraining deep models with such data. We demonstrate the strength and generality of our approach by performing experiments on three different tasks with varying levels of difficulty: (1) Digit classification (MNIST, SVHN and USPS datasets) (2) Object recognition using OFFICE dataset and (3) Domain adaptation from synthetic to real data. Our method achieves state-of-the art performance in most experimental settings and by far the only GAN-based method that has been shown to work well across different datasets such as OFFICE and DIGITS. △ Less

Submitted 12 April, 2018; v1 submitted 6 April, 2017; originally announced April 2017.

Comments: Accepted as spotlight talk at CVPR 2018. Code available here: https://github.com/yogeshbalaji/Generate_To_Adapt

arXiv:1703.07928 [pdf, other]

Self corrective Perturbations for Semantic Segmentation and Classification

Authors: Swami Sankaranarayanan, Arpit Jain, Ser Nam Lim

Abstract: Convolutional Neural Networks have been a subject of great importance over the past decade and great strides have been made in their utility for producing state of the art performance in many computer vision problems. However, the behavior of deep networks is yet to be fully understood and is still an active area of research. In this work, we present an intriguing behavior: pre-trained CNNs can be… ▽ More Convolutional Neural Networks have been a subject of great importance over the past decade and great strides have been made in their utility for producing state of the art performance in many computer vision problems. However, the behavior of deep networks is yet to be fully understood and is still an active area of research. In this work, we present an intriguing behavior: pre-trained CNNs can be made to improve their predictions by structurally perturbing the input. We observe that these perturbations - referred as Guided Perturbations - enable a trained network to improve its prediction performance without any learning or change in network weights. We perform various ablative experiments to understand how these perturbations affect the local context and feature representations. Furthermore, we demonstrate that this idea can improve performance of several existing approaches on semantic segmentation and scene labeling tasks on the PASCAL VOC dataset and supervised classification tasks on MNIST and CIFAR10 datasets. △ Less

Submitted 3 August, 2017; v1 submitted 23 March, 2017; originally announced March 2017.

Comments: Accepted to ICCV 2017

arXiv:1702.07103 [pdf, other]

Discriminating Traces with Time

Authors: Saeid Tizpaz-Niari, Pavol Cerny, Bor-Yuh Evan Chang, Sriram Sankaranarayanan, Ashutosh Trivedi

Abstract: What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP)… ▽ More What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals. On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy---more scalably than maximum likelihood discriminants and with comparable accuracy. We demonstrate on three larger case studies how decision-tree discriminants produced by our tool are useful for debugging timing side-channel vulnerabilities (i.e., where a malicious observer infers secrets simply from passively watching execution times) and availability vulnerabilities. △ Less

Submitted 23 February, 2017; originally announced February 2017.

Comments: Published in TACAS 2017

arXiv:1611.01751 [pdf, other]

doi 10.1109/FG.2017.85

Deep Convolutional Neural Network Features and the Original Image

Authors: Connor J. Parde, Carlos Castillo, Matthew Q. Hill, Y. Ivette Colon, Swami Sankaranarayanan, Jun-Cheng Chen, Alice J. O'Toole

Abstract: Face recognition algorithms based on deep convolutional neural networks (DCNNs) have made progress on the task of recognizing faces in unconstrained viewing conditions. These networks operate with compact feature-based face representations derived from learning a very large number of face images. While the learned features produced by DCNNs can be highly robust to changes in viewpoint, illuminatio… ▽ More Face recognition algorithms based on deep convolutional neural networks (DCNNs) have made progress on the task of recognizing faces in unconstrained viewing conditions. These networks operate with compact feature-based face representations derived from learning a very large number of face images. While the learned features produced by DCNNs can be highly robust to changes in viewpoint, illumination, and appearance, little is known about the nature of the face code that emerges at the top level of such networks. We analyzed the DCNN features produced by two face recognition algorithms. In the first set of experiments we used the top-level features from the DCNNs as input into linear classifiers aimed at predicting metadata about the images. The results show that the DCNN features contain surprisingly accurate information about the yaw and pitch of a face, and about whether the face came from a still image or a video frame. In the second set of experiments, we measured the extent to which individual DCNN features operated in a view-dependent or view-invariant manner. We found that view-dependent coding was a characteristic of the identities rather than the DCNN features - with some identities coded consistently in a view-dependent way and others in a view-independent way. In our third analysis, we visualized the DCNN feature space for over 24,000 images of 500 identities. Images in the center of the space were uniformly of low quality (e.g., extreme views, face occlusion, low resolution). Image quality increased monotonically as a function of distance from the origin. This result suggests that image quality information is available in the DCNN features, such that consistently average feature values reflect coding failures that reliably indicate poor or unusable images. Combined, the results offer insight into the coding mechanisms that support robust representation of faces in DCNNs. △ Less

Submitted 6 November, 2016; originally announced November 2016.

Comments: Submitted to Face and Gesture Conference, 2017

arXiv:1611.00851 [pdf, other]

An All-In-One Convolutional Neural Network for Face Analysis

Authors: Rajeev Ranjan, Swami Sankaranarayanan, Carlos D. Castillo, Rama Chellappa

Abstract: We present a multi-purpose algorithm for simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation and face recognition using a single deep convolutional neural network (CNN). The proposed method employs a multi-task learning framework that regularizes the shared parameters of CNN and builds a synergy among different domains and tasks. Extens… ▽ More We present a multi-purpose algorithm for simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation and face recognition using a single deep convolutional neural network (CNN). The proposed method employs a multi-task learning framework that regularizes the shared parameters of CNN and builds a synergy among different domains and tasks. Extensive experiments show that the network has a better understanding of face and achieves state-of-the-art result for most of these tasks. △ Less

Submitted 2 November, 2016; originally announced November 2016.

arXiv:1605.02686 [pdf, other]

Unconstrained Still/Video-Based Face Verification with Deep Convolutional Neural Networks

Authors: Jun-Cheng Chen, Rajeev Ranjan, Swami Sankaranarayanan, Amit Kumar, Ching-Hui Chen, Vishal M. Patel, Carlos D. Castillo, Rama Chellappa

Abstract: Over the last five years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear map** between input images and class labels as well as the affordability of GPUs. In this paper,… ▽ More Over the last five years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear map** between input images and class labels as well as the affordability of GPUs. In this paper, we present the design details of a deep learning system for unconstrained face recognition, including modules for face detection, association, alignment and face verification. The quantitative performance evaluation is conducted using the IARPA Janus Benchmark A (IJB-A), the JANUS Challenge Set 2 (JANUS CS2), and the LFW dataset. The IJB-A dataset includes real-world unconstrained faces of 500 subjects with significant pose and illumination variations which are much harder than the Labeled Faces in the Wild (LFW) and Youtube Face (YTF) datasets. JANUS CS2 is the extended version of IJB-A which contains not only all the images/frames of IJB-A but also includes the original videos for evaluating the video-based face verification system. Some open issues regarding DCNNs for face verification problems are then discussed. △ Less

Submitted 17 July, 2017; v1 submitted 9 May, 2016; originally announced May 2016.

Comments: accepted by IJCV

arXiv:1604.05417 [pdf, other]

doi 10.1109/BTAS.2016.7791205

Triplet Probabilistic Embedding for Face Verification and Clustering

Authors: Swami Sankaranarayanan, Azadeh Alavi, Carlos Castillo, Rama Chellappa

Abstract: Despite significant progress made over the past twenty five years, unconstrained face verification remains a challenging problem. This paper proposes an approach that couples a deep CNN-based approach with a low-dimensional discriminative embedding learned using triplet probability constraints to solve the unconstrained face verification problem. Aside from yielding performance improvements, this… ▽ More Despite significant progress made over the past twenty five years, unconstrained face verification remains a challenging problem. This paper proposes an approach that couples a deep CNN-based approach with a low-dimensional discriminative embedding learned using triplet probability constraints to solve the unconstrained face verification problem. Aside from yielding performance improvements, this embedding provides significant advantages in terms of memory and for post-processing operations like subject specific clustering. Experiments on the challenging IJB-A dataset show that the proposed algorithm performs comparably or better than the state of the art methods in verification and identification metrics, while requiring much less training data and training time. The superior performance of the proposed method on the CFP dataset shows that the representation learned by our deep CNN is robust to extreme pose variation. Furthermore, we demonstrate the robustness of the deep features to challenges including age, pose, blur and clutter by performing simple clustering experiments on both IJB-A and LFW datasets. △ Less

Submitted 17 January, 2017; v1 submitted 18 April, 2016; originally announced April 2016.

Comments: Oral Paper in BTAS 2016; NVIDIA Best paper Award (http://ieee-biometrics.org/btas2016/awards.html)

arXiv:1602.03418 [pdf, ps, other]

Triplet Similarity Embedding for Face Verification

Authors: Swami Sankaranarayanan, Azadeh Alavi, Rama Chellappa

Abstract: In this work, we present an unconstrained face verification algorithm and evaluate it on the recently released IJB-A dataset that aims to push the boundaries of face verification methods. The proposed algorithm couples a deep CNN-based approach with a low-dimensional discriminative embedding learnt using triplet similarity constraints in a large margin fashion. Aside from yielding performance impr… ▽ More In this work, we present an unconstrained face verification algorithm and evaluate it on the recently released IJB-A dataset that aims to push the boundaries of face verification methods. The proposed algorithm couples a deep CNN-based approach with a low-dimensional discriminative embedding learnt using triplet similarity constraints in a large margin fashion. Aside from yielding performance improvement, this embedding provides significant advantages in terms of memory and post-processing operations like hashing and visualization. Experiments on the IJB-A dataset show that the proposed algorithm outperforms state of the art methods in verification and identification metrics, while requiring less training time. △ Less

Submitted 13 March, 2016; v1 submitted 10 February, 2016; originally announced February 2016.

Showing 1–50 of 54 results for author: Sankaranarayanan, S