Skip to main content

Showing 1–50 of 55 results for author: Itti, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09546  [pdf, other

    cs.CV

    BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

    Authors: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

    Abstract: The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data generators offer a promising alternative, particularly for embodied AI tasks, they often fall short for computer vision tasks due to low asset and renderin… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: CVPR 2024 (Highlight). Project website: https://behavior-vision-suite.github.io/

  2. arXiv:2312.14216  [pdf, other

    cs.CV

    DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

    Authors: Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

    Abstract: The popularization of Text-to-Image (T2I) diffusion models enables the generation of high-quality images from text descriptions. However, generating diverse customized images with reference visual attributes remains challenging. This work focuses on personalizing T2I diffusion models at a more abstract concept or category level, adapting commonalities from a set of reference images while creating… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  3. arXiv:2312.12339  [pdf, other

    cs.LG cs.RO

    Value Explicit Pretraining for Learning Transferable Representations

    Authors: Kiran Lekkala, Henghui Bao, Sumedh Sontakke, Laurent Itti

    Abstract: We propose Value Explicit Pretraining (VEP), a method that learns generalizable representations for transfer reinforcement learning. VEP enables learning of new tasks that share similar objectives as previously learned tasks, by learning an encoder for objective-conditioned representations, irrespective of appearance changes and environment dynamics. To pre-train the encoder from a sequence of obs… ▽ More

    Submitted 7 March, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted at CoRL 2023 Workshop on PRL, Under Review at ICML 2024

  4. arXiv:2312.05277  [pdf, other

    cs.CV cs.LG

    3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection

    Authors: Yunhao Ge, Hong-Xing Yu, Cheng Zhao, Yuliang Guo, Xinyu Huang, Liu Ren, Laurent Itti, Jiajun Wu

    Abstract: A major challenge in monocular 3D object detection is the limited diversity and quantity of objects in real datasets. While augmenting real scenes with virtual objects holds promise to improve both the diversity and quantity of the objects, it remains elusive due to the lack of an effective 3D object insertion method in complex real captured scenes. In this work, we study augmenting complex real i… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023. Project website: https://gyhandy.github.io/3D-Copy-Paste/

  5. arXiv:2311.13648  [pdf, other

    cs.LG

    Evaluating Pretrained models for Deployable Lifelong Learning

    Authors: Kiran Lekkala, Eshan Bhargava, Yunhao Ge, Laurent Itti

    Abstract: We create a novel benchmark for evaluating a Deployable Lifelong Learning system for Visual Reinforcement Learning (RL) that is pretrained on a curated dataset, and propose a novel Scalable Lifelong Learning system capable of retaining knowledge from the previously learnt RL tasks. Our benchmark measures the efficacy of a deployable Lifelong Learning system that is evaluated on scalability, perfor… ▽ More

    Submitted 17 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: In submission to CoLLA 2024. Also published in the Proceedings of WACV 2024 Workshop on Pretraining

  6. arXiv:2310.18847  [pdf, other

    cs.RO cs.LG

    Bird's Eye View Based Pretrained World model for Visual Navigation

    Authors: Kiran Lekkala, Chen Liu, Laurent Itti

    Abstract: Sim2Real transfer has gained popularity because it helps transfer from inexpensive simulators to real world. This paper presents a novel system that fuses components in a traditional World Model into a robust system, trained entirely within a simulator, that Zero-Shot transfers to the real world. To facilitate transfer, we use an intermediary representation that is based on \textit{Bird's Eye View… ▽ More

    Submitted 22 March, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Under Review at the IROS 2024; Accepted at NeurIPS 2023, Robot Learning Workshop

  7. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  8. arXiv:2310.07899  [pdf, other

    cs.AI cs.RO

    RoboCLIP: One Demonstration is Enough to Learn Robot Policies

    Authors: Sumedh A Sontakke, Jesse Zhang, Sébastien M. R. Arnold, Karl Pertsch, Erdem Bıyık, Dorsa Sadigh, Chelsea Finn, Laurent Itti

    Abstract: Reward specification is a notoriously difficult problem in reinforcement learning, requiring extensive expert supervision to design robust reward functions. Imitation learning (IL) methods attempt to circumvent these problems by utilizing expert demonstrations but typically require a large number of in-domain expert demonstrations. Inspired by advances in the field of Video-and-Language Models (VL… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  9. arXiv:2309.05956  [pdf, other

    cs.CV

    Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

    Abstract: We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples training data generation into foreground object generation, and contextually coherent background generation. To generate foreground objects, we employ a straightforward textual template,… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Code in https://github.com/gyhandy/Text2Image-for-Detection

  10. arXiv:2307.11386  [pdf, other

    cs.CV

    CLR: Channel-wise Lightweight Reprogramming for Continual Learning

    Authors: Yunhao Ge, Yuecheng Li, Shuo Ni, Jia** Zhao, Ming-Hsuan Yang, Laurent Itti

    Abstract: Continual learning aims to emulate the human ability to continually accumulate knowledge over sequential tasks. The main challenge is to maintain performance on previously learned tasks after learning new tasks, i.e., to avoid catastrophic forgetting. We propose a Channel-wise Lightweight Reprogramming (CLR) approach that helps convolutional neural networks (CNNs) overcome catastrophic forgetting… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: ICCV 2023

  11. arXiv:2305.17207  [pdf, other

    cs.CV

    Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models

    Authors: Yunhao Ge, Jie Ren, Jia** Zhao, Kaifeng Chen, Andrew Gallagher, Laurent Itti, Balaji Lakshminarayanan

    Abstract: We focus on the challenge of out-of-distribution (OOD) detection in deep learning models, a crucial aspect in ensuring reliability. Despite considerable effort, the problem remains significantly challenging in deep learning models due to their propensity to output over-confident predictions for OOD inputs. We propose a novel one-class open-set OOD detector that leverages text-image pre-trained mod… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 16 pages (including appendix and references), 3 figures

  12. arXiv:2305.16484  [pdf, other

    cs.LG cs.AI

    Batch Model Consolidation: A Multi-Task Model Consolidation Framework

    Authors: Iordanis Fostiropoulos, Jiaye Zhu, Laurent Itti

    Abstract: In Continual Learning (CL), a model is required to learn a stream of tasks sequentially without significant performance degradation on previously learned tasks. Current approaches fail for a long sequence of tasks from diverse domains and difficulties. Many of the existing CL approaches are difficult to apply in practice due to excessive memory cost or training time, or are tightly coupled to a si… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Published at CVPR 2023

  13. arXiv:2305.15591  [pdf, other

    cs.LG

    Lightweight Learner for Shared Knowledge Lifelong Learning

    Authors: Yunhao Ge, Yuecheng Li, Di Wu, Ao Xu, Adam M. Jones, Amanda Sofie Rios, Iordanis Fostiropoulos, Shixian Wen, Po-Hsuan Huang, Zachary William Murdock, Gozde Sahin, Shuo Ni, Kiran Lekkala, Sumedh Anand Sontakke, Laurent Itti

    Abstract: In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated LL machinery is then deployed to mitigate the forgetting of old tasks as new tasks are learned. This is inherently slow. We propose a new Shared Knowledge Lifelong Learning (SKILL) challenge, which deploys a decentral… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Transactions on Machine Learning Research (TMLR) paper

  14. arXiv:2305.12571  [pdf, other

    cs.LG cs.AI cs.SE

    Reproducibility Requires Consolidated Artifacts

    Authors: Iordanis Fostiropoulos, Bowman Brown, Laurent Itti

    Abstract: Machine learning is facing a 'reproducibility crisis' where a significant number of works report failures when attempting to reproduce previously published results. We evaluate the sources of reproducibility failures using a meta-analysis of 142 replication studies from ReScience C and 204 code repositories. We find that missing experiment details such as hyperparameters are potential causes of un… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  15. arXiv:2212.07629  [pdf, other

    cs.CV

    EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet

    Abstract: We propose EM-PASTE: an Expectation Maximization(EM) guided Cut-Paste compositional dataset augmentation approach for weakly-supervised instance segmentation using only image-level supervision. The proposed method consists of three main components. The first component generates high-quality foreground object masks. To this end, an EM-like approach is proposed that iteratively refines an initial se… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: 15 pages (including appendix), 7 figures

  16. arXiv:2212.01758  [pdf, other

    cs.CV

    Improving Zero-shot Generalization and Robustness of Multi-modal Models

    Authors: Yunhao Ge, Jie Ren, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, Jia** Zhao

    Abstract: Multi-modal image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks and their zero-shot generalization ability is particularly exciting. While the top-5 zero-shot accuracies of these models are very high, the top-1 accuracies are much lower (over 25% gap in some cases). We investigate the reasons for this performance gap and find that many… ▽ More

    Submitted 25 May, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

    Comments: CVPR 2023

  17. arXiv:2212.00089  [pdf, other

    cs.AR cs.ET

    Ferroelectric FET based Context-Switching FPGA Enabling Dynamic Reconfiguration for Adaptive Deep Learning Machines

    Authors: Yixin Xu, Zijian Zhao, Yi Xiao, Tongguang Yu, Halid Mulaosmanovic, Dominik Kleimaier, Stefan Duenkel, Sven Beyer, Xiao Gong, Rajiv Joshi, X. Sharon Hu, Shixian Wen, Amanda Sofie Rios, Kiran Lekkala, Laurent Itti, Eric Homan, Sumitha George, Vijaykrishnan Narayanan, Kai Ni

    Abstract: Field Programmable Gate Array (FPGA) is widely used in acceleration of deep learning applications because of its reconfigurability, flexibility, and fast time-to-market. However, conventional FPGA suffers from the tradeoff between chip area and reconfiguration latency, making efficient FPGA accelerations that require switching between multiple configurations still elusive. In this paper, we perfor… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: 54 pages, 15 figures

  18. arXiv:2211.14424  [pdf, other

    cs.LG cs.AI

    Supervised Contrastive Prototype Learning: Augmentation Free Robust Neural Network

    Authors: Iordanis Fostiropoulos, Laurent Itti

    Abstract: Transformations in the input space of Deep Neural Networks (DNN) lead to unintended changes in the feature space. Almost perceptually identical inputs, such as adversarial examples, can have significantly distant feature representations. On the contrary, Out-of-Distribution (OOD) samples can have highly similar feature representations to training set samples. Our theoretical analysis for DNNs trai… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  19. arXiv:2207.11368  [pdf, other

    cs.CV

    Neural-Sim: Learning to Generate Training Data with NeRF

    Authors: Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

    Abstract: Training computer vision models usually requires collecting and labeling vast amounts of imagery under a diverse set of scene configurations and properties. This process is incredibly time-consuming, and it is challenging to ensure that the captured data distribution maps well to the target domain of an application scenario. Recently, synthetic data has emerged as a way to address both of these is… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  20. arXiv:2207.09510  [pdf, other

    cs.CV

    Contributions of Shape, Texture, and Color in Visual Recognition

    Authors: Yunhao Ge, Yao Xiao, Zhi Xu, Xingrui Wang, Laurent Itti

    Abstract: We investigate the contributions of three important features of the human visual system (HVS)~ -- ~shape, texture, and color ~ -- ~to object classification. We build a humanoid vision engine (HVE) that explicitly and separately computes shape, texture, and color features from images. The resulting feature vectors are then concatenated to support the final classification. We show that HVE can summa… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  21. arXiv:2206.09592  [pdf, other

    cs.CV

    DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

    Abstract: We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples training data generation into foreground object mask generation and background (context) image generation. For foreground object mask generation, we use a simple textual template with obje… ▽ More

    Submitted 21 December, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: v3(same as v2) version, update structure (add foreground generation, stable diffusion), add more experiments

  22. arXiv:2206.06469  [pdf

    cs.LG stat.ML

    Invariant Structure Learning for Better Generalization and Causal Explainability

    Authors: Yunhao Ge, Sercan Ö. Arik, **sung Yoon, Ao Xu, Laurent Itti, Tomas Pfister

    Abstract: Learning the causal structure behind data is invaluable for improving generalization and obtaining high-quality explanations. We propose a novel framework, Invariant Structure Learning (ISL), that is designed to improve causal structure discovery by utilizing generalization as an indication. ISL splits the data into different environments, and learns a structure that is invariant to the target acr… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: 16 pages (including Appendix), 4 figures

  23. arXiv:2202.11226  [pdf, other

    cs.LG cs.AI

    Model2Detector: Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps

    Authors: Sumedh A Sontakke, Buvaneswari Ramanan, Laurent Itti, Thomas Woo

    Abstract: Out-of-distribution detection is an important capability that has long eluded vanilla neural networks. Deep Neural networks (DNNs) tend to generate over-confident predictions when presented with inputs that are significantly out-of-distribution (OOD). This can be dangerous when employing machine learning systems in the wild as detecting attacks can thus be difficult. Recent advances inference-time… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:1807.03888, arXiv:1812.04606 by other authors

    Report number: RAISA/2022/04

  24. arXiv:2201.08098  [pdf, other

    cs.CV

    What can we learn from misclassified ImageNet images?

    Authors: Shixian Wen, Amanda Sofie Rios, Kiran Lekkala, Laurent Itti

    Abstract: Understanding the patterns of misclassified ImageNet images is particularly important, as it could guide us to design deep neural networks (DNN) that generalize better. However, the richness of ImageNet imposes difficulties for researchers to visually find any useful patterns of misclassification. Here, to help find these patterns, we propose "Superclassing ImageNet dataset". It is a subset of Ima… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  25. arXiv:2112.03163  [pdf, other

    cs.CV

    Encouraging Disentangled and Convex Representation with Controllable Interpolation Regularization

    Authors: Yunhao Ge, Zhi Xu, Yao Xiao, Gan Xin, Yunkui Pang, Laurent Itti

    Abstract: We focus on controllable disentangled representation learning (C-Dis-RL), where users can control the partition of the disentangled latent space to factorize dataset attributes (concepts) for downstream tasks. Two general problems remain under-explored in current methods: (1) They lack comprehensive disentanglement constraints, especially missing the minimization of mutual information between diff… ▽ More

    Submitted 23 March, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: 17 pages, 19 figure (including appendix)

  26. arXiv:2110.15489  [pdf, other

    cs.LG cs.AI

    GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

    Authors: Sumedh A Sontakke, Stephen Iota, Zizhao Hu, Arash Mehrjou, Laurent Itti, Bernhard Schölkopf

    Abstract: Out-of-distribution (OOD) detection is a well-studied topic in supervised learning. Extending the successes in supervised learning methods to the reinforcement learning (RL) setting, however, is difficult due to the data generating process - RL agents actively query their environment for data, and the data are a function of the policy followed by the agent. An agent could thus neglect a shift in t… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  27. arXiv:2109.03813  [pdf, other

    cs.AI

    Video2Skill: Adapting Events in Demonstration Videos to Skills in an Environment using Cyclic MDP Homomorphisms

    Authors: Sumedh A Sontakke, Sumegh Roychowdhury, Mausoom Sarkar, Nikaash Puri, Balaji Krishnamurthy, Laurent Itti

    Abstract: Humans excel at learning long-horizon tasks from demonstrations augmented with textual commentary, as evidenced by the burgeoning popularity of tutorial videos online. Intuitively, this capability can be separated into 2 distinct subtasks - first, dividing a long-horizon demonstration sequence into semantically meaningful events; second, adapting such events into meaningful behaviors in one's own… ▽ More

    Submitted 9 September, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

  28. arXiv:2105.14639  [pdf, other

    cs.RO cs.LG cs.NE

    Shaped Policy Search for Evolutionary Strategies using Waypoints

    Authors: Kiran Lekkala, Laurent Itti

    Abstract: In this paper, we try to improve exploration in Blackbox methods, particularly Evolution strategies (ES), when applied to Reinforcement Learning (RL) problems where intermediate waypoints/subgoals are available. Since Evolutionary strategies are highly parallelizable, instead of extracting just a scalar cumulative reward, we use the state-action pairs from the trajectories obtained during rollouts… ▽ More

    Submitted 3 July, 2023; v1 submitted 30 May, 2021; originally announced May 2021.

    Comments: Presented at the International Conference on Robotics and Automation (ICRA) 2021

  29. arXiv:2105.00290  [pdf, other

    cs.CV cs.AI cs.LG

    A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

    Authors: Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu

    Abstract: Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from a lack of transparency and interpretability. While recent developments in explainable artificial intelligence attempt to bridge this gap (e.g., by visualizing the correlation between input pixels and final outputs), these approaches are limited to explaining low-level relationsh… ▽ More

    Submitted 1 May, 2021; originally announced May 2021.

    Comments: CVPR 2021

  30. arXiv:2011.04783  [pdf, other

    cs.LG cs.AI

    Lifelong Learning Without a Task Oracle

    Authors: Amanda Rios, Laurent Itti

    Abstract: Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned, termed "catastrophic forgetting". Many state-of-the-art solutions to continual learning rely on biasing and/or partitioning a model to accommodate successive tasks incrementally. However, these methods largely depend on the availability of a task-oracle to confer task ide… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Proceedings of the IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI 2020)

  31. arXiv:2010.03110  [pdf, other

    cs.LG cs.AI cs.RO

    Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

    Authors: Sumedh A. Sontakke, Arash Mehrjou, Laurent Itti, Bernhard Schölkopf

    Abstract: Animals exhibit an innate ability to learn regularities of the world through interaction. By performing experiments in their environment, they are able to discern the causal factors of variation and infer how they affect the world's dynamics. Inspired by this, we attempt to equip reinforcement learning agents with the ability to perform experiments that facilitate a categorization of the rolled-ou… ▽ More

    Submitted 6 August, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: International Conference on Machine Learning, PMLR 139, 2021

  32. arXiv:2010.02556  [pdf, other

    cs.LG cs.AI cs.CL

    SHERLock: Self-Supervised Hierarchical Event Representation Learning

    Authors: Sumegh Roychowdhury, Sumedh A. Sontakke, Nikaash Puri, Mausoom Sarkar, Milan Aggarwal, Pinkesh Badjatiya, Balaji Krishnamurthy, Laurent Itti

    Abstract: Temporal event representations are an essential aspect of learning among humans. They allow for succinct encoding of the experiences we have through a variety of sensory inputs. Also, they are believed to be arranged hierarchically, allowing for an efficient representation of complex long-horizon experiences. Additionally, these representations are acquired in a self-supervised manner. Analogously… ▽ More

    Submitted 22 August, 2022; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted at ICPR '22

  33. Beneficial Perturbation Network for designing general adaptive artificial intelligence systems

    Authors: Shixian Wen, Amanda Rios, Yunhao Ge, Laurent Itti

    Abstract: The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also can adapt to new situations. In contrast, deep neural networks only learn one sophisticated but fixed map** from inputs to outputs. This limits their applicability to more dynamic situations, where input to output map** may change with different contexts. A salient example is… ▽ More

    Submitted 1 February, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

    Comments: Accepted at IEEE Transactions on Neural Networks and Learning Systems Keyword: Adaptive artificial intelligence system , Switch modes , Beneficial perturbations , Continual learning , Adversarial examples

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems 2021

  34. arXiv:2009.12724  [pdf, other

    cs.LG cs.CR stat.ML

    Beneficial Perturbations Network for Defending Adversarial Examples

    Authors: Shixian Wen, Amanda Rios, Laurent Itti

    Abstract: Deep neural networks can be fooled by adversarial attacks: adding carefully computed small adversarial perturbations to clean inputs can cause misclassification on state-of-the-art machine learning models. The reason is that neural networks fail to accommodate the distribution drift of the input data caused by adversarial perturbations. Here, we present a new solution - Beneficial Perturbation Net… ▽ More

    Submitted 13 September, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

    Comments: The paper is under consideration at Pattern Recognition Letters

  35. arXiv:2009.06586  [pdf, other

    cs.CV cs.AI cs.LG

    Zero-shot Synthesis with Group-Supervised Learning

    Authors: Yunhao Ge, Sami Abu-El-Haija, Gan Xin, Laurent Itti

    Abstract: Visual cognition of primates is superior to that of artificial neural networks in its ability to 'envision' a visual object, even a newly-introduced one, in different attributes including pose, position, color, texture, etc. To aid neural networks to envision objects with different attributes, we propose a family of objective functions, expressed on groups of examples, as a novel learning framewor… ▽ More

    Submitted 16 February, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Published at ICLR 2021 (16 pages including appendix)

  36. arXiv:2006.07438  [pdf, other

    cs.LG stat.ML

    Attentive Feature Reuse for Multi Task Meta learning

    Authors: Kiran Lekkala, Laurent Itti

    Abstract: We develop new algorithms for simultaneous learning of multiple tasks (e.g., image classification, depth estimation), and for adapting to unseen task/domain distributions within those high-level tasks (e.g., different environments). First, we learn common representations underlying all tasks. We then propose an attention mechanism to dynamically specialize the network, at runtime, for each task. O… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  37. arXiv:2003.08526  [pdf, other

    cs.CV

    Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition

    Authors: Yunhao Ge, Jia** Zhao, Laurent Itti

    Abstract: Object pose increases intraclass object variance which makes object recognition from 2D images harder. To render a classifier robust to pose variations, most deep neural networks try to eliminate the influence of pose by using large datasets with many poses for each class. Here, we propose a different approach: a class-agnostic object pose transformation network (OPT-Net) can transform an image al… ▽ More

    Submitted 13 January, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: ECCV 2020, with supplementary materials

  38. arXiv:1911.10322  [pdf, other

    cs.LG cs.AI stat.ML

    Meta Adaptation using Importance Weighted Demonstrations

    Authors: Kiran Lekkala, Sami Abu-El-Haija, Laurent Itti

    Abstract: Imitation learning has gained immense popularity because of its high sample-efficiency. However, in real-world scenarios, where the trajectory distribution of most of the tasks dynamically shifts, model fitting on continuously aggregated data alone would be futile. In some cases, the distribution shifts, so much, that it is difficult for an agent to infer the new task. We propose a novel algorithm… ▽ More

    Submitted 3 July, 2023; v1 submitted 23 November, 2019; originally announced November 2019.

  39. arXiv:1910.04279  [pdf, other

    cs.LG stat.ML

    Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system

    Authors: Shixian Wen, Laurent Itti

    Abstract: Adversarial training, in which a network is trained on both adversarial and clean examples, is one of the most trusted defense methods against adversarial attacks. However, there are three major practical difficulties in implementing and deploying this method - expensive in terms of extra memory and computation costs; accuracy trade-off between clean and adversarial examples; and lack of diversity… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

  40. arXiv:1906.10528  [pdf, other

    cs.LG cs.AI

    Beneficial perturbation network for continual learning

    Authors: Shixian Wen, Laurent Itti

    Abstract: Sequential learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby previously learned knowledge is erased during learning of new, disjoint knowledge. Here, we propose a fundamentally new type of method - Beneficial Perturbation Network (BPN). We add task-dependent memory (biasing) units to allow the network to operate in different r… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

  41. arXiv:1906.10437  [pdf, other

    cs.LG stat.ML

    Learning Causal State Representations of Partially Observable Environments

    Authors: Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

    Abstract: Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predi… ▽ More

    Submitted 8 February, 2021; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 35 pages, 8 figures

  42. arXiv:1811.01146  [pdf, other

    cs.LG cs.AI stat.ML

    Closed-Loop Memory GAN for Continual Learning

    Authors: Amanda Rios, Laurent Itti

    Abstract: Sequential learning of tasks using gradient descent leads to an unremitting decline in the accuracy of tasks for which training data is no longer available, termed catastrophic forgetting. Generative models have been explored as a means to approximate the distribution of old tasks and bypass storage of real data. Here we propose a cumulative closed-loop memory replay GAN (CloGAN) provided with ext… ▽ More

    Submitted 28 September, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-2019). https://doi.org/10.24963/ijcai.2019/462

  43. arXiv:1805.07441  [pdf, other

    cs.LG stat.ML

    Overcoming catastrophic forgetting problem by weight consolidation and long-term memory

    Authors: Shixian Wen, Laurent Itti

    Abstract: Sequential learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby previously learned knowledge is erased during learning of new, disjoint knowledge. Here, we propose a new approach to sequential learning which leverages the recent discovery of adversarial examples. We use adversarial subspaces from previous tasks to enable learning… ▽ More

    Submitted 18 May, 2018; originally announced May 2018.

    Comments: for Conference on Neural Information Processing Systems 2018 submission

  44. arXiv:1805.04770  [pdf, other

    stat.ML cs.AI cs.LG

    Born Again Neural Networks

    Authors: Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar

    Abstract: Knowledge Distillation (KD) consists of transferring “knowledge” from one machine learning model (the teacher) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student’s compactness, without sacrificing too much performance. We study KD from a new p… ▽ More

    Submitted 29 June, 2018; v1 submitted 12 May, 2018; originally announced May 2018.

    Comments: Published @ICML 2018

  45. arXiv:1607.05851  [pdf, other

    cs.CV

    Learning to Recognize Objects by Retaining other Factors of Variation

    Authors: Jia** Zhao, Chin-kai Chang, Laurent Itti

    Abstract: Natural images are generated under many factors, including shape, pose, illumination etc. Most existing ConvNets formulate object recognition from natural images as a single task classification problem, and attempt to learn features useful for object categories, but invariant to other factors of variation as much as possible. These architectures do not explicitly learn other factors, like pose and… ▽ More

    Submitted 22 January, 2017; v1 submitted 20 July, 2016; originally announced July 2016.

    Comments: 9 pages, accepted by WACV 2017

  46. arXiv:1607.05836  [pdf, other

    cs.CV

    Improved Deep Learning of Object Category using Pose Information

    Authors: Jia** Zhao, Laurent Itti

    Abstract: Despite significant recent progress, the best available computer vision algorithms still lag far behind human capabilities, even for recognizing individual discrete objects under various poses, illuminations, and backgrounds. Here we present a new approach to using object pose information to improve deep network learning. While existing large-scale datasets, e.g. ImageNet, do not have pose informa… ▽ More

    Submitted 22 January, 2017; v1 submitted 20 July, 2016; originally announced July 2016.

    Comments: 10 pages, accepted by WACV 2017

  47. arXiv:1606.03628  [pdf, other

    cs.LG

    metricDTW: local distance metric learning in Dynamic Time War**

    Authors: Jia** Zhao, Zerong Xi, Laurent Itti

    Abstract: We propose to learn multiple local Mahalanobis distance metrics to perform k-nearest neighbor (kNN) classification of temporal sequences. Temporal sequences are first aligned by dynamic time war** (DTW); given the alignment path, similarity between two sequences is measured by the DTW distance, which is computed as the accumulated distance between matched temporal point pairs along the alignment… ▽ More

    Submitted 11 June, 2016; originally announced June 2016.

  48. arXiv:1606.02355  [pdf, other

    cs.LG cs.AI stat.ML

    Active Long Term Memory Networks

    Authors: Tommaso Furlanello, Jia** Zhao, Andrew M. Saxe, Laurent Itti, Bosco S. Tjan

    Abstract: Continual Learning in artificial neural networks suffers from interference and forgetting when different tasks are learned sequentially. This paper introduces the Active Long Term Memory Networks (A-LTM), a model of sequential multi-task deep learning that is able to maintain previously learned association between sensory input and behavioral output while acquiring knew knowledge. A-LTM exploits t… ▽ More

    Submitted 7 June, 2016; originally announced June 2016.

  49. arXiv:1606.01601  [pdf, other

    cs.CV

    shapeDTW: shape Dynamic Time War**

    Authors: Jia** Zhao, Laurent Itti

    Abstract: Dynamic Time War** (DTW) is an algorithm to align temporal sequences with possible local non-linear distortions, and has been widely applied to audio, video and graphics data alignments. DTW is essentially a point-to-point matching method under some boundary and temporal consistency constraints. Although DTW obtains a global optimal solution, it does not necessarily achieve locally sensible matc… ▽ More

    Submitted 5 June, 2016; originally announced June 2016.

    Comments: 14 pages

  50. Detecting "Smart" Spammers On Social Network: A Topic Model Approach

    Authors: Linqing Liu, Yao Lu, Ye Luo, Renxian Zhang, Laurent Itti, Jianwei Lu

    Abstract: Spammer detection on social network is a challenging problem. The rigid anti-spam rules have resulted in emergence of "smart" spammers. They resemble legitimate users who are difficult to identify. In this paper, we present a novel spammer classification approach based on Latent Dirichlet Allocation(LDA), a topic model. Our approach extracts both the local and the global information of topic distr… ▽ More

    Submitted 9 June, 2016; v1 submitted 28 April, 2016; originally announced April 2016.

    Comments: NAACL-HLT 2016, Student Research Workshop