Skip to main content

Showing 1–50 of 56 results for author: Zhao, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17963  [pdf, other

    cs.LG cs.HC cs.SI

    Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories

    Authors: Yiqiao **, Andrew Zhao, Yeon-Chang Lee, Meng Ye, Ajay Divakaran, Srijan Kumar

    Abstract: We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs,… ▽ More

    Submitted 28 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 27 pages, 11 figures

  2. arXiv:2405.19026  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints

    Authors: Andrew Zhao, Quentin Xu, Matthieu Lin, Shenzhi Wang, Yong-** Liu, Zilong Zheng, Gao Huang

    Abstract: Recent advances in large language models (LLMs) have made them indispensable, raising significant concerns over managing their safety. Automated red teaming offers a promising alternative to the labor-intensive and error-prone manual probing for vulnerabilities, providing more consistent and scalable safety evaluations. However, existing approaches often compromise diversity by focusing on maximiz… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  3. arXiv:2404.09445  [pdf, other

    cs.LG cs.AI cs.CV

    Exploring Text-to-Motion Generation with Human Preference

    Authors: Jenny Sheng, Matthieu Lin, Andrew Zhao, Kevin Pruvost, Yu-Hui Wen, Yangguang Li, Gao Huang, Yong-** Liu

    Abstract: This paper presents an exploration of preference learning in text-to-motion generation. We find that current improvements in text-to-motion generation still rely on datasets requiring expert labelers with motion capture systems. Instead, learning from human preference data does not require motion capture systems; a labeler with no expertise simply compares two generated motions. This is particular… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024 HuMoGen Workshop

  4. arXiv:2403.18451  [pdf, other

    cs.LG cs.AI

    CoRAST: Towards Foundation Model-Powered Correlated Data Analysis in Resource-Constrained CPS and IoT

    Authors: Yi Hu, **hang Zuo, Alanis Zhao, Bob Iannucci, Carlee Joe-Wong

    Abstract: Foundation models (FMs) emerge as a promising solution to harness distributed and diverse environmental data by leveraging prior knowledge to understand the complicated temporal and spatial correlations within heterogeneous datasets. Unlike distributed learning frameworks such as federated learning, which often struggle with multimodal data, FMs can transform diverse inputs into embeddings. This p… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: accepted and to be published in 2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys)

  5. arXiv:2403.13455  [pdf, other

    cs.RO

    FACT: Fast and Active Coordinate Initialization for Vision-based Drone Swarms

    Authors: Yuan Li, Anke Zhao, Yingjian Wang, Ziyi Xu, Xin Zhou, **ni Zhou, Chao Xu, Fei Gao

    Abstract: Swarm robots have sparked remarkable developments across a range of fields. While it is necessary for various applications in swarm robots, a fast and robust coordinate initialization in vision-based drone swarms remains elusive. To this end, our paper proposes a complete system to recover a swarm's initial relative pose on platforms with size, weight, and power (SWaP) constraints. To overcome lim… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  6. arXiv:2401.12377  [pdf, other

    cs.AR

    ACS: Concurrent Kernel Execution on Irregular, Input-Dependent Computational Graphs

    Authors: Sankeerth Durvasula, Adrian Zhao, Raymond Kiguru, Yushi Guan, Zhonghan Chen, Nandita Vijaykumar

    Abstract: GPUs are widely used to accelerate many important classes of workloads today. However, we observe that several important emerging classes of workloads, including simulation engines for deep reinforcement learning and dynamic neural networks, are unable to fully utilize the massive parallelism that GPUs offer. These applications tend to have kernels that are small in size, i.e., have few thread blo… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  7. arXiv:2401.05345  [pdf, other

    cs.CV cs.GR cs.PF

    DISTWAR: Fast Differentiable Rendering on Raster-based Rendering Pipelines

    Authors: Sankeerth Durvasula, Adrian Zhao, Fan Chen, Ruofan Liang, Pawan Kumar Sanjaya, Nandita Vijaykumar

    Abstract: Differentiable rendering is a technique used in an important emerging class of visual computing applications that involves representing a 3D scene as a model that is trained from 2D images using gradient descent. Recent works (e.g. 3D Gaussian Splatting) use a rasterization pipeline to enable rendering high quality photo-realistic imagery at high speeds from these learned 3D models. These methods… ▽ More

    Submitted 1 December, 2023; originally announced January 2024.

  8. arXiv:2312.10399  [pdf, other

    quant-ph cs.IT physics.chem-ph

    Learning, Optimizing, and Simulating Fermions with Quantum Computers

    Authors: Andrew Zhao

    Abstract: Fermions are fundamental particles which obey seemingly bizarre quantum-mechanical principles, yet constitute all the ordinary matter that we inhabit. As such, their study is heavily motivated from both fundamental and practical incentives. In this dissertation, we will explore how the tools of quantum information and computation can assist us on both of these fronts. We primarily do so through th… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: PhD thesis. Includes a background and overview of many-fermion systems, quantum-state learning, and NISQ/error mitigation. Main chapters are based on arXiv:2010.16094 (new: lower bound on sample complexity for local fermionic estimation), arXiv:2310.03071, arXiv:1908.08067 (new: connection between unitary partitioning and matchgate circuits), and arXiv:2301.01778

  9. arXiv:2311.12848  [pdf, other

    cs.DB cs.AI

    Lightweight Knowledge Representations for Automating Data Analysis

    Authors: Marko Sterbentz, Cameron Barrie, Donna Hooshmand, Shubham Shahi, Abhratanu Dutta, Harper Pack, Andong Li Zhao, Andrew Paley, Alexander Einarsson, Kristian Hammond

    Abstract: The principal goal of data science is to derive meaningful information from data. To do this, data scientists develop a space of analytic possibilities and from it reach their information goals by using their knowledge of the domain, the available data, the operations that can be performed on those data, the algorithms/models that are fed the data, and how all of these facets interweave. In this w… ▽ More

    Submitted 15 October, 2023; originally announced November 2023.

  10. arXiv:2311.09692  [pdf, other

    cs.LG cs.AI cs.RO

    Augmenting Unsupervised Reinforcement Learning with Self-Reference

    Authors: Andrew Zhao, Erle Zhu, Rui Lu, Matthieu Lin, Yong-** Liu, Gao Huang

    Abstract: Humans possess the ability to draw on past experiences explicitly when learning new tasks and applying them accordingly. We believe this capacity for self-referencing is especially advantageous for reinforcement learning agents in the unsupervised pretrain-then-finetune setting. During pretraining, an agent's past experiences can be explicitly utilized to mitigate the nonstationarity of intrinsic… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: Preprint

  11. arXiv:2310.01320  [pdf, other

    cs.AI cs.CL cs.CY cs.LG cs.MA

    Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

    Authors: Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang

    Abstract: Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive deceptive or misleading information in human society and AI-generated content. This oversight makes LLMs susceptible to malicious manipulations, potentially… ▽ More

    Submitted 24 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 40 pages

  12. arXiv:2309.08154  [pdf, other

    cs.CV cs.IR

    Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking

    Authors: Wenzhang Wei, Zhipeng Gui, Changguang Wu, Anqi Zhao, Dehua Peng, Huayi Wu

    Abstract: The core of cross-modal matching is to accurately measure the similarity between different modalities in a unified representation space. However, compared to textual descriptions of a certain perspective, the visual modality has more semantic variations. So, images are usually associated with multiple textual captions in databases. Although popular symmetric embedding methods have explored numerou… ▽ More

    Submitted 20 December, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

  13. arXiv:2309.03851  [pdf, other

    cs.LG cs.CV

    CenTime: Event-Conditional Modelling of Censoring in Survival Analysis

    Authors: Ahmed H. Shahin, An Zhao, Alexander C. Whitehead, Daniel C. Alexander, Joseph Jacob, David Barber

    Abstract: Survival analysis is a valuable tool for estimating the time until specific events, such as death or cancer recurrence, based on baseline observations. This is particularly useful in healthcare to prognostically predict clinically important events based on patient data. However, existing approaches often have limitations; some focus only on ranking patients by survivability, neglecting to estimate… ▽ More

    Submitted 10 January, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

  14. arXiv:2308.12625  [pdf

    cs.LG

    Uncertainty and Explainable Analysis of Machine Learning Model for Reconstruction of Sonic Slowness Logs

    Authors: Hua Wang, Yuqiong Wu, Yushun Zhang, Fuqiang Lai, Zhou Feng, Bing Xie, Ailin Zhao

    Abstract: Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. In this paper, we utilize data from the 2020 machine learning competition of th… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  15. arXiv:2308.10144  [pdf, other

    cs.LG cs.AI cs.CL

    ExpeL: LLM Agents Are Experiential Learners

    Authors: Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-** Liu, Gao Huang

    Abstract: The recent surge in research interest in applying large language models (LLMs) to decision-making tasks has flourished by leveraging the extensive world knowledge embedded in LLMs. While there is a growing demand to tailor LLMs for custom decision-making tasks, finetuning them for specific tasks is resource-intensive and may diminish the model's generalization capabilities. Moreover, state-of-the-… ▽ More

    Submitted 17 December, 2023; v1 submitted 19 August, 2023; originally announced August 2023.

    Comments: Accepted by the 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24)

  16. arXiv:2308.00008  [pdf

    eess.IV cs.LG

    A data-centric deep learning approach to airway segmentation

    Authors: Wing Keung Cheung, Ashkan Pakzad, Nesrin Mogulkoc, Sarah Needleman, Bojidar Rangelov, Eyjolfur Gudmundsson, An Zhao, Mariam Abbas, Davina McLaverty, Dimitrios Asimakopoulos, Robert Chapman, Recep Savas, Sam M Janes, Yipeng Hu, Daniel C. Alexander, John R Hurst, Joseph Jacob

    Abstract: The morphology and distribution of airway tree abnormalities enables diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. In this study, we propose a data-centric deep learning technique to se… ▽ More

    Submitted 29 July, 2023; originally announced August 2023.

  17. arXiv:2307.14377  [pdf, other

    cs.CL cs.AI

    How Can Large Language Models Help Humans in Design and Manufacturing?

    Authors: Liane Makatura, Michael Foshey, Bohan Wang, Felix HähnLein, **chuan Ma, Bolei Deng, Megan Tjandrasuwita, Andrew Spielberg, Crystal Elaine Owens, Peter Yichen Chen, Allan Zhao, Amy Zhu, Wil J Norton, Edward Gu, Joshua Jacob, Yifei Li, Adriana Schulz, Wojciech Matusik

    Abstract: The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, transforming a design into manufacturing instruction… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  18. arXiv:2303.10789  [pdf, other

    cs.LG

    A hybrid CNN-RNN approach for survival analysis in a Lung Cancer Screening study

    Authors: Yaozhi Lu, Shahab Aslani, An Zhao, Ahmed Shahin, David Barber, Mark Emberton, Daniel C. Alexander, Joseph Jacob

    Abstract: In this study, we present a hybrid CNN-RNN approach to investigate long-term survival of subjects in a lung cancer screening study. Subjects who died of cardiovascular and respiratory causes were identified whereby the CNN model was used to capture imaging features in the CT scans and the RNN model was used to investigate time series and thus global information. The models were trained on subjects… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

  19. arXiv:2301.01778  [pdf, other

    quant-ph cs.DS math.OC

    Expanding the reach of quantum optimization with fermionic embeddings

    Authors: Andrew Zhao, Nicholas C. Rubin

    Abstract: Quadratic programming over orthogonal matrices encompasses a broad class of hard optimization problems that do not have an efficient quantum representation. Such problems are instances of the little noncommutative Grothendieck problem (LNCG), a generalization of binary quadratic programs to continuous, noncommutative variables. In this work, we establish a natural embedding for this class of LNCG… ▽ More

    Submitted 9 June, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: 48 pages, 4 figures. Title and abstract revised, template changed. Contents largely unmodified from prior version. Comments welcome

  20. arXiv:2212.09035  [pdf, other

    cs.CV cs.LG

    Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks

    Authors: Anqi Zhao, Tong Chu, Yahao Liu, Wen Li, **g**g Li, Lixin Duan

    Abstract: In this work, we study the black-box targeted attack problem from the model discrepancy perspective. On the theoretical side, we present a generalization error bound for black-box targeted attacks, which gives a rigorous theoretical analysis for guaranteeing the success of the attack. We reveal that the attack error on a target model mainly depends on empirical attack error on the substitute model… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

  21. arXiv:2212.00476  [pdf, other

    cs.CV cs.AR

    Efficient stereo matching on embedded GPUs with zero-means cross correlation

    Authors: Qiong Chang, Aolong Zha, Weimin Wang, Xin Liu, Masaki Onishi, Lei Lei, Meng Joo Er, Tsutomu Maruyama

    Abstract: Mobile stereo-matching systems have become an important part of many applications, such as automated-driving vehicles and autonomous robots. Accurate stereo-matching methods usually lead to high computational complexity; however, mobile platforms have only limited hardware resources to keep their power consumption low; this makes it difficult to maintain both an acceptable processing speed and acc… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  22. arXiv:2210.06702  [pdf, other

    cs.LG cs.AI

    A Mixture of Surprises for Unsupervised Reinforcement Learning

    Authors: Andrew Zhao, Matthieu Gaetan Lin, Yangguang Li, Yong-** Liu, Gao Huang

    Abstract: Unsupervised reinforcement learning aims at learning a generalist policy in a reward-free manner for fast adaptation to downstream tasks. Most of the existing methods propose to provide an intrinsic reward based on surprise. Maximizing or minimizing surprise drives the agent to either explore or gain control over its environment. However, both strategies rely on a strong assumption: the entropy of… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  23. arXiv:2209.09338  [pdf, other

    cs.LG

    Revisiting Embeddings for Graph Neural Networks

    Authors: S. Purchase, A. Zhao, R. D. Mullins

    Abstract: Current graph representation learning techniques use Graph Neural Networks (GNNs) to extract features from dataset embeddings. In this work, we examine the quality of these embeddings and assess how changing them can affect the accuracy of GNNs. We explore different embedding extraction techniques for both images and texts; and find that the performance of different GNN architectures is dependent… ▽ More

    Submitted 29 November, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

  24. arXiv:2208.14866  [pdf, ps, other

    cs.DM

    A case study of the profit-maximizing multi-vehicle pickup and delivery selection problem for the road networks with the integratable nodes

    Authors: Aolong Zha, Qiong Chang, Naoto Imura, Katsuhiro Nishinari

    Abstract: This paper is a study of an application-based model in profit-maximizing multi-vehicle pickup and delivery selection problem (PPDSP). The graph-theoretic model proposed by existing studies of PPDSP is based on transport requests to define the corresponding nodes (i.e., each request corresponds to a pickup node and a delivery node). In practice, however, there are probably multiple requests coming… ▽ More

    Submitted 2 September, 2022; v1 submitted 31 August, 2022; originally announced August 2022.

  25. arXiv:2206.03671  [pdf, other

    eess.IV cs.CV

    COVIDx CXR-3: A Large-Scale, Open-Source Benchmark Dataset of Chest X-ray Images for Computer-Aided COVID-19 Diagnostics

    Authors: Maya Pavlova, Tia Tuinstra, Hossein Aboutalebi, Andy Zhao, Hayden Gunraj, Alexander Wong

    Abstract: After more than two years since the beginning of the COVID-19 pandemic, the pressure of this crisis continues to devastate globally. The use of chest X-ray (CXR) imaging as a complementary screening strategy to RT-PCR testing is not only prevailing but has greatly increased due to its routine clinical use for respiratory complaints. Thus far, many visual perception models have been proposed for CO… ▽ More

    Submitted 18 November, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: 5 pages, MED-NeurIPS 2022 workshop

  26. arXiv:2205.15701  [pdf, other

    cs.LG cs.AI

    Provable General Function Class Representation Learning in Multitask Bandits and MDPs

    Authors: Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang

    Abstract: While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representatio… ▽ More

    Submitted 21 October, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

  27. arXiv:2203.06425  [pdf, other

    eess.IV cs.CV

    VAFO-Loss: VAscular Feature Optimised Loss Function for Retinal Artery/Vein Segmentation

    Authors: Yukun Zhou, Moucheng Xu, Yipeng Hu, Stefano B. Blumberg, An Zhao, Siegfried K. Wagner, Pearse A. Keane, Daniel C. Alexander

    Abstract: Estimating clinically-relevant vascular features following vessel segmentation is a standard pipeline for retinal vessel analysis, which provides potential ocular biomarkers for both ophthalmic disease and systemic disease. In this work, we integrate these clinical features into a novel vascular feature optimised loss function (VAFO-Loss), in order to regularise networks to produce segmentation ma… ▽ More

    Submitted 12 March, 2022; originally announced March 2022.

    Comments: 13 pages, 6 figures, 3 tables

  28. arXiv:2112.03119  [pdf, ps, other

    cs.AI cs.HC

    Requirements for Open Political Information: Transparency Beyond Open Data

    Authors: Andong Luis Li Zhao, Andrew Paley, Rachel Adler, Harper Pack, Sergio Servantez, Alexander Einarsson, Cameron Barrie, Marko Sterbentz, Kristian Hammond

    Abstract: A politically informed citizenry is imperative for a welldeveloped democracy. While the US government has pursued policies for open data, these efforts have been insufficient in achieving an open government because only people with technical and domain knowledge can access information in the data. In this work, we conduct user interviews to identify wants and needs among stakeholders. We further u… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: Presented at AAAI FSS-21: Artificial Intelligence in Government and Public Sector, Washington, DC, USA

  29. arXiv:2109.06421  [pdf, other

    eess.IV cs.CV

    COVID-Net MLSys: Designing COVID-Net for the Clinical Workflow

    Authors: Audrey G. Chung, Maya Pavlova, Hayden Gunraj, Naomi Terhljan, Alexander MacLean, Hossein Aboutalebi, Siddharth Surana, Andy Zhao, Saad Abbasi, Alexander Wong

    Abstract: As the COVID-19 pandemic continues to devastate globally, one promising field of research is machine learning-driven computer vision to streamline various parts of the COVID-19 clinical workflow. These machine learning methods are typically stand-alone models designed without consideration for the integration necessary for real-world application workflows. In this study, we take a machine learning… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: 4 pages

  30. arXiv:2108.03131  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    COVID-Net US: A Tailored, Highly Efficient, Self-Attention Deep Convolutional Neural Network Design for Detection of COVID-19 Patient Cases from Point-of-care Ultrasound Imaging

    Authors: Alexander MacLean, Saad Abbasi, Ashkan Ebadi, Andy Zhao, Maya Pavlova, Hayden Gunraj, Pengcheng Xi, Sonny Kohli, Alexander Wong

    Abstract: The Coronavirus Disease 2019 (COVID-19) pandemic has impacted many aspects of life globally, and a critical factor in mitigating its effects is screening individuals for infections, thereby allowing for both proper treatment for those individuals as well as action to be taken to prevent further spread of the virus. Point-of-care ultrasound (POCUS) imaging has been proposed as a screening tool as i… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 12 pages

  31. arXiv:2107.05858  [pdf, other

    cs.RO

    Multi-Objective Graph Heuristic Search for Terrestrial Robot Design

    Authors: Jie Xu, Andrew Spielberg, Allan Zhao, Daniela Rus, Wojciech Matusik

    Abstract: We present methods for co-designing rigid robots over control and morphology (including discrete topology) over multiple objectives. Previous work has addressed problems in single-objective robot co-design or multi-objective control. However, the joint multi-objective co-design problem is extremely important for generating capable, versatile, algorithmically designed robots. In this work, we prese… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: IEEE International Conference on Robotics and Automation (ICRA 2021)

  32. arXiv:2105.06640  [pdf, other

    eess.IV cs.CV cs.LG

    COVID-Net CXR-2: An Enhanced Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-ray Images

    Authors: Maya Pavlova, Naomi Terhljan, Audrey G. Chung, Andy Zhao, Siddharth Surana, Hossein Aboutalebi, Hayden Gunraj, Ali Sabri, Amer Alaref, Alexander Wong

    Abstract: As the COVID-19 pandemic continues to devastate globally, the use of chest X-ray (CXR) imaging as a complimentary screening strategy to RT-PCR testing continues to grow given its routine clinical use for respiratory complaint. As part of the COVID-Net open source initiative, we introduce COVID-Net CXR-2, an enhanced deep convolutional neural network design for COVID-19 detection from CXR images bu… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

    Comments: 12 pages. arXiv admin note: text overlap with arXiv:2105.00256

  33. arXiv:2103.04301  [pdf, other

    cs.RO

    DMotion: Robotic Visuomotor Control with Unsupervised Forward Model Learned from Videos

    Authors: Haoqi Yuan, Ruihai Wu, Andrew Zhao, Haipeng Zhang, Zihan Ding, Hao Dong

    Abstract: Learning an accurate model of the environment is essential for model-based control tasks. Existing methods in robotic visuomotor control usually learn from data with heavily labelled actions, object entities or locations, which can be demanding in many cases. To cope with this limitation, we propose a method, dubbed DMotion, that trains a forward model from video data only, via disentangling the m… ▽ More

    Submitted 26 July, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

    Comments: IROS 2021

  34. arXiv:2101.02469  [pdf, other

    cs.CV cs.AI

    Multimodal Gait Recognition for Neurodegenerative Diseases

    Authors: Aite Zhao, Jianbo Li, Junyu Dong, Lin Qi, Qianni Zhang, Ning Li, Xin Wang, Huiyu Zhou

    Abstract: In recent years, single modality based gait recognition has been extensively explored in the analysis of medical images or other sensory data, and it is recognised that each of the established approaches has different strengths and weaknesses. As an important motor symptom, gait disturbance is usually used for diagnosis and evaluation of diseases; moreover, the use of multi-modality analysis of th… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  35. arXiv:2101.02458  [pdf, other

    cs.CV cs.AI

    Associated Spatio-Temporal Capsule Network for Gait Recognition

    Authors: Aite Zhao, Junyu Dong, Jianbo Li, Lin Qi, Huiyu Zhou

    Abstract: It is a challenging task to identify a person based on her/his gait patterns. State-of-the-art approaches rely on the analysis of temporal or spatial characteristics of gait, and gait recognition is usually performed on single modality data (such as images, skeleton joint coordinates, or force signals). Evidence has shown that using multi-modality data is more conducive to gait research. Therefore… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  36. arXiv:2011.07931  [pdf, other

    cs.IR cs.LG

    Do Offline Metrics Predict Online Performance in Recommender Systems?

    Authors: Karl Krauth, Sarah Dean, Alex Zhao, Wenshuo Guo, Mihaela Curmei, Benjamin Recht, Michael I. Jordan

    Abstract: Recommender systems operate in an inherently dynamical setting. Past recommendations influence future behavior, including which data points are observed and how user preferences change. However, experimenting in production systems with real user dynamics is often infeasible, and existing simulation-based approaches have limited scale. As a result, many state-of-the-art algorithms are designed to s… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

  37. Muti-view Mouse Social Behaviour Recognition with Deep Graphical Model

    Authors: Zheheng Jiang, Feixiang Zhou, Aite Zhao, Xin Li, Ling Li, Dacheng Tao, Xuelong Li, Huiyu Zhou

    Abstract: Home-cage social behaviour analysis of mice is an invaluable tool to assess therapeutic efficacy of neurodegenerative diseases. Despite tremendous efforts made within the research community, single-camera video recordings are mainly used for such analysis. Because of the potential to create rich descriptions of mouse social behaviors, the use of multi-view video recordings for rodent observations… ▽ More

    Submitted 30 June, 2021; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: 17 pages, 11 figures

  38. Optimizing Short-Time Fourier Transform Parameters via Gradient Descent

    Authors: An Zhao, Krishna Subramani, Paris Smaragdis

    Abstract: The Short-Time Fourier Transform (STFT) has been a staple of signal processing, often being the first step for many audio tasks. A very familiar process when using the STFT is the search for the best STFT parameters, as they often have significant side effects if chosen poorly. These parameters are often defined in terms of an integer number of samples, which makes their optimization non-trivial.… ▽ More

    Submitted 18 February, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: Accepted for ICASSP 2021

  39. arXiv:2008.09697  [pdf, other

    cs.CV

    Perceptual underwater image enhancement with deep learning and physical priors

    Authors: Long Chen, Zheheng Jiang, Lei Tong, Zhihua Liu, Aite Zhao, Qianni Zhang, Junyu Dong, Huiyu Zhou

    Abstract: Underwater image enhancement, as a pre-processing step to improve the accuracy of the following object detection task, has drawn considerable attention in the field of underwater navigation and ocean exploration. However, most of the existing underwater image enhancement strategies tend to consider enhancement and detection as two independent modules with no interaction, and the practice of separa… ▽ More

    Submitted 26 September, 2020; v1 submitted 21 August, 2020; originally announced August 2020.

  40. arXiv:2006.09588  [pdf, ps, other

    math.CO cs.DM

    The No-Flippancy Game

    Authors: Isha Agarwal, Matvey Borodin, Aidan Duncan, Kaylee Ji, Tanya Khovanova, Shane Lee, Boyan Litchev, Anshul Rastogi, Garima Rastogi, Andrew Zhao

    Abstract: We analyze a coin-based game with two players where, before starting the game, each player selects a string of length $n$ comprised of coin tosses. They alternate turns, choosing the outcome of a coin toss according to specific rules. As a result, the game is deterministic. The player whose string appears first wins. If neither player's string occurs, then the game must be infinite. We study sev… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: 16 pages

    MSC Class: 00A08; 91A05

  41. arXiv:2003.08626  [pdf, other

    cs.CV

    Domain-Adaptive Few-Shot Learning

    Authors: An Zhao, Mingyu Ding, Zhiwu Lu, Tao Xiang, Yulei Niu, Jiechao Guan, Ji-Rong Wen, ** Luo

    Abstract: Existing few-shot learning (FSL) methods make the implicit assumption that the few target class samples are from the same domain as the source class samples. However, in practice this assumption is often invalid -- the target classes could come from a different domain. This poses an additional challenge of domain adaptation (DA) with few training samples. In this paper, the problem of domain-adapt… ▽ More

    Submitted 19 March, 2020; originally announced March 2020.

  42. arXiv:2001.01026  [pdf, other

    cs.GR cs.CV

    Painting Many Pasts: Synthesizing Time Lapse Videos of Paintings

    Authors: Amy Zhao, Guha Balakrishnan, Kathleen M. Lewis, Frédo Durand, John V. Guttag, Adrian V. Dalca

    Abstract: We introduce a new video synthesis task: synthesizing time lapse videos depicting how a given painting might have been created. Artists paint using unique combinations of brushes, strokes, and colors. There are often many possible ways to create a given painting. Our goal is to learn to capture this rich range of possibilities. Creating distributions of long-term videos is a challenge for learni… ▽ More

    Submitted 25 April, 2020; v1 submitted 3 January, 2020; originally announced January 2020.

    Comments: 10 pages, CVPR 2020

  43. arXiv:1912.02973  [pdf, other

    cs.CV cs.LG cs.RO

    SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy Learning

    Authors: Albert Zhao, Tong He, Yitao Liang, Haibin Huang, Guy Van den Broeck, Stefano Soatto

    Abstract: We describe a policy learning approach to map visual inputs to driving controls conditioned on turning command that leverages side tasks on semantics and object affordances via a learned representation trained for driving. To learn this representation, we train a squeeze network to drive using annotations for the side task as input. This representation encodes the driving-relevant information asso… ▽ More

    Submitted 19 November, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Conference on Robot Learning (CoRL) 2020

  44. arXiv:1909.00475  [pdf, other

    cs.CV

    Visual Deprojection: Probabilistic Recovery of Collapsed Dimensions

    Authors: Guha Balakrishnan, Adrian V. Dalca, Amy Zhao, John V. Guttag, Fredo Durand, William T. Freeman

    Abstract: We introduce visual deprojection: the task of recovering an image or video that has been collapsed along a dimension. Projections arise in various contexts, such as long-exposure photography, where a dynamic scene is collapsed in time to produce a motion-blurred image, and corner cameras, where reflected light from a scene is collapsed along a spatial dimension because of an edge occluder to yield… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: ICCV 2019

  45. arXiv:1904.03815  [pdf, other

    cs.RO

    Quasi-Direct Drive for Low-Cost Compliant Robotic Manipulation

    Authors: David V. Gealy, Stephen McKinley, Brent Yi, Philipp Wu, Phillip R. Downey, Greg Balke, Allan Zhao, Menglong Guo, Rachel Thomasson, Anthony Sinclair, Peter Cuellar, Zoe McCarthy, Pieter Abbeel

    Abstract: Robots must cost less and be force-controlled to enable widespread, safe deployment in unconstrained human environments. We propose Quasi-Direct Drive actuation as a capable paradigm for robotic force-controlled manipulation in human environments at low-cost. Our prototype - Blue - is a human scale 7 Degree of Freedom arm with 2kg payload. Blue can cost less than $5000. We show that Blue has dynam… ▽ More

    Submitted 11 April, 2019; v1 submitted 7 April, 2019; originally announced April 2019.

    Comments: This is our long version - 8 pages. Our 6 page version without a discussion of thermal limits was accepted to ICRA 2019. 11 Figures

  46. arXiv:1902.09383  [pdf, other

    cs.CV

    Data augmentation using learned transformations for one-shot medical image segmentation

    Authors: Amy Zhao, Guha Balakrishnan, Frédo Durand, John V. Guttag, Adrian V. Dalca

    Abstract: Image segmentation is an important task in many medical applications. Methods based on convolutional neural networks attain state-of-the-art accuracy; however, they typically rely on supervised training with large labeled datasets. Labeling medical images requires significant expertise and time, and typical hand-tuned approaches for data augmentation fail to capture the complex variations in such… ▽ More

    Submitted 6 April, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: 9 pages, CVPR 2019

  47. arXiv:1812.04429  [pdf, other

    cs.CV

    Face-Focused Cross-Stream Network for Deception Detection in Videos

    Authors: Mingyu Ding, An Zhao, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

    Abstract: Automated deception detection (ADD) from real-life videos is a challenging task. It specifically needs to address two problems: (1) Both face and body contain useful cues regarding whether a subject is deceptive. How to effectively fuse the two is thus key to the effectiveness of an ADD model. (2) Real-life deceptive samples are hard to collect; learning with limited training data thus challenges… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  48. arXiv:1810.08332  [pdf, other

    cs.CV cs.LG

    Zero and Few Shot Learning with Semantic Feature Synthesis and Competitive Learning

    Authors: Zhiwu Lu, Jiechao Guan, Aoxue Li, Tao Xiang, An Zhao, Ji-Rong Wen

    Abstract: Zero-shot learning (ZSL) is made possible by learning a projection function between a feature space and a semantic space (e.g.,~an attribute space). Key to ZSL is thus to learn a projection that is robust against the often large domain gap between the seen and unseen class domains. In this work, this is achieved by unseen class data synthesis and robust projection function learning. Specifically,… ▽ More

    Submitted 18 October, 2018; originally announced October 2018.

    Comments: Submitted to IEEE TPAMI

  49. arXiv:1810.08326  [pdf, other

    cs.CV cs.LG

    Domain-Invariant Projection Learning for Zero-Shot Recognition

    Authors: An Zhao, Mingyu Ding, Jiechao Guan, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

    Abstract: Zero-shot learning (ZSL) aims to recognize unseen object classes without any training samples, which can be regarded as a form of transfer learning from seen classes to unseen ones. This is made possible by learning a projection between a feature space and a semantic space (e.g. attribute space). Key to ZSL is thus to learn a projection function that is robust against the often large domain gap be… ▽ More

    Submitted 18 October, 2018; originally announced October 2018.

    Comments: Accepted to NIPS 2018

  50. VoxelMorph: A Learning Framework for Deformable Medical Image Registration

    Authors: Guha Balakrishnan, Amy Zhao, Mert R. Sabuncu, John Guttag, Adrian V. Dalca

    Abstract: We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach, and building on recent learning-based methods, we formulate registration as a function that maps a… ▽ More

    Submitted 1 September, 2019; v1 submitted 13 September, 2018; originally announced September 2018.

    Comments: Accepted to IEEE TMI ( (c) IEEE). This manuscript expands the CVPR 2018 paper (arXiv:1802.02604) by introducing an auxiliary model that uses segmentation maps during training, an amortized optimization analysis, and extensive model analysis. Code available at http://voxelmorph.csail.mit.edu