Skip to main content

Showing 1–50 of 73 results for author: Xiao, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11733  [pdf, other

    stat.ML cs.LG

    A Clipped Trip: the Dynamics of SGD with Gradient Clip** in High-Dimensions

    Authors: Noah Marshall, Ke Liang Xiao, Atish Agarwala, Elliot Paquette

    Abstract: The success of modern machine learning is due in part to the adaptive optimization methods that have been developed to deal with the difficulties of training large models over complex datasets. One such method is gradient clip**: a practical procedure with limited theoretical underpinnings. In this work, we study clip** in a least squares problem under streaming SGD. We develop a theoretical a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi **, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2406.00143  [pdf, other

    cs.CV

    Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding

    Authors: Xiaolong Sun, Liushuai Shi, Le Wang, San** Zhou, Kun Xia, Yabing Wang, Gang Hua

    Abstract: Temporal sentence grounding is a challenging task that aims to localize the moment spans relevant to a language description. Although recent DETR-based models have achieved notable progress by leveraging multiple learnable moment queries, they suffer from overlapped and redundant proposals, leading to inaccurate predictions. We attribute this limitation to the lack of task-related guidance for the… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  4. arXiv:2405.16003  [pdf, other

    cs.AI cs.CY cs.LG

    Disentangling Heterogeneous Knowledge Concept Embedding for Cognitive Diagnosis on Untested Knowledge

    Authors: Kui Xiao, Runtian Xing, Miao Zhang, Shunfeng Tan, Ziming Wang, Xiaolian Zhu

    Abstract: Cognitive diagnosis is a fundamental and critical task in learning assessment, which aims to infer students' proficiency on knowledge concepts from their response logs. Current works assume each knowledge concept will certainly be tested and covered by multiple exercises. However, whether online or offline courses, it's hardly feasible to completely cover all knowledge concepts in several exercise… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  5. arXiv:2405.04918  [pdf, other

    cs.CV cs.AI

    Delve into Base-Novel Confusion: Redundancy Exploration for Few-Shot Class-Incremental Learning

    Authors: Haichen Zhou, Yixiong Zou, Ruixuan Li, Yuhua Li, Kui Xiao

    Abstract: Few-shot class-incremental learning (FSCIL) aims to acquire knowledge from novel classes with limited samples while retaining information about base classes. Existing methods address catastrophic forgetting and overfitting by freezing the feature extractor during novel-class learning. However, these methods usually tend to cause the confusion between base and novel classes, i.e., classifying novel… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  6. arXiv:2404.16416  [pdf, other

    cs.CV

    Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition

    Authors: Yu Wang, San** Zhou, Kun Xia, Le Wang

    Abstract: Semi-supervised action recognition aims to improve spatio-temporal reasoning ability with a few labeled data in conjunction with a large amount of unlabeled data. Albeit recent advancements, existing powerful methods are still prone to making ambiguous predictions under scarce labeled data, embodied as the limitation of distinguishing different actions with similar spatio-temporal information. In… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures, 6 tables, 56 conferences

    MSC Class: 68U10; 68T45 ACM Class: I.2.10

  7. arXiv:2404.13208  [pdf, other

    cs.CR cs.CL cs.LG

    The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

    Authors: Eric Wallace, Kai Xiao, Reimar Leike, Lilian Weng, Johannes Heidecke, Alex Beutel

    Abstract: Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts. In this work, we argue that one of the primary vulnerabilities underlying these attacks is that LLMs often consider system prompts (e.g., text from an application developer) to be the same priority as text from untrus… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  8. arXiv:2404.12242  [pdf, other

    cs.CL

    CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

    Authors: Mengna Zhu, Zijie Xu, Kaisheng Zeng, Kaiming Xiao, Mao Wang, Wenjun Ke, Hongbin Huang

    Abstract: Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE,… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures, accepted to LREC-COLING 2024

  9. arXiv:2404.01319  [pdf, other

    cs.SI cs.AI cs.CY

    Information Cascade Prediction under Public Emergencies: A Survey

    Authors: Qi Zhang, Guang Wang, Li Lin, Kaiwen Xia, Shuai Wang

    Abstract: With the advent of the era of big data, massive information, expert experience, and high-accuracy models bring great opportunities to the information cascade prediction of public emergencies. However, the involvement of specialist knowledge from various disciplines has resulted in a primarily application-specific focus (e.g., earthquakes, floods, infectious diseases) for information cascade predic… ▽ More

    Submitted 16 May, 2024; v1 submitted 27 March, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2007.09815 by other authors

  10. arXiv:2403.11189  [pdf, other

    cs.CV

    Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes

    Authors: Kun Xia, Le Wang, San** Zhou, Gang Hua, Wei Tang

    Abstract: The crux of semi-supervised temporal action localization (SS-TAL) lies in excavating valuable information from abundant unlabeled videos. However, current approaches predominantly focus on building models that are robust to the error-prone target class (i.e, the predicted class with the highest confidence) while ignoring informative semantics within non-target classes. This paper approaches SS-TAL… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  11. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  12. arXiv:2402.01864  [pdf, other

    cs.CY cs.AI

    (A)I Am Not a Lawyer, But...: Engaging Legal Experts towards Responsible LLM Policies for Legal Advice

    Authors: Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang

    Abstract: Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users,… ▽ More

    Submitted 3 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 14 pages

  13. arXiv:2401.02602  [pdf, other

    cs.LG cs.AI

    Neural Causal Abstractions

    Authors: Kevin Xia, Elias Bareinboim

    Abstract: The abilities of humans to understand the world in terms of cause and effect relationships, as well as to compress information into abstract concepts, are two hallmark features of human intelligence. These two topics have been studied in tandem in the literature under the rubric of causal abstractions theory. In practice, it remains an open problem how to best leverage abstraction theory in real-w… ▽ More

    Submitted 22 February, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: 48 total pages, 20 figures, short version accepted to AAAI-24

  14. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  15. arXiv:2311.17311  [pdf, other

    cs.CL cs.AI

    Universal Self-Consistency for Large Language Model Generation

    Authors: Xinyun Chen, Renat Aksitov, Uri Alon, Jie Ren, Kefan Xiao, Pengcheng Yin, Sushant Prakash, Charles Sutton, Xuezhi Wang, Denny Zhou

    Abstract: Self-consistency with chain-of-thought prompting (CoT) has demonstrated remarkable performance gains on various challenging tasks, by utilizing multiple reasoning paths sampled from large language models (LLMs). However, self-consistency relies on the answer extraction process to aggregate multiple solutions, which is not applicable to free-form answers. In this work, we propose Universal Self-Con… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  16. arXiv:2311.11086  [pdf

    eess.IV cs.CV

    LightBTSeg: A lightweight breast tumor segmentation model using ultrasound images via dual-path joint knowledge distillation

    Authors: Hongjiang Guo, Shengwen Wang, Hao Dang, Kangle Xiao, Yaru Yang, Wenpei Liu, Tongtong Liu, Yiying Wan

    Abstract: The accurate segmentation of breast tumors is an important prerequisite for lesion detection, which has significant clinical value for breast tumor research. The mainstream deep learning-based methods have achieved a breakthrough. However, these high-performance segmentation methods are formidable to implement in clinical scenarios since they always embrace high computation complexity, massive par… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: 7 pages, 7 figures, conference

  17. arXiv:2311.10934  [pdf, other

    cs.AI cs.CY cs.HC

    Case Repositories: Towards Case-Based Reasoning for AI Alignment

    Authors: K. J. Kevin Feng, Quan Ze Chen, Inyoung Cheong, King Xia, Amy X. Zhang

    Abstract: Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose v… ▽ More

    Submitted 26 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: MP2 workshop @ NeurIPS 2023

  18. arXiv:2310.16401  [pdf, other

    cs.LG

    Graph Neural Networks with a Distribution of Parametrized Graphs

    Authors: See Hian Lee, Feng Ji, Kelin Xia, Wee Peng Tay

    Abstract: Traditionally, graph neural networks have been trained using a single observed graph. However, the observed graph represents only one possible realization. In many applications, the graph may encounter uncertainties, such as having erroneous or missing edges, as well as edge weights that provide little informative value. To address these challenges and capture additional information previously abs… ▽ More

    Submitted 2 February, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

  19. Machine Learning for Automated Mitral Regurgitation Detection from Cardiac Imaging

    Authors: Ke Xiao, Erik Learned-Miller, Evangelos Kalogerakis, James Priest, Madalina Fiterau

    Abstract: Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP ope… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 12 pages including references and the appendix. 9 Figures, 2 tables. Accepted at MICCAI (Machine Learning for Automated Mitral Regurgitation Detection from Cardiac Imaging) 2023, Link to Springer at https://link.springer.com/chapter/10.1007/978-3-031-43990-2_23

    ACM Class: I.4.0; I.2.10

    Journal ref: In: Medical Image Computing and Computer Assisted Intervention - MICCAI 2023. pp. 236-246 (2023)

  20. arXiv:2307.04349  [pdf, other

    cs.AI cs.CL cs.LG

    RLTF: Reinforcement Learning from Unit Test Feedback

    Authors: Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye

    Abstract: The goal of program synthesis, or code generation, is to generate executable code based on given descriptions. Recently, there has been an increasing number of studies employing reinforcement learning (RL) to improve the performance of large language models (LLMs) for code. However, current representative works either rely solely on offline frameworks, limiting the exploration of new sample spaces… ▽ More

    Submitted 12 November, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: Accepted by TMLR

  21. arXiv:2306.17733  [pdf, other

    cs.CL cs.AI

    Token-Event-Role Structure-based Multi-Channel Document-Level Event Extraction

    Authors: Qizhi Wan, Changxuan Wan, Keli Xiao, Hui Xiong, Dexi Liu, Xi** Liu

    Abstract: Document-level event extraction is a long-standing challenging information retrieval problem involving a sequence of sub-tasks: entity extraction, event type judgment, and event type-specific multi-event extraction. However, addressing the problem as multiple learning tasks leads to increased model complexity. Also, existing methods insufficiently utilize the correlation of entities crossing diffe… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

  22. arXiv:2306.15065  [pdf, other

    physics.comp-ph cs.AI cs.LG

    Molecular geometric deep learning

    Authors: Cong Shen, Jiawei Luo, Kelin Xia

    Abstract: Geometric deep learning (GDL) has demonstrated huge power and enormous potential in molecular data analysis. However, a great challenge still remains for highly efficient molecular representations. Currently, covalent-bond-based molecular graphs are the de facto standard for representing molecular topology at the atomic level. Here we demonstrate, for the first time, that molecular graphs construc… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  23. arXiv:2306.13699  [pdf, other

    q-bio.QM cs.AI cs.LG q-bio.BM

    Curvature-enhanced Graph Convolutional Network for Biomolecular Interaction Prediction

    Authors: Cong Shen, **jian Ding, Junjie Wee, Jialin Bi, Jiawei Luo, Kelin Xia

    Abstract: Geometric deep learning has demonstrated a great potential in non-Euclidean data analysis. The incorporation of geometric insights into learning architecture is vital to its success. Here we propose a curvature-enhanced graph convolutional network (CGCN) for biomolecular interaction prediction, for the first time. Our CGCN employs Ollivier-Ricci curvature (ORC) to characterize network local struct… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  24. arXiv:2306.13541  [pdf, other

    cs.LG cs.AI

    Torsion Graph Neural Networks

    Authors: Cong Shen, Xiang Liu, Jiawei Luo, Kelin Xia

    Abstract: Geometric deep learning (GDL) models have demonstrated a great potential for the analysis of non-Euclidian data. They are developed to incorporate the geometric and topological information of non-Euclidian data into the end-to-end deep learning architectures. Motivated by the recent success of discrete Ricci curvature in graph neural network (GNNs), we propose TorGNN, an analytic Torsion enhanced… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  25. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yan** Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yu**g Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  26. Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

    Authors: Yuke Hu, Wei Liang, Ruofan Wu, Kai Xiao, Weiqiang Wang, Xiaochen Li, **fei Liu, Zhan Qin

    Abstract: Knowledge Graph Embedding (KGE) is a fundamental technique that extracts expressive representation from knowledge graph (KG) to facilitate diverse downstream tasks. The emerging federated KGE (FKGE) collaboratively trains from distributed KGs held among clients while avoiding exchanging clients' sensitive raw KGs, which can still suffer from privacy threats as evidenced in other federated model tr… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted in the ACM Web Conference (WWW 2023)

  27. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  28. arXiv:2302.01973  [pdf, other

    cs.LG cs.CL cs.PL

    Measuring The Impact Of Programming Language Distribution

    Authors: Gabriel Orlanski, Kefan Xiao, Xavier Garcia, Jeffrey Hui, Joshua Howland, Jonathan Malmaud, Jacob Austin, Rishabh Singh, Michele Catasta

    Abstract: Current benchmarks for evaluating neural code models focus on only a small subset of programming languages, excluding many popular languages such as Go or Rust. To ameliorate this issue, we present the BabelCode framework for execution-based evaluation of any benchmark in any language. BabelCode enables new investigations into the qualitative performance of models' memory, runtime, and individual… ▽ More

    Submitted 24 May, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted to ICML 2023, Code and data release: https://github.com/google-research/babelcode

  29. arXiv:2212.09248  [pdf, other

    cs.CL cs.SE

    Natural Language to Code Generation in Interactive Data Science Notebooks

    Authors: Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton

    Abstract: Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 46 pages. 32 figures

  30. arXiv:2211.05102  [pdf, other

    cs.LG cs.CL

    Efficiently Scaling Transformer Inference

    Authors: Reiner Pope, Sholto Douglas, Aakanksha Chowdhery, Jacob Devlin, James Bradbury, Anselm Levskaya, Jonathan Heek, Kefan Xiao, Shivani Agrawal, Jeff Dean

    Abstract: We study the problem of efficient generative inference for Transformer models, in one of its most challenging settings: large deep models, with tight latency targets and long sequence lengths. Better understanding of the engineering tradeoffs for inference for large Transformer-based models is important as use cases of these models are growing rapidly throughout application areas. We develop a sim… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  31. arXiv:2211.00007  [pdf, other

    cs.NI

    Quality-Cost Trade-off on Constructing Logical Views for Vehicular Cyber-Physical Systems: A Deep Reinforcement Learning Approach

    Authors: Junyuan Wu, Xincao Xu, Chuzhao Li, Hao Zhang, Ke Xiao, Kai Liu

    Abstract: With the development of sensing technologies, vehicle-to-everything (V2X) communications, edge computing paradigm, vehicular cyber-physical systems (VCPS) are emerging as the most fundamental platform for realizing future intelligent transportation systems (ITSs). In particular, the construction of logical views at the edge nodes based on heterogeneous information sensing and uploading are critica… ▽ More

    Submitted 19 September, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2209.12265, arXiv:2210.17386

  32. arXiv:2210.00035  [pdf, other

    cs.LG

    Neural Causal Models for Counterfactual Identification and Estimation

    Authors: Kevin Xia, Yushu Pan, Elias Bareinboim

    Abstract: Evaluating hypothetical statements about how the world would be had a different course of action been taken is arguably one key capability expected from modern AI systems. Counterfactual reasoning underpins discussions in fairness, the determination of blame and responsibility, credit assignment, and regret. In this paper, we study the evaluation of counterfactual statements through neural models.… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

    Comments: 10 pages main body, 57 pages total, 23 figures

  33. Age of View: A New Metric for Evaluating Heterogeneous Information Fusion in Vehicular Cyber-Physical Systems

    Authors: Xincao Xu, Kai Liu, Qisen Zhang, Hao Jiang, Ke Xiao, Jiangtao Luo

    Abstract: Heterogeneous information fusion is one of the most critical issues for realizing vehicular cyber-physical systems (VCPSs). This work makes the first attempt at quantitatively measuring the quality of heterogeneous information fusion in VCPS by designing a new metric called Age of View (AoV). Specifically, we derive a sensing model based on a multi-class M/G/1 priority queue and a transmission mod… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

    Journal ref: 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), 2022, pp. 3762-3767

  34. arXiv:2206.11493  [pdf, other

    cs.CV

    Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

    Authors: Kun Xia, Le Wang, San** Zhou, Nanning Zheng, Wei Tang

    Abstract: The main challenge of Temporal Action Localization is to retrieve subtle human actions from various co-occurring ingredients, e.g., context and background, in an untrimmed video. While prior approaches have achieved substantial progress through devising advanced action detectors, they still suffer from these co-occurring ingredients which often dominate the actual action content in videos. In this… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted by CVPR 2022

  35. arXiv:2205.07601  [pdf

    cond-mat.mtrl-sci cs.CG physics.app-ph

    Volumetric-map**-based inverse design of 3D architected materials and mobility control by topology reconstruction

    Authors: Kai Xiao, Xiang Zhou, Jaehyung Ju

    Abstract: The recent development of modular origami structures has ushered in a new era for active metamaterials with multiple degrees of freedom (multi-DOF). Notably, no systematic inverse design approach for volumetric modular origami structures has been reported. Moreover, very few topologies of modular origami have been studied for the design of active metamaterials with multi-DOF. Herein, we develop an… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: 36 pages

    Journal ref: Nat Commun 13, 7474 (2022)

  36. arXiv:2204.12363  [pdf, other

    cs.CV

    Causal Transportability for Visual Recognition

    Authors: Chengzhi Mao, Kevin Xia, James Wang, Hao Wang, Junfeng Yang, Elias Bareinboim, Carl Vondrick

    Abstract: Visual representations underlie object recognition tasks, but they often contain both robust and non-robust features. Our main observation is that image classifiers may perform poorly on out-of-distribution samples because spurious correlations between non-robust features and labels can be changed in a new environment. By analyzing procedures for out-of-distribution generalization with a causal gr… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  37. arXiv:2203.13093  [pdf

    cs.CV

    A Preliminary Research on Space Situational Awareness Based on Event Cameras

    Authors: Kun Xiao, Pengju Li, Guohui Wang, Zhi Li, Yi Chen, Yongfeng Xie, Yuqiang Fang

    Abstract: Event camera is a new type of sensor that is different from traditional cameras. Each pixel is triggered asynchronously by an event. The trigger event is the change of the brightness irradiated on the pixel. If the increment or decrement is higher than a certain threshold, the event is output. Compared with traditional cameras, event cameras have the advantages of high temporal resolution, low lat… ▽ More

    Submitted 24 March, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

  38. arXiv:2203.12270  [pdf

    cs.CV

    Event-Based Dense Reconstruction Pipeline

    Authors: Kun Xiao, Guohui Wang, Yi Chen, **ghong Nan, Yongfeng Xie

    Abstract: Event cameras are a new type of sensors that are different from traditional cameras. Each pixel is triggered asynchronously by event. The trigger event is the change of the brightness irradiated on the pixel. If the increment or decrement of brightness is higher than a certain threshold, an event is output. Compared with traditional cameras, event cameras have the advantages of high dynamic range… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

  39. Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Crop**

    Authors: Nora Horanyi, Kedi Xia, Kwang Moo Yi, Abhishake Kumar Bojja, Ales Leonardis, Hyung ** Chang

    Abstract: We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image crop** methods, where one typically trains a deep network to regress to crop parameters or crop** actions, we propose to directly optimize for the crop** parameters by repurposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuni… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

    Journal ref: Pattern Recognition, 2022, 108485, ISSN 0031-3203

  40. arXiv:2112.15329  [pdf, other

    cs.LG cs.CV

    On Distinctive Properties of Universal Perturbations

    Authors: Sung Min Park, Kuo-An Wei, Kai Xiao, Jerry Li, Aleksander Madry

    Abstract: We identify properties of universal adversarial perturbations (UAPs) that distinguish them from standard adversarial perturbations. Specifically, we show that targeted UAPs generated by projected gradient descent exhibit two human-aligned properties: semantic locality and spatial invariance, which standard targeted adversarial perturbations lack. We also demonstrate that UAPs contain significantly… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

  41. arXiv:2112.02353  [pdf, other

    cs.CV cs.LG

    Label Hierarchy Transition: Delving into Class Hierarchies to Enhance Deep Classifiers

    Authors: Renzhen Wang, De cai, Kaiwen Xiao, Xixi Jia, Xiao Han, Deyu Meng

    Abstract: Hierarchical classification aims to sort the object into a hierarchical structure of categories. For example, a bird can be categorized according to a three-level hierarchy of order, family, and species. Existing methods commonly address hierarchical classification by decoupling it into a series of multi-class classification tasks. However, such a multi-task learning strategy fails to fully exploi… ▽ More

    Submitted 31 October, 2023; v1 submitted 4 December, 2021; originally announced December 2021.

  42. arXiv:2112.00427  [pdf

    cs.RO cs.AI

    Research on Event Accumulator Settings for Event-Based SLAM

    Authors: Kun Xiao, Guohui Wang, Yi Chen, Yongfeng Xie, Hong Li, Sen Li

    Abstract: Event cameras are a new type of sensors that are different from traditional cameras. Each pixel is triggered asynchronously by event. The trigger event is the change of the brightness irradiated on the pixel. If the increment or decrement of brightness is higher than a certain threshold, an event is output. Compared with traditional cameras, event cameras have the advantages of high dynamic range… ▽ More

    Submitted 8 February, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: text overlap with arXiv:2008.05749 by other authors

  43. arXiv:2111.11652  [pdf, other

    cs.LG cs.CV

    CoDiM: Learning with Noisy Labels via Contrastive Semi-Supervised Learning

    Authors: Xin Zhang, Zixuan Liu, Kaiwen Xiao, Tian Shen, Junzhou Huang, Wei Yang, Dimitris Samaras, Xiao Han

    Abstract: Labels are costly and sometimes unreliable. Noisy label learning, semi-supervised learning, and contrastive learning are three different strategies for designing learning processes requiring less annotation cost. Semi-supervised learning and contrastive learning have been recently demonstrated to improve learning strategies that address datasets with noisy labels. Still, the inner connections betw… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 19 Pages, 9 figures, conference paper

  44. arXiv:2111.09124  [pdf, other

    cs.LG cs.AI

    Route Optimization via Environment-Aware Deep Network and Reinforcement Learning

    Authors: Pengzhan Guo, Keli Xiao, Zeyang Ye, Wei Zhu

    Abstract: Vehicle mobility optimization in urban areas is a long-standing problem in smart city and spatial data analysis. Given the complex urban scenario and unpredictable social events, our work focuses on develo** a mobile sequential recommendation system to maximize the profitability of vehicle service providers (e.g., taxi drivers). In particular, we treat the dynamic route optimization problem as a… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  45. arXiv:2110.14937  [pdf, other

    cs.LG cs.DC cs.NI

    Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT

    Authors: Shunpu Tang, Lunyuan Chen, Ke HeJunjuan Xia, Lisheng Fan, Arumugam Nallanathan

    Abstract: In this paper, we investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this system, the IoT devices can collaboratively train a shared model without compromising data privacy. However, due to limited resources in the industrial IoT networks, including computational power, bandwidth, and channel state, it is challenging for many dev… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version will be superseded

  46. arXiv:2109.00756  [pdf

    physics.geo-ph cs.CV

    Learning 3D Mineral Prospectivity from 3D Geological Models Using Convolutional Neural Networks: Application to a Structure-controlled Hydrothermal Gold Deposit

    Authors: Hao Deng, Yang Zheng, ** Chen, Shuyan Yu, Keyan Xiao, Xiancheng Mao

    Abstract: The three-dimensional (3D) geological models are the typical and key data source in the 3D mineral prospecitivity modeling. Identifying prospectivity-informative predictor variables from the 3D geological models is a challenging and tedious task. Motivated by the ability of convolutional neural networks (CNNs) to learn the intrinsic features, in this paper, we present a novel method that leverages… ▽ More

    Submitted 14 January, 2022; v1 submitted 2 September, 2021; originally announced September 2021.

  47. arXiv:2107.01326  [pdf, other

    cs.LG cs.AI

    SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing

    Authors: Hui Li, Xing Fu, Ruofan Wu, **yu Xu, Kai Xiao, Xiaofu Chang, Weiqiang Wang, Shuai Chen, Leilei Shi, Tao Xiong, Yuan Qi

    Abstract: Deep learning provides a promising way to extract effective representations from raw data in an end-to-end fashion and has proven its effectiveness in various domains such as computer vision, natural language processing, etc. However, in domains such as content/product recommendation and risk management, where sequence of event data is the most used raw data form and experts derived features are m… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

    Comments: 18 pages, 4 figures

  48. arXiv:2107.00793  [pdf, other

    cs.LG cs.AI

    The Causal-Neural Connection: Expressiveness, Learnability, and Inference

    Authors: Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, Elias Bareinboim

    Abstract: One of the central elements of any causal inference is an object called structural causal model (SCM), which represents a collection of mechanisms and exogenous sources of random variation of the system under investigation (Pearl, 2000). An important property of many kinds of neural networks is universal approximability: the ability to approximate any function to arbitrary precision. Given this pr… ▽ More

    Submitted 3 October, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: 10 pages main body (53 total pages with references and appendix), 5 figures in main body (20 total figures including appendix)

  49. arXiv:2106.03805  [pdf, other

    cs.CV cs.LG stat.ML

    3DB: A Framework for Debugging Computer Vision Models

    Authors: Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

    Abstract: We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study th… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  50. arXiv:2104.12086  [pdf, other

    cs.LG cs.AI cs.CR

    FedSup: A Communication-Efficient Federated Learning Fatigue Driving Behaviors Supervision Framework

    Authors: Chen Zhao, Zhipeng Gao, Qian Wang, Kaile Xiao, Zijia Mo, M. Jamal Deen

    Abstract: With the proliferation of edge smart devices and the Internet of Vehicles (IoV) technologies, intelligent fatigue detection has become one of the most-used methods in our daily driving. To improve the performance of the detection model, a series of techniques have been developed. However, existing work still leaves much to be desired, such as privacy disclosure and communication cost. To address t… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.