Skip to main content

Showing 1–50 of 144 results for author: Hóu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03953  [pdf, other

    cs.LG cs.SI

    Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-Training on Industrial-Scale Data

    Authors: Yufei He, Zhenyu Hou, Yukuo Cen, Feng He, Xu Cheng, Bryan Hooi

    Abstract: Graph pre-training has been concentrated on graph-level on small graphs (e.g., molecular graphs) or learning node representations on a fixed graph. Extending graph pre-trained models to web-scale graphs with billions of nodes in industrial scenarios, while avoiding negative transfer across graphs or tasks, remains a challenge. We aim to develop a general graph pre-trained model with inductive abil… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Work in progress

  2. arXiv:2406.19749  [pdf, other

    eess.IV cs.CV

    SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.18616  [pdf, other

    cs.SE cs.AI cs.CL

    Towards Large Language Model Aided Program Refinement

    Authors: Yufan Cai, Zhe Hou, Xiaokun Luan, David Miguel Sanan Baena, Yun Lin, Jun Sun, ** Song Dong

    Abstract: Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    ACM Class: K.6.3

  4. arXiv:2406.16317  [pdf

    cs.SD eess.AS

    SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement

    Authors: Zhongshu Hou, Qinwen Hu, Zhanzhong Cao, Ming Tang, **g Lu

    Abstract: Despite significant progress made in the last decade, deep neural network (DNN) based speech enhancement (SE) still faces the challenge of notable degradation in the quality of recovered speech under low signal-to-noise ratio (SNR) conditions. In this letter, we propose an SNR-progressive speech enhancement model with harmonic compensation for low-SNR SE. Reliable pitch estimation is obtained from… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, **g Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been develo** over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  6. arXiv:2406.06028  [pdf, other

    cs.CV

    ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery

    Authors: Xian Sun, Qiwei Yan, Chubo Deng, Chenglong Liu, Yi Jiang, Zhongyan Hou, Wanxuan Lu, Fanglong Yao, Xiaoyu Liu, Lingxiang Hao, Hongfeng Yu

    Abstract: Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images. Significant progress has been made in the study of SGG in natural images in recent years, but its exploration in the domain of remote sensing images remains very limited. The complex characteristics of remote sensing images ne… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2406.02953  [pdf, other

    cs.LG

    GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment

    Authors: Zhenyu Hou, Haozhan Li, Yukuo Cen, Jie Tang, Yuxiao Dong

    Abstract: Graph self-supervised learning (SSL) holds considerable promise for mining and learning with graph-structured data. Yet, a significant challenge in graph SSL lies in the feature discrepancy among graphs across different domains. In this work, we aim to pretrain one graph neural network (GNN) on a varied collection of graphs endowed with rich node features and subsequently apply the pretrained GNN… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  8. arXiv:2405.12868  [pdf, other

    cs.LG cs.AI

    Equivariant Spatio-Temporal Attentive Graph Networks to Simulate Physical Dynamics

    Authors: Liming Wu, Zhichao Hou, Jirui Yuan, Yu Rong, Wenbing Huang

    Abstract: Learning to represent and simulate the dynamics of physical systems is a crucial yet challenging task. Existing equivariant Graph Neural Network (GNN) based methods have encapsulated the symmetry of physics, \emph{e.g.}, translations, rotations, etc, leading to better generalization ability. Nevertheless, their frame-to-frame formulation of the task overlooks the non-Markov property mainly incurre… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: The paper has been published to the conference of NeurIPS 2023

  9. arXiv:2405.08317  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

    Authors: Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff

    Abstract: Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 9+6 pages, Submitted to ACL 2024

  10. arXiv:2404.16852  [pdf, other

    cs.LG cs.AI cs.CL eess.IV

    A Disease Labeler for Chinese Chest X-Ray Report Generation

    Authors: Mengwei Wang, Ruixin Yan, Zeyi Hou, Ning Lang, Xiuzhuang Zhou

    Abstract: In the field of medical image analysis, the scarcity of Chinese chest X-ray report datasets has hindered the development of technology for generating Chinese chest X-ray reports. On one hand, the construction of a Chinese chest X-ray report dataset is limited by the time-consuming and costly process of accurate expert disease annotation. On the other hand, a single natural language generation metr… ▽ More

    Submitted 18 March, 2024; originally announced April 2024.

  11. arXiv:2404.15366  [pdf, other

    eess.SP cs.LG

    A Weight-aware-based Multi-source Unsupervised Domain Adaptation Method for Human Motion Intention Recognition

    Authors: Xiao-Yin Liu, Guotao Li, Xiao-Hu Zhou, Xu Liang, Zeng-Guang Hou

    Abstract: Accurate recognition of human motion intention (HMI) is beneficial for exoskeleton robots to improve the wearing comfort level and achieve natural human-robot interaction. A classifier trained on labeled source subjects (domains) performs poorly on unlabeled target subject since the difference in individual motor characteristics. The unsupervised domain adaptation (UDA) method has become an effect… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, 5 figures

  12. arXiv:2404.02893  [pdf, other

    cs.CL

    ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

    Authors: Yifan Xu, Xiao Liu, Xinghan Liu, Zhenyu Hou, Yueyan Li, Xiaohan Zhang, Zihan Wang, Aohan Zeng, Zhengxiao Du, Wenyi Zhao, Jie Tang, Yuxiao Dong

    Abstract: Large language models (LLMs) have shown excellent mastering of human language, but still struggle in real-world applications that require mathematical problem-solving. While many strategies and datasets to enhance LLMs' mathematics are developed, it remains a challenge to simultaneously maintain and improve both language and mathematical capabilities in deployed LLM systems.In this work, we tailor… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  13. arXiv:2404.01695  [pdf, other

    cs.LG

    Selective Temporal Knowledge Graph Reasoning

    Authors: Zhongni Hou, Xiaolong **, Zixuan Li, Long Bai, Jiafeng Guo, Xueqi Cheng

    Abstract: Temporal Knowledge Graph (TKG), which characterizes temporally evolving facts in the form of (subject, relation, object, timestamp), has attracted much attention recently. TKG reasoning aims to predict future facts based on given historical ones. However, existing TKG reasoning models are unable to abstain from predictions they are uncertain, which will inevitably bring risks in real-world applica… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  14. arXiv:2404.00934  [pdf, other

    cs.CL

    ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback

    Authors: Zhenyu Hou, Yilin Niu, Zhengxiao Du, Xiaohan Zhang, Xiao Liu, Aohan Zeng, Qinkai Zheng, Minlie Huang, Hongning Wang, Jie Tang, Yuxiao Dong

    Abstract: ChatGLM is a free-to-use AI service powered by the ChatGLM family of large language models (LLMs). In this paper, we present the ChatGLM-RLHF pipeline -- a reinforcement learning from human feedback (RLHF) system -- designed to enhance ChatGLM's alignment with human preferences. ChatGLM-RLHF encompasses three major components: the collection of human preference data, the training of the reward mod… ▽ More

    Submitted 3 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  15. arXiv:2403.18208  [pdf, other

    cs.CV cs.AI cs.NE

    An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture Recognition

    Authors: Yizhang Xia, Shihao Song, Zhanglu Hou, Junwen Xu, Juan Zou, Yuan Liu, Shengxiang Yang

    Abstract: Hand gesture recognition (HGR) based on multimodal data has attracted considerable attention owing to its great potential in applications. Various manually designed multimodal deep networks have performed well in multimodal HGR (MHGR), but most of existing algorithms require a lot of expert experience and time-consuming manual trials. To address these issues, we propose an evolutionary network arc… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  16. arXiv:2403.15191  [pdf, other

    cs.CR cs.DC

    VORTEX: Real-Time Off-Chain Payments and Cross-Chain Swaps for Cryptocurrencies

    Authors: Di Wu, Jian Liu, Zhengwei Hou, Wu Wen, Kui Ren

    Abstract: In this paper, we present VERTEX, a TEE-based layer-2 solution that tackles two crucial challenges in the realm of cryptocurrencies: off-chain payments and cross-chain swaps. It offers three notable features: - Channel-free off-chain payments: it allows a payer to make direct payments to anyone without requiring any on-chain relationship or intermediary channels. - Real-time yet decentralized cros… ▽ More

    Submitted 5 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  17. arXiv:2403.09998  [pdf, other

    cs.CV cs.AI

    FBPT: A Fully Binary Point Transformer

    Authors: Zhixing Hou, Yuzhang Shang, Yan Yan

    Abstract: This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices. By compressing the weights and activations of a 32-bit full-precision network to 1-bit binary values, the proposed binary point cloud Transformer network significantly reduces the storage footprint and computational re… ▽ More

    Submitted 9 May, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted to ICRA 2024. arXiv admin note: substantial text overlap with arXiv:2303.01166

  18. arXiv:2403.07035  [pdf, other

    cs.NE cs.LG

    Multiple Population Alternate Evolution Neural Architecture Search

    Authors: Juan Zou, Han Chu, Yizhang Xia, Junwen Xu, Yuan Liu, Zhanglu Hou

    Abstract: The effectiveness of Evolutionary Neural Architecture Search (ENAS) is influenced by the design of the search space. Nevertheless, common methods including the global search space, scalable search space and hierarchical search space have certain limitations. Specifically, the global search space requires a significant amount of computational resources and time, the scalable search space sacrifices… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  19. arXiv:2403.02667  [pdf, other

    cs.NE

    G-EvoNAS: Evolutionary Neural Architecture Search Based on Network Growth

    Authors: Juan Zou, Weiwei Jiang, Yizhang Xia, Yuan Liu, Zhanglu Hou

    Abstract: The evolutionary paradigm has been successfully applied to neural network search(NAS) in recent years. Due to the vast search complexity of the global space, current research mainly seeks to repeatedly stack partial architectures to build the entire model or to seek the entire model based on manually designed benchmark modules. The above two methods are attempts to reduce the search difficulty by… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  20. arXiv:2403.00092  [pdf, other

    cs.CL

    PROC2PDDL: Open-Domain Planning Representations from Texts

    Authors: Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon

    Abstract: Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representation… ▽ More

    Submitted 2 July, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: In NLRSE 2024, the 2nd Natural Language Reasoning and Structured Explanations Workshop

  21. arXiv:2402.15635  [pdf, other

    cs.IT cs.CV cs.LG eess.IV stat.AP stat.ML

    Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise

    Authors: Xi Chen, Zhewen Hou, Christopher A. Metzler, Arian Maleki, Shirin Jalali

    Abstract: We investigate both the theoretical and algorithmic aspects of likelihood-based methods for recovering a complex-valued signal from multiple sets of measurements, referred to as looks, affected by speckle (multiplicative) noise. Our theoretical contributions include establishing the first existing theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  22. arXiv:2402.14853  [pdf, other

    cs.CL cs.AI

    NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries

    Authors: Wei Zhao, Zhitao Hou, Siyuan Wu, Yan Gao, Haoyu Dong, Yao Wan, Hongyu Zhang, Yulei Sui, Haidong Zhang

    Abstract: Writing formulas on spreadsheets, such as Microsoft Excel and Google Sheets, is a widespread practice among users performing data analysis. However, crafting formulas on spreadsheets remains a tedious and error-prone task for many end-users, particularly when dealing with complex operations. To alleviate the burden associated with writing spreadsheet formulas, this paper introduces a novel benchma… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: To appear at EACL 2024

  23. arXiv:2401.11856  [pdf, other

    eess.IV cs.CV

    MOSformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Xiu-Ling Liu, Zeng-Guang Hou

    Abstract: Medical image segmentation takes an important position in various clinical applications. Deep learning has emerged as the predominant solution for automated segmentation of volumetric medical images. 2.5D-based segmentation models bridge computational efficiency of 2D-based models and spatial perception capabilities of 3D-based models. However, prevailing 2.5D-based models often treat each slice e… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Under Review

  24. arXiv:2401.05012  [pdf, other

    cs.LG

    HiMTM: Hierarchical Multi-Scale Masked Time Series Modeling for Long-Term Forecasting

    Authors: Shubao Zhao, Ming **, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Qingsong Wen, Yi Wang

    Abstract: Time series forecasting is crucial and challenging in the real world. The recent surge in interest regarding time series foundation models, which cater to a diverse array of downstream tasks, is noteworthy. However, existing methods often overlook the multi-scale nature of time series, an aspect crucial for precise forecasting. To bridge this gap, we propose HiMTM, a hierarchical multi-scale maske… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  25. arXiv:2312.03991  [pdf, other

    cs.LG cs.AI

    MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator

    Authors: Xiao-Yin Liu, Xiao-Hu Zhou, Guotao Li, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou

    Abstract: Offline reinforcement learning (RL) faces a significant challenge of distribution shift. Model-free offline RL penalizes the Q value for out-of-distribution (OOD) data or constrains the policy closed to the behavior policy to tackle this problem, but this inhibits the exploration of the OOD region. Model-based offline RL, which uses the trained environment model to generate more OOD data and perfo… ▽ More

    Submitted 17 April, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted by IJCAI 2024 (the 33rd International Joint Conference on Artificial Intelligence)

  26. arXiv:2312.00978  [pdf

    cs.NE

    Combining Kernelized Autoencoding and Centroid Prediction for Dynamic Multi-objective Optimization

    Authors: Zhanglu Hou, Juan Zou, Gan Ruan, Yuan Liu, Yizhang Xia

    Abstract: Evolutionary algorithms face significant challenges when dealing with dynamic multi-objective optimization because Pareto optimal solutions and/or Pareto optimal fronts change. This paper proposes a unified paradigm, which combines the kernelized autoncoding evolutionary search and the centriod-based prediction (denoted by KAEP), for solving dynamic multi-objective optimization problems (DMOPs). S… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  27. arXiv:2311.15030  [pdf, other

    cs.RO

    Tuning-free Quasi-stiffness Control Framework of a Powered Transfemoral Prosthesis for Task-adaptive Walking

    Authors: Teng Ma, Shucong Yin, Zhimin Hou, Binxin Huang, Haoyong Yu, Chenglong Fu

    Abstract: Impedance-based control represents a prevalent strategy in the development of powered transfemoral prostheses. However, creating a task-adaptive, tuning-free controller that effectively generalizes across diverse locomotion modes and terrain conditions continues to be a significant challenge. This letter proposes a tuning-free and task-adaptive quasi-stiffness control framework for powered prosthe… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: 8 pages, 10 figures. This work has been submitted to the IEEE-RAL for possible publication

  28. arXiv:2311.14934  [pdf, other

    cs.LG

    Robust Graph Neural Networks via Unbiased Aggregation

    Authors: Ruiqi Feng, Zhichao Hou, Tyler Derr, Xiaorui Liu

    Abstract: The adversarial robustness of Graph Neural Networks (GNNs) has been questioned due to the false sense of security uncovered by strong adaptive attacks despite the existence of numerous defenses. In this work, we delve into the robustness analysis of representative robust GNNs and provide a unified robust estimation point of view to understand their robustness and limitations. Our novel analysis of… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  29. arXiv:2310.19019  [pdf, other

    cs.CL cs.AI

    TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

    Authors: Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan

    Abstract: Large Language Models (LLMs) exhibit impressive reasoning and data augmentation capabilities in various NLP tasks. However, what about small models? In this work, we propose TeacherLM-7.1B, capable of annotating relevant fundamentals, chain of thought, and common mistakes for most NLP samples, which makes annotation more than just an answer, thus allowing other models to learn "why" instead of jus… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 5 figures, 15 pages

  30. arXiv:2310.17245  [pdf, other

    cs.LG cs.AI

    CROP: Conservative Reward for Model-based Offline Policy Optimization

    Authors: Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Xiao-Yin Liu, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Offline reinforcement learning (RL) aims to optimize policy using collected data without online interactions. Model-based approaches are particularly appealing for addressing offline RL challenges due to their capability to mitigate the limitations of offline data through data generation using models. Prior research has demonstrated that introducing conservatism into the model or Q-function during… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  31. arXiv:2309.11848  [pdf, other

    cs.RO

    TeachingBot: Robot Teacher for Human Handwriting

    Authors: Zhimin Hou, Cunjun Yu, David Hsu, Haoyong Yu

    Abstract: Teaching physical skills to humans requires one-on-one interaction between the teacher and the learner. With a shortage of human teachers, such a teaching mode faces the challenge of scaling up. Robots, with their replicable nature and physical capabilities, offer a solution. In this work, we present TeachingBot, a robotic system designed for teaching handwriting to human learners. We tackle two p… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  32. arXiv:2309.11737  [pdf, other

    cs.AI

    Choice-75: A Dataset on Decision Branching in Script Learning

    Authors: Zhaoyi Joey Hou, Li Zhang, Chris Callison-Burch

    Abstract: Script learning studies how stereotypical events unfold, enabling machines to reason about narratives with implicit information. Previous works mostly consider a script as a linear sequence of events while ignoring the potential branches that arise due to people's circumstantial choices. We hence propose Choice-75, the first benchmark that challenges intelligent systems to make decisions given des… ▽ More

    Submitted 17 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: To be published in LREC-COLING-2024

  33. arXiv:2309.08925  [pdf, other

    cs.LG cs.AI

    DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

    Authors: Xiao-Yin Liu, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou

    Abstract: Model-based reinforcement learning (RL), which learns environment model from offline dataset and generates more out-of-distribution model data, has become an effective approach to the problem of distribution shift in offline RL. Due to the gap between the learned and actual environment, conservatism should be incorporated into the algorithm to balance accurate offline data and imprecise model data… ▽ More

    Submitted 25 April, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

    Comments: 16 pages, 7 figures

  34. arXiv:2308.11217  [pdf, other

    cs.LG cs.AI

    Federated Learning in Big Model Era: Domain-Specific Multimodal Large Models

    Authors: Zengxiang Li, Zhaoxiang Hou, Hui Liu, Ying Wang, Tongzhi Li, Longfei Xie, Chao Shi, Chengyi Yang, Weishan Zhang, Zelei Liu, Liang Xu

    Abstract: Multimodal data, which can comprehensively perceive and recognize the physical world, has become an essential path towards general artificial intelligence. However, multimodal large models trained on public datasets often underperform in specific industrial domains. This paper proposes a multimodal federated learning framework that enables multiple enterprises to utilize private domain data to col… ▽ More

    Submitted 24 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

  35. arXiv:2308.03945  [pdf, other

    cs.LG

    The Prospect of Enhancing Large-Scale Heterogeneous Federated Learning with Transformers

    Authors: Yulan Gao, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Han Yu

    Abstract: Federated learning (FL) addresses data privacy concerns by enabling collaborative training of AI models across distributed data owners. Wide adoption of FL faces the fundamental challenges of data heterogeneity and the large scale of data owners involved. In this paper, we investigate the prospect of Transformer-based FL models for achieving generalization and personalization in this setting. We c… ▽ More

    Submitted 22 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  36. arXiv:2308.03344  [pdf, other

    quant-ph cs.DC

    A Parallel and Distributed Quantum SAT Solver Based on Entanglement and Quantum Teleportation

    Authors: Shang-Wei Lin, Tzu-Fan Wang, Yean-Ru Chen, Zhe Hou, David Sanán, Yon Shin Teo

    Abstract: Boolean satisfiability (SAT) solving is a fundamental problem in computer science. Finding efficient algorithms for SAT solving has broad implications in many areas of computer science and beyond. Quantum SAT solvers have been proposed in the literature based on Grover's algorithm. Although existing quantum SAT solvers can consider all possible inputs at once, they evaluate each clause in the form… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  37. arXiv:2307.13716  [pdf, other

    cs.LG cs.AI

    FedDRL: A Trustworthy Federated Learning Model Fusion Method Based on Staged Reinforcement Learning

    Authors: Leiming Chen, Weishan Zhang, Cihao Dong, Sibo Qiao, Ziling Huang, Yuming Nie, Zhaoxiang Hou, Chee Wei Tan

    Abstract: Traditional federated learning uses the number of samples to calculate the weights of each client model and uses this fixed weight value to fusion the global model. However, in practical scenarios, each client's device and data heterogeneity leads to differences in the quality of each client's model. Thus the contribution to the global model is not wholly determined by the sample size. In addition… ▽ More

    Submitted 19 March, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

  38. arXiv:2307.07956  [pdf, other

    cs.LG cs.AI

    Automated Polynomial Filter Learning for Graph Neural Networks

    Authors: Wendi Yu, Zhichao Hou, Xiaorui Liu

    Abstract: Polynomial graph filters have been widely used as guiding principles in the design of Graph Neural Networks (GNNs). Recently, the adaptive learning of the polynomial graph filters has demonstrated promising performance for modeling graph signals on both homophilic and heterophilic graphs, owning to their flexibility and expressiveness. In this work, we conduct a novel preliminary study to explore… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: 10 pages, 3 figures

  39. arXiv:2306.15255  [pdf, other

    cs.CV cs.CL

    GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

    Authors: Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou

    Abstract: In this report, we present our champion solution for Ego4D Natural Language Queries (NLQ) Challenge in CVPR 2023. Essentially, to accurately ground in a video, an effective egocentric feature extractor and a powerful grounding model are required. Motivated by this, we leverage a two-stage pre-training strategy to train egocentric feature extractors and the grounding model on video narrations, and… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 5 pages, 2 figures, 4 tables, the champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023

  40. arXiv:2306.13986  [pdf, other

    cs.CL

    Large Language Models as Sous Chefs: Revising Recipes with GPT-3

    Authors: Alyssa Hwang, Bryan Li, Zhaoyi Hou, Dan Roth

    Abstract: With their remarkably improved text generation and prompting capabilities, large language models can adapt existing written information into forms that are easier to use and understand. In our work, we focus on recipes as an example of complex, diverse, and widely used instructions. We develop a prompt grounded in the original recipe and ingredients list that breaks recipes down into simpler steps… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  41. arXiv:2306.02002  [pdf, other

    cs.LG cs.AI cs.CR

    Can Directed Graph Neural Networks be Adversarially Robust?

    Authors: Zhichao Hou, Xitong Zhang, Wei Wang, Charu C. Aggarwal, Xiaorui Liu

    Abstract: The existing research on robust Graph Neural Networks (GNNs) fails to acknowledge the significance of directed graphs in providing rich information about networks' inherent structure. This work presents the first investigation into the robustness of GNNs in the context of directed graphs, aiming to harness the profound trust implications offered by directed graphs to bolster the robustness and res… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  42. arXiv:2305.07988  [pdf, other

    cs.CL

    Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts

    Authors: Haochen Tan, Han Wu, Wei Shao, Xinyun Zhang, Mingjie Zhan, Zhaohui Hou, Ding Liang, Linqi Song

    Abstract: Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content. To overcome these challenges, we propose a two-step framework, Reconstruct before Summarize (RbS), for effective and efficient meeting summarization. RbS first leverages a self-supervised paradigm to annotate essential contents by reconstructing the meeting transcripts. Secondly,… ▽ More

    Submitted 22 October, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 main conference

  43. arXiv:2305.05280  [pdf, other

    cs.CL cs.AI

    VCSUM: A Versatile Chinese Meeting Summarization Dataset

    Authors: Han Wu, Mingjie Zhan, Haochen Tan, Zhaohui Hou, Ding Liang, Linqi Song

    Abstract: Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data. To this end, we introduce a versatile Chinese meeting summarization dataset, dubbed VCSum, consisting of 239 real-life meetings, with a total duration of over 230 hours. We claim our dataset is versatile because we provide the annotations of topic segmentation, headlines, se… ▽ More

    Submitted 15 May, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: Findings of ACL 2023 (long paper). GitHub: https://github.com/hahahawu/VCSum

  44. arXiv:2304.14070  [pdf, other

    cs.CV cs.AI cs.LG

    Compositional 3D Human-Object Neural Animation

    Authors: Zhi Hou, Baosheng Yu, Dacheng Tao

    Abstract: Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics. Since existing methods mainly explore capturing HOIs, rendering HOI remains less investigated. In this paper, we address this challenge in HOI animation from a compositional perspective, i.e., animating novel HOIs including novel interaction,… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: 14 pages, 6 figures

  45. arXiv:2304.09632  [pdf, other

    cs.RO

    CASOG: Conservative Actor-critic with SmOoth Gradient for Skill Learning in Robot-Assisted Intervention

    Authors: Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Zeng-Guang Hou

    Abstract: Robot-assisted intervention has shown reduced radiation exposure to physicians and improved precision in clinical trials. However, existing vascular robotic systems follow master-slave control mode and entirely rely on manual commands. This paper proposes a novel offline reinforcement learning algorithm, Conservative Actor-critic with SmOoth Gradient (CASOG), to learn manipulation skills from huma… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: 13 pages, 5 figure, preprint

  46. arXiv:2304.04779  [pdf, other

    cs.LG

    GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

    Authors: Zhenyu Hou, Yufei He, Yukuo Cen, Xiao Liu, Yuxiao Dong, Evgeny Kharlamov, Jie Tang

    Abstract: Graph self-supervised learning (SSL), including contrastive and generative approaches, offers great potential to address the fundamental challenge of label scarcity in real-world graph data. Among both sets of graph SSL techniques, the masked graph autoencoders (e.g., GraphMAE)--one type of generative method--have recently produced promising results. The idea behind this is to reconstruct the node… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted to WWW'23

  47. arXiv:2303.01166  [pdf, other

    cs.CV

    BPT: Binary Point Cloud Transformer for Place Recognition

    Authors: Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan

    Abstract: Place recognition, an algorithm to recognize the re-visited places, plays the role of back-end optimization trigger in a full SLAM system. Many works equipped with deep learning tools, such as MLP, CNN, and transformer, have achieved great improvements in this research field. Point cloud transformer is one of the excellent frameworks for place recognition applied in robotics, but with large memory… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Submitted to the IEEE/RSJ International Conference on Intelligent Robots (IROS 2023)

  48. Human-in-the-Loop Schema Induction

    Authors: Tianyi Zhang, Isaac Tham, Zhaoyi Hou, Jiaxuan Ren, Liyang Zhou, Hainiu Xu, Li Zhang, Lara J. Martin, Rotem Dror, Sha Li, Heng Ji, Martha Palmer, Susan Brown, Reece Suchocki, Chris Callison-Burch

    Abstract: Schema induction builds a graph representation explaining how events unfold in a scenario. Existing approaches have been based on information retrieval (IR) and information extraction(IE), often with limited human curation. We demonstrate a human-in-the-loop schema induction system powered by GPT-3. We first describe the different modules of our system, including prompting to generate schematic el… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: 10 pages, ACL2023 demo track

  49. arXiv:2302.05693  [pdf

    cs.SD eess.AS

    Local spectral attention for full-band speech enhancement

    Authors: Zhongshu Hou, Qinwen Hu, Kai Chen, **g Lu

    Abstract: Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the inherent connection of signal both in time domain and spectrum domain. Usually, the span of attention is limited in time domain while the attention in frequency domain spans the whole frequency range. In this paper, we notice that the attention over the whole frequency range h… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  50. arXiv:2302.05690  [pdf

    cs.SD eess.AS

    Attention does not guarantee best performance in speech enhancement

    Authors: Zhongshu Hou, Qinwen Hu, Kai Chen, **g Lu

    Abstract: Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the long-term inherent connection of signal both in time domain and spectrum domain. However, the generally used global attention mechanism might not be the best choice since the adjacent information naturally imposes more influence than the far-apart information in speech enhance… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.