Skip to main content

Showing 1–50 of 359 results for author: Hou, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00943  [pdf, other

    cs.DC

    FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlap** and Participant Selection

    Authors: Jiaxiang Geng, Boyu Li, Xiaoqi Qin, Yixuan Li, Liang Li, Yanzhao Hou, Miao Pan

    Abstract: Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlap** local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues i… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 14 pages, 9 figures, Submitted to Sensys2024

  2. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.16933  [pdf, other

    eess.SP cs.AI

    SGSM: A Foundation-model-like Semi-generalist Sensing Model

    Authors: Tianjian Yang, Hao Zhou, Shuo Liu, Kaiwen Guo, Yiwen Hou, Haohua Du, Zhi Liu, Xiang-Yang Li

    Abstract: The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  4. arXiv:2406.13805  [pdf, other

    cs.CL cs.AI cs.LG

    WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia

    Authors: Yufang Hou, Alessandra Pascale, Javier Carnerero-Cano, Tigran Tchrakian, Radu Marinescu, Elizabeth Daly, Inkit Padhi, Prasanna Sattigeri

    Abstract: Retrieval-augmented generation (RAG) has emerged as a promising solution to mitigate the limitations of large language models (LLMs), such as hallucinations and outdated information. However, it remains unclear how LLMs handle knowledge conflicts arising from different augmented retrieved passages, especially when these passages originate from the same source and have equal trustworthiness. In thi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.13073  [pdf, other

    cs.LG cs.CR cs.CV

    NoiSec: Harnessing Noise for Security against Adversarial and Backdoor Attacks

    Authors: Md Hasan Shahriar, Ning Wang, Y. Thomas Hou, Wen**g Lou

    Abstract: The exponential adoption of machine learning (ML) is propelling the world into a future of intelligent automation and data-driven solutions. However, the proliferation of malicious data manipulation attacks against ML, namely adversarial and backdoor attacks, jeopardizes its reliability in safety-critical applications. The existing detection methods against such attacks are built upon assumptions,… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 20 pages, 7 figures

  6. arXiv:2406.09567  [pdf, other

    stat.ML cs.LG

    Causal Fine-Tuning and Effect Calibration of Non-Causal Predictive Models

    Authors: Carlos Fernández-Loría, Yanfang Hou, Foster Provost, Jennifer Hill

    Abstract: This paper proposes techniques to enhance the performance of non-causal models for causal inference using data from randomized experiments. In domains like advertising, customer retention, and precision medicine, non-causal models that predict outcomes under no intervention are often used to score individuals and rank them according to the expected effectiveness of an intervention (e.g, an ad, a r… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.05914  [pdf, other

    eess.AS cs.SD eess.SP

    Soundscape Captioning using Sound Affective Quality Network and Large Language Model

    Authors: Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren

    Abstract: We live in a rich and varied acoustic world, which is experienced by individuals or communities as a soundscape. Computational auditory scene analysis, disentangling acoustic scenes by detecting and classifying events, focuses on objective attributes of sounds, such as their category and temporal characteristics, ignoring the effect of sounds on people and failing to explore the relationship betwe… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/Yuanbo2020/SoundSCaper

  8. arXiv:2406.04216  [pdf, other

    cs.CL cs.LG

    What Do Language Models Learn in Context? The Structured Task Hypothesis

    Authors: Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Large language models (LLMs) exhibit an intriguing ability to learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL). Understandably, a swath of research has been dedicated to uncovering the theories underpinning ICL. One popular hypothesis explains ICL by task selection. LLMs identify the task based on the demonstration and generalize it to the… ▽ More

    Submitted 8 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: This work is published in ACL 2024

  9. arXiv:2406.03772  [pdf, other

    cs.CL

    Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure

    Authors: Yang Hou, Zhenghua Li

    Abstract: Revealing the syntactic structure of sentences in Chinese poses significant challenges for word-level parsers due to the absence of clear word boundaries. To facilitate a transition from word-level to character-level Chinese dependency parsing, this paper proposes modeling latent internal structures within words. In this way, each word-level dependency tree is interpreted as a forest of character-… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024

  10. arXiv:2406.03181  [pdf, other

    cs.CL

    Missci: Reconstructing Fallacies in Misrepresented Science

    Authors: Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych

    Abstract: Health-related misinformation on social networks can lead to poor decision-making and real-world dangers. Such misinformation often misrepresents scientific publications and cites them as "proof" to gain perceived credibility. To effectively counter such claims automatically, a system must explain how the claim was falsely derived from the cited publication. Current methods for automated fact-chec… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL 2024 (main)

  11. arXiv:2406.02651  [pdf, other

    cs.LG cs.AI cs.NI

    RoutePlacer: An End-to-End Routability-Aware Placer with Graph Neural Network

    Authors: Yunbo Hou, Haoran Ye, Yingxue Zhang, Siyuan Xu, Guojie Song

    Abstract: Placement is a critical and challenging step of modern chip design, with routability being an essential indicator of placement quality. Current routability-oriented placers typically apply an iterative two-stage approach, wherein the first stage generates a placement solution, and the second stage provides non-differentiable routing results to heuristically improve the solution quality. This metho… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted at KDD 2024

  12. arXiv:2405.20521  [pdf, other

    cs.CR

    SoK: Public Blockchain Sharding

    Authors: Md Mohaimin Al Barat, Shaoyu Li, Changlai Du, Y. Thomas Hou, Wen**g Lou

    Abstract: Blockchain's decentralization, transparency, and tamper-resistance properties have facilitated the system's use in various application fields. However, the low throughput and high confirmation latency hinder the widespread adoption of Blockchain. Many solutions have been proposed to address these issues, including first-layer solutions (or on-chain solutions) and second-layer solutions (or off-cha… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 18 pages

  13. arXiv:2405.16871  [pdf, other

    cs.IR

    Multi-Behavior Generative Recommendation

    Authors: Zihan Liu, Yupeng Hou, Julian McAuley

    Abstract: Multi-behavior sequential recommendation (MBSR) aims to incorporate behavior types of interactions for better recommendations. Existing approaches focus on the next-item prediction objective, neglecting the value of integrating the target behavior type into the learning objective. In this paper, we propose MBGen, a novel Multi-Behavior sequential Generative recommendation framework. We formulate t… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  14. arXiv:2405.13326  [pdf, other

    cs.CL

    Mosaic IT: Enhancing Instruction Tuning with Data Mosaics

    Authors: Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou

    Abstract: Finetuning large language models with a variety of instruction-response pairs has enhanced their capability to understand and follow instructions. Current instruction tuning primarily relies on teacher models or human intervention to generate and refine the instructions and responses, which are costly, non-sustainable, and may lack diversity. In this paper, we introduce Mosaic Instruction Tuning (… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  15. arXiv:2405.09708  [pdf, ps, other

    cs.RO cs.AI stat.CO

    No More Mumbles: Enhancing Robot Intelligibility through Speech Adaptation

    Authors: Qiaoqiao Ren, Yuanbo Hou, Dick Botteldooren, Tony Belpaeme

    Abstract: Spoken language interaction is at the heart of interpersonal communication, and people flexibly adapt their speech to different individuals and environments. It is surprising that robots, and by extension other digital devices, are not equipped to adapt their speech and instead rely on fixed speech parameters, which often hinder comprehension by the user. We conducted a speech comprehension study… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: IEEE Robotics and Automation Letters (IEEE RAL)

  16. arXiv:2405.09365  [pdf, other

    cs.CV

    SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition

    Authors: Weijie L, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, Xiang Li

    Abstract: Synthetic aperture radar (SAR) is essential in actively acquiring information for Earth observation. SAR Automatic Target Recognition (ATR) focuses on detecting and classifying various target categories under different image conditions. The current deep learning-based SAR ATR methods are typically designed for specific datasets and applications. Various target characteristics, scene background inf… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  17. arXiv:2405.08838  [pdf, other

    cs.SD cs.AI eess.AS

    PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset

    Authors: Yang Hou, Haitao Fu, Chuankai Chen, Zida Li, Haoyu Zhang, Jianjun Zhao

    Abstract: With the rapid advancement of generative AI, multimodal deepfakes, which manipulate both audio and visual modalities, have drawn increasing public concern. Currently, deepfake detection has emerged as a crucial strategy in countering these growing threats. However, as a key factor in training and validating deepfake detectors, most existing deepfake datasets primarily focus on the visual modal, an… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 13 page, 4 figures

    MSC Class: 68T45 ACM Class: I.4.9

  18. arXiv:2405.04307  [pdf, other

    cs.RO cs.AI cs.LG

    Improving Offline Reinforcement Learning with Inaccurate Simulators

    Authors: Yiwen Hou, Haoyuan Sun, **ming Ma, Feng Wu

    Abstract: Offline reinforcement learning (RL) provides a promising approach to avoid costly online interaction with the real environment. However, the performance of offline RL highly depends on the quality of the datasets, which may cause extrapolation error in the learning process. In many robotic applications, an inaccurate simulator is often available. However, the data directly collected from the inacc… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  19. arXiv:2405.02466  [pdf, other

    cs.CR cs.LG

    ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models

    Authors: Heng **, Chaoyu Zhang, Shanghao Shi, Wen**g Lou, Y. Thomas Hou

    Abstract: Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individual researchers and smaller companies are able to build derivative LLMs based o… ▽ More

    Submitted 26 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: This is the author's pre-print version of the work. It is posted here for your personal use. Not for redistribution

  20. arXiv:2405.00885  [pdf, other

    cs.LG cs.NI eess.IV

    WHALE-FL: Wireless and Heterogeneity Aware Latency Efficient Federated Learning over Mobile Devices via Adaptive Subnetwork Scheduling

    Authors: Huai-an Su, Jiaxiang Geng, Liang Li, Xiaoqi Qin, Yanzhao Hou, Xin Fu, Miao Pan

    Abstract: As a popular distributed learning paradigm, federated learning (FL) over mobile devices fosters numerous applications, while their practical deployment is hindered by participating devices' computing and communication heterogeneity. Some pioneering research efforts proposed to extract subnetworks from the global model, and assign as large a subnetwork as possible to the device for local training b… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  21. arXiv:2404.19756  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI stat.ML

    KAN: Kolmogorov-Arnold Networks

    Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark

    Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametriz… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 48 pages, 20 figures. Codes are available at https://github.com/KindXiaoming/pykan

  22. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu **, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huan**g Yue, **gyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  23. arXiv:2404.18923  [pdf, other

    cs.CL

    Holmes: Benchmark the Linguistic Competence of Language Models

    Authors: Andreas Waldis, Yotam Perlitz, Leshem Choshen, Yufang Hou, Iryna Gurevych

    Abstract: We introduce Holmes, a benchmark to assess the linguistic competence of language models (LMs) - their ability to grasp linguistic phenomena. Unlike prior prompting-based evaluations, Holmes assesses the linguistic competence of LMs via their internal representations using classifier-based probing. In doing so, we disentangle specific phenomena (e.g., part-of-speech of words) from other cognitive a… ▽ More

    Submitted 22 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  24. arXiv:2404.18144  [pdf, other

    cs.LG cs.AI cs.HC

    Generative AI for Visualization: State of the Art and Future Directions

    Authors: Yilin Ye, Jianing Hao, Yihan Hou, Zhan Wang, Shishi Xiao, Yuyu Luo, Wei Zeng

    Abstract: Generative AI (GenAI) has witnessed remarkable progress in recent years and demonstrated impressive performance in various generation tasks in different domains such as computer vision and computational design. Many researchers have attempted to integrate GenAI into visualization framework, leveraging the superior generative capacity for different operations. Concurrently, recent major breakthroug… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  25. arXiv:2404.14591  [pdf, other

    cs.CE

    Predicting the Temporal Dynamics of Prosthetic Vision

    Authors: Yuchen Hou, Laya Pullela, Jiaxin Su, Sriya Aluru, Shivani Sista, Xiankun Lu, Michael Beyeler

    Abstract: Retinal implants are a promising treatment option for degenerative retinal disease. While numerous models have been developed to simulate the appearance of elicited visual percepts ("phosphenes"), these models often either focus solely on spatial characteristics or inadequately capture the complex temporal dynamics observed in clinical trials, which vary heavily across implant technologies, subjec… ▽ More

    Submitted 1 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  26. arXiv:2404.14240  [pdf, other

    cs.IR cs.AI cs.IT cs.LG cs.SI

    Collaborative Filtering Based on Diffusion Models: Unveiling the Potential of High-Order Connectivity

    Authors: Yu Hou, **-Duk Park, Won-Yong Shin

    Abstract: A recent study has shown that diffusion models are well-suited for modeling the generative process of user-item interactions in recommender systems due to their denoising nature. However, existing diffusion model-based recommender systems do not explicitly leverage high-order connectivities that contain crucial collaborative signals for accurate recommendations. Addressing this gap, we propose CF-… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures, 4 tables; 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024) (to appear) (Please cite our conference version.)

  27. arXiv:2404.13377  [pdf, other

    cs.NE

    Bridging the Gap Between Theory and Practice: Benchmarking Transfer Evolutionary Optimization

    Authors: Yaqing Hou, Wenqiang Ma, Abhishek Gupta, Kavitesh Kumar Bali, Hongwei Ge, Qiang Zhang, Carlos A. Coello Coello, Yew-Soon Ong

    Abstract: In recent years, the field of Transfer Evolutionary Optimization (TrEO) has witnessed substantial growth, fueled by the realization of its profound impact on solving complex problems. Numerous algorithms have emerged to address the challenges posed by transferring knowledge between tasks. However, the recently highlighted ``no free lunch theorem'' in transfer optimization clarifies that no single… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 17 pages, 18 figures

  28. arXiv:2404.12228  [pdf, other

    cs.AI cs.LG

    Relationship Discovery for Drug Recommendation

    Authors: Xiang Li, Shunpan Liang, Yu Lei, Chen Li, Yulei Hou, Tengfei Ma

    Abstract: Medication recommendation systems are designed to deliver personalized drug suggestions that are closely aligned with individual patient needs. Previous studies have primarily concentrated on develo** medication embeddings, achieving significant progress. Nonetheless, these approaches often fall short in accurately reflecting individual patient profiles, mainly due to challenges in distinguishin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  29. arXiv:2404.11811  [pdf

    physics.chem-ph cs.AI cs.LG

    Physics-informed active learning for accelerating quantum chemical simulations

    Authors: Yi-Fan Hou, Lina Zhang, Quanhao Zhang, Fuchun Ge, Pavlo O. Dral

    Abstract: Quantum chemical simulations can be greatly accelerated by constructing machine learning potentials, which is often done using active learning (AL). The usefulness of the constructed potentials is often limited by the high effort required and their insufficient robustness in the simulations. Here we introduce the end-to-end AL for constructing robust data-efficient potentials with affordable inves… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  30. arXiv:2404.01701  [pdf, other

    cs.CL

    On the Role of Summary Content Units in Text Summarization Evaluation

    Authors: Marcel Nawrath, Agnieszka Nowak, Tristan Ratz, Danilo C. Walenta, Juri Opitz, Leonardo F. R. Ribeiro, João Sedoc, Daniel Deutsch, Simon Mille, Yixin Liu, Lining Zhang, Sebastian Gehrmann, Saad Mahamood, Miruna Clinciu, Khyathi Chandu, Yufang Hou

    Abstract: At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs). These SCUs are concise sentences that decompose a summary into small facts. Such SCUs can be used to judge the quality of a candidate summary, possibly partially automated via natural language inference (NLI) systems. Interestingly, with the aim to fully automate the Pyramid evaluat… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 10 Pages, 3 Figures, 3 Tables, camera ready version accepted at NAACL 2024

  31. arXiv:2404.00989  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    360+x: A Panoptic Multi-modal Scene Understanding Dataset

    Authors: Hao Chen, Yuqi Hou, Chenyuan Qu, Irene Testini, Xiaohan Hong, Jianbo Jiao

    Abstract: Human perception of the world is shaped by a multitude of viewpoints and modalities. While many existing datasets focus on scene understanding from a certain perspective (e.g. egocentric or third-person views), our dataset offers a panoptic perspective (i.e. multiple viewpoints with multiple data modalities). Specifically, we encapsulate third-person panoramic and front views, as well as egocentri… ▽ More

    Submitted 7 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 (Oral Presentation), Project page: https://x360dataset.github.io/

    Journal ref: The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024

  32. "My agent understands me better": Integrating Dynamic Human-like Memory Recall and Consolidation in LLM-Based Agents

    Authors: Yuki Hou, Haruki Tamoto, Homei Miyashita

    Abstract: In this study, we propose a novel human-like memory architecture designed for enhancing the cognitive abilities of large language model based dialogue agents. Our proposed architecture enables agents to autonomously recall memories necessary for response generation, effectively addressing a limitation in the temporal cognition of LLMs. We adopt the human memory cue recall as a trigger for accurate… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted to CHI 2024 Late-Breaking Work

    ACM Class: I.2.4; H.3.3

  33. arXiv:2403.17307  [pdf, other

    cs.CL cs.IT

    HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification

    Authors: He Zhu, Junran Wu, Ruomei Liu, Yue Hou, Ze Yuan, Shangzhe Li, Yicheng Pan, Ke Xu

    Abstract: Existing self-supervised methods in natural language processing (NLP), especially hierarchical text classification (HTC), mainly focus on self-supervised contrastive learning, extremely relying on human-designed augmentation rules to generate contrastive samples, which can potentially corrupt or distort the original information. In this paper, we tend to investigate the feasibility of a contrastiv… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted by NAACL 2024

  34. Understanding the Impact of Referent Design on Scale Perception in Immersive Data Visualization

    Authors: Yihan Hou, Hao Cui, Rongrong Chen, Wei Zeng

    Abstract: Referents are often used to enhance scale perception in immersive visualizations. Common referent designs include the considerations of referent layout (side-by-side vs. in-situ) and referent size (small vs. medium vs. large). This paper introduces a controlled user study to assess how different referent designs affect the efficiency and accuracy of scale perception across different data scales, o… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 7 pages, 6 figures, Accepted to Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '24)

  35. arXiv:2403.11128  [pdf, other

    cs.CL

    Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities

    Authors: Honglin Mu, Yang Xu, Yunlong Feng, Xiaofeng Han, Yitong Li, Yutai Hou, Wanxiang Che

    Abstract: With the rise of Large Language Models (LLMs), AI assistants' ability to utilize tools, especially through API calls, has advanced notably. This progress has necessitated more accurate evaluation methods. Many existing studies adopt static evaluation, where they assess AI assistants' API call based on pre-defined dialogue histories. However, such evaluation method can be misleading, as an AI assis… ▽ More

    Submitted 27 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  36. arXiv:2403.06447  [pdf, other

    cs.IR cs.AI

    CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation

    Authors: Junda Wu, Cheng-Chun Chang, Tong Yu, Zhankui He, Jianing Wang, Yupeng Hou, Julian McAuley

    Abstract: The long-tail recommendation is a challenging task for traditional recommender systems, due to data sparsity and data imbalance issues. The recent development of large language models (LLMs) has shown their abilities in complex reasoning, which can help to deduce users' preferences based on very few previous interactions. However, since most LLM-based systems rely on items' semantic meaning as the… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 11 pages

  37. arXiv:2403.03952  [pdf, other

    cs.IR

    Bridging Language and Items for Retrieval and Recommendation

    Authors: Yupeng Hou, Jiacheng Li, Zhankui He, An Yan, Xiusi Chen, Julian McAuley

    Abstract: This paper introduces BLaIR, a series of pretrained sentence embedding models specialized for recommendation scenarios. BLaIR is trained to learn correlations between item metadata and potential natural language context, which is useful for retrieving and recommending items. To pretrain BLaIR, we collect Amazon Reviews 2023, a new dataset comprising over 570 million reviews and 48 million items fr… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  38. Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation

    Authors: Lu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang

    Abstract: Semi-supervised learning is a sound measure to relieve the strict demand of abundant annotated datasets, especially for challenging multi-organ segmentation . However, most existing SSL methods predict pixels in a single image independently, ignoring the relations among images and categories. In this paper, we propose a two-stage Dual Contrastive Learning Network for semi-supervised MoS, which uti… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Published at ICASSP 2024

  39. arXiv:2403.02573  [pdf, other

    cs.LG

    Learning-augmented Online Minimization of Age of Information and Transmission Costs

    Authors: Zhongdong Liu, Keyuan Zhang, Bin Li, Yin Sun, Y. Thomas Hou, Bo Ji

    Abstract: We consider a discrete-time system where a resource-constrained source (e.g., a small sensor) transmits its time-sensitive data to a destination over a time-varying wireless channel. Each transmission incurs a fixed transmission cost (e.g., energy cost), and no transmission results in a staleness cost represented by the Age-of-Information. The source must balance the tradeoff between transmission… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: A preliminary version of this work is to be presented at IEEE INFOCOM 2024 Age and Semantics of Information Workshop

  40. arXiv:2403.00880  [pdf, other

    cs.IR cs.AI

    Dual-Granularity Medication Recommendation Based on Causal Inference

    Authors: Shunpan Liang, Xiang Li, Xiang Li, Chen Li, Yu Lei, Yulei Hou, Tengfei Ma

    Abstract: As medical demands grow and machine learning technology advances, AI-based diagnostic and treatment systems are garnering increasing attention. Medication recommendation aims to integrate patients' long-term health records with medical knowledge, recommending accuracy and safe medication combinations for specific conditions. However, most existing researches treat medication recommendation systems… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  41. arXiv:2402.14464  [pdf, other

    cs.CV

    NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection

    Authors: Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang

    Abstract: NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by innovatively utilizing NeRF to enhance representation learning. Despite its notable performance, we uncover three decisive shortcomings in its current design, including semantic ambiguity, inappropriate sampling, and insufficient utilization of depth supervision. To combat the aforementioned problems, we present thre… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 7 pages, 2 figures

  42. arXiv:2402.10097  [pdf, other

    cs.LG cs.NI

    Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling

    Authors: Jiaxiang Geng, Yanzhao Hou, Xiaofeng Tao, Juncheng Wang, Bing Luo

    Abstract: Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client… ▽ More

    Submitted 13 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures, accepted for publication in IEEE International Conference on Communications (ICC)

  43. arXiv:2402.08785  [pdf, other

    cs.CL

    InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment

    Authors: Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, Julian McAuley

    Abstract: Do current large language models (LLMs) better solve graph reasoning and generation tasks with parameter updates? In this paper, we propose InstructGraph, a framework that empowers LLMs with the abilities of graph reasoning and generation by instruction tuning and preference alignment. Specifically, we first propose a structured format verbalizer to unify all graph data into a universal code-like… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 19 pages

  44. arXiv:2402.08221  [pdf, other

    cs.RO cs.CV

    MetaTra: Meta-Learning for Generalized Trajectory Prediction in Unseen Domain

    Authors: Xiaohe Li, Feilong Huang, Zide Fan, Fangli Mou, Yingyan Hou, Chen Qian, Lijie Wen

    Abstract: Trajectory prediction has garnered widespread attention in different fields, such as autonomous driving and robotic navigation. However, due to the significant variations in trajectory patterns across different scenarios, models trained in known environments often falter in unseen ones. To learn a generalized model that can directly handle unseen domains without requiring any model updating, we pr… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  45. arXiv:2402.03951  [pdf, other

    cs.CV cs.AI

    Boosting Adversarial Transferability across Model Genus by Deformation-Constrained War**

    Authors: Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Song

    Abstract: Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems. To address this problem, many transferability enhancement approaches (e.g., input transformation and model augmentation) have been proposed. However, they show poor performances in attacking systems having different model genera from the surrogate model. In this paper, we propos… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: AAAI 2024

  46. arXiv:2402.03327  [pdf, other

    cs.CV cs.AI cs.CL

    Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models

    Authors: Dingning Liu, Xiaoshui Huang, Yuenan Hou, Zhihui Wang, Zhenfei Yin, Yongshun Gong, Peng Gao, Wanli Ouyang

    Abstract: In this paper, we introduce Uni3D-LLM, a unified framework that leverages a Large Language Model (LLM) to integrate tasks of 3D perception, generation, and editing within point cloud scenes. This framework empowers users to effortlessly generate and modify objects at specified locations within a scene, guided by the versatility of natural language descriptions. Uni3D-LLM harnesses the expressive p… ▽ More

    Submitted 9 January, 2024; originally announced February 2024.

    Comments: 10 pages, 6 figures

  47. arXiv:2402.01375  [pdf, other

    cs.CL

    Dive into the Chasm: Probing the Gap between In- and Cross-Topic Generalization

    Authors: Andreas Waldis, Yufang Hou, Iryna Gurevych

    Abstract: Pre-trained language models (LMs) perform well in In-Topic setups, where training and testing data come from the same topics. However, they face challenges in Cross-Topic scenarios where testing data is derived from distinct topics -- such as Gun Control. This study analyzes various LMs with three probing-based experiments to shed light on the reasons behind the In- vs. Cross-Topic generalization… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: EACL 2024

  48. arXiv:2402.01166  [pdf, other

    cs.CV cs.AI

    A Comprehensive Survey on 3D Content Generation

    Authors: Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, Wangmeng Zuo, Junjun Jiang, Xianming Liu

    Abstract: Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e.g., text, image, video, audio and 3D. The 3D is the most close visual modality to real-world 3D environment and carries enormous knowledge. The 3D content generation shows both academic and practical values while also presenting formidable technical challenges. This… ▽ More

    Submitted 19 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: under review

  49. arXiv:2401.17167  [pdf, other

    cs.CL

    Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios

    Authors: Shijue Huang, Wanjun Zhong, Jianqiao Lu, Qi Zhu, Jiahui Gao, Weiwen Liu, Yutai Hou, Xingshan Zeng, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruifeng Xu, Qun Liu

    Abstract: The recent trend of using Large Language Models (LLMs) as tool agents in real-world applications underscores the necessity for comprehensive evaluations of their capabilities, particularly in complex scenarios involving planning, creating, and using tools. However, existing benchmarks typically focus on simple synthesized queries that do not reflect real-world complexity, thereby offering limited… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted by ACL2024 Findings

  50. C2Ideas: Supporting Creative Interior Color Design Ideation with Large Language Model

    Authors: Yihan Hou, Manling Yang, Hao Cui, Lei Wang, Jie Xu, Wei Zeng

    Abstract: Interior color design is a creative process that endeavors to allocate colors to furniture and other elements within an interior space. While much research focuses on generating realistic interior designs, these automated approaches often misalign with user intention and disregard design rationales. Informed by a need-finding preliminary study, we develop C2Ideas, an innovative system for designer… ▽ More

    Submitted 27 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: 26 pages, 11 figures