Skip to main content

Showing 51–100 of 1,252 results for author: Liang, J

.
  1. arXiv:2405.05955  [pdf, other

    cs.CL

    Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning

    Authors: Junzhi Chen, Juhao Liang, Benyou Wang

    Abstract: The emergence of large language models (LLMs) has opened up unprecedented possibilities for automating complex tasks that are often comparable to human performance. Despite their capabilities, LLMs still encounter difficulties in completing tasks that require high levels of accuracy and complexity due to their inherent limitations in handling multifaceted problems single-handedly. This paper intro… ▽ More

    Submitted 23 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05757  [pdf, other

    cs.ET eess.SY

    Design and Implementation of Energy-Efficient Wireless Tire Sensing System with Delay Analysis for Intelligent Vehicles

    Authors: Shashank Mishra, Jia-Ming Liang

    Abstract: The growing prevalence of Internet of Things (IoT) technologies has led to a rise in the popularity of intelligent vehicles that incorporate a range of sensors to monitor various aspects, such as driving speed, fuel usage, distance proximity and tire anomalies. Nowadays, real-time tire sensing systems play important roles for intelligent vehicles in increasing mileage, reducing fuel consumption, i… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.04859  [pdf, other

    cs.RO

    Guarding Force: Safety-Critical Compliant Control for Robot-Environment Interaction

    Authors: Xinming Wang, Jun Yang, Jianliang Mao, **zhuo Liang, Shihua Li, Yunda Yan

    Abstract: In this study, we propose a safety-critical compliant control strategy designed to strictly enforce interaction force constraints during the physical interaction of robots with unknown environments. The interaction force constraint is interpreted as a new force-constrained control barrier function (FC-CBF) by exploiting the generalized contact model and the prior information of the environment, i.… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  4. arXiv:2405.04855  [pdf, other

    hep-ph

    Revisiting general dark matter-bound-electron interactions

    Authors: **-Han Liang, Yi Liao, Xiao-Dong Ma, Hao-Lin Wang

    Abstract: In this letter we revisit general dark matter (DM)-bound-electron interactions studied previously in the influential work of [Catena et al., Phys. Rev. Res. 2, 033195 (2020)]. We derive the DM-electron response functions and find a crucial minus sign was missed for the second atomic response function $W_2$ defined in that work. The minus sign has significant phenomenological consequences when expl… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 6+2 pages, 4 figures

  5. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  6. arXiv:2405.03988  [pdf, other

    cs.IR cs.AI

    Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application

    Authors: Jian Jia, Yipei Wang, Yan Li, Honggang Chen, Xuehan Bai, Zhaocheng Liu, Jian Liang, Quan Chen, Han Li, Peng Jiang, Kun Gai

    Abstract: Contemporary recommender systems predominantly rely on collaborative filtering techniques, employing ID-embedding to capture latent associations among users and items. However, this approach overlooks the wealth of semantic information embedded within textual descriptions of items, leading to suboptimal performance in cold-start scenarios and long-tail user recommendations. Leveraging the capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures

  7. arXiv:2404.17186  [pdf, other

    cs.CV cs.AI cs.LG

    MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

    Authors: Jiajun Liang, Baoquan Zhang, Yunming Ye, Xutao Li, Chuyao Luo, Xukai Fu

    Abstract: The accurate detection of Mesoscale Convective Systems (MCS) is crucial for meteorological monitoring due to their potential to cause significant destruction through severe weather phenomena such as hail, thunderstorms, and heavy rainfall. However, the existing methods for MCS detection mostly targets on single-frame detection, which just considers the static characteristics and ignores the tempor… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  8. arXiv:2404.16348  [pdf, other

    cs.CV

    Dual Expert Distillation Network for Generalized Zero-Shot Learning

    Authors: Zhijie Rao, **gcai Guo, Xiaocheng Lu, **gming Liang, Jie Zhang, Haozhao Wang, Kang Wei, Xiaofeng Cao

    Abstract: Zero-shot learning has consistently yielded remarkable progress via modeling nuanced one-to-one visual-attribute correlation. Existing studies resort to refining a uniform map** function to align and correlate the sample regions and subattributes, ignoring two crucial issues: 1) the inherent asymmetry of attributes; and 2) the unutilized channel information. This paper addresses these issues by… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 9 pages, 4 figures; Accepted to IJCAI 2024

  9. arXiv:2404.16297  [pdf, other

    cs.SE cs.AI

    When Fuzzing Meets LLMs: Challenges and Opportunities

    Authors: Yu Jiang, Jie Liang, Fuchen Ma, Yuanliang Chen, Chi** Zhou, Yuheng Shen, Zhiyong Wu, **gzhou Fu, Mingzhe Wang, ShanShan Li, Quan Zhang

    Abstract: Fuzzing, a widely-used technique for bug detection, has seen advancements through Large Language Models (LLMs). Despite their potential, LLMs face specific challenges in fuzzing. In this paper, we identified five major challenges of LLM-assisted fuzzing. To support our findings, we revisited the most recent papers from top-tier conferences, confirming that these challenges are widespread. As a rem… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  10. arXiv:2404.15846  [pdf, other

    cs.CL

    From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models

    Authors: Qianyu He, Jie Zeng, Qianxi He, Jiaqing Liang, Yanghua Xiao

    Abstract: It is imperative for Large language models (LLMs) to follow instructions with elaborate requirements (i.e. Complex Instructions Following). Yet, it remains under-explored how to enhance the ability of LLMs to follow complex instructions with multiple constraints. To bridge the gap, we initially study what training data is effective in enhancing complex constraints following abilities. We found tha… ▽ More

    Submitted 18 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  11. arXiv:2404.15672  [pdf, other

    cs.CV

    Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision

    Authors: Mohammad Reza Hosseinzadeh Taher, Michael B. Gotway, Jianming Liang

    Abstract: Humans effortlessly interpret images by parsing them into part-whole hierarchies; deep learning excels in learning multi-level feature spaces, but they often lack explicit coding of part-whole relations, a prominent property of medical imaging. To overcome this limitation, we introduce Adam-v2, a new self-supervised learning framework extending Adam [79] by explicitly incorporating part-whole hier… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024 [main conference]

  12. arXiv:2404.14713  [pdf, other

    eess.SY

    Enhancing High-Speed Cruising Performance of Autonomous Vehicles through Integrated Deep Reinforcement Learning Framework

    Authors: **hao Liang, Kaidi Yang, Chaopeng Tan, **xiang Wang, Guodong Yin

    Abstract: High-speed cruising scenarios with mixed traffic greatly challenge the road safety of autonomous vehicles (AVs). Unlike existing works that only look at fundamental modules in isolation, this work enhances AV safety in mixed-traffic high-speed cruising scenarios by proposing an integrated framework that synthesizes three fundamental modules, i.e., behavioral decision-making, path-planning, and mot… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  13. arXiv:2404.14225  [pdf

    gr-qc

    GravitoMagneto-Hydrodynamics and Spacetime Turbulence in Early Universe

    Authors: Jiaxiang Liang, Minghui Du, Peng Xu

    Abstract: Based on the gravitoelectromagnetic formalism and inspired by the rich analogies between electrodynamics and general relativity, we try one step further along this line and suggest a new counterpart in the gravitoelectromagnetic world analogue to the electromagnetic physics. A counterpart model of the MagnetoHydroDynamics that could help us to understand the possible new physics in tightly bounded… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  14. arXiv:2404.12753  [pdf, other

    cs.CL cs.AI

    AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

    Authors: Wenhao Huang, Chenghao Peng, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Liqian Wen, Zulong Chen

    Abstract: Web automation is a significant technique that accomplishes complicated web tasks by automating common web actions, enhancing operational efficiency, and reducing the need for manual intervention. Traditional methods, such as wrappers, suffer from limited adaptability and scalability when faced with a new website. On the other hand, generative agents empowered by large language models (LLMs) exhib… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

  15. arXiv:2404.12322  [pdf, other

    cs.CV cs.AI

    Generalizable Face Landmarking Guided by Conditional Face War**

    Authors: Jiayi Liang, Haotian Liu, Hongteng Xu, Dixin Luo

    Abstract: As a significant step for human face modeling, editing, and generation, face landmarking aims at extracting facial keypoints from images. A generalizable face landmarker is required in practice because real-world facial images, e.g., the avatars in animations and games, are often stylized in various ways. However, achieving generalizable face landmarking is challenging due to the diversity of faci… ▽ More

    Submitted 21 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted in CVPR 2024

  16. arXiv:2404.12138  [pdf, other

    cs.AI

    Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?

    Authors: Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao

    Abstract: Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investi… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  17. arXiv:2404.10315  [pdf, other

    cs.CL

    Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience

    Authors: Haixia Han, Tingyun Li, Shisong Chen, Jie Shi, Chengyu Du, Yanghua Xiao, Jiaqing Liang, Xin Lin

    Abstract: Large Language Models (LLMs) have exhibited remarkable performance across various downstream tasks, but they may generate inaccurate or false information with a confident tone. One of the possible solutions is to empower the LLM confidence expression capability, in which the confidence expressed can be well-aligned with the true probability of the generated answer being correct. However, leveragin… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  18. arXiv:2404.09593  [pdf, other

    cs.CL

    Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

    Authors: Zepeng Ding, Wenhao Huang, Jiaqing Liang, Deqing Yang, Yanghua Xiao

    Abstract: Relation triple extraction, which outputs a set of triples from long sentences, plays a vital role in knowledge acquisition. Large language models can accurately extract triples from simple sentences through few-shot learning or fine-tuning when given appropriate instructions. However, they often miss out when extracting from complex sentences. In this paper, we design an evaluation-filtering fram… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024 main conference

  19. arXiv:2404.09145  [pdf, other

    cs.CL cs.AI

    ToNER: Type-oriented Named Entity Recognition with Generative Language Model

    Authors: Guochao Jiang, Ziqin Luo, Yuchen Shi, Dixuan Wang, Jiaqing Liang, Deqing Yang

    Abstract: In recent years, the fine-tuned generative models have been proven more powerful than the previous tagging-based or span-based models on named entity recognition (NER) task. It has also been found that the information related to entities, such as entity types, can prompt a model to achieve NER better. However, it is not easy to determine the entity types indeed existing in the given sentence in ad… ▽ More

    Submitted 11 June, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by LREC-COLING 2024

  20. arXiv:2404.09120  [pdf

    cond-mat.mes-hall

    Interfacial reaction boosts thermal conductance of room-temperature integrated semiconductor interfaces stable up to 1100 C

    Authors: Zhe Cheng, Xiaoyang Ji, Zifeng Huang, Yutaka Ohno, Koji Inoue, Yasusyohi Nagai, Yoshiki Sakaida, Hiroki Uratani, Naoteru Shigekawa, Jianbo Liang

    Abstract: Overheating has emerged as a primary challenge constraining the reliability and performance of next-generation high-performance electronics, such as chiplets and (ultra)wide bandgap electronics. Advanced heterogeneous integration not only constitutes a pivotal technique for fabricating these electronics but also offers potential solutions for thermal management. This study presents the integration… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  21. arXiv:2404.08707  [pdf, other

    cs.LG cs.AI cs.CL

    Large Language Model Can Continue Evolving From Mistakes

    Authors: Haokun Zhao, Haixia Han, Jie Shi, Chengyu Du, Jiaqing Liang, Yanghua Xiao

    Abstract: As world knowledge evolves and new task paradigms emerge, Continual Learning (CL) is crucial for kee** Large Language Models (LLMs) up-to-date and addressing their shortcomings. In practical applications, LLMs often require both continual instruction tuning (CIT) and continual pre-training (CPT) to adapt to new task paradigms and acquire necessary knowledge for task-solving. However, it remains… ▽ More

    Submitted 17 June, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  22. arXiv:2404.06970  [pdf, other

    cs.CL

    Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning

    Authors: Peipei Liu, Gaosheng Wang, Ying Tong, Jian Liang, Zhenquan Ding, Hongsong Zhu

    Abstract: Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. Previous methods employing token-level or span-level metric learning suffer from the computational burden and a large number of negative sample spans. In this paper, we propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER), which splits the… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  23. arXiv:2404.05609  [pdf, other

    math.OC eess.SY

    Feedback Stability Under Mixed Gain and Phase Uncertainty

    Authors: Jia** Liang, Di Zhao, Li Qiu

    Abstract: In this study, we investigate the robust feedback stability problem for multiple-input-multiple-output linear time-invariant systems involving sectored-disk uncertainty, namely, dynamic uncertainty subject to simultaneous gain and phase constraints. This problem is thereby called a sectored-disk problem. Employing a frequency-wise analysis approach, we derive a fundamental static matrix problem th… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  24. arXiv:2404.04293  [pdf, other

    cs.CL cs.AI

    Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding

    Authors: Yanda Li, Dixuan Wang, Jiaqing Liang, Guochao Jiang, Qianyu He, Yanghua Xiao, Deqing Yang

    Abstract: Large Language Models (LLMs) have demonstrated good performance in many reasoning tasks, but they still struggle with some complicated reasoning tasks including logical reasoning. One non-negligible reason for LLMs' suboptimal performance on logical reasoning is their overlooking of understanding logical fallacies correctly. To evaluate LLMs' capability of logical fallacy understanding (LFU), we p… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  25. arXiv:2404.03845  [pdf

    cs.CR cs.HC

    Buck You: Designing Easy-to-Onboard Blockchain Applications with Zero-Knowledge Login and Sponsored Transactions on Sui

    Authors: Eason Chen, Zimo Xiao, Justa Liang, Damien Chen, Pierce Hung, Kostas Kryptos Chalkias

    Abstract: In this paper, we developed a blockchain application to demonstrate the functionality of Sui's recent innovations: Zero Knowledge Login and Sponsored Transactions. Zero Knowledge Login allows users to create and access their blockchain wallets just with their OAuth accounts (e.g., Google, Facebook, Twitch), while Sponsored Transactions eliminate the need for users to prepare transaction fees, as t… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  26. arXiv:2404.02885  [pdf, other

    cs.CV

    PoCo: Point Context Cluster for RGBD Indoor Place Recognition

    Authors: **g Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen

    Abstract: We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database. The task presents inherent challenges attributed to the constrained field of view and limited range of perception sensors. We propose a new network architecture, which generalizes the recent Context of Clusters (… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  27. arXiv:2404.02663  [pdf

    eess.SP cs.IT

    Ground-to-UAV sub-Terahertz channel measurement and modeling

    Authors: Da Li, Peian Li, Jiabiao Zhao, Jianjian Liang, Jiacheng Liu, Guohao Liu, Yuanshuai Lei, Wenbo Liu, Jianqin Deng, Fuyong Liu, Jianjun Ma

    Abstract: Unmanned Aerial Vehicle (UAV) assisted terahertz (THz) wireless communications have been expected to play a vital role in the next generation of wireless networks. UAVs can serve as either repeaters or data collectors within the communication link, thereby potentially augmenting the efficacy of communication systems. Despite their promise, the channel analysis and modeling specific to THz wireless… ▽ More

    Submitted 28 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Submitted to Optics Express

  28. arXiv:2404.02239  [pdf, ps, other

    math.OC cs.LG

    Proximal Oracles for Optimization and Sampling

    Authors: Jiaming Liang, Yongxin Chen

    Abstract: We consider convex optimization with non-smooth objective function and log-concave sampling with non-smooth potential (negative log density). In particular, we study two specific settings where the convex objective/potential function is either semi-smooth or in composite form as the finite sum of semi-smooth components. To overcome the challenges caused by non-smoothness, our algorithms employ two… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 25 pages. arXiv admin note: text overlap with arXiv:2202.13975

  29. arXiv:2404.01564  [pdf, other

    hep-lat hep-ex hep-ph

    The radiative decay of scalar glueball from lattice QCD

    Authors: **tao Zou, Long-Cheng Gui, Ying Chen, Jian Liang, Xiangyu Jiang, Wen Qin

    Abstract: We perform the first lattice QCD study on the radiative decay of the scalar glueball to the vector meson $φ$ in the quenched approximation. The calculations are carried out on three gauge ensembles with different lattice spaicings, which enable us to do the continuum extrapolation. We first revisit the radiative $J/ψ$ decay into the scalar glueball $G$ and obtain the partial decay width… ▽ More

    Submitted 4 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 12 pages,10 figures

  30. arXiv:2404.00210  [pdf, other

    cs.RO

    Socially Aware Robot Navigation through Scoring Using Vision-Language Models

    Authors: Daeun Song, **g Liang, Amirreza Payandeh, Xuesu Xiao, Dinesh Manocha

    Abstract: We propose VLM-Social-Nav, a novel Vision-Language Model (VLM) based navigation approach to compute a robot's trajectory in human-centered environments. Our goal is to make real-time decisions on robot actions that are socially compliant with human expectations. We utilize a perception model to detect important social entities and prompt a VLM to generate guidance for socially compliant robot beha… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  31. arXiv:2403.20254  [pdf, other

    cs.CV

    Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

    Authors: Runhao Zeng, Xiaoyong Chen, Jiaming Liang, Huisi Wu, Guangzhong Cao, Yong Guo

    Abstract: Temporal action detection (TAD) aims to locate action positions and recognize action categories in long-term untrimmed videos. Although many methods have achieved promising results, their robustness has not been thoroughly studied. In practice, we observe that temporal information in videos can be occasionally corrupted, such as missing or blurred frames. Interestingly, existing methods often incu… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  32. arXiv:2403.18638  [pdf, other

    eess.AS

    Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

    Authors: **hua Liang, Ines Nolasco, Burooj Ghani, Huy Phan, Emmanouil Benetos, Dan Stowell

    Abstract: Detecting the presence of animal vocalisations in nature is essential to study animal populations and their behaviors. A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples. Previous efforts in this area have utilized different architectures… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  33. arXiv:2403.18105  [pdf, other

    cs.CL cs.AI

    Large Language Models for Education: A Survey and Outlook

    Authors: Shen Wang, Tianlong Xu, Hang Li, Chaoli Zhang, Joleen Liang, Jiliang Tang, Philip S. Yu, Qingsong Wen

    Abstract: The advent of Large Language Models (LLMs) has brought in a new era of possibilities in the realm of education. This survey paper summarizes the various technologies of LLMs in educational settings from multifaceted perspectives, encompassing student and teacher assistance, adaptive learning, and commercial tools. We systematically review the technological advancements in each perspective, organiz… ▽ More

    Submitted 1 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  34. arXiv:2403.16536  [pdf, other

    cs.CV

    VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting

    Authors: Yu** Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liang

    Abstract: Combining CNNs or ViTs, with RNNs for spatiotemporal forecasting, has yielded unparalleled results in predicting temporal and spatial dynamics. However, modeling extensive global information remains a formidable challenge; CNNs are limited by their narrow receptive fields, and ViTs struggle with the intensive computational demands of their attention mechanisms. The emergence of recent Mamba-based… ▽ More

    Submitted 29 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: CVPR2024 Precognition Workshop

  35. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung ** Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  36. arXiv:2403.16396  [pdf, other

    cs.CL

    Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases

    Authors: Wenhao Huang, Qianyu He, Zhixu Li, Jiaqing Liang, Yanghua Xiao

    Abstract: Definition bias is a negative phenomenon that can mislead models. Definition bias in information extraction appears not only across datasets from different domains but also within datasets sharing the same domain. We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets. To systematic… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 15 pages, 4 figures

  37. arXiv:2403.16257  [pdf, other

    cs.CV

    Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning

    Authors: Siyuan Liang, Kuanrong Liu, Jiajun Gong, Jiawei Liang, Yuan Xun, Ee-Chien Chang, Xiaochun Cao

    Abstract: Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features using the complementary strengths of various data modalities. However, the open nature of such systems inadvertently increases the possibility of backdoor attacks. These attacks subtly embed malicious behaviors within the model during training, which can be activated by specific triggers in the in… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 6 pages, 2 figures

  38. arXiv:2403.16242  [pdf, other

    cs.CV

    Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

    Authors: Xiaoyu Zhu, Junwei Liang, Po-Yao Huang, Alex Hauptmann

    Abstract: We study the problem of unsupervised domain adaptation for egocentric videos. We propose a transformer-based model to learn class-discriminative and domain-invariant feature representations. It consists of two novel designs. The first module is called Generative Adversarial Domain Alignment Network with the aim of learning domain-invariant representations. It simultaneously learns a mask generator… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  39. arXiv:2403.16225  [pdf, other

    eess.SY

    Bi-Level Control of Weaving Sections in Mixed Traffic Environments with Connected and Automated Vehicles

    Authors: Longhao Yan, **hao Liang, Kaidi Yang

    Abstract: Connected and automated vehicles (CAVs) can be beneficial for improving the operation of highway bottlenecks such as weaving sections. This paper proposes a bi-level control approach based on an upper-level deep reinforcement learning controller and a lower-level model predictive controller to coordinate the lane-changings of a mixed fleet of CAVs and human-driven vehicles (HVs) in weaving section… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures

  40. arXiv:2403.16195  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Ideal spin-polarized Weyl-half-semimetal with a single pair of Weyl points in half-Heusler compounds XCrTe (X=K, Rb)

    Authors: Hongshuang Liu, ** Cao, Zeying Zhang, Jiashuo Liang, Liying Wang, Shengyuan A. Yang

    Abstract: Realizing ideal Weyl semimetal state with a single pair of Weyl points has been a long-sought goal in the field of topological semimetals. Here, we reveal such a state in the Cr-based half-Heusler compounds XCrTe (X=K, Rb). We show that these materials have a half metal ground state, with Fermi level crossing only one spin channel. Importantly, the Fermi surface is clean, consisting of the minimal… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  41. arXiv:2403.14791  [pdf, other

    cs.CY cs.AI

    Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits

    Authors: Jimin Mun, Liwei Jiang, Jenny Liang, Inyoung Cheong, Nicole DeCario, Ye** Choi, Tadayoshi Kohno, Maarten Sap

    Abstract: General purpose AI, such as ChatGPT, seems to have lowered the barriers for the public to use AI and harness its power. However, the governance and development of AI still remain in the hands of a few, and the pace of development is accelerating without proper assessment of risks. As a first step towards democratic governance and risk assessment of AI, we introduce Particip-AI, a framework to gath… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 35 pages, 4 figures, 23 tables

  42. arXiv:2403.14749  [pdf, other

    astro-ph.GA astro-ph.CO

    Connection between galaxy morphology and dark-matter halo structure I: a running threshold for thin discs and size predictors from the dark sector

    Authors: **ning Liang, Fangzhou Jiang, Houjun Mo, Andrew Benson, Avishai Dekel, Noa Tavron, Philip F. Hopkins, Luis C. Ho

    Abstract: We present a series of studies on the connection between galaxy morphology and the structure of host dark-matter (DM) haloes using cosmological simulations. In this work, we introduce a new kinematic decomposition scheme that features physical identification of morphological components, enabling robust separation of thin and thick discs; and measure a wide range of halo properties, including their… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 20 pages, 17 figures, submitted to MNRAS

  43. arXiv:2403.14689  [pdf, other

    cs.CY cs.AI cs.LG

    Develo** and Deploying Industry Standards for Artificial Intelligence in Education (AIED): Challenges, Strategies, and Future Directions

    Authors: Richard Tong, Haoyang Li, Joleen Liang, Qingsong Wen

    Abstract: The adoption of Artificial Intelligence in Education (AIED) holds the promise of revolutionizing educational practices by offering personalized learning experiences, automating administrative and pedagogical tasks, and reducing the cost of content creation. However, the lack of standardized practices in the development and deployment of AIED solutions has led to fragmented ecosystems, which presen… ▽ More

    Submitted 25 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 12 pages

  44. arXiv:2403.14376  [pdf, other

    cs.CV

    InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity

    Authors: Jiabin Liang, Lanqing Zhang, Zhuoran Zhao, Xiangyu Xu

    Abstract: The conventional mesh-based Level of Detail (LoD) technique, exemplified by applications such as Google Earth and many game engines, exhibits the capability to holistically represent a large scene even the Earth, and achieves rendering with a space complexity of O(log n). This constrained data requirement not only enhances rendering efficiency but also facilitates dynamic data fetching, thereby en… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  45. arXiv:2403.14301  [pdf, other

    physics.optics physics.app-ph

    Picotesla-sensitivity microcavity optomechanical magnetometry

    Authors: Zhi-Gang Hu, Yi-Meng Gao, Jian-Fei Liu, Hao Yang, Min Wang, Yuechen Lei, Xin Zhou, **cheng Li, Xuening Cao, ****g Liang, Chao-Qun Hu, Zhilin Li, Yong-Chang Lau, Jian-Wang Cai, Bei-Bei Li

    Abstract: Cavity optomechanical systems have enabled precision sensing of magnetic fields, by leveraging the optical resonance-enhanced readout and mechanical resonance-enhanced response. Previous studies have successfully achieved scalable and reproducible microcavity optomechanical magnetometry (MCOM) by incorporating Terfenol-D thin films into high-quality ($Q$) factor whispering gallery mode (WGM) micro… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  46. arXiv:2403.12284  [pdf, other

    math.ST q-bio.QM stat.AP stat.ME

    The Wreaths of KHAN: Uniform Graph Feature Selection with False Discovery Rate Control

    Authors: Jiajun Liang, Yue Liu, Doudou Zhou, Sinian Zhang, Junwei Lu

    Abstract: Graphical models find numerous applications in biology, chemistry, sociology, neuroscience, etc. While substantial progress has been made in graph estimation, it remains largely unexplored how to select significant graph signals with uncertainty assessment, especially those graph features related to topological structures including cycles (i.e., wreaths), cliques, hubs, etc. These features play a… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  47. arXiv:2403.11650  [pdf, other

    cs.CV

    Prioritized Semantic Learning for Zero-shot Instance Navigation

    Authors: Xander Sun, Louis Lau, Hoyard Zhi, Ronghe Qiu, Junwei Liang

    Abstract: We study zero-shot instance navigation, in which the agent navigates to a specific object without using object annotations for training. Previous object navigation approaches apply the image-goal navigation (ImageNav) task (go to the location of an image) for pretraining, and transfer the agent to achieve object goals using a vision-language model. However, these approaches lead to issues of seman… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  48. arXiv:2403.10854  [pdf, other

    cs.CV

    A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

    Authors: Tianhe Wu, Kede Ma, Jie Liang, Yujiu Yang, Lei Zhang

    Abstract: While Multimodal Large Language Models (MLLMs) have experienced significant advancement on visual understanding and reasoning, their potentials to serve as powerful, flexible, interpretable, and text-driven models for Image Quality Assessment (IQA) remains largely unexplored. In this paper, we conduct a comprehensive and systematic study of prompting MLLMs for IQA. Specifically, we first investiga… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  49. arXiv:2403.10261  [pdf, other

    cs.CV

    Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection

    Authors: Yuting Xu, Jian Liang, Lijun Sheng, Xiao-Yu Zhang

    Abstract: The deepfake threats to society and cybersecurity have provoked significant public apprehension, driving intensified efforts within the realm of deepfake video detection. Current video-level methods are mostly based on {3D CNNs} resulting in high computational demands, although have achieved good performance. This paper introduces an elegantly simple yet effective strategy named Thumbnail Layout (… ▽ More

    Submitted 20 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted by IJCV

  50. arXiv:2403.10228  [pdf, other

    cs.CV cs.AI cs.CL

    HawkEye: Training Video-Text LLMs for Grounding Text in Videos

    Authors: Yueqian Wang, Xiaojun Meng, Jianxin Liang, Yuxuan Wang, Qun Liu, Dongyan Zhao

    Abstract: Video-text Large Language Models (video-text LLMs) have shown remarkable performance in answering questions and holding conversations on simple videos. However, they perform almost the same as random on grounding text queries in long and complicated videos, having little ability to understand and reason about temporal information, which is the most fundamental difference between videos and images.… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.