Skip to main content

Showing 1–50 of 171 results for author: Lv, Z

.
  1. arXiv:2407.02382  [pdf, other

    cs.CV cs.LG cs.RO

    Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions

    Authors: Zhiqi Zhao, Chang Wu, Xiaotong Kong, Zejie Lv, Xiaoqi Du, Qiyan Li

    Abstract: Simultaneous Localization and Map** (SLAM) has become a critical technology for intelligent transportation systems and autonomous robots and is widely used in autonomous driving. However, traditional manual feature-based methods in challenging lighting environments make it difficult to ensure robustness and accuracy. Some deep learning-based methods show potential but still have significant draw… ▽ More

    Submitted 10 May, 2024; originally announced July 2024.

  2. arXiv:2406.16382  [pdf, other

    cs.CL

    UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models

    Authors: Zhanyue Qin, Haochuan Wang, Deyuan Liu, Ziyang Song, Cunhang Fan, Zhao Lv, **lin Wu, Zhen Lei, Zhiying Tu, Dianhui Chu, Xiaoyan Yu, Dianbo Sui

    Abstract: Sequential decision-making refers to algorithms that take into account the dynamics of the environment, where early decisions affect subsequent decisions. With large language models (LLMs) demonstrating powerful capabilities between tasks, we can't help but ask: Can Current LLMs Effectively Make Sequential Decisions? In order to answer this question, we propose the UNO Arena based on the card game… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.16330  [pdf, other

    cs.CL cs.AI

    Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging

    Authors: Deyuan Liu, Zhanyue Qin, Hairu Wang, Zhao Yang, Zecheng Wang, Fangying Rong, Qingbin Liu, Yanchao Hao, Xi Chen, Cunhang Fan, Zhao Lv, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui

    Abstract: While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments. Current compression techniques, such as parameter pruning, often fail to effectively utilize the knowledge from pruned parameters. To address these challenges, we propose Manifold-Based Knowledge Alignment and Layer Merging Compression (MKA), a novel approach… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.14283  [pdf, other

    cs.AI

    Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

    Authors: Chaojie Wang, Yanchen Deng, Zhiyi Lv, Zeng Liang, Jujie He, Shuicheng Yan, An Bo

    Abstract: Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. However, the auto-regressive generation process makes LLMs prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. In this paper, by casting multi-step reasoning of LLMs as a heuristic search problem, we aim to alleviate the pathology by introducing… ▽ More

    Submitted 27 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.11816  [pdf, other

    cs.CV

    VideoLLM-online: Online Video Large Language Model for Streaming Video

    Authors: Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou

    Abstract: Recent Large Language Models have been enhanced with vision capabilities, enabling them to comprehend images, videos, and interleaved vision-language content. However, the learning methods of these large multimodal models typically treat videos as predetermined clips, making them less effective and efficient at handling streaming video inputs. In this paper, we propose a novel Learning-In-Video-St… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: CVPR 2024. This arxiv version is upgraded with Llama-3

  6. arXiv:2406.11364  [pdf, other

    cs.SD eess.AS

    AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

    Authors: Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, **yi Fan

    Abstract: Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, res… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  7. arXiv:2406.09664  [pdf, other

    cs.SD eess.AS

    Frequency-mix Knowledge Distillation for Fake Speech Detection

    Authors: Cunhang Fan, Shunbo Dong, Jun Xue, Yujie Chen, Jiangyan Yi, Zhao Lv

    Abstract: In the telephony scenarios, the fake speech detection (FSD) task to combat speech spoofing attacks is challenging. Data augmentation (DA) methods are considered effective means to address the FSD task in telephony scenarios, typically divided into time domain and frequency domain stages. While each has its advantages, both can result in information loss. To tackle this issue, we propose a novel DA… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  8. arXiv:2406.08804  [pdf, other

    cs.DC cs.AI cs.IR

    DIET: Customized Slimming for Incompatible Networks in Sequential Recommendation

    Authors: Kairui Fu, Shengyu Zhang, Zheqi Lv, **gyuan Chen, Jiwei Li

    Abstract: Due to the continuously improving capabilities of mobile edges, recommender systems start to deploy models on edges to alleviate network congestion caused by frequent mobile requests. Several studies have leveraged the proximity of edge-side to real-time data, fine-tuning them to create edge-specific models. Despite their significant progress, these methods require substantial on-edge computationa… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  9. arXiv:2406.08580  [pdf, other

    physics.chem-ph

    Anomalous Enhancement of the Electrocatalytic Hydrogen Evolution Reaction in AuPt Nanoclusters

    Authors: Jiahui Kang, Jan Kloppenburg, Jiali Sheng, Zhenyu Xu, Kristoffer Meinander, Hua Jiang, Zhong-Peng Lv, Esko I. Kauppinen, Qiang Zhang, Xi Chen, Olli Ikkala, Miguel A. Caro, Bo Peng

    Abstract: Energy- and resource-efficient electrocatalytic water splitting is of paramount importance to enable sustainable hydrogen production. The best bulk catalyst for the hydrogen evolution reaction (HER), i.e., platinum, is one of the scarcest elements on Earth. The use of raw material for HER can be dramatically reduced by utilizing nanoclusters. In addition, nanoalloying can further improve the perfo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  10. arXiv:2406.01601  [pdf, other

    cs.DC cs.AI cs.LG

    Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration

    Authors: Wei Ji, Li Li, Zheqi Lv, Wenqiao Zhang, Mengze Li, Zhen Wan, Wenqiang Lei, Roger Zimmermann

    Abstract: In our increasingly interconnected world, where intelligent devices continually amass copious personalized multi-modal data, a pressing need arises to deliver high-quality, personalized device-aware services. However, this endeavor presents a multifaceted challenge to prevailing artificial intelligence (AI) systems primarily rooted in the cloud. As these systems grapple with shifting data distribu… ▽ More

    Submitted 21 May, 2024; originally announced June 2024.

  11. arXiv:2405.02194  [pdf, other

    physics.atom-ph physics.optics

    Coherent XUV super continuum emission from atomic bound states

    Authors: **g Zhao, Xiaowei Wang, Li Wang, Jiacan Wang, Yalei Zhu, Fan Xiao, Wenkai Tao, Zhigang Zheng, Haizhong Wu, Xu Sun, Yue Lang, Congsen Meng, Dongwen Zhang, Zhihui Lv, **lei Liu, Zengxiu Zhao

    Abstract: Coherent supercontinuum radiation in the extreme-ultraviolet (XUV) range is indispensable for synthesizing attosecond light pulses and for exploring transient atomic structures. Here, we report the striking observations of coherent XUV supercontinuum (XSC) extended from below to far above the ionization threshold, which exhibits completely different temporal and spatial properties comparing to the… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  12. arXiv:2404.15007  [pdf, ps, other

    cond-mat.mtrl-sci

    Single-Spin Waved-Brim Flat-Top Hat in the Band Edge of GdIH Monolayer

    Authors: Ningning Jia, Zhao Yang, Jiangtao Cai, Zhiheng Lv, Yongting Shi, Tielei Song, Xin Cui, Zhifeng Liu

    Abstract: Exotic electronic bands, such as flat bands, linear crossing bands, spontaneously valley- or spin-polarized bands, in two-dimensional materials have been the hot topics in condensed matter physics. Herein, we first propose a general dispersion model for possible hat-like electronic bands, and then identify an intriguing single-spin \emph{waved-brim flat-top hat} in the valence band edge of a stabl… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  13. arXiv:2404.13558  [pdf, other

    cs.CV

    LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation

    Authors: Haoyu Zheng, Wenqiao Zhang, Yaoke Wang, Hao Zhou, Jiang Liu, Juncheng Li, Zheqi Lv, Siliang Tang, Yueting Zhuang

    Abstract: Revolutionary advancements in text-to-image models have unlocked new dimensions for sophisticated content creation, e.g., text-conditioned image editing, allowing us to edit the diverse images that convey highly complex visual concepts according to the textual guidance. Despite being promising, existing methods focus on texture- or non-rigid-based visual manipulation, which struggles to produce th… ▽ More

    Submitted 23 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 10 pages, 7 figures

  14. arXiv:2404.03572  [pdf, other

    cs.CV cs.CG

    Terrain Point Cloud Inpainting via Signal Decomposition

    Authors: Yizhou Xie, Xiangning Xie, Yuran Wang, Yanci Zhang, Zejun Lv

    Abstract: The rapid development of 3D acquisition technology has made it possible to obtain point clouds of real-world terrains. However, due to limitations in sensor acquisition technology or specific requirements, point clouds often contain defects such as holes with missing data. Inpainting algorithms are widely used to patch these holes. However, existing traditional inpainting algorithms rely on precis… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  15. arXiv:2404.01006  [pdf

    physics.app-ph physics.chem-ph

    Transforming the Synthesis of Carbon Nanotubes with Machine Learning Models and Automation

    Authors: Yue Li, Shurui Wang, Zhou Lv, Zhaoji Wang, Yunbiao Zhao, Ying Xie, Yang Xu, Liu Qian, Yaodong Yang, Ziqiang Zhao, ** Zhang

    Abstract: Carbon-based nanomaterials (CBNs) are showing significant potential in various fields, such as electronics, energy, and mechanics. However, their practical applications face synthesis challenges stemming from the complexities of structural control, large-area uniformity, and high yield. Current research methodologies fall short in addressing the multi-variable, coupled interactions inherent to CBN… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  16. arXiv:2403.18118  [pdf, other

    cs.CV

    EgoLifter: Open-world 3D Segmentation for Egocentric Perception

    Authors: Qiao Gu, Zhaoyang Lv, Duncan Frost, Simon Green, Julian Straub, Chris Sweeney

    Abstract: In this paper we present EgoLifter, a novel system that can automatically segment scenes captured from egocentric sensors into a complete decomposition of individual 3D objects. The system is specifically designed for egocentric data where scenes contain hundreds of objects captured from natural (non-scanning) motion. EgoLifter adopts 3D Gaussians as the underlying representation of 3D scenes and… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Preprint. Project page: https://egolifter.github.io/

  17. arXiv:2403.13447  [pdf, other

    cs.AI cs.CL cs.CV

    HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models

    Authors: Wenqiao Zhang, Tianwei Lin, Jiang Liu, Fangxun Shu, Haoyuan Li, Lei Zhang, He Wanggui, Hao Zhou, Zheqi Lv, Hao Jiang, Juncheng Li, Siliang Tang, Yueting Zhuang

    Abstract: Recent advancements indicate that scaling up Multimodal Large Language Models (MLLMs) effectively enhances performance on downstream multimodal tasks. The prevailing MLLM paradigm, \emph{e.g.}, LLaVA, transforms visual features into text-like tokens using a \emph{static} vision-language mapper, thereby enabling \emph{static} LLMs to develop the capability to comprehend visual information through v… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  18. arXiv:2403.12549  [pdf, other

    math.CO

    Treewidth of generalized Hamming graph, bipartite Kneser graph and generalized Petersen graph

    Authors: Yichen Wang, Mengyu Cao, Zequn Lv, Mei Lu

    Abstract: Let $t,q$ and $n$ be positive integers. Write $[q] = \{1,2,\ldots,q\}$. The generalized Hamming graph $H(t,q,n)$ is the graph whose vertex set is the cartesian product of $n$ copies of $[q]$$(q\ge 2)$, where two vertices are adjacent if their Hamming distance is at most $t$. In particular, $H(1,q,n)$ is the well-known Hamming graph and $H(1,2,n)$ is the hypercube. In 2006, Chandran and Kavitha des… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  19. arXiv:2403.07030  [pdf, other

    cs.LG cs.CV

    AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

    Authors: Zihao Tang, Zheqi Lv, Shengyu Zhang, Yifan Zhou, Xinyu Duan, Fei Wu, Kun Kuang

    Abstract: Due to privacy or patent concerns, a growing number of large models are released without granting access to their training data, making transferring their knowledge inefficient and problematic. In response, Data-Free Knowledge Distillation (DFKD) methods have emerged as direct solutions. However, simply adopting models derived from DFKD for real-world applications suffers significant performance d… ▽ More

    Submitted 17 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024

  20. arXiv:2403.01852  [pdf, other

    cs.CV

    PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

    Authors: Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo, Kwan-Yee K. Wong

    Abstract: Recent advancements in large-scale pre-trained text-to-image models have led to remarkable progress in semantic image synthesis. Nevertheless, synthesizing high-quality images with consistent semantics and layout remains a challenge. In this paper, we propose the adaPtive LAyout-semantiC fusion modulE (PLACE) that harnesses pre-trained models to alleviate the aforementioned issues. Specifically, w… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  21. arXiv:2403.01813  [pdf, other

    cs.CV

    A Simple Baseline for Efficient Hand Mesh Reconstruction

    Authors: Zhishan Zhou, Shihao. zhou, Zhi Lv, Minqiang Zou, Yao Tang, Jiajun Liang

    Abstract: 3D hand pose estimation has found broad application in areas such as gesture recognition and human-machine interaction tasks. As performance improves, the complexity of the systems also increases, which can limit the comparative analysis and practical implementation of these methods. In this paper, we propose a simple yet effective baseline that not only surpasses state-of-the-art (SOTA) methods b… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  22. arXiv:2402.13349  [pdf, other

    cs.CV cs.AI cs.HC

    Aria Everyday Activities Dataset

    Authors: Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexander Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, **g Dong, Kiran Somasundaram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Julian Engel, Xiaqing Pan, Carl Ren

    Abstract: We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data includi… ▽ More

    Submitted 21 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Dataset website: https://www.projectaria.com/datasets/aea/

  23. arXiv:2402.12408  [pdf, other

    cs.LG cs.AI cs.CL

    ModelGPT: Unleashing LLM's Capabilities for Tailored Model Generation

    Authors: Zihao Tang, Zheqi Lv, Shengyu Zhang, Fei Wu, Kun Kuang

    Abstract: The rapid advancement of Large Language Models (LLMs) has revolutionized various sectors by automating routine tasks, marking a step toward the realization of Artificial General Intelligence (AGI). However, they still struggle to accommodate the diverse and specific needs of users and simplify the utilization of AI models for the average user. In response, we propose ModelGPT, a novel framework de… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  24. arXiv:2402.10294  [pdf, other

    cs.HC cs.AI cs.CL cs.MM

    LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing

    Authors: Bryan Wang, Yuliang Li, Zhaoyang Lv, Haijun Xia, Yan Xu, Raj Sodhi

    Abstract: Video creation has become increasingly popular, yet the expertise and effort required for editing often pose barriers to beginners. In this paper, we explore the integration of large language models (LLMs) into the video editing workflow to reduce these barriers. Our design vision is embodied in LAVE, a novel system that provides LLM-powered agent assistance and language-augmented editing features… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Paper accepted to the ACM Conference on Intelligent User Interfaces (ACM IUI) 2024

  25. Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion

    Authors: Cunhang Fan, Yujie Chen, Jun Xue, Yonghui Kong, Jianhua Tao, Zhao Lv

    Abstract: In recent years, knowledge graph completion (KGC) models based on pre-trained language model (PLM) have shown promising results. However, the large number of parameters and high computational cost of PLM models pose challenges for their application in downstream tasks. This paper proposes a progressive distillation method based on masked generation features for KGC task, aiming to significantly re… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024

    Journal ref: (2024) Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Proceedings of the AAAI Conference on Artificial Intelligence, 38(8), 8380-8388

  26. arXiv:2401.11712  [pdf, other

    cs.NE

    A First Step Towards Runtime Analysis of Evolutionary Neural Architecture Search

    Authors: Zeqiong Lv, Chao Qian, Yanan Sun

    Abstract: Evolutionary neural architecture search (ENAS) employs evolutionary algorithms to find high-performing neural architectures automatically, and has achieved great success. However, compared to the empirical success, its rigorous theoretical analysis has yet to be touched. This work goes preliminary steps toward the mathematical runtime analysis of ENAS. In particular, we define a binary classificat… ▽ More

    Submitted 8 April, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  27. arXiv:2401.06337   

    cs.NE

    An ontology alignment method with user intervention using compact differential evolution with adaptive parameter control

    Authors: Zhaoming Lv

    Abstract: User interaction is one of the most effective ways to improve the ontology alignment quality. However, this approach faces the challenge of how users can participate effectively in the matching process. To solve this challenge. In this paper, an interactive ontology alignment approach using compact differential evolution algorithm with adaptive parameter control (IOACDE) is proposed. In this metho… ▽ More

    Submitted 18 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: This paper needs to be revised

  28. arXiv:2312.16700  [pdf, other

    astro-ph.GA astro-ph.CO

    Understanding the Universal Dust Attenuation Scaling Relation of Star-Forming Galaxies

    Authors: J. Qin, X. Z. Zheng, S. Wuyts, Z. Lv, M. Qiao, J. -S. Huang, F. S. Liu, A. Katsianis, V. Gonzalez, F. Bian, H. Xu, Z. Pan, W. Liu, Q. -H. Tan, F. X. An, D. D. Shi, Y. Zhang, R. Wen, S. Liu, C. Yang

    Abstract: Star-forming galaxies (SFGs) adhere to a surprisingly tight scaling relation of dust attenuation parameterized by the infrared excess (IRX=$L_{\rm IR}/L_{\rm UV}$), being jointly determined by the star formation rate (SFR), galaxy size ($R_{\rm e}$), metallicity ($Z$/Z$_\odot$) and axial ratio ($b/a$). We examine how these galaxy parameters determine the effective dust attenuation and give rise to… ▽ More

    Submitted 30 January, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 20 pages, 10 figures, published in MNRAS (2024, Volume 528, Issue 1, pp.658-675); A PHTHON package IRX_TAU_TOT is available at https://github.com/LvZF/irx_tau_tot/ to calculate the total dust optical depth of a galaxy with given metallicity and best-fitting geometry parameters

    Journal ref: MNRAS, 528, 658 (2024)

  29. arXiv:2312.12475  [pdf, other

    cs.LG cs.AI

    Learning to Reweight for Graph Neural Network

    Authors: Zhengyu Chen, Teng Xiao, Kun Kuang, Zheqi Lv, Min Zhang, **luan Yang, Chengqiang Lu, Hongxia Yang, Fei Wu

    Abstract: Graph Neural Networks (GNNs) show promising results for graph tasks. However, existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. The cardinal impetus underlying the severe degeneration is that the GNNs are architected predicated upon the I.I.D assumptions. In such a setting, GNNs are inclined to leverage imperceptible st… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  30. arXiv:2312.03421  [pdf, other

    astro-ph.IM

    Basic Survey Scheduling for the Wide Field Survey Telescope (WFST)

    Authors: Yan-Peng Chen, Ji-an Jiang, Wen-Tao Luo, Xian Zhong Zheng, Min Fang, Chao Yang, Yuan-Yu Hong, Zong-Fei Lv

    Abstract: Aiming at improving the survey efficiency of the Wide Field Survey Telescope, we have developed a basic scheduling strategy that takes into account the telescope characteristics, observing conditions, and weather conditions at the Lenghu site. The sky area is divided into rectangular regions, referred to as `tiles', with a size of 2.577 deg * 2.634 deg slightly smaller than the focal area of the m… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 14 pages, 7 figures, 1 table. Accepted for pubulication in Research in Astronomy and Astrophysics

  31. arXiv:2311.12905  [pdf, other

    cs.AI cs.LG

    Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer

    Authors: Wenqiao Zhang, Zheqi Lv, Hao Zhou, Jia-Wei Liu, Juncheng Li, Mengze Li, Siliang Tang, Yueting Zhuang

    Abstract: Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a new target domain by actively selecting a limited number of target data to annotate.This setting neglects the more practical scenario where training data are collected from multiple sources. This motivates us to target a new and challenging setting of knowledge transfer that extends ADA from a single source domain to mult… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: text overlap with arXiv:2302.13824 by other authors

  32. arXiv:2311.06463  [pdf, other

    physics.optics

    Self-suppressed quantum diffusion and fundamental noise limit of soliton microcombs

    Authors: Xing **, Zhe Lv, Qihuang Gong, Qi-Fan Yang

    Abstract: Quantum diffusion of soliton microcombs has long been recognized as their fundamental noise limit. Here we surpass such limit by utilizing dispersive wave dynamics in multimode microresonators. Through the recoil force provided by these dispersive waves, the quantum diffusion can be suppressed to a much lower level that forms the ultimate fundamental noise limit of soliton microcombs. Our findings… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 8 pages, 5 figures

  33. arXiv:2310.15767  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Unpaired MRI Super Resolution with Contrastive Learning

    Authors: Hao Li, Quanwei Liu, Jianan Liu, Xiling Liu, Yanni Dong, Tao Huang, Zhihan Lv

    Abstract: Magnetic resonance imaging (MRI) is crucial for enhancing diagnostic accuracy in clinical settings. However, the inherent long scan time of MRI restricts its widespread applicability. Deep learning-based image super-resolution (SR) methods exhibit promise in improving MRI resolution without additional cost. Due to lacking of aligned high-resolution (HR) and low-resolution (LR) MRI image pairs, uns… ▽ More

    Submitted 16 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  34. arXiv:2310.15194  [pdf

    q-bio.NC cs.HC eess.SP q-bio.QM

    How do the resting EEG preprocessing states affect the outcomes of postprocessing?

    Authors: Shiang Hu, Jie Ruan, Juan Hou, Pedro Antonio Valdes-Sosa, Zhao Lv

    Abstract: Plenty of artifact removal tools and pipelines have been developed to correct the EEG recordings and discover the values below the waveforms. Without visual inspection from the experts, it is susceptible to derive improper preprocessing states, like the insufficient preprocessed EEG (IPE), and the excessive preprocessed EEG (EPE). However, little is known about the impacts of IPE or EPE on the pos… ▽ More

    Submitted 12 December, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

  35. arXiv:2310.11994  [pdf

    cs.HC eess.SP q-bio.NC

    Spectral homogeneity cross frequencies can be a quality metric for the large-scale resting EEG preprocessing

    Authors: Shiang Hu, Jie Ruan, Nicolas Langer, Jorge Bosch-Bayard, Zhao Lv, Dezhong Yao, Pedro Antonio Valdes-Sosa

    Abstract: The brain projects require the collection of massive electrophysiological data, aiming to the longitudinal, sectional, or populational neuroscience studies. Quality metrics automatically label the data after centralized preprocessing. However, although the waveforms-based metrics are partially useful, they may be unreliable by neglecting the spectral profiles. Here, we detected the phenomenon of p… ▽ More

    Submitted 4 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  36. Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection

    Authors: Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv

    Abstract: Most research in synthetic speech detection (SSD) focuses on improving performance on standard noise-free datasets. However, in actual situations, noise interference is usually present, causing significant performance degradation in SSD systems. To improve noise robustness, this paper proposes a dual-branch knowledge distillation synthetic speech detection (DKDSSD) method. Specifically, a parallel… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  37. arXiv:2310.06858  [pdf, ps, other

    cs.NI cs.AI

    Design of JiuTian Intelligent Network Simulation Platform

    Authors: Lei Zhao, Miaomiao Zhang, Guangyu Li, Zhuowen Guan, Sijia Liu, Zhaobin Xiao, Yuting Cao, Zhe Lv, Yan** Liang

    Abstract: This paper introduced the JiuTian Intelligent Network Simulation Platform, which can provide wireless communication simulation data services for the Open Innovation Platform. The platform contains a series of scalable simulator functionalities, offering open services that enable users to use reinforcement learning algorithms for model training and inference based on simulation environments and dat… ▽ More

    Submitted 28 September, 2023; originally announced October 2023.

  38. arXiv:2310.04769  [pdf

    cs.CV

    1st Place Solution of Egocentric 3D Hand Pose Estimation Challenge 2023 Technical Report:A Concise Pipeline for Egocentric Hand Pose Reconstruction

    Authors: Zhishan Zhou, Zhi Lv, Shihao Zhou, Minqiang Zou, Tong Wu, Mochen Yu, Yao Tang, Jiajun Liang

    Abstract: This report introduce our work on Egocentric 3D Hand Pose Estimation workshop. Using AssemblyHands, this challenge focuses on egocentric 3D hand pose estimation from a single-view image. In the competition, we adopt ViT based backbones and a simple regressor for 3D keypoints prediction, which provides strong model baselines. We noticed that Hand-objects occlusions and self-occlusions lead to perfo… ▽ More

    Submitted 9 October, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

  39. Granularity at Scale: Estimating Neighborhood Socioeconomic Indicators from High-Resolution Orthographic Imagery and Hybrid Learning

    Authors: Ethan Brewer, Giovani Valdrighi, Parikshit Solunke, Joao Rulff, Yurii Piadyk, Zhonghui Lv, Jorge Poco, Claudio Silva

    Abstract: Many areas of the world are without basic information on the socioeconomic well-being of the residing population due to limitations in existing data collection methods. Overhead images obtained remotely, such as from satellite or aircraft, can help serve as windows into the state of life on the ground and help "fill in the gaps" where community information is sparse, with estimates at smaller geog… ▽ More

    Submitted 18 February, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Updated after acceptance to IEEE J-STARS

    Journal ref: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 5668-5679, 2024

  40. arXiv:2309.07581  [pdf, ps, other

    cs.AR

    A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives

    Authors: Zhengyang Lv, Mingyu Yan, Xin Liu, Mengyao Dong, Xiaochun Ye, Dongrui Fan, Ninghui Sun

    Abstract: Graph-related applications have experienced significant growth in academia and industry, driven by the powerful representation capabilities of graph. However, efficiently executing these applications faces various challenges, such as load imbalance, random memory access, etc. To address these challenges, researchers have proposed various acceleration systems, including software frameworks and hard… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  41. arXiv:2309.07147  [pdf, other

    eess.SP cs.HC cs.LG cs.MM cs.SD eess.AS

    DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection

    Authors: Cunhang Fan, Hongyu Zhang, Wei Huang, Jun Xue, Jianhua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu

    Abstract: Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment. Although EEG-based AAD methods have shown promising results in recent years, current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images. This makes it challenging to handle EEG signals, which possess non-Euclidean… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  42. arXiv:2309.06067  [pdf, ps, other

    eess.IV cs.CV physics.med-ph

    Implicit Neural Representation for MRI Parallel Imaging Reconstruction

    Authors: Hao Li, Yusheng Zhou, Jianan Liu, Xiling Liu, Tao Huang, Zhihan Lv, Weidong Cai

    Abstract: Magnetic resonance imaging (MRI) usually faces lengthy acquisition times, prompting the exploration of strategies such as parallel imaging (PI) to alleviate this problem by periodically skip** specific K-space lines and subsequently reconstructing high-quality images from the undersampled K-space. Implicit neural representation (INR) has recently emerged as a promising deep learning technique, c… ▽ More

    Submitted 10 April, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  43. Probing the Galactic halo with RR Lyrae stars -- IV. On the Oosterhoff dichotomy of RR Lyrae stars

    Authors: Shan Zhang, Gaochao Liu, Yang Huang, Zongfei Lv, Sarah Ann Bird, Bingqiu Chen, Huawei Zhang, Timothy C. Beers, Xinyi Li, Haijun Tian, Peng Zhang

    Abstract: We use 3653 (2661 RRab, 992 RRc) RR Lyrae stars (RRLs) with 7D (3D position, 3D velocity, and metallicity) information selected from SDSS, LAMOST, and Gaia EDR3, and divide the sample into two Oosterhoff groups (Oo I and Oo II) according to their amplitude-period behaviour in the Bailey Diagram. We present a comparative study of these two groups based on chemistry, kinematics, and dynamics. We fin… ▽ More

    Submitted 12 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

  44. arXiv:2309.01519  [pdf, other

    cs.SE cs.LG

    Hawkeye: Change-targeted Testing for Android Apps based on Deep Reinforcement Learning

    Authors: Chao Peng, Zhengwei Lv, Jiarong Fu, Jiayuan Liang, Zhao Zhang, Ajitha Rajan, ** Yang

    Abstract: Android Apps are frequently updated to keep up with changing user, hardware, and business demands. Ensuring the correctness of App updates through extensive testing is crucial to avoid potential bugs reaching the end user. Existing Android testing tools generate GUI events focussing on improving the test coverage of the entire App rather than prioritising updates and its impacted elements. Recent… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  45. arXiv:2308.13561  [pdf, other

    cs.HC cs.CV

    Project Aria: A New Tool for Egocentric Multi-Modal AI Research

    Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

    Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More

    Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

  46. arXiv:2308.09944  [pdf, other

    cs.SD eess.AS

    Spatial Reconstructed Local Attention Res2Net with F0 Subband for Fake Speech Detection

    Authors: Cunhang Fan, Jun Xue, Jianhua Tao, Jiangyan Yi, Chenglong Wang, Chengshi Zheng, Zhao Lv

    Abstract: The rhythm of synthetic speech is usually too smooth, which causes that the fundamental frequency (F0) of synthetic speech is significantly different from that of real speech. It is expected that the F0 feature contains the discriminative information for the fake speech detection (FSD) task. In this paper, we propose a novel F0 subband for FSD. In addition, to effectively model the F0 subband so a… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  47. arXiv:2308.05648  [pdf, other

    cs.CV

    Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

    Authors: Zezhong Lv, Bing Su, Ji-Rong Wen

    Abstract: Video moment localization aims to retrieve the target segment of an untrimmed video according to the natural language query. Weakly supervised methods gains attention recently, as the precise temporal location of the target segment is not always available. However, one of the greatest challenges encountered by the weakly supervised method is implied in the mismatch between the video and language i… ▽ More

    Submitted 14 October, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  48. arXiv:2308.03585  [pdf, ps, other

    math.CO

    Hilton-Milner theorem for bounded multisets

    Authors: Jiaqi Liao, Zequn Lv, Mengyu Cao, Mei Lu

    Abstract: Let $ k, n \in \mathbb{N}^+ $ and $ m \in \mathbb{N}^+ \cup \{\infty \} $. A $ k $-multiset in $ [n]_m $ is a $ k $-set whose elements are integers from $ \{1, 2, \ldots, n\} $, and each element is allowed to have at most $ m $ repetitions. A family of $ k $-multisets in $ [n]_m $ is said to be intersecting if every pair of $ k $-multisets from the family have non-empty intersection. In this paper… ▽ More

    Submitted 22 May, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: 13 pages

    MSC Class: 05D05; 05C35; 05A15

  49. arXiv:2308.00537  [pdf, other

    eess.SY cs.AI cs.LG

    Graph Embedding Dynamic Feature-based Supervised Contrastive Learning of Transient Stability for Changing Power Grid Topologies

    Authors: Zijian Lv, Xin Chen, Zijian Feng

    Abstract: Accurate online transient stability prediction is critical for ensuring power system stability when facing disturbances. While traditional transient stablity analysis replies on the time domain simulations can not be quickly adapted to the power grid toplogy change. In order to vectorize high-dimensional power grid topological structure information into low-dimensional node-based graph embedding s… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: This work has been submitted to the IEEE Transactions on Power Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  50. arXiv:2306.15389  [pdf, other

    cs.SD cs.LG eess.AS

    Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection

    Authors: Shunbo Dong, Jun Xue, Cunhang Fan, Kang Zhu, Yujie Chen, Zhao Lv

    Abstract: In this paper, we propose the multi-perspective information fusion (MPIF) Res2Net with random Specmix for fake speech detection (FSD). The main purpose of this system is to improve the model's ability to learn precise forgery information for FSD task in low-quality scenarios. The task of random Specmix, a data augmentation, is to improve the generalization ability of the model and enhance the mode… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted by DADA2023