Skip to main content

Showing 51–100 of 2,649 results for author: Chen, P

.
  1. arXiv:2405.16890  [pdf, other

    cs.CV

    PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance

    Authors: Haohan Weng, Yikai Wang, Tong Zhang, C. L. Philip Chen, Jun Zhu

    Abstract: Generating compact and sharply detailed 3D meshes poses a significant challenge for current 3D generative models. Different from extracting dense meshes from neural representation, some recent works try to model the native mesh distribution (i.e., a set of triangles), which generates more compact results as humans crafted. However, due to the complexity and variety of mesh topology, these methods… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Project website: https://whaohan.github.io/pivotmesh

  2. arXiv:2405.16833  [pdf, other

    cs.LG

    Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models

    Authors: Chia-Yi Hsu, Yu-Lin Tsai, Chih-Hsun Lin, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang

    Abstract: While large language models (LLMs) such as Llama-2 or GPT-4 have shown impressive zero-shot performance, fine-tuning is still necessary to enhance their performance for customized datasets, domain-specific tasks, or other private needs. However, fine-tuning all parameters of LLMs requires significant hardware resources, which can be impractical for typical users. Therefore, parameter-efficient fin… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.16677  [pdf, other

    eess.AS cs.CL cs.SD

    Crossmodal ASR Error Correction with Discrete Speech Units

    Authors: Yuanchao Li, Pinzhen Chen, Peter Bell, Catherine Lai

    Abstract: ASR remains unsatisfactory in scenarios where the speaking style diverges from that used to train ASR systems, resulting in erroneous transcripts. To address this, ASR Error Correction (AEC), a post-ASR processing approach, is required. In this work, we tackle an understudied issue: the Low-Resource Out-of-Domain (LROOD) problem, by investigating crossmodal AEC on very limited downstream data with… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  4. arXiv:2405.16646  [pdf, other

    cs.LG

    A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts

    Authors: Mohammed Nowaz Rabbani Chowdhury, Meng Wang, Kaoutar El Maghraoui, Naigang Wang, Pin-Yu Chen, Christopher Carothers

    Abstract: The sparsely gated mixture of experts (MoE) architecture sends different inputs to different subnetworks, i.e., experts, through trainable routers. MoE reduces the training computation significantly for large models, but its deployment can be still memory or computation expensive for some downstream tasks. Model pruning is a popular approach to reduce inference computation, but its application in… ▽ More

    Submitted 30 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Journal ref: The 41st International Conference on Machine Learning, ICML 2024

  5. arXiv:2405.15920  [pdf, other

    cs.LG stat.ML

    SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning

    Authors: Shuai Zhang, Heshan Devaka Fernando, Miao Liu, Keerthiram Murugesan, Songtao Lu, Pin-Yu Chen, Tianyi Chen, Meng Wang

    Abstract: This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics. In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward map**: the former characterizes the transition dynamics, and the latter characterizes the task-specif… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.16173

  6. High-field magnetoelectric coupling and successive magnetic transitions in Mn-doped polar antiferromagnet Ni3TeO6

    Authors: J. H. Zhang, L. Lin, C. Dong, Y. T. Chang, J. F. Wang, C. L. Lu, P. Z. Chen, W. J. Zhai, G. Z. Zhou, L. Huang, Y. S. Tang, S. H. Zheng, M. F. Liu, X. H. Zhou, Z. B. Yan, J. -M. Liu

    Abstract: Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 sing… ▽ More

    Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 30 pages with 8 figures

    Journal ref: Phys. Rev. B 109, 184112 (2024)

  7. arXiv:2405.14696  [pdf, other

    cs.CL cs.AI cs.DB

    A Declarative System for Optimizing AI Workloads

    Authors: Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baille Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Gerardo Vitagliano

    Abstract: A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large corpora of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today's models can accomplish these tasks with high accuracy… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 29 pages, 9 figures

    ACM Class: H.2.3; I.2.5

  8. arXiv:2405.14455  [pdf, other

    cs.CV

    TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing

    Authors: Teng Xu, Jiamin Chen, Peng Chen, Youjia Zhang, Junqing Yu, Wei Yang

    Abstract: Editing objects within a scene is a critical functionality required across a broad spectrum of applications in computer vision and graphics. As 3D Gaussian Splatting (3DGS) emerges as a frontier in scene representation, the effective modification of 3D Gaussian scenes has become increasingly vital. This process entails accurately retrieve the target objects and subsequently performing modification… ▽ More

    Submitted 1 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  9. arXiv:2405.14161  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

    Authors: Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Chengwei Qin, Pin-Yu Chen, Eng Siong Chng, Chao Zhang

    Abstract: We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) systems in diverse target domains, such as noise and accents. STAR is developed for prevalent speech foundation models based on Transformer-related architecture with auto-regressive decoding (e.g., Whisper, Canary). Specifica… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 23 pages, Preprint

  10. arXiv:2405.13860  [pdf, other

    cs.CV

    MAGIC: Map-Guided Few-Shot Audio-Visual Acoustics Modeling

    Authors: Diwei Huang, Kunyang Lin, Peihao Chen, Qing Du, Mingkui Tan

    Abstract: Few-shot audio-visual acoustics modeling seeks to synthesize the room impulse response in arbitrary locations with few-shot observations. To sufficiently exploit the provided few-shot data for accurate acoustic modeling, we present a *map-guided* framework by constructing acoustic-related visual semantic feature maps of the scenes. Visual features preserve semantic details related to sound and map… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 17 pages, 12 pages for main paper, 5 pages for supplementary

  11. arXiv:2405.13326  [pdf, other

    cs.CL

    Mosaic IT: Enhancing Instruction Tuning with Data Mosaics

    Authors: Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, Yupeng Hou, Fuxiao Liu, Tianyi Zhou

    Abstract: Finetuning large language models with a variety of instruction-response pairs has enhanced their capability to understand and follow instructions. Current instruction tuning primarily relies on teacher models or human intervention to generate and refine the instructions and responses, which are costly, non-sustainable, and may lack diversity. In this paper, we introduce Mosaic Instruction Tuning (… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  12. arXiv:2405.11733  [pdf, other

    quant-ph

    Simulating a Chern Insulator with C = $\pm$2 on Synthetic Floquet Lattice

    Authors: Lingxiao Lei, Weichen Wang, Guangyao Huang, Shun Hu, Xi Cao, Xinfang Zhang, Mingtang Deng, **xing Chen

    Abstract: The synthetic Floquet lattice, generated by multiple strong drives with mutually incommensurate frequencies, provides a powerful platform for the quantum simulation of topological phenomena. In this study, we propose a 4-band tight-binding model of the Chern insulator with a Chern number C = $\pm$2 by coupling two layers of the half-BHZ lattice and subsequently map** it onto the Floquet lattice… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  13. arXiv:2405.09148  [pdf, ps, other

    cs.CV

    A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection

    Authors: Honghui Chen, **** Chen, Huan Mao, Mengxi Jiang

    Abstract: Anomaly detection and localization without any manual annotations and prior knowledge is a challenging task under the setting of unsupervised learning. The existing works achieve excellent performance in the anomaly detection, but with complex networks or cumbersome pipelines. To address this issue, this paper explores a simple but effective architecture in the anomaly detection. It consists of a… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 12 pages, 4 figures

    MSC Class: 68T01 ACM Class: I.2.10

  14. arXiv:2405.09125  [pdf, other

    cs.CV cs.AI

    HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition

    Authors: Honghui Chen, Yuhang Qiu, Jiabao Wang, **** Chen, Nam Ling

    Abstract: Internal Language Model (LM)-based methods use permutation language modeling (PLM) to solve the error correction caused by conditional independence in external LM-based methods. However, random permutations of human interference cause fit oscillations in the model training, and Iterative Refinement (IR) operation to improve multimodal information decoupling also introduces additional overhead. To… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures

    MSC Class: 68T01 ACM Class: I.2.10

  15. arXiv:2405.09061  [pdf, other

    cs.LG

    Improving Transformers using Faithful Positional Encoding

    Authors: Tsuyoshi Idé, Jokin Labaien, Pin-Yu Chen

    Abstract: We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the t… ▽ More

    Submitted 16 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.17149

  16. arXiv:2405.08816  [pdf, other

    cs.CV cs.RO

    The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

    Authors: Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei Zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang , et al. (66 additional authors not shown)

    Abstract: In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: ICRA 2024; 32 pages, 24 figures, 5 tables; Code at https://robodrive-24.github.io/

  17. arXiv:2405.08399  [pdf

    cond-mat.mtrl-sci

    Exploring material compositions for synthesis using oxidation states

    Authors: Maung Thway, Andy Paul Chen, Haiwen Dai, Jose Recatala-Gomez, Siyu Isaac Parker Tian, Ruiming Zhu, Wenhao Zhai, Fengxia Wei, D. V. Maheshwar Repaka, Tonio Buonassisi, Pieremanuele Canepa, Kedar Hippalgaonkar

    Abstract: Recent advances in machine learning techniques have made it possible to use high-throughput screening to identify novel materials with specific properties. However, the large number of potential candidates produced by these techniques can make it difficult to select the most promising ones. In this study, we develop the oxidation state probability (OSP) method which evaluates ternary compounds bas… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 11 pages, 3 figures

  18. arXiv:2405.08098  [pdf, other

    nucl-th

    Production cross sections of superheavy elements: insights from the dinuclear system model with high-quality microscopic nuclear masses

    Authors: Peng-Hui Chen, Chang Geng, Fei Niu, Zu-Xing Yang, Xiang-Hua Zeng, Zhao-Qing Feng

    Abstract: To accurately predict the synthesis cross-sections of superheavy elements, identifying the optimal projectile-target combinations and the evaporation channels at specific collision energies, we have attempted to utilize high-quality microscopic nuclear masses (HQMNM) within the dinuclear system (DNS) model, which are obtained by fitting experimental data with the Skyrme energy density functional t… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures

  19. arXiv:2405.07870  [pdf

    cs.SE

    Map** the Invisible: A Framework for Tracking COVID-19 Spread Among College Students with Google Location Data

    Authors: Pra**dra Sankar Krishnan, Chai Phing Chen, Gamal Alkawsi, Sieh Kiong Tiong, Luiz Fernando Capretz

    Abstract: The COVID-19 pandemic and the implementation of social distancing policies have rapidly changed people's visiting patterns, as reflected in mobility data that tracks mobility traffic using location trackers on cell phones. However, the frequency and duration of concurrent occupancy at specific locations govern the transmission rather than the number of customers visiting. Therefore, understanding… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 8 pages

    Journal ref: Latin American Workshop on Data Fusion (LAFUSION 2023), November/2023, pp 1-8, Rio de Janeiro, Brazil

  20. arXiv:2405.07744  [pdf, other

    cs.SE

    MoCo: Fuzzing Deep Learning Libraries via Assembling Code

    Authors: Pin Ji, Yang Feng, Duo Wu, Lingyue Yan, Pengling Chen, Jia Liu, Zhihong Zhao

    Abstract: The rapidly develo** deep learning (DL) techniques have been applied in software systems with various application scenarios. However, they could also pose new safety threats with potentially serious consequences, especially in safety-critical domains. DL libraries serve as the underlying foundation for DL systems, and bugs in them can have unpredictable impacts that directly affect the behaviors… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  21. arXiv:2405.07281  [pdf, ps, other

    eess.SP

    Movable Antennas Aided Multicast MISO Communication Systems

    Authors: Zhenqiao Cheng, Nanxi Li, Ruizhe Long, Jianchi Zhu, Chongjun Ouyang, Peng Chen

    Abstract: A novel multicast communication system with movable antennas (MAs) is proposed, where the antenna position optimization is exploited to enhance the transmission rate. Specifically, an MA-assisted two-user multicast multiple-input single-input system is considered. The joint optimization of the transmit beamforming vector and transmit MA positions is studied by modeling the motion of the MA element… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 5 pages

  22. arXiv:2405.07167  [pdf, other

    cs.CV

    3D Hand Mesh Recovery from Monocular RGB in Camera Space

    Authors: Haonan Li, Patrick P. K. Chen, Yitong Zhou

    Abstract: With the rapid advancement of technologies such as virtual reality, augmented reality, and gesture control, users expect interactions with computer interfaces to be more natural and intuitive. Existing visual algorithms often struggle to accomplish advanced human-computer interaction tasks, necessitating accurate and reliable absolute spatial prediction methods. Moreover, dealing with complex scen… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 21 pages, 7 figures

  23. arXiv:2405.02607  [pdf, ps, other

    math.CA

    On pointwise convergence of cone multipliers

    Authors: Peng Chen, Danqing He, Xiaochun Li, Lixin Yan

    Abstract: For $p\ge 2$, and $λ>\max\{n|\tfrac 1p-\tfrac 12|-\tfrac12, 0\}$, we prove the pointwise convergence of cone multipliers, i.e. $$ \lim_{t\to\infty}T_t^λ(f)\to f \text{ a.e.},$$ where $f\in L^p(\mathbb R^n)$ satisfies $supp\ \widehat f\subset\{ξ\in\mathbb R^n:\ 1<|ξ_n|<2\}$. Our main tools are weighted estimates for maximal cone operators, which are consequences of trace inequalities for cones.

    Submitted 4 May, 2024; originally announced May 2024.

  24. arXiv:2405.02517  [pdf, other

    cs.CL

    Mothman at SemEval-2024 Task 9: An Iterative System for Chain-of-Thought Prompt Optimization

    Authors: Alvin Po-Chun Chen, Ray Groshan, Sean von Bayern

    Abstract: Extensive research exists on the performance of large language models on logic-based tasks, whereas relatively little has been done on their ability to generate creative solutions on lateral thinking tasks. The BrainTeaser shared task tests lateral thinking and uses adversarial datasets to prevent memorization, resulting in poor performance for out-of-the-box models. We propose a system for iterat… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 13 pages, 2 figures, to be published in Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

  25. arXiv:2405.01306  [pdf, other

    cs.LG

    Graph is all you need? Lightweight data-agnostic neural architecture search without training

    Authors: Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Chunhen Jiang, Jianxi Gao

    Abstract: Neural architecture search (NAS) enables the automatic design of neural network models. However, training the candidates generated by the search algorithm for performance evaluation incurs considerable computational overhead. Our method, dubbed nasgraph, remarkably reduces the computational costs by converting neural architectures to graphs and using the average degree, a graph measure, as the pro… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  26. arXiv:2404.19446  [pdf, other

    nucl-th

    Exploring the potential of synthesizing unknown superheavy isotopes via cold-fusion reactions based on the dinuclear system model

    Authors: Hao Wu, Peng-Hui Chen, Fei Niu, Zu-Xing Yang, Xiang-Hua Zeng, Zhao-Qing Feng

    Abstract: To assess the potential of cold-fusion for synthesizing superheavy nuclei (SHN) with proton numbers 104-113, we systematically calculated 145 naturally occurring projectile-target combinations within the DNS model. Reactions predominantly show maximum cross-sections in the 1n to 2n channels, peaking near the Coulomb barrier with a sum of barrier and Q-value within 30 MeV. The maximum cross-section… ▽ More

    Submitted 1 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures

  27. arXiv:2404.18406  [pdf, ps, other

    cs.IT eess.SP

    Movable Antenna-Enhanced Wireless Powered Mobile Edge Computing Systems

    Authors: Pengcheng Chen, Yuxuan Yang, Bin Lyu, Zhen Yang, Abbas Jamalipour

    Abstract: In this paper, we propose a movable antenna (MA) enhanced scheme for wireless powered mobile edge computing (WP-MEC) system, where the hybrid access point (HAP) equipped with multiple MAs first emits wireless energy to charge wireless devices (WDs), and then receives the offloaded tasks from the WDs for edge computing. The MAs deployed at the HAP enhance the spatial degrees of freedom (DoFs) by fl… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages, 10 figures. Submitted for possible publication

  28. arXiv:2404.18036  [pdf

    astro-ph.IM

    Current laboratory performance of starlight suppression systems, and potential pathways to desired Habitable Worlds Observatory exoplanet science capabilities

    Authors: Bertrand Mennesson, Ruslan Belikov, Emiel Por, Eugene Serabyn, Garreth Ruane, A. J. Eldorado Riggs, Dan Sirbu, Laurent Pueyo, Remi Soummer, Jeremy Kasdin, Stuart Shaklan, Byoung-Joon Seo, Christopher Stark, Eric Cady, Pin Chen, Brendan Crill, Kevin Fogarty, Alexandra Greenbaum, Olivier Guyon, Roser Juanola-Parramon, Brian Kern, John Krist, Bruce Macintosh, David Marx, Dimitri Mawet , et al. (12 additional authors not shown)

    Abstract: We summarize the current best polychromatic (10 to 20 % bandwidth) contrast performance demonstrated in the laboratory by different starlight suppression approaches and systems designed to directly characterize exoplanets around nearby stars. We present results obtained by internal coronagraph and external starshade experimental testbeds using entrance apertures equivalent to off-axis or on-axis t… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 63 pages, 28 pages, submitted to JATIS

  29. arXiv:2404.17936  [pdf, other

    cs.CV

    FDCE-Net: Underwater Image Enhancement with Embedding Frequency and Dual Color Encoder

    Authors: Zheng Cheng, Guodong Fan, **gchun Zhou, Min Gan, C. L. Philip Chen

    Abstract: Underwater images often suffer from various issues such as low brightness, color shift, blurred details, and noise due to light absorption and scattering caused by water and suspended particles. Previous underwater image enhancement (UIE) methods have primarily focused on spatial domain enhancement, neglecting the frequency domain information inherent in the images. However, the degradation factor… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 16pages,13 figures

  30. arXiv:2404.17729  [pdf, other

    cs.CL

    CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving

    Authors: Pei Chen, Boran Han, Shuai Zhang

    Abstract: Large Language Models (LLMs) have shown great ability in solving traditional natural language tasks and elementary reasoning tasks with appropriate prompting techniques. However, their ability is still limited in solving complicated science problems. In this work, we aim to push the upper bound of the reasoning capability of LLMs by proposing a collaborative multi-agent, multi-reasoning-path (CoMM… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  31. arXiv:2404.17400  [pdf, other

    cs.CV cs.AI eess.IV

    Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement

    Authors: Zishu Yao, Guodong Fan, **fu Fan, Min Gan, C. L. Philip Chen

    Abstract: Low-light remote sensing images generally feature high resolution and high spatial complexity, with continuously distributed surface features in space. This continuity in scenes leads to extensive long-range correlations in spatial domains within remote sensing images. Convolutional Neural Networks, which rely on local correlations for long-distance modeling, struggle to establish long-range corre… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 14 page

  32. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  33. arXiv:2404.16302  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions

    Authors: Haoyuan Li, Qi Hu, You Yao, Kailun Yang, Peng Chen

    Abstract: Cross-modality images that integrate visible-infrared spectra cues can provide richer complementary information for object detection. Despite this, existing visible-infrared object detection methods severely degrade in severe weather conditions. This failure stems from the pronounced sensitivity of visible images to environmental perturbations, such as rain, haze, and snow, which frequently cause… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: The dataset and source code will be made publicly available at https://github.com/lhy-zjut/CFMW

  34. arXiv:2404.16227  [pdf, other

    quant-ph

    Optimal entanglement generation in optomechanical systems via Krotov control of covariance matrix dynamics

    Authors: Peng-Ju Chen, Da-Wei Luo, Ting Yu

    Abstract: We investigated the optimal control of a continuous variable system, focusing on entanglement generation in an optomechanical system without utilizing Fock basis cutoffs. Using the Krotov algorithm to optimize the dynamics of the covariance matrix, we illustrated how to design a control objective function to manipulate the dynamics of the system to generate a desirable target state. We showed that… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures

  35. arXiv:2404.16115  [pdf, other

    cs.CL cs.AI cs.LG

    Online Personalizing White-box LLMs Generation with Neural Bandits

    Authors: Zekai Chen, Weeden Daniel, Po-yu Chen, Francois Buet-Golfouse

    Abstract: The advent of personalized content generation by LLMs presents a novel challenge: how to efficiently adapt text to meet individual preferences without the unsustainable demand of creating a unique model for each user. This study introduces an innovative online method that employs neural bandit algorithms to dynamically optimize soft instruction embeddings based on user feedback, enhancing the pers… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 7 pages

  36. arXiv:2404.16033  [pdf, other

    cs.CV cs.CL

    Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

    Authors: Timin Gao, Peixian Chen, Mengdan Zhang, Chaoyou Fu, Yunhang Shen, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Xing Sun, Liujuan Cao, Rongrong Ji

    Abstract: With the advent of large language models(LLMs) enhanced by the chain-of-thought(CoT) methodology, visual reasoning problem is usually decomposed into manageable sub-tasks and tackled sequentially with various external tools. However, such a paradigm faces the challenge of the potential "determining hallucinations" in decision-making due to insufficient visual information and the limitation of low-… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: The project page is available at https://ggg0919.github.io/cantor/

  37. arXiv:2404.15881  [pdf, other

    cs.CV cs.AI

    Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks

    Authors: Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung, Che-Rung Lee

    Abstract: Latency attacks against object detection represent a variant of adversarial attacks that aim to inflate the inference time by generating additional ghost objects in a target image. However, generating ghost objects in the black-box scenario remains a challenge since information about these unqualified objects remains opaque. In this study, we demonstrate the feasibility of generating ghost objects… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  38. arXiv:2404.15450  [pdf, other

    gr-qc

    A possible origin of the $α$-vacuum as the initial state of the Universe

    Authors: Pisin Chen, Kuan-Nan Lin, Wei-Chen Lin, Dong-han Yeom

    Abstract: We investigate the cosmological observables using the Euclidean path integral approach. Specifically, we study both the no-boundary compact instantons scenario and the Euclidean wormholes scenario that can induce the creation of two universes from nothing. It is known that perturbations associated with the no-boundary scenario can only be consistent with the Bunch-Davies vacuum. Here we demonstrat… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 25 pages, 8 figures

  39. arXiv:2404.14122  [pdf, other

    cs.CL

    Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?

    Authors: Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, Dietrich Klakow

    Abstract: Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality. In the current practice of fine-tuning large language models (LLMs) for translation, we revisit the importance of all these factors. We find that LLMs display strong translation capability after being fine-tuned on as fe… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  40. arXiv:2404.13859  [pdf, other

    cs.CV cs.AI

    Unveiling and Mitigating Generalized Biases of DNNs through the Intrinsic Dimensions of Perceptual Manifolds

    Authors: Yanbiao Ma, Licheng Jiao, Fang Liu, Lingling Li, Wen** Ma, Shuyuan Yang, Xu Liu, Puhua Chen

    Abstract: Building fair deep neural networks (DNNs) is a crucial step towards achieving trustworthy artificial intelligence. Delving into deeper factors that affect the fairness of DNNs is paramount and serves as the foundation for mitigating model biases. However, current methods are limited in accurately predicting DNN biases, relying solely on the number of training samples and lacking more precise measu… ▽ More

    Submitted 17 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 8pages, 6figures, Submitted to TPAMI

  41. arXiv:2404.13301  [pdf, other

    math.OC

    Sequential subspace methods on Stiefel manifold optimization problems

    Authors: Pengwen Chen, Chung-Kuan Cheng, Chester Holtz

    Abstract: We study the minimization of a quadratic over Stiefel manifolds (the set of all orthogonal $r$-frames in \IR^n), which has applications in high-dimensional semi-supervised classification tasks. To reduce the computational complexity, sequential subspace methods(SSM) are employed to convert the high-dimensional minimization problems to low-dimensional ones. In this paper, we are interested in attai… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  42. arXiv:2404.13031  [pdf, other

    astro-ph.SR astro-ph.EP astro-ph.GA

    OGLE-2015-BLG-0845L: A low-mass M dwarf from the microlensing parallax and xallarap effects

    Authors: Zhecheng Hu, Wei Zhu, Andrew Gould, Andrzej Udalski, Takahiro Sumi, ** Chen, Sebastiano Calchi Novati, Jennifer C. Yee, Charles A. Beichman, Geoffery Bryden, Sean Carey, Michael Fausnaugh, B. Scott Gaudi, Calen B. Henderson, Yossi Shvartzvald, Benjamin Wibking, Przemek Mróz, Jan Skowron, Radoslaw Poleski, Michaeł K. Szymański, Igor Soszynśki, Paweł Pietrukowicz, Szymon Kozłowski, Krzysztof Ulaczyk, Krzysztof A. Rybicki , et al. (29 additional authors not shown)

    Abstract: We present the analysis of the microlensing event OGLE-2015-BLG-0845, which was affected by both the microlensing parallax and xallarap effects. The former was detected via the simultaneous observations from the ground and Spitzer, and the latter was caused by the orbital motion of the source star in a relatively close binary. The combination of these two effects led to a direct mass measurement o… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 13 pages, 10 figures. Submitted to Monthly Notices of the Royal Astronomical Society

  43. arXiv:2404.12649  [pdf, ps, other

    quant-ph

    Qubit-assisted quantum metrology

    Authors: Peng Chen, Jun **g

    Abstract: We propose a quantum metrology protocol based on a two-step joint evolution of the probe system and an ancillary qubit and a single-shot projective measurement. With an optimized initialization of the ancillary qubit, the quantum Fisher information (QFI) about the phase parameter encoded in the probe system is found to be determined by the expectation value of the square of a time-optimized phase… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  44. arXiv:2404.12398  [pdf, other

    cs.LG

    Incremental Self-training for Semi-supervised Learning

    Authors: Jifeng Guo, Zhulin Liu, Tong Zhang, C. L. Philip Chen

    Abstract: Semi-supervised learning provides a solution to reduce the dependency of machine learning on labeled data. As one of the efficient semi-supervised techniques, self-training (ST) has received increasing attention. Several advancements have emerged to address challenges associated with noisy pseudo-labels. Previous works on self-training acknowledge the importance of unlabeled data but have not delv… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  45. arXiv:2404.10004  [pdf

    cs.LG physics.soc-ph stat.AP

    A Strategy Transfer and Decision Support Approach for Epidemic Control in Experience Shortage Scenarios

    Authors: X. Xiao, P. Chen, X. Cao, K. Liu, L. Deng, D. Zhao, Z. Chen, Q. Deng, F. Yu, H. Zhang

    Abstract: Epidemic outbreaks can cause critical health concerns and severe global economic crises. For countries or regions with new infectious disease outbreaks, it is essential to generate preventive strategies by learning lessons from others with similar risk profiles. A Strategy Transfer and Decision Support Approach (STDSA) is proposed based on the profile similarity evaluation. There are four steps in… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 20 pages, 9 figures

  46. arXiv:2404.09889  [pdf, other

    cs.IR cs.AI cs.CL

    Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval

    Authors: Peter Baile Chen, Yi Zhang, Dan Roth

    Abstract: Retrieving relevant tables containing the necessary information to accurately answer a given question over tables is critical to open-domain question-answering (QA) systems. Previous methods assume the answer to such a question can be found either in a single table or multiple tables identified through question decomposition or rewriting. However, neither of these approaches is sufficient, as many… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: ACL 2024 camera ready

  47. arXiv:2404.09707  [pdf, other

    cs.CV cs.AI cs.LG

    Adaptive Patching for High-resolution Image Segmentation with Transformers

    Authors: Enzhi Zhang, Isaac Lyngaas, Peng Chen, Xiao Wang, Jun Igarashi, Yuankai Huo, Mohamed Wahib, Masaharu Munetomo

    Abstract: Attention-based models are proliferating in the space of image analytics, including segmentation. The standard method of feeding images to transformer encoders is to divide the images into patches and then feed the patches to the model as a linear sequence of tokens. For high-resolution images, e.g. microscopic pathology images, the quadratic compute and memory cost prohibits the use of an attenti… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  48. arXiv:2404.09624  [pdf, other

    cs.CV

    AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception

    Authors: Yipo Huang, Xiangfei Sheng, Zhichao Yang, Quan Yuan, Zhichao Duan, Pengfei Chen, Leida Li, Weisi Lin, Guangming Shi

    Abstract: The highly abstract nature of image aesthetics perception (IAP) poses significant challenge for current multimodal large language models (MLLMs). The lack of human-annotated multi-modality aesthetic data further exacerbates this dilemma, resulting in MLLMs falling short of aesthetics perception capabilities. To address the above challenge, we first introduce a comprehensively annotated Aesthetic M… ▽ More

    Submitted 18 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  49. arXiv:2404.09446  [pdf, other

    gr-qc

    The final burst of the moving mirror is unrelated to the partner mode of analog Hawking radiation

    Authors: Yuki Osawa, Kuan-Nan Lin, Yasusada Nambu, Masahiro Hotta, Pisin Chen

    Abstract: Flying mirrors with appropriate trajectories have been recognized as an analog system that mimics black hole Hawking evaporation and have been widely investigated. It has recently been suggested that the partner mode of the analog Hawking radiation emitted from a moving mirror would manifest itself through a final burst when the mirror executes a sudden stop. Here we argue the opposite via the par… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 6 figures

  50. arXiv:2404.09432  [pdf, other

    cs.CV cs.AI cs.LG

    The 8th AI City Challenge

    Authors: Shuo Wang, David C. Anastasiu, Zheng Tang, Ming-Ching Chang, Yue Yao, Liang Zheng, Mohammed Shaiqur Rahman, Meenakshi S. Arya, Anuj Sharma, Pranamesh Chakraborty, Sanjita Prajapati, Quan Kong, Norimasa Kobori, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Fady Alnajjar, Ganzorig Batnasan, **-Yang Chen, Jun-Wei Hsieh, Xunlei Wu, Sameer Satish Pusegaonkar, Yizhou Wang, Sujit Biswas, Rama Chellappa

    Abstract: The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC)… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Summary of the 8th AI City Challenge Workshop in conjunction with CVPR 2024