Skip to main content

Showing 1–50 of 702 results for author: Tan, Z

.
  1. arXiv:2407.00390  [pdf, other

    cs.CL

    Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

    Authors: Mingqian He, Yongliang Shen, Wenqi Zhang, Zeqi Tan, Weiming Lu

    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, t… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.19417  [pdf, other

    cs.CR cs.AI

    "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

    Authors: Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Song Wang, Jundong Li, Tianlong Chen, Huan Liu

    Abstract: Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) by integrating external knowledge bases, improving their performance in applications like fact-checking and information searching. In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases by injecting deceptive content into the retrieval database, intentionall… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Preprint

  3. arXiv:2406.18981  [pdf, other

    cond-mat.str-el cond-mat.quant-gas cond-mat.stat-mech hep-lat quant-ph

    Exact Fisher zeros and thermofield dynamics across a quantum critical point

    Authors: Yang Liu, Songtai Lv, Yuchen Meng, Zefan Tan, Erhai Zhao, Haiyuan Zou

    Abstract: By setting the inverse temperature $β$ loose to occupy the complex plane, Michael E. Fisher showed that the zeros of the complex partition function $Z$, if approaching the real $β$ axis, reveal a thermodynamic phase transition. More recently, Fisher zeros have been used to mark the dynamical phase transition in quench dynamics. The success of Fisher zeros however seems limited, and it is unclear h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 6+4 pages; 3+4 figures

  4. arXiv:2406.17992  [pdf, other

    cs.CL cs.AI

    Catching Chameleons: Detecting Evolving Disinformation Generated using Large Language Models

    Authors: Bohan Jiang, Chengshuai Zhao, Zhen Tan, Huan Liu

    Abstract: Despite recent advancements in detecting disinformation generated by large language models (LLMs), current efforts overlook the ever-evolving nature of this disinformation. In this work, we investigate a challenging yet practical research problem of detecting evolving LLM-generated disinformation. Disinformation evolves constantly through the rapid development of LLMs and their variants. As a cons… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  5. arXiv:2406.17841  [pdf, other

    quant-ph cs.AI

    Probing many-body Bell correlation depth with superconducting qubits

    Authors: Ke Wang, Weikang Li, Shibo Xu, Mengyao Hu, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong **, Xuhao Zhu, Yu Gao, Ziqi Tan, Aosai Zhang, Ning Wang, Yiren Zou, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Zixuan Song, **feng Deng, Hang Dong, Xu Zhang, Pengfei Zhang, Wenjie Jiang , et al. (10 additional authors not shown)

    Abstract: Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages,6 figures + 14 pages, 6 figures

  6. arXiv:2406.16562  [pdf, other

    cs.CV cs.CL

    EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models

    Authors: Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Meng** Yang, Cheng Zhang, Hao Li

    Abstract: The recent advancements in text-to-image generative models have been remarkable. Yet, the field suffers from a lack of evaluation metrics that accurately reflect the performance of these models, particularly lacking fine-grained metrics that can guide the optimization of the models. In this paper, we propose EvalAlign, a metric characterized by its accuracy, stability, and fine granularity. Our ap… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: Github Repository: https://github.com/SAIS-FUXI/EvalAlign

  7. arXiv:2406.16260  [pdf, other

    cs.CV cs.AI

    Video-Infinity: Distributed Long Video Generation

    Authors: Zhenxiong Tan, Xingyi Yang, Songhua Liu, Xinchao Wang

    Abstract: Diffusion models have recently achieved remarkable results for video generation. Despite the encouraging performances, the generated videos are typically constrained to a small number of frames, resulting in clips lasting merely a few seconds. The primary challenges in producing longer videos include the substantial memory requirements and the extended processing time required on a single GPU. A s… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  8. arXiv:2406.16003  [pdf

    physics.optics

    Unidirectional Chiral Emission via Twisted Bi-layer Metasurfaces

    Authors: Dmitrii Gromyko, Shu An, Sergey Gorelik, Jiahui Xu, Li Jun Lim, Henry Yit Loong Lee, Febiana Tjiptoharsono, Zhi-Kuang Tan, Cheng-Wei Qiu, Zhaogang Dong, Lin Wu

    Abstract: Controlling and channelling light emissions from unpolarized quantum dots into specific directions with chiral polarization remains a key challenge in modern photonics. Stacked metasurface designs offer a potential compact solution for chirality and directionality engineering. However, experimental observations of directional chiral radiation from resonant metasurfaces with quantum emitters remain… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 16 pages, 4 figures

  9. arXiv:2406.15992  [pdf, other

    cs.CL

    Can LLM Graph Reasoning Generalize beyond Pattern Memorization?

    Authors: Yizhuo Zhang, Heng Wang, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov

    Abstract: Large language models (LLMs) demonstrate great potential for problems with implicit graphical structures, while recent works seek to enhance the graph reasoning capabilities of LLMs through specialized instruction tuning. The resulting 'graph LLMs' are evaluated with in-distribution settings only, thus it remains underexplored whether LLMs are learning generalizable graph reasoning skills or merel… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 16 pages, 6 figures, Code and data will be publicly available at https://github.com/MatthewYZhang/NLGift

    ACM Class: I.2.7

  10. arXiv:2406.13906  [pdf, other

    stat.ME stat.ML

    Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data

    Authors: Ye Tian, Peng Wu, Zhiqiang Tan

    Abstract: The accessibility of vast volumes of unlabeled data has sparked growing interest in semi-supervised learning (SSL) and covariate shift transfer learning (CSTL). In this paper, we present an inference framework for estimating regression coefficients in conditional mean models within both SSL and CSTL settings, while allowing for the misspecification of conditional mean models. We develop an augment… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  11. FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

    Authors: Tianchi Cai, Zhiwen Tan, Xierui Song, Tao Sun, Jiyan Jiang, Yunqi Xu, Yinger Zhang, **jie Gu

    Abstract: Retrieval Augmented Generation (RAG) has become prevalent in question-answering (QA) tasks due to its ability of utilizing search engine to enhance the quality of long-form question-answering (LFQA). Despite the emergence of various open source methods and web-enhanced commercial systems such as Bing Chat, two critical problems remain unsolved, i.e., the lack of factuality and clear logic in the g… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Report number: 30th

    Journal ref: KDD 2024

  12. arXiv:2406.12867  [pdf, ps, other

    math.QA

    Classification of quasi-affine Generalized Dynkin Diagrams with Rank $3$ and Rank $2$

    Authors: Zhengtang Tan, Shouchuan Zhang

    Abstract: All quasi-affine connected Generalized Dynkin Diagram with rank $= 3$ and $2$ are found. All quasi-affine Nichols (Lie braided) algebras with rank $ 3$ and $2$ are also found.

    Submitted 9 April, 2024; originally announced June 2024.

    Comments: 338 pages

    MSC Class: 16W30; 16G10

  13. arXiv:2406.12313  [pdf

    cs.DB

    A framework for develo** a knowledge management platform

    Authors: Marie Lisandra Zepeda Mendoza, Sonali Agarwal, James A. Blackshaw, Vanesa Bol, Audrey Fazzi, Filippo Fiorini, Amy Louise Foreman, Nancy George, Brett R. Johnson, Brian Martin, Dave McComb, Euphemia Mutasa-Gottgens, Helen Parkinson, Martin Romacker, Rolf Russell, Valérien Ségard, Shawn Zheng Kai Tan, Wei Kheng Teh, F. P. Winstanley, Benedict Wong, Adrian M. Smith

    Abstract: Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide gu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages, 1 figure

  14. arXiv:2406.10471  [pdf, other

    cs.CL

    Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts

    Authors: Zhaoxuan Tan, Zheyuan Liu, Meng Jiang

    Abstract: Personalized large language models (LLMs) aim to tailor interactions, content, and recommendations to individual user preferences. While parameter-efficient fine-tuning (PEFT) methods excel in performance and generalization, they are costly and limit communal benefits when used individually. To this end, we introduce Personalized Pieces (Per-Pcs), a framework that allows users to safely share and… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  15. arXiv:2406.09899  [pdf, other

    cs.LG cs.AI

    Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem

    Authors: Zhentao Tan, Yadong Mu

    Abstract: Recently various optimization problems, such as Mixed Integer Linear Programming Problems (MILPs), have undergone comprehensive investigation, leveraging the capabilities of machine learning. This work focuses on learning-based solutions for efficiently solving the Quadratic Assignment Problem (QAPs), which stands as a formidable challenge in combinatorial optimization. While many instances of sim… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  16. arXiv:2406.08785  [pdf, other

    cs.CV

    BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection

    Authors: Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, **gdong Wang, Gaoang Wang, Xi Li

    Abstract: Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range. While previous work mainly focuses on accurately estimating depth or height for 2D-to-3D map**, ignoring the position approximation error in the voxel pooling process. Inspired by this insight, we p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2406.06911  [pdf, other

    cs.CV cs.AI

    AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

    Authors: Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang

    Abstract: Diffusion models have garnered significant interest from the community for their great generative ability across various applications. However, their typical multi-step sequential-denoising nature gives rise to high cumulative latency, thereby precluding the possibilities of parallel computation. To address this, we introduce AsyncDiff, a universal and plug-and-play acceleration scheme that enable… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Work in progress. Project Page: https://czg1225.github.io/asyncdiff_page/

  18. arXiv:2406.06904  [pdf, other

    cs.RO cs.HC

    Person Transfer in the Field: Examining Real World Sequential Human-Robot Interaction Between Two Robots

    Authors: Xiang Zhi Tan, Elizabeth J. Carter, Aaron Steinfeld

    Abstract: With more robots being deployed in the world, users will likely interact with multiple robots sequentially when receiving services. In this paper, we describe an exploratory field study in which unsuspecting participants experienced a ``person transfer'' -- a scenario in which they first interacted with one stationary robot before another mobile robot joined to complete the interaction. In our 7-h… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to RO-MAN 2024

  19. arXiv:2406.06295  [pdf, other

    cs.SD eess.AS

    Zero-Shot Audio Captioning Using Soft and Hard Prompts

    Authors: Yiming Zhang, Xuenan Xu, Ruoyi Du, Haohe Liu, Yuan Dong, Zheng-Hua Tan, Wenwu Wang, Zhanyu Ma

    Abstract: In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test sets from the same dataset. Such methods have two limitations. First, these methods are often data-hungry and require time-consuming and expensive human annotations to obtain audio-text pairs. Second, these model… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

  20. arXiv:2406.06160  [pdf, other

    eess.AS

    The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems

    Authors: Philippe Gonzalez, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen, Tommy Sonne Alstrøm, Tobias May

    Abstract: The performance of deep neural network-based speech enhancement systems typically increases with the training dataset size. However, studies that investigated the effect of training dataset size on speech enhancement performance did not consider recent approaches, such as diffusion-based generative models. Diffusion models are typically trained with massive datasets for image generation tasks, but… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  21. arXiv:2406.06140  [pdf, other

    cs.CL cs.LG

    Can I understand what I create? Self-Knowledge Evaluation of Large Language Models

    Authors: Zhiquan Tan, Lai Wei, **dong Wang, Xing Xie, Weiran Huang

    Abstract: Large language models (LLMs) have achieved remarkable progress in linguistic tasks, necessitating robust evaluation frameworks to understand their capabilities and limitations. Inspired by Feynman's principle of understanding through creation, we introduce a self-knowledge evaluation framework that is easy to implement, evaluating models on their ability to comprehend and respond to self-generated… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  22. arXiv:2406.05270  [pdf

    physics.med-ph cs.CV cs.LG eess.IV

    fastMRI Breast: A publicly available radial k-space dataset of breast dynamic contrast-enhanced MRI

    Authors: Eddy Solomon, Patricia M. Johnson, Zhengguo Tan, Radhika Tibrewala, Yvonne W. Lui, Florian Knoll, Linda Moy, Sungheon Gene Kim, Laura Heacock

    Abstract: This data curation work introduces the first large-scale dataset of radial k-space and DICOM data for breast DCE-MRI acquired in diagnostic breast MRI exams. Our dataset includes case-level labels indicating patient age, menopause status, lesion status (negative, benign, and malignant), and lesion type for each case. The public availability of this dataset and accompanying reconstruction code will… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  23. arXiv:2406.03999  [pdf, other

    cs.LG cs.CV

    Unveiling the Dynamics of Information Interplay in Supervised Learning

    Authors: Kun Song, Zhiquan Tan, Bochao Zou, Huimin Ma, Weiran Huang

    Abstract: In this paper, we use matrix information theory as an analytical tool to analyze the dynamics of the information interplay between data representations and classification head vectors in the supervised learning process. Specifically, inspired by the theory of Neural Collapse, we introduce matrix mutual information ratio (MIR) and matrix entropy difference ratio (HDR) to assess the interactions of… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  24. arXiv:2406.03799  [pdf

    cs.CV cs.AI

    Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge

    Authors: Nan Zhang, Xidan Zhang, Jianing Wei, Fangjun Wang, Zhiming Tan

    Abstract: This report describes the winning solution to the WeatherProof Dataset Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an enhanced semantic segmentation pipeline for this challenge. Firstly, we improve semantic segmentation models, using backbone pretrained with Depth Anything to improve UperNet mod… ▽ More

    Submitted 6 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  25. arXiv:2406.03777  [pdf, other

    cs.LG cs.AI

    Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices

    Authors: Ruiyang Qin, Dancheng Liu, Zheyu Yan, Zhaoxuan Tan, Zixuan Pan, Zhenge Jia, Meng Jiang, Ahmed Abbasi, **jun Xiong, Yiyu Shi

    Abstract: The scaling laws have become the de facto guidelines for designing large language models (LLMs), but they were studied under the assumption of unlimited computing resources for both training and inference. As LLMs are increasingly used as personalized intelligent assistants, their customization (i.e., learning through fine-tuning) and deployment onto resource-constrained edge devices will become m… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Benckmarking paper

  26. arXiv:2406.02178  [pdf, other

    cs.SD cs.AI eess.AS

    Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations

    Authors: Sarthak Yadav, Zheng-Hua Tan

    Abstract: Despite its widespread adoption as the prominent neural architecture, the Transformer has spurred several independent lines of work to address its limitations. One such approach is selective state space models, which have demonstrated promising results for language modelling. However, their feasibility for learning self-supervised, general-purpose audio representations is yet to be investigated. T… ▽ More

    Submitted 7 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  27. arXiv:2406.01196  [pdf, other

    cs.CV cs.AI

    3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information

    Authors: Sihan Wen, Xiantan Zhu, Zhiming Tan

    Abstract: In recent years, a plethora of diverse methods have been proposed for 3D pose estimation. Among these, self-attention mechanisms and graph convolutions have both been proven to be effective and practical methods. Recognizing the strengths of those two techniques, we have developed a novel Semantic Graph Attention Network which can benefit from the ability of self-attention to capture global contex… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  28. arXiv:2406.00777  [pdf, other

    cs.CV cs.AI

    Diffusion Features to Bridge Domain Gap for Semantic Segmentation

    Authors: Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu

    Abstract: Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this, our study delves into the utilization of the implicit knowledge embedded within diffusion models to address challenges in cross-domain semantic segmentation. Thi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  29. arXiv:2406.00323  [pdf, other

    cs.IR cs.MM

    BeFA: A General Behavior-driven Feature Adapter for Multimedia Recommendation

    Authors: Qile Fan, Penghang Yu, Zhiyi Tan, Bing-Kun Bao, Guanming Lu

    Abstract: Multimedia recommender systems focus on utilizing behavioral information and content information to model user preferences. Typically, it employs pre-trained feature encoders to extract content features, then fuses them with behavioral features. However, pre-trained feature encoders often extract features from the entire content simultaneously, including excessive preference-irrelevant details. We… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  30. arXiv:2405.20754  [pdf, ps, other

    math.AP

    Non-uniqueness of weak solutions to 2D generalized Navier-Stokes equations

    Authors: Xinliang Li, Zhong Tan

    Abstract: We study the non-uniqueness of weak solutions for the two-dimensional hyper-dissipative Navier-Stokes equations in the super-critical spaces $L_{t}^γL_{x}^{p}$ when $α\in[1,\frac{3}{2})$, and obtain the conclusion that the non-uniqueness of the weak solutions at the endpoint $(γ,p)=(\infty, \frac{2}{2α-1})$ is sharp in view of the generalized Ladyženskaja-Prodi-Serrin condition by using a differen… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 30 pages

    MSC Class: 35A02; 35Q30; 76D05

  31. arXiv:2405.20711  [pdf, other

    cs.CV

    Revisiting Mutual Information Maximization for Generalized Category Discovery

    Authors: Zhaorui Tan, Chengrui Zhang, Xi Yang, Jie Sun, Kaizhu Huang

    Abstract: Generalized category discovery presents a challenge in a realistic scenario, which requires the model's generalization ability to recognize unlabeled samples from known and unknown categories. This paper revisits the challenge of generalized category discovery through the lens of information maximization (InfoMax) with a probabilistic parametric classifier. Our findings reveal that ensuring indepe… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Preprint version

  32. arXiv:2405.18756  [pdf, other

    cs.LG cs.AI cs.CV stat.AP stat.ML

    Provable Contrastive Continual Learning

    Authors: Yichen Wen, Zhiquan Tan, Kaipeng Zheng, Chuanlong Xie, Weiran Huang

    Abstract: Continual learning requires learning incremental tasks with dynamic data distributions. So far, it has been observed that employing a combination of contrastive loss and distillation loss for training in continual learning yields strong performance. To the best of our knowledge, however, this contrastive continual learning framework lacks convincing theoretical explanations. In this work, we fill… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  33. arXiv:2405.14278  [pdf, other

    cs.CV

    SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation

    Authors: Kai Yao, Zhaorui Tan, Zixian Su, Xi Yang, Jie Sun, Kaizhu Huang

    Abstract: Open compound domain adaptation (OCDA) aims to transfer knowledge from a labeled source domain to a mix of unlabeled homogeneous compound target domains while generalizing to open unseen domains. Existing OCDA methods solve the intra-domain gaps by a divide-and-conquer strategy, which divides the problem into several individual and parallel domain adaptation (DA) tasks. Such approaches often conta… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  34. arXiv:2405.14092  [pdf, other

    cs.CL

    Large Language Models Can Self-Correct with Minimal Effort

    Authors: Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang

    Abstract: Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective verification method can unleash inherent capabilities of the LLMs. That is to mask a key condition in the question, add the current… ▽ More

    Submitted 23 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Work in Progress

  35. arXiv:2405.12914  [pdf, other

    cs.CV

    An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation

    Authors: Zhiyu Tan, Meng** Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li

    Abstract: One critical prerequisite for faithful text-to-image generation is the accurate understanding of text inputs. Existing methods leverage the text encoder of the CLIP model to represent input prompts. However, the pre-trained CLIP model can merely encode English with a maximum token length of 77. Moreover, the model capacity of the text encoder from CLIP is relatively limited compared to Large Langu… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Technical report. Project page: https://github.com/llm-conditioned-diffusion/llm-conditioned-diffusion

  36. arXiv:2405.11405  [pdf, ps, other

    cs.IT

    On the Rate-Distortion Function for Sampled Cyclostationary Gaussian Processes with Memory: Extended Version with Proofs

    Authors: Zikun Tan, Ron Dabora, H. Vincent Poor

    Abstract: In this work we study the rate-distortion function (RDF) for lossy compression of asynchronously-sampled continuous-time (CT) wide-sense cyclostationary (WSCS) Gaussian processes with memory. As the case of synchronous sampling, i.e., when the sampling interval is commensurate with the period of the cyclostationary statistics, has already been studied, we focus on discrete-time (DT) processes obta… ▽ More

    Submitted 23 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

    Comments: 11 pages, 0 figures, accepted by the 2024 IEEE International Symposium on Information Theory (ISIT 2024)

  37. arXiv:2405.07027  [pdf, other

    cs.CV cs.AI cs.RO

    TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization

    Authors: Zhen Tan, Zongtan Zhou, Yangbing Ge, Zi Wang, Xieyuanli Chen, Dewen Hu

    Abstract: The reliance on accurate camera poses is a significant barrier to the widespread deployment of Neural Radiance Fields (NeRF) models for 3D reconstruction and SLAM tasks. The existing method introduces monocular depth priors to jointly optimize the camera poses and NeRF, which fails to fully exploit the depth priors and neglects the impact of their inherent noise. In this paper, we propose Truncate… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  38. arXiv:2405.06332  [pdf, ps, other

    math.OC

    Solving maximally comonotone inclusion problems via an implicit Newton-like inertial dynamical system and its discretization

    Authors: Z. Z. Tan, R. Hu, Y. P. Fang

    Abstract: This paper deals with an implicit Newton-like inertial dynamical system governed by a maximally comonotone inclusion problem in a Hilbert space. Under suitable conditions, we establish not only pointwise estimates and integral estimates for the velocity and the value of the associated Yosida regularization operator along the trajectory of the system, but also the weak convergence of the trajectory… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  39. arXiv:2405.06041  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Gate Tunable Asymmetric Ozone Adsorption on Graphene

    Authors: Zhen Qi, Wanlei Li, Jun Cheng, Zhongxin Guo, Chenglong Li, Shang Wang, Zuoquan Tan, Zhiting Gao, Yongchao Wang, Zichen Lian, Shanshan Chen, Yonglin He, Zhiyong Wang, Yapei Wang, **song Zhang, Yayu Wang, Peng Cai

    Abstract: Molecular adsorption is pivotal in device fabrication and material synthesis for quantum technology. However, elucidating the behavior of physisorption poses technical challenges. Here graphene with ultrahigh sensitivity was utilized to detect ozone adsorption at cryogenic temperatures. Significant hole do** observed in graphene indicates a strong interaction between ozone and graphene. Interest… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  40. arXiv:2405.05413  [pdf

    cs.DB

    Digital Evolution: Novo Nordisk's Shift to Ontology-Based Data Management

    Authors: Shawn Zheng Kai Tan, Shounak Baksi, Thomas Gade Bjerregaard, Preethi Elangovan, Thrishna Kuttikattu Gopalakrishnan, Darko Hric, Joffrey Joumaa, Beidi Li, Kashif Rabbani, Santhosh Kannan Venkatesan, Joshua Daniel Valdez, Saritha Vettikunnel Kuriakose

    Abstract: Biomedical data is growing exponentially, and managing it is increasingly challenging. While Findable, Accessible, Interoperable and Reusable (FAIR) data principles provide guidance, their adoption has proven difficult, especially in larger enterprises like pharmaceutical companies. In this manuscript, we describe how we leverage an Ontology-Based Data Management (OBDM) strategy for digital transf… ▽ More

    Submitted 10 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 14 pages, 2 figures

  41. arXiv:2405.04819  [pdf, other

    cs.CL cs.AI

    DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature

    Authors: Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen

    Abstract: Recent advancements in large language models (LLMs) have achieved promising performances across various applications. Nonetheless, the ongoing challenge of integrating long-tail knowledge continues to impede the seamless adoption of LLMs in specialized domains. In this work, we introduce DALK, a.k.a. Dynamic Co-Augmentation of LLMs and KG, to address this limitation and demonstrate its ability on… ▽ More

    Submitted 12 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Under Review; Incorrect author name revised

  42. arXiv:2405.03723  [pdf, other

    cs.LG stat.ME stat.ML

    Generative adversarial learning with optimal input dimension and its adaptive generator architecture

    Authors: Zhiyao Tan, Ling Zhou, Huazhen Lin

    Abstract: We investigate the impact of the input dimension on the generalization error in generative adversarial networks (GANs). In particular, we first provide both theoretical and practical evidence to validate the existence of an optimal input dimension (OID) that minimizes the generalization error. Then, to identify the OID, we introduce a novel framework called generalized GANs (G-GANs), which include… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  43. arXiv:2405.00974  [pdf, other

    math.ST

    On Ridge Estimation in High-dimensional Rotationally Sparse Linear Regression

    Authors: Libin Liang, Zhiqiang Tan

    Abstract: Recently, deep neural networks have been found to nearly interpolate training data but still generalize well in various applications. To help understand such a phenomenon, it has been of interest to analyze the ridge estimator and its interpolation limit in high-dimensional regression models. For this motivation, we study the ridge estimator in a rotationally sparse setting of high-dimensional lin… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  44. Misaka: Interactive Swarm Testbed for Smart Grid Distributed Algorithm Test and Evaluation

    Authors: Tingliang Zhang, Haiwang Zhong, Zhenfei Tan, Xinfei Yan

    Abstract: In this paper, we present Misaka, a visualized swarm testbed for smart grid algorithm evaluation, also an extendable open-source open-hardware platform for develo** tabletop tangible swarm interfaces. The platform consists of a collection of custom-designed 3 omni-directional wheels robots each 10 cm in diameter, high accuracy localization through a microdot pattern overlaid on top of the activi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Journal ref: 2020 IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia)

  45. arXiv:2404.16831  [pdf, other

    cs.CV

    The Third Monocular Depth Estimation Challenge

    Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, Yi** Bao, Xiao Liu, Dohyeong Kim, **seong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, **qiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

    Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More

    Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in CVPRW2024

  46. arXiv:2404.16339  [pdf, other

    cs.CV cs.AI

    Training-Free Unsupervised Prompt for Vision-Language Models

    Authors: Sifan Long, Linbin Wang, Zhen Zhao, Zichang Tan, Yiming Wu, Shengsheng Wang, **gdong Wang

    Abstract: Prompt learning has become the most effective paradigm for adapting large pre-trained vision-language models (VLMs) to downstream tasks. Recently, unsupervised prompt tuning methods, such as UPL and POUF, directly leverage pseudo-labels as supervisory information to fine-tune additional adaptation modules on unlabeled data. However, inaccurate pseudo labels easily misguide the tuning process and r… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  47. arXiv:2404.15878  [pdf, other

    quant-ph physics.flu-dyn

    Simulating unsteady fluid flows on a superconducting quantum processor

    Authors: Zhaoyuan Meng, Jiarun Zhong, Shibo Xu, Ke Wang, Jiachen Chen, Feitong **, Xuhao Zhu, Yu Gao, Yaozu Wu, Chuanyu Zhang, Ning Wang, Yiren Zou, Aosai Zhang, Zhengyi Cui, Fanhao Shen, Zehang Bao, Zitian Zhu, Ziqi Tan, Tingting Li, Pengfei Zhang, Shiying Xiong, Hekang Li, Qiujiang Guo, Zhen Wang, Chao Song , et al. (2 additional authors not shown)

    Abstract: Recent advancements of intermediate-scale quantum processors have triggered tremendous interest in the exploration of practical quantum advantage. The simulation of fluid dynamics, a highly challenging problem in classical physics but vital for practical applications, emerges as a good candidate for showing quantum utility. Here, we report an experiment on the digital simulation of unsteady flows,… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  48. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  49. arXiv:2404.14453  [pdf, other

    cs.CL cs.AI cs.DB

    EPI-SQL: Enhancing Text-to-SQL Translation with Error-Prevention Instructions

    Authors: Xi** Liu, Zhao Tan

    Abstract: The conversion of natural language queries into SQL queries, known as Text-to-SQL, is a critical yet challenging task. This paper introduces EPI-SQL, a novel methodological framework leveraging Large Language Models (LLMs) to enhance the performance of Text-to-SQL tasks. EPI-SQL operates through a four-step process. Initially, the method involves gathering instances from the Spider dataset on whic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  50. arXiv:2404.14135  [pdf, other

    cs.CV

    Text in the Dark: Extremely Low-Light Text Image Enhancement

    Authors: Che-Tsung Lin, Chun Chet Ng, Zhi Qin Tan, Wan Jun Nah, Xinyu Wang, Jie Long Kew, Pohao Hsu, Shang Hong Lai, Chee Seng Chan, Christopher Zach

    Abstract: Extremely low-light text images are common in natural scenes, making scene text detection and recognition challenging. One solution is to enhance these images using low-light image enhancement methods before text extraction. However, previous methods often do not try to particularly address the significance of low-level features, which are crucial for optimal performance on downstream scene text t… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally to this work