Skip to main content

Showing 51–100 of 2,292 results for author: Yang, M

.
  1. arXiv:2405.17285  [pdf, ps, other

    math.AP

    Asymptotic behavior of solutions for a critical heat equation with nonlocal reaction

    Authors: Jian Zhang, Jacques Giacomoni, Vicentiu Radulescu, Minbo Yang

    Abstract: In this paper, we consider the following nonlocal parabolic equation \begin{equation*} u_{t}-Δu=\left( \int_Ω\frac{|u(y,t)|^{2^{\ast}_μ}}{|x-y|^μ}dy\right) |u|^{2^{\ast}_μ-2}u,\ \text{in}\ Ω\times(0,\infty), \end{equation*} where $Ω$ is a bounded domain in $\mathbb{R}^{N}$, $0<μ<N$ and $2^{\ast}_μ=(2N-μ)/(N-2)$ denotes the critical exponent in the sense of the Hardy-Littlewood-Sobolev ineq… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    MSC Class: 35A01; 35B33; 35B40; 35B44; 35K05; 58J35

  2. arXiv:2405.17176  [pdf, other

    cs.GR cs.AI

    DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models

    Authors: Yuqing Zhang, Yuan Liu, Zhiyu Xie, Lei Yang, Zhongyuan Liu, Mengzhou Yang, Runze Zhang, Qilong Kou, Cheng Lin, Wen** Wang, Xiaogang **

    Abstract: 2D diffusion model, which often contains unwanted baked-in shading effects and results in unrealistic rendering effects in the downstream applications. Generating Physically Based Rendering (PBR) materials instead of just RGB textures would be a promising solution. However, directly distilling the PBR material parameters from 2D diffusion models still suffers from incorrect material decomposition,… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024

  3. arXiv:2405.16588  [pdf, other

    cs.AI cs.GT cs.HC

    Attaining Human`s Desirable Outcomes in Human-AI Interaction via Structural Causal Games

    Authors: Anjie Liu, Jianhong Wang, Haoxuan Li, Xu Chen, Jun Wang, Samuel Kaski, Mengyue Yang

    Abstract: In human-AI interaction, a prominent goal is to attain human`s desirable outcome with the assistance of AI agents, which can be ideally delineated as a problem of seeking the optimal Nash Equilibrium that matches the human`s desirable outcome. However, reaching the outcome is usually challenging due to the existence of multiple Nash Equilibria that are related to the assisting task but do not corr… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 38 pages, 5 figures

  4. arXiv:2405.16433  [pdf, other

    cs.CL cs.AI cs.CY

    CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

    Authors: Chenhao Zhang, Renhao Li, Minghuan Tan, Min Yang, **gwei Zhu, Di Yang, Jiahao Zhao, Guancheng Ye, Chengming Li, Xi** Hu

    Abstract: Using large language models (LLMs) to assist psychological counseling is a significant but challenging task at present. Attempts have been made on improving empathetic conversations or acting as effective assistants in the treatment with LLMs. However, the existing datasets lack consulting knowledge, resulting in LLMs lacking professional consulting competence. Moreover, how to automatically evalu… ▽ More

    Submitted 10 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: Appectped to Findings of ACL2024

  5. arXiv:2405.16041  [pdf, other

    cs.LG cs.AI

    Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models

    Authors: Zhenzhong Wang, Zehui Lin, Wanyu Lin, Ming Yang, Minggang Zeng, Kay Chen Tan

    Abstract: Providing explainable molecule property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown great potential in accurate molecular property prediction, they neither provide chemically meaningful explanations nor faithfully reveal the molecular structure-property relationships. In this work, we develop… ▽ More

    Submitted 31 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  6. arXiv:2405.15304  [pdf, other

    cs.LG cs.CV

    Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

    Authors: Yongliang Wu, Shiji Zhou, Mingzhuo Yang, Lianzhe Wang, Wenbo Zhu, Heng Chang, Xiao Zhou, Xu Yang

    Abstract: Current text-to-image diffusion models have achieved groundbreaking results in image generation tasks. However, the unavoidable inclusion of sensitive information during pre-training introduces significant risks such as copyright infringement and privacy violations in the generated images. Machine Unlearning (MU) provides a effective way to the sensitive concepts captured by the model, has been sh… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2405.15232  [pdf, other

    cs.CV cs.CL

    DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception

    Authors: Run Luo, Yunshui Li, Longze Chen, Wanwei He, Ting-En Lin, Ziqiang Liu, Lei Zhang, Zikai Song, Xiaobo Xia, Tongliang Liu, Min Yang, Binyuan Hui

    Abstract: The development of large language models (LLMs) has significantly advanced the emergence of large multimodal models (LMMs). While LMMs have achieved tremendous success by promoting the synergy between multimodal comprehension and creation, they often face challenges when confronted with out-of-distribution data. This is primarily due to their reliance on image encoders trained to encode images int… ▽ More

    Submitted 3 July, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 25 pages. arXiv admin note: text overlap with arXiv:2401.10208 by other authors

  8. arXiv:2405.14343  [pdf, other

    cs.CV

    Efficient Visual State Space Model for Image Deblurring

    Authors: Lingshun Kong, Jiangxin Dong, Ming-Hsuan Yang, **shan Pan

    Abstract: Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. ViTs typically yield superior results in image restoration compared to CNNs due to their ability to capture long-range dependencies and input-dependent characteristics. However, the computational complexity of Transformer-based models grows quadratically with the image reso… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  9. arXiv:2405.12944  [pdf, other

    cs.CV

    AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection

    Authors: Zizhao Chen, Yeqiang Qian, Xiaoxiao Yang, Chunxiang Wang, Ming Yang

    Abstract: Multispectral pedestrian detection has been shown to be effective in improving performance within complex illumination scenarios. However, prevalent double-stream networks in multispectral detection employ two separate feature extraction branches for multi-modal data, leading to nearly double the inference time compared to single-stream networks utilizing only one feature extraction branch. This i… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  10. arXiv:2405.12914  [pdf, other

    cs.CV

    An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation

    Authors: Zhiyu Tan, Meng** Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li

    Abstract: One critical prerequisite for faithful text-to-image generation is the accurate understanding of text inputs. Existing methods leverage the text encoder of the CLIP model to represent input prompts. However, the pre-trained CLIP model can merely encode English with a maximum token length of 77. Moreover, the model capacity of the text encoder from CLIP is relatively limited compared to Large Langu… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Technical report. Project page: https://github.com/llm-conditioned-diffusion/llm-conditioned-diffusion

  11. arXiv:2405.12590  [pdf, other

    cs.LG cs.DC

    Maverick-Aware Shapley Valuation for Client Selection in Federated Learning

    Authors: Mengwei Yang, Ismat Jarin, Baturalp Buyukates, Salman Avestimehr, Athina Markopoulou

    Abstract: Federated Learning (FL) allows clients to train a model collaboratively without sharing their private data. One key challenge in practical FL systems is data heterogeneity, particularly in handling clients with rare data, also referred to as Mavericks. These clients own one or more data classes exclusively, and the model performance becomes poor without their participation. Thus, utilizing Maveric… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  12. arXiv:2405.11826  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Data quality control system and long-term performance monitor of the LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

    Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  13. arXiv:2405.10589  [pdf, other

    cs.CV cs.AI eess.IV

    Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

    Authors: I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu, Ming-Hsuan Yang, Sy-Yen Kuo

    Abstract: Crowd counting and localization have become increasingly important in computer vision due to their wide-ranging applications. While point-based strategies have been widely used in crowd counting methods, they face a significant challenge, i.e., the lack of an effective learning strategy to guide the matching process. This deficiency leads to instability in matching point proposals to target points… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  14. arXiv:2405.10212  [pdf, other

    cs.CL

    CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

    Authors: Jiahao Zhao, **gwei Zhu, Minghuan Tan, Min Yang, Di Yang, Chenhao Zhang, Guancheng Ye, Chengming Li, Xi** Hu

    Abstract: In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese language examinations. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. From the pool of 22k questions, we utilize 4k to create the benchmark that offe… ▽ More

    Submitted 18 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  15. arXiv:2405.09120  [pdf

    physics.geo-ph

    Martian seismic anisotropy underneath Elysium Planitia revealed by direct S wave splitting

    Authors: **g Shi, Cunrui Han, Tao Wang, Chao Qi, Han Chen, Zhihan Yu, Jiaqi Geng, Minghan Yang, Xu Wang, Ling Chen, Hejiu Hui

    Abstract: Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the M… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Manuscript has been submitted to Earth and Planetary Science Letters; 9 figures; 33 pages

  16. arXiv:2405.09090  [pdf, other

    cs.CR

    Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography

    Authors: Minhao Bai. **shuai Yang, Kaiyi Pang, Huili Wang, Yongfeng Huang

    Abstract: Linguistic steganography provides convenient implementation to hide messages, particularly with the emergence of AI generation technology. The potential abuse of this technology raises security concerns within societies, calling for powerful linguistic steganalysis to detect carrier containing steganographic messages. Existing methods are limited to finding distribution differences between stegano… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  17. arXiv:2405.07784  [pdf, other

    cs.CV

    Generating Human Motion in 3D Scenes from Text Descriptions

    Authors: Zhi Cen, Huai** Pi, Sida Peng, Zehong Shen, Minghui Yang, Shuai Zhu, Hujun Bao, Xiaowei Zhou

    Abstract: Generating human motions from textual descriptions has gained growing research interest due to its wide range of applications. However, only a few works consider human-scene interactions together with text conditions, which is crucial for visual and physical realism. This paper focuses on the task of generating human motions in 3D indoor scenes given text descriptions of the human-scene interactio… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Project page: https://zju3dv.github.io/text_scene_motion

  18. arXiv:2405.07691  [pdf, other

    astro-ph.HE

    Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  19. MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

    Authors: Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, **gwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzhong Li, Rangan Majumder, Jennifer Neville, Andy Oakley, Knut Magne Risvik , et al. (6 additional authors not shown)

    Abstract: Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of down… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures, for associated dataset, see http://github.com/microsoft/MS-MARCO-Web-Search

  20. arXiv:2405.07442  [pdf

    cs.SD cs.AI eess.AS q-bio.QM

    Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases

    Authors: Pengfei Zhang, Zhihang Zheng, Shichen Zhang, Minghao Yang, Shaojun Tang

    Abstract: Compared with invasive examinations that require tissue sampling, respiratory sound testing is a non-invasive examination method that is safer and easier for patients to accept. In this study, we introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition. Rene has been rigorously fine-tuned with an extensive dataset featuring a broad array of respiratory audio sample… ▽ More

    Submitted 6 June, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

  21. arXiv:2405.07391  [pdf, other

    cs.RO cs.AI cs.LG

    AnyRotate: Gravity-Invariant In-Hand Object Rotation with Sim-to-Real Touch

    Authors: Max Yang, Chenghua Lu, Alex Church, Yijiong Lin, Chris Ford, Haoran Li, Efi Psomopoulou, David A. W. Barton, Nathan F. Lepora

    Abstract: Human hands are capable of in-hand manipulation in the presence of different hand motions. For a robot hand, harnessing rich tactile information to achieve this level of dexterity still remains a significant challenge. In this paper, we present AnyRotate, a system for gravity-invariant multi-axis in-hand object rotation using dense featured sim-to-real touch. We tackle this problem by training a d… ▽ More

    Submitted 11 June, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

    Comments: Project website can be found at https://maxyang27896.github.io/anyrotate/

  22. arXiv:2405.06674  [pdf, other

    cs.CL cs.AI

    Open-SQL Framework: Enhancing Text-to-SQL on Open-source Large Language Models

    Authors: Xiaojun Chen, Tianle Wang, Tianhao Qiu, Jianbin Qin, Min Yang

    Abstract: Despite the success of large language models (LLMs) in Text-to-SQL tasks, open-source LLMs encounter challenges in contextual understanding and response coherence. To tackle these issues, we present \ours, a systematic methodology tailored for Text-to-SQL with open-source LLMs. Our contributions include a comprehensive evaluation of open-source LLMs in Text-to-SQL tasks, the \openprompt strategy f… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  23. arXiv:2405.05587  [pdf, other

    cs.CV cs.LG

    Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse

    Authors: Yining Wang, Junjie Sun, Chenyue Wang, Mi Zhang, Min Yang

    Abstract: Recent studies have noted an intriguing phenomenon termed Neural Collapse, that is, when the neural networks establish the right correlation between feature spaces and the training targets, their last-layer features, together with the classifier weights, will collapse into a stable and symmetric structure. In this paper, we extend the investigation of Neural Collapse to the biased datasets with im… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: CVPR 2024 Highlight

  24. arXiv:2405.04115  [pdf, other

    cs.CR

    A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

    Authors: Xiaoyang Xu, Mengda Yang, Wenzhe Yi, Ziang Li, Juan Wang, Hongxin Hu, Yong Zhuang, Yaxin Liu

    Abstract: Split Learning (SL) is a distributed learning framework renowned for its privacy-preserving features and minimal computational requirements. Previous research consistently highlights the potential privacy breaches in SL systems by server adversaries reconstructing training data. However, these studies often rely on strong assumptions or compromise system utility to enhance attack performance. This… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  25. arXiv:2405.02897  [pdf, other

    cs.RO

    DexiTac: Soft Dexterous Tactile Grip**

    Authors: Chenghua Lu, Kailuan Tang, Max Yang, Tianqi Yue, Nathan F. Lepora

    Abstract: Gras** object,whether they are flat, round, or narrow and whether they have regular or irregular shapes,introduces difficulties in determining the ideal gras** posture, even for the most state-of-the-art grippers. In this article, we presented a reconfigurable pneumatic gripper with fingers that could be set in various configurations, such as hooking, supporting, closuring, and pinching. Each… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 11 pages, 12 figures

  26. arXiv:2405.02008  [pdf, other

    cs.CV

    DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model

    Authors: Pei** Jia, Tuopu Wen, Ziang Luo, Mengmeng Yang, Kun Jiang, Zhiquan Lei, Xuewei Tang, Ziyuan Liu, Le Cui, Kehua Sheng, Bo Zhang, Diange Yang

    Abstract: Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving. In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception. However, existing models still encounter challenges in producing realistic and consistent semantic map layouts. One prominent issue is the limited… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  27. arXiv:2405.01356  [pdf, other

    cs.CV

    Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

    Authors: Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

    Abstract: In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt. In this work, we propose Subject-Agnostic Guidance (SAG), a simple yet effective solution to remedy the problem. We show that through constructing a subject-agnostic condition and applying our pr… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  28. arXiv:2405.01200  [pdf, other

    eess.SY cs.LG

    Learning-to-solve unit commitment based on few-shot physics-guided spatial-temporal graph convolution network

    Authors: Mei Yang, Gao Qiu andJunyong Liu, Kai Liu

    Abstract: This letter proposes a few-shot physics-guided spatial temporal graph convolutional network (FPG-STGCN) to fast solve unit commitment (UC). Firstly, STGCN is tailored to parameterize UC. Then, few-shot physics-guided learning scheme is proposed. It exploits few typical UC solutions yielded via commercial optimizer to escape from local minimum, and leverages the augmented Lagrangian method for cons… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  29. SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration

    Authors: Yuto Nakashima, Mingzhe Yang, Yukino Baba

    Abstract: Generating preferred images using generative adversarial networks (GANs) is challenging owing to the high-dimensional nature of latent space. In this study, we propose a novel approach that uses simple user-swipe interactions to generate preferred images for users. To effectively explore the latent space with only swipe interactions, we apply principal component analysis to the latent space of the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 11 pages, 13 figures

  30. arXiv:2404.18437  [pdf, ps, other

    cs.IT

    A family of self-orthogonal divisible codes with locality 2

    Authors: Ziling Heng, Mengjie Yang, Yang Ming

    Abstract: Linear codes are widely studied due to their applications in communication, cryptography, quantum codes, distributed storage and many other fields. In this paper, we use the trace and norm functions over finite fields to construct a family of linear codes. The weight distributions of the codes are determined in three cases via Gaussian sums. The codes are shown to be self-orthogonal divisible code… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 25 pages

  31. arXiv:2404.18038  [pdf, other

    quant-ph

    Multi-Stage Watermarking for Quantum Circuits

    Authors: Min Yang, Xiaolong Guo, Lei Jiang

    Abstract: Quantum computing represents a burgeoning computational paradigm that significantly advances the resolution of contemporary intricate problems across various domains, including cryptography, chemistry, and machine learning. Quantum circuits tailored to address specific problems have emerged as critical intellectual properties (IPs) for quantum computing companies, attributing to the escalating com… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  32. arXiv:2404.17898  [pdf, ps, other

    math.AP

    A two-phase problem with degenerate operator in Orlicz-Sobolev spaces

    Authors: Pedro F. Silva Pontes, Minbo Yang

    Abstract: In this paper we are interested in the study of a two-phase problem equipped with the $Φ$-Laplacian operator $$ Δ_Φu \coloneqq \mbox{div} \left(φ(|\nabla u|)\dfrac{\nabla u}{|\nabla u|}\right), $$ where $Φ(s)=e^{s^2}-1$ and $φ=Φ'$. We obtain the existence, boundedness, and Log-Lipschitz regularity of the minimizers of the energy functional associated to the two-phase problem. Furthermore,… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  33. arXiv:2404.17894  [pdf, ps, other

    cs.CV

    Unpaired Multi-view Clustering via Reliable View Guidance

    Authors: Like Xin, Wanqi Yang, Lei Wang, Ming Yang

    Abstract: This paper focuses on unpaired multi-view clustering (UMC), a challenging problem where paired observed samples are unavailable across multiple views. The goal is to perform effective joint clustering using the unpaired observed samples in all views. In incomplete multi-view clustering, existing methods typically rely on sample pairing between views to capture their complementary. However, that is… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  34. arXiv:2404.17736  [pdf, other

    eess.SP cs.CV cs.IT eess.IV

    Diffusion-Aided Joint Source Channel Coding For High Realism Wireless Image Transmission

    Authors: Mingyu Yang, Bowen Liu, Boyang Wang, Hun-Seok Kim

    Abstract: Deep learning-based joint source-channel coding (deep JSCC) has been demonstrated as an effective approach for wireless image transmission. Nevertheless, current research has concentrated on minimizing a standard distortion metric such as Mean Squared Error (MSE), which does not necessarily improve the perceptual quality. To address this issue, we propose DiffJSCC, a novel framework that leverages… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  35. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  36. arXiv:2404.16322  [pdf, other

    cs.DB

    Bridging Speed and Accuracy to Approximate $K$-Nearest Neighbor Search

    Authors: Mingyu Yang, Jiabao **, Xiangyu Wang, Zhitao Shen, Wei Jia, Wentao Li, Wei Wang

    Abstract: Approximate K-Nearest Neighbor (AKNN) search in high-dimensional spaces is a critical yet challenging problem. The efficiency of AKNN search largely depends on the computation of distances, a process that significantly affects the runtime. To improve computational efficiency, existing work often opts for estimating approximate distances rather than computing exact distances, at the cost of reduced… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 13 pages

  37. arXiv:2404.15237  [pdf

    cond-mat.mtrl-sci

    Insights into the defect-driven heterogeneous structural evolution of Ni-rich layered cathode in lithium-ion batteries

    Authors: Zhongyuan Huang, Ziwei Chen, Maolin Yang, Mihai Chu, Zenan Li, Sihao Deng, Lunhua He, Lei **, Rafal E. Dunin-Borkowski, Rui Wang, Jun Wang, Tingting Yang, Yinguo Xiao

    Abstract: Recently, considerable efforts have been made on research and improvement for Ni-rich lithium-ion batteries to meet the demand from vehicles and grid-level large-scale energy storage. Development of next-generation high-performance lithium-ion batteries requires a comprehensive understanding on the underlying electrochemical mechanisms associated with its structural evolution. In this work, advanc… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 29 pages and 5 figures for manuscript; 30 pages, 14 figures and 4 tables for supplementary information

  38. arXiv:2404.14852  [pdf, other

    cs.CV

    Ultrasound Nodule Segmentation Using Asymmetric Learning with Simple Clinical Annotation

    Authors: Xingyue Zhao, Zhongyu Li, Xiangde Luo, Peiqi Li, Peng Huang, Jianwei Zhu, Yang Liu, Jihua Zhu, Meng Yang, Shi Chang, Jun Dong

    Abstract: Recent advances in deep learning have greatly facilitated the automated segmentation of ultrasound images, which is essential for nodule morphological analysis. Nevertheless, most existing methods depend on extensive and precise annotations by domain experts, which are labor-intensive and time-consuming. In this study, we suggest using simple aspect ratio annotations directly from ultrasound clini… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by TCSVT

  39. arXiv:2404.14368  [pdf, other

    cs.CV cs.AI cs.CL

    Graphic Design with Large Multimodal Model

    Authors: Yutao Cheng, Zhao Zhang, Maoke Yang, Hui Nie, Chunyuan Li, Xinglong Wu, Jie Shao

    Abstract: In the field of graphic design, automating the integration of design elements into a cohesive multi-layered artwork not only boosts productivity but also paves the way for the democratization of graphic design. One existing practice is Graphic Layout Generation (GLG), which aims to layout sequential design elements. It has been constrained by the necessity for a predefined correct sequence of laye… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  40. arXiv:2404.14066  [pdf, other

    cs.CV cs.IR

    SHE-Net: Syntax-Hierarchy-Enhanced Text-Video Retrieval

    Authors: Xuzheng Yu, Chen Jiang, Xingning Dong, Tian Gan, Ming Yang, Qingpei Guo

    Abstract: The user base of short video apps has experienced unprecedented growth in recent years, resulting in a significant demand for video content analysis. In particular, text-video retrieval, which aims to find the top matching videos given text descriptions from a vast video corpus, is an essential function, the primary challenge of which is to bridge the modality gap. Nevertheless, most existing appr… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  41. arXiv:2404.14021  [pdf, ps, other

    physics.chem-ph

    Physics-Informed Neural Networks and Beyond: Enforcing Physical Constraints in Quantum Dissipative Dynamics

    Authors: Arif Ullah, Yu Huang, Ming Yang, Pavlo O. Dral

    Abstract: Neural networks (NNs) accelerate simulations of quantum dissipative dynamics. Ensuring that these simulations adhere to fundamental physical laws is crucial, but has been largely ignored in the state-of-the-art NN approaches. We show that this may lead to implausible results measured by violation of the trace conservation. To recover the correct physical behavior, we develop physics-informed NNs t… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Two figures

  42. arXiv:2404.13783  [pdf, other

    physics.gen-ph

    Spin Theory Based on the Extended Least Action Principle and Information Metrics: Quantization, Entanglement, and Bell Test With Time Delay

    Authors: Jianhao M. Yang

    Abstract: A theory of electron spin is developed here based on the extended least action principle and assumptions of intrinsic angular momentum of an electron with random orientations. By incorporating appropriate relative entropy for the random orientations of intrinsic angular momentum in the extended least action principle, the theory recovers the quantum formulation of electron spin. The two-level quan… ▽ More

    Submitted 1 July, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 24 pages, 7 figures. This article is related to arXiv:2302.14619 and arXiv:2310.02274

  43. arXiv:2404.13153  [pdf, other

    eess.IV cs.CV

    Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

    Authors: Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, Ming-Hsuan Yang

    Abstract: Eliminating image blur produced by various kinds of motion has been a challenging problem. Dominant approaches rely heavily on model capacity to remove blurring by reconstructing residual from blurry observation in feature space. These practices not only prevent the capture of spatially variable motion in the real world but also ignore the tailored handling of various motions in image space. In th… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  44. arXiv:2404.12009  [pdf, ps, other

    math.AP

    Single-peak and multi-peak solutions for Hamiltonian elliptic systems in dimension two

    Authors: Hui Zhang, Minbo Yang, Jianjun Zhang, Xuexiu Zhong

    Abstract: This paper is concerned with the Hamiltonian elliptic system in dimension two\begin{equation*}\aligned \left\{ \begin{array}{lll} -ε^2Δu+V(x)u=g(v)\ & \text{in}\quad \mathbb{R}^2,\\ -ε^2Δv+V(x)v=f(u)\ & \text{in}\quad \mathbb{R}^2, \end{array}\right.\endaligned \end{equation*} where $V\in C(\mathbb{R}^2)$ has local minimum points, and $f,g\in C^1(\mathbb{R})$ are assumed to be of exponential growt… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2205.15474

  45. arXiv:2404.11836  [pdf, other

    eess.SP

    AI-Empowered RIS-Assisted Networks: CV-Enabled RIS Selection and DNN-Enabled Transmission

    Authors: Conggang Hu, Yang Lu, Hongyang Du, Mi Yang, Bo Ai, Dusit Niyato

    Abstract: This paper investigates artificial intelligence (AI) empowered schemes for reconfigurable intelligent surface (RIS) assisted networks from the perspective of fast implementation. We formulate a weighted sum-rate maximization problem for a multi-RIS-assisted network. To avoid huge channel estimation overhead due to activate all RISs, we propose a computer vision (CV) enabled RIS selection scheme ba… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  46. arXiv:2404.11576  [pdf, other

    cs.CV

    State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend

    Authors: Fei Cui, Jiaojiao Fang, Xiaojiang Wu, Zelong Lai, Mengke Yang, Menghan Jia, Guizhong Liu

    Abstract: Stochastic video prediction enables the consideration of uncertainty in future motion, thereby providing a better reflection of the dynamic nature of the environment. Stochastic video prediction methods based on image auto-regressive recurrent models need to feed their predictions back into the latent space. Conversely, the state-space models, which decouple frame synthesis and temporal prediction… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  47. arXiv:2404.11475  [pdf, other

    cs.CV cs.AI

    AdaIR: Exploiting Underlying Similarities of Image Restoration Tasks with Adapters

    Authors: Hao-Wei Chen, Yu-Syuan Xu, Kelvin C. K. Chan, Hsien-Kai Kuo, Chun-Yi Lee, Ming-Hsuan Yang

    Abstract: Existing image restoration approaches typically employ extensive networks specifically trained for designated degradations. Despite being effective, such methods inevitably entail considerable storage costs and computational overheads due to the reliance on task-specific networks. In this work, we go beyond this well-established framework and exploit the inherent commonalities among image restorat… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  48. arXiv:2404.11343  [pdf, other

    cs.IR cs.AI

    Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System

    Authors: Sein Kim, Hongseok Kang, Seungyoon Choi, Donghyun Kim, Minchul Yang, Chanyoung Park

    Abstract: Collaborative filtering recommender systems (CF-RecSys) have shown successive results in enhancing the user experience on social media and e-commerce platforms. However, as CF-RecSys struggles under cold scenarios with sparse user-item interactions, recent strategies have focused on leveraging modality information of user/items (e.g., text or images) based on pre-trained modality encoders and Larg… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: KDD 2024

  49. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Ya**g Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  50. arXiv:2404.11175  [pdf, other

    quant-ph

    Active Quantum Distillation

    Authors: Muchun Yang, D. L. Zhou

    Abstract: Quantum distillation is a modern technology to decrease the von Neumann entropy of a subsystem by coherent system dynamics. Here we propose an active quantum distillation protocol, in which a bang-bang theme is applied to actively control the coherent dynamics of our system in order to obtain a subsystem with the von Neumann entropy as low as possible. For a bipartite Bosonic system, we derive the… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 13 pages, 6 figures