Skip to main content

Showing 151–200 of 1,131 results for author: Zhou, M

.
  1. arXiv:2308.01095  [pdf, other

    cs.CV

    AutoPoster: A Highly Automatic and Content-aware Design System for Advertising Poster Generation

    Authors: **peng Lin, Min Zhou, Ye Ma, Yifan Gao, Chenxi Fei, Yangjian Chen, Zhang Yu, Tiezheng Ge

    Abstract: Advertising posters, a form of information presentation, combine visual and linguistic modalities. Creating a poster involves multiple steps and necessitates design experience and creativity. This paper introduces AutoPoster, a highly automatic and content-aware system for generating advertising posters. With only product images and titles as inputs, AutoPoster can automatically produce posters of… ▽ More

    Submitted 23 August, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: Accepted for ACM MM 2023

  2. arXiv:2308.00499  [pdf, ps, other

    cs.IT

    Stochastic Geometry Based Modeling and Analysis on Network NOMA in Downlink CoMP Systems

    Authors: Yanshi Sun, Zhiguo Ding, Xuchu Dai, Momiao Zhou, Zhizhong Ding

    Abstract: This paper investigates the performance of network non-orthogonal multiple access (N-NOMA) in a downlink coordinated multi-point (CoMP) system. In the considered N-NOMA scheme, multiple base stations (BSs) cooperatively serve a CoMP user, meanwhile, each BS serves additional NOMA users by occupying the same resource block allocated to the CoMP user. The locations of the BSs and users are modeled b… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  3. arXiv:2307.14901  [pdf, other

    cs.CV cs.AI

    Text-guided Foundation Model Adaptation for Pathological Image Classification

    Authors: Yunkun Zhang, ** Gao, Mu Zhou, Xiaosong Wang, Yu Qiao, Shaoting Zhang, Dequan Wang

    Abstract: The recent surge of foundation models in computer vision and natural language processing opens up perspectives in utilizing multi-modal clinical data to train large models with strong generalizability. Yet pathological image datasets often lack biomedical text annotation and enrichment. Guiding data-efficient image diagnosis from the use of biomedical text knowledge becomes a substantial interest.… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted to MICCAI2023

  4. EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

    Authors: Bo Yang, Xinyu Zhang, Jian Zhang, Jun Luo, Mingliang Zhou, Yangjun Pi

    Abstract: Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First,… ▽ More

    Submitted 27 February, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing 19 February 2024

  5. arXiv:2307.13693  [pdf, other

    cs.CL

    Evaluating Large Language Models for Radiology Natural Language Processing

    Authors: Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen , et al. (20 additional authors not shown)

    Abstract: The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a compreh… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  6. arXiv:2307.12755  [pdf, other

    astro-ph.IM astro-ph.HE gr-qc

    Testing General Relativity with Black Hole X-Ray Data and ABHModels

    Authors: Cosimo Bambi, Askar B. Abdikamalov, Honghui Liu, Shafqat Riaz, Swarnim Shashank, Menglei Zhou

    Abstract: The past 10 years have seen tremendous progress in our capability of testing General Relativity in the strong field regime with black hole observations. 10 years ago, the theory of General Relativity was almost completely unexplored in the strong field regime. Today, we have gravitational wave data of the coalescence of stellar-mass black holes, radio images of the supermassive black holes SgrA… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: 31 pages, 5 figures. Talk given at the Frascati Workshop 2023 "Multifrequency Behaviour of High Energy Cosmic Sources - XIV" (Palermo, Italy, 12-17 June 2023). v2: refereed version

    Journal ref: PoS MULTIF2023, 016 (2024)

  7. arXiv:2307.11952  [pdf, other

    cs.CV cs.AI

    Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction

    Authors: Kexin Ding, Mu Zhou, Dimitris N. Metaxas, Shaoting Zhang

    Abstract: Survival outcome assessment is challenging and inherently associated with multiple clinical factors (e.g., imaging and genomics biomarkers) in cancer. Enabling multimodal analytics promises to reveal novel predictive patterns of patient outcomes. In this study, we propose a multimodal transformer (PathOmics) integrating pathology and genomics insights into colon-related cancer survival prediction.… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Accepted to MICCAI2023 (Top14%)

  8. Learning and Evaluating Human Preferences for Conversational Head Generation

    Authors: Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, Tao Mei

    Abstract: A reliable and comprehensive evaluation metric that aligns with manual preference assessments is crucial for conversational head video synthesis methods development. Existing quantitative evaluations often fail to capture the full complexity of human preference, as they only consider limited evaluation dimensions. Qualitative evaluations and user studies offer a solution but are time-consuming and… ▽ More

    Submitted 2 August, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted by ACM Multimedia 2023

  9. arXiv:2307.09066  [pdf, other

    cs.CV

    PatchCT: Aligning Patch Set and Label Set with Conditional Transport for Multi-Label Image Classification

    Authors: Miaoge Li, Dongsheng Wang, Xinyang Liu, Zequn Zeng, Ruiying Lu, Bo Chen, Mingyuan Zhou

    Abstract: Multi-label image classification is a prediction task that aims to identify more than one label from a given image. This paper considers the semantic consistency of the latent space between the visual patch and linguistic label domains and introduces the conditional transport (CT) theory to bridge the acknowledged gap. While recent cross-modal attention-based studies have attempted to align such t… ▽ More

    Submitted 18 August, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: accepted by ICCV23

  10. Spin measurement of 4U 1543-47 with Insight-HXMT and NICER from its 2021 outburst: A test of accretion disk models at high luminosities

    Authors: E. S. Yorgancioglu, Q. C. Bu, A. Santangelo, L. Tao, S. W. Davis, A. Vahdat, L. D. Kong, S. Piraino, M. Zhou, S. N. Zhang

    Abstract: 4U 1543--47 is one of a handful of known black hole candidates located in the Milky Way Galaxy, and has undergone a very bright outburst in 2021, reaching a total of $\sim$9 Crab, as observed by the Monitor of All-sky Image (MAXI), and exceeding twice its Eddington luminosity. The unprecedented bright outburst of 4U 1543--47 provides a unique opportunity to test the behavior of accretion disk mode… ▽ More

    Submitted 21 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: 10 pages, 6 figures

    Journal ref: A&A 677, A79 (2023)

  11. arXiv:2307.05633  [pdf, other

    cs.LG

    Transaction Fraud Detection via an Adaptive Graph Neural Network

    Authors: Yue Tian, Guanjun Liu, Jiacun Wang, Mengchu Zhou

    Abstract: Many machine learning methods have been proposed to achieve accurate transaction fraud detection, which is essential to the financial security of individuals and banks. However, most existing methods leverage original features only or require manual feature engineering. They lack the ability to learn discriminative representations from transaction data. Moreover, criminals often commit fraud by im… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  12. arXiv:2307.04858  [pdf, other

    cs.HC cs.CV q-bio.NC

    AmadeusGPT: a natural language interface for interactive animal behavioral analysis

    Authors: Shaokai Ye, Jessy Lauer, Mu Zhou, Alexander Mathis, Mackenzie W. Mathis

    Abstract: The process of quantifying and analyzing animal behavior involves translating the naturally occurring descriptive language of their actions into machine-readable code. Yet, codifying behavior analysis is often challenging without deep understanding of animal behavior and technical machine learning knowledge. To limit this gap, we introduce AmadeusGPT: a natural language interface that turns natura… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: demo available https://github.com/AdaptiveMotorControlLab/AmadeusGPT

    Journal ref: Published in Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) 2023

  13. arXiv:2307.03983  [pdf, ps, other

    cs.IT

    Hybrid Successive Interference Cancellation and Power Adaptation: a Win-Win Strategy for Robust Uplink NOMA Transmission

    Authors: Yanshi Sun, Wei Cao, Momiao Zhou, Zhiguo Ding

    Abstract: The aim of this paper is to reveal the importance of hybrid successive interference cancellation (SIC) and power adaptation (PA) for improving transmission robustness of uplink non-orthogonal multiple access (NOMA). Particularly, a cognitive radio inspired uplink NOMA communication scenario is considered, where one primary user is allocated one dedicated resource block, while M secondary users com… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2307.01517

  14. arXiv:2307.03705  [pdf, other

    cs.RO cs.AI

    Intelligent Robotic Sonographer: Mutual Information-based Disentangled Reward Learning from Few Demonstrations

    Authors: Zhongliang Jiang, Yuan Bi, Mingchuan Zhou, Ying Hu, Michael Burke, Nassir Navab

    Abstract: Ultrasound (US) imaging is widely used for biometric measurement and diagnosis of internal organs due to the advantages of being real-time and radiation-free. However, due to inter-operator variations, resulting images highly depend on the experience of sonographers. This work proposes an intelligent robotic sonographer to autonomously "explore" target anatomies and navigate a US probe to a releva… ▽ More

    Submitted 29 November, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

  15. arXiv:2307.02090  [pdf, other

    cs.CV

    Interactive Conversational Head Generation

    Authors: Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao

    Abstract: We introduce a new conversation head generation benchmark for synthesizing behaviors of a single interlocutor in a face-to-face conversation. The capability to automatically synthesize interlocutors which can participate in long and multi-turn conversations is vital and offer benefits for various applications, including digital humans, virtual agents, and social robots. While existing research pri… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2112.13548

  16. arXiv:2307.01517  [pdf, ps, other

    cs.IT

    New Designs of Robust Uplink NOMA in Cognitive Radio Inspired Communications

    Authors: Yanshi Sun, Wei Cao, Momiao Zhou, Zhiguo Ding

    Abstract: This paper considers a cognitive radio inspired uplink communication scenario, where one primary user is allocated with one dedicated resource block, while $M$ secondary users compete with each other to opportunistically access the primary user's channel. Two new designs of NOMA schemes, namely hybrid successive interference cancellation with power adaptation (HSIC-PA) and fixed successive interfe… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  17. arXiv:2307.00479  [pdf, other

    eess.IV cs.CV

    Domain Transfer Through Image-to-Image Translation for Uncertainty-Aware Prostate Cancer Classification

    Authors: Meng Zhou, Amoon Jamzad, Jason Izard, Alexandre Menard, Robert Siemens, Parvin Mousavi

    Abstract: Prostate Cancer (PCa) is a prevalent disease among men, and multi-parametric MRIs offer a non-invasive method for its detection. While MRI-based deep learning solutions have shown promise in supporting PCa diagnosis, acquiring sufficient training data, particularly in local clinics remains challenging. One potential solution is to take advantage of publicly available datasets to pre-train deep mod… ▽ More

    Submitted 3 June, 2024; v1 submitted 2 July, 2023; originally announced July 2023.

    Comments: Preprint. In Submission

  18. arXiv:2306.16307  [pdf, other

    cs.SE

    Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement

    Authors: Kai Gao, Runzhi He, Bing Xie, Minghui Zhou

    Abstract: Deep learning (DL) package supply chains (SCs) are critical for DL frameworks to remain competitive. However, vital knowledge on the nature of DL package SCs is still lacking. In this paper, we explore the domains, clusters, and disengagement of packages in two representative PyPI DL package SCs to bridge this knowledge gap. We analyze the metadata of nearly six million PyPI package distributions… ▽ More

    Submitted 20 December, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: Preprint of paper accepted by ACM Transactions on Software Engineering and Methodology (TOSEM)

  19. arXiv:2306.14274  [pdf, other

    eess.IV cs.CV

    MEPNet: A Model-Driven Equivariant Proximal Network for Joint Sparse-View Reconstruction and Metal Artifact Reduction in CT Images

    Authors: Hong Wang, Minghao Zhou, Dong Wei, Yuexiang Li, Yefeng Zheng

    Abstract: Sparse-view computed tomography (CT) has been adopted as an important technique for speeding up data acquisition and decreasing radiation dose. However, due to the lack of sufficient projection data, the reconstructed CT images often present severe artifacts, which will be further amplified when patients carry metallic implants. For this joint sparse-view reconstruction and metal artifact reductio… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: MICCAI 2023

  20. Explainable Recommendation with Personalized Review Retrieval and Aspect Learning

    Authors: Hao Cheng, Shuo Wang, Wensheng Lu, Wei Zhang, Mingyang Zhou, Kezhong Lu, Hao Liao

    Abstract: Explainable recommendation is a technique that combines prediction and generation tasks to produce more persuasive results. Among these tasks, textual generation demands large amounts of data to achieve satisfactory accuracy. However, historical user reviews of items are often insufficient, making it challenging to ensure the precision of generated explanation text. To address this issue, we propo… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Journal ref: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2023

  21. Visual-Aware Text-to-Speech

    Authors: Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, Tao Mei

    Abstract: Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction. For example, the speaker could take advantage of the listener's facial expression to adjust the tones, stressed syllables, or pauses. In this work, we present a new visual-aware text-to-speech (VA-TTS) task to synthesize speech conditioned on both textual inputs and s… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: accepted as oral and top 3% paper by ICASSP 2023

    Journal ref: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023, 1-5

  22. Democratizing Chatbot Debugging: A Computational Framework for Evaluating and Explaining Inappropriate Chatbot Responses

    Authors: Xu Han, Michelle Zhou, Yichen Wang, Wenxi Chen, Tom Yeh

    Abstract: Evaluating and understanding the inappropriateness of chatbot behaviors can be challenging, particularly for chatbot designers without technical backgrounds. To democratize the debugging process of chatbot misbehaviors for non-technical designers, we propose a framework that leverages dialogue act (DA) modeling to automate the evaluation and explanation of chatbot response inappropriateness. The f… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 7 pages, 4 figures, accepted to CUI 2023 poster track

  23. arXiv:2306.09118  [pdf, other

    cs.LG cs.AI

    Hyperbolic Representation Learning: Revisiting and Advancing

    Authors: Menglin Yang, Min Zhou, Rex Ying, Yankai Chen, Irwin King

    Abstract: The non-Euclidean geometry of hyperbolic spaces has recently garnered considerable attention in the realm of representation learning. Current endeavors in hyperbolic representation largely presuppose that the underlying hierarchies can be automatically inferred and preserved through the adaptive optimization process. This assumption, however, is questionable and requires further validation. In thi… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  24. arXiv:2306.07879  [pdf, other

    cs.CV q-bio.QM

    Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity

    Authors: Mu Zhou, Lucas Stoffl, Mackenzie Weygandt Mathis, Alexander Mathis

    Abstract: Frequent interactions between individuals are a fundamental challenge for pose estimation algorithms. Current pipelines either use an object detector together with a pose estimator (top-down approach), or localize all body parts first and then link them to predict the pose of individuals (bottom-up). Yet, when individuals closely interact, top-down methods are ill-defined due to overlap** indivi… ▽ More

    Submitted 30 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Published at ICCV 2023; Code at https://github.com/amathislab/BUCTD Video at https://www.youtube.com/watch?v=BHZnA-CZeZY

    Journal ref: ICCV Link: https://openaccess.thecvf.com/content/ICCV2023/papers/Zhou_Rethinking_Pose_Estimation_in_Crowds_Overcoming_the_Detection_Information_Bottleneck_ICCV_2023_paper.pdf

  25. ModelObfuscator: Obfuscating Model Information to Protect Deployed ML-based Systems

    Authors: Mingyi Zhou, Xiang Gao, **g Wu, John Grundy, Xiao Chen, Chunyang Chen, Li Li

    Abstract: More and more edge devices and mobile apps are leveraging deep learning (DL) capabilities. Deploying such models on devices -- referred to as on-device models -- rather than as remote cloud-hosted services, has gained popularity because it avoids transmitting user data off of the device and achieves high response time. However, on-device models can be easily attacked, as they can be accessed by un… ▽ More

    Submitted 29 February, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Published In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA23)

    Journal ref: In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023), 2023, 1005-1017

  26. arXiv:2306.03498  [pdf, other

    math.AP

    Boundary regularity of uniformly rotating vortex patches and an unstable elliptic free boundary problem

    Authors: Yuchen Wang, Guanghui Zhang, Maolin Zhou

    Abstract: In this paper, we consider a sign-changing free boundary problem that comes from the boundary regularity of rotating vortex patches of the two-dimensional incompressible Euler equations. The complete classification of singular points has been obtained through establishing a new Weiss-type monotonicity formula. Upon these results, we prove that only $90^\circ$ corner type of singularity could happe… ▽ More

    Submitted 17 March, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

  27. arXiv:2306.02416  [pdf, other

    cs.CV

    Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation

    Authors: Yunhe Gao, Zhuowei Li, Di Liu, Mu Zhou, Shaoting Zhang, Dimitris N. Metaxas

    Abstract: A major focus of clinical imaging workflow is disease diagnosis and management, leading to medical imaging datasets strongly tied to specific clinical objectives. This scenario has led to the prevailing practice of develo** task-specific segmentation models, without gaining insights from widespread imaging cohorts. Inspired by the training program of medical radiology residents, we propose a shi… ▽ More

    Submitted 6 April, 2024; v1 submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted by CVPR 2024

  28. arXiv:2306.00398  [pdf, other

    cs.CL

    Preference-grounded Token-level Guidance for Language Model Fine-tuning

    Authors: Shentao Yang, Shujian Zhang, Congying Xia, Yihao Feng, Caiming Xiong, Mingyuan Zhou

    Abstract: Aligning language models (LMs) with preferences is an important problem in natural language generation. A key challenge is that preferences are typically provided at the *sequence level* while LM training and generation both occur at the *token level*. There is, therefore, a *granularity mismatch* between the preference and the LM training losses, which may complicate the learning problem. In this… ▽ More

    Submitted 9 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  29. arXiv:2305.18641  [pdf, other

    cs.CL cs.CV

    Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs

    Authors: Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang

    Abstract: Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community. The capability to uncover the underlined table data of chart figures is a critical key to automatic chart understanding. We introduce ChartT5, a V+L model that learns how to interpret table information from char… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted by Findings of ACL 2023

  30. arXiv:2305.18375  [pdf, other

    cs.LG stat.ME stat.ML

    Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling

    Authors: Tianqi Chen, Mingyuan Zhou

    Abstract: Learning to denoise has emerged as a prominent paradigm to design state-of-the-art deep generative models for natural images. How to use it to model the distributions of both continuous real-valued data and categorical data has been well studied in recently proposed diffusion models. However, it is found in this paper to have limited ability in modeling some other types of data, such as count and… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  31. arXiv:2305.17030  [pdf, other

    astro-ph.HE hep-ph

    The First LHAASO Catalog of Gamma-Ray Sources

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022.… ▽ More

    Submitted 27 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 40 pages, 13 figures, 4 tables

    Journal ref: The Astrophysical Journal Supplement Series, 271 (2024) 25

  32. arXiv:2305.16310  [pdf, other

    cs.CV

    Securing Deep Generative Models with Universal Adversarial Signature

    Authors: Yu Zeng, Mo Zhou, Yuan Xue, Vishal M. Patel

    Abstract: Recent advances in deep generative models have led to the development of methods capable of synthesizing high-quality, realistic images. These models pose threats to society due to their potential misuse. Prior research attempted to mitigate these threats by detecting generated images, but the varying traces left by different generative models make it challenging to create a universal detector cap… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  33. arXiv:2305.15066  [pdf, other

    cs.AI cs.CL

    GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and Benchmarking

    Authors: Jiayan Guo, Lun Du, Hengyu Liu, Mengyu Zhou, Xinyi He, Shi Han

    Abstract: Large language models~(LLM) like ChatGPT have become indispensable to artificial general intelligence~(AGI), demonstrating excellent performance in various natural language processing tasks. In the real world, graph data is ubiquitous and an essential part of AGI and prevails in domains like social network analysis, bioinformatics and recommender systems. The training corpus of large language mode… ▽ More

    Submitted 11 July, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  34. arXiv:2305.14674  [pdf, other

    cs.CV

    T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified Visual Modalities

    Authors: Kangfu Mei, Mo Zhou, Vishal M. Patel

    Abstract: Diffusion Probabilistic Field (DPF) models the distribution of continuous functions defined over metric spaces. While DPF shows great potential for unifying data generation of various modalities including images, videos, and 3D geometry, it does not scale to a higher data resolution. This can be attributed to the ``scaling property'', where it is difficult for the model to capture local structures… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: for project page, see https://t1-diffusion-model.github.io

  35. arXiv:2305.13062  [pdf, other

    cs.CL cs.AI cs.IR

    Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study

    Authors: Yuan Sui, Mengyu Zhou, Mingjie Zhou, Shi Han, Dongmei Zhang

    Abstract: Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. However, there is still much to learn about how well LLMs understand structured data, such as tables. Although tables can be used as input to LLMs with serialization, there is a lack of comprehensive studies that examine whether LLMs can truly comprehend such data. In this paper… ▽ More

    Submitted 17 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: This paper has been accepted as a full paper at WSDM 2024. The code will be released at https://github.com/microsoft/TableProvider

  36. arXiv:2305.09967  [pdf, other

    cs.CV cs.LG

    Variable Length Embeddings

    Authors: Johnathan Chiu, Andi Gu, Matt Zhou

    Abstract: In this work, we introduce a novel deep learning architecture, Variable Length Embeddings (VLEs), an autoregressive model that can produce a latent representation composed of an arbitrary number of tokens. As a proof of concept, we demonstrate the capabilities of VLEs on tasks that involve reconstruction and image decomposition. We evaluate our experiments on a mix of the iNaturalist and ImageNet… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  37. arXiv:2305.09135  [pdf, ps, other

    math.AG

    Frobenius splitting of moduli spaces of parabolic bundles

    Authors: Xiaotao Sun, Mingshuo Zhou

    Abstract: Let $C$ be a nonsingular projective curve over an algebraically closed field of characteristic $p>0$ and $I\subset C$ be a finite set. If $\mathcal{U}_{C,\,ω}$ denotes the moduli space of semistable parabolic bundles of rank $r$ and degree $d$ on $C$ with parabolic structures determined by $ω=(k,\{\vec n(x),\vec a(x)\}_{x\in I})$, we prove that $\mathcal{U}_{C,\,ω}$ is \textit{$F$-split} for gener… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 42 pages

    MSC Class: Algebraic Geometry; 14H60; 14D20

  38. Quantum Neural Network for Quantum Neural Computing

    Authors: Min-Gang Zhou, Zhi-** Liu, Hua-Lei Yin, Chen-Long Li, Tong-Kai Xu, Zeng-Bing Chen

    Abstract: Neural networks have achieved impressive breakthroughs in both industry and academia. How to effectively develop neural networks on quantum computing devices is a challenging open problem. Here, we propose a new quantum neural network model for quantum neural computing using (classically-controlled) single-qubit operations and measurements on real-world quantum systems with naturally occurring env… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 10 pages, 6 figures

    Journal ref: Research 6, 0134 (2023)

  39. arXiv:2305.07835  [pdf, other

    cs.IT

    Multi-Scenario Broadband Channel Measurement and Modeling for Sub-6 GHz RIS-Assisted Wireless Communication Systems

    Authors: Jian Sang, Mingyong Zhou, Jifeng Lan, Boning Gao, Wankai Tang, Xiao Li, Shi **, Ertugrul Basar, Cen Li, Qiang Cheng, Tie Jun Cui

    Abstract: Reconfigurable intelligent surface (RIS)-empowered communication, has been considered widely as one of the revolutionary technologies for next generation networks. However, due to the novel propagation characteristics of RISs, underlying RIS channel modeling and measurement research is still in its infancy and not fully investigated. In this paper, we conduct multi-scenario broadband channel measu… ▽ More

    Submitted 13 May, 2023; originally announced May 2023.

  40. arXiv:2305.07774  [pdf, other

    cs.CV eess.IV

    PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

    Authors: Gang Yang, Xiangyong Cao, Wenzhe Xiao, Man Zhou, Ai** Liu, Xun chen, Deyu Meng

    Abstract: Pan-sharpening aims to generate a high-resolution multispectral (HRMS) image by integrating the spectral information of a low-resolution multispectral (LRMS) image with the texture details of a high-resolution panchromatic (PAN) image. It essentially inherits the ill-posed nature of the super-resolution (SR) task that diverse HRMS images can degrade into an LRMS image. However, existing deep learn… ▽ More

    Submitted 16 May, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

  41. Measurement of ultra-high-energy diffuse gamma-ray emission of the Galactic plane from 10 TeV to 1 PeV with LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer ar… ▽ More

    Submitted 19 August, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: 12 pages, 8 figures, 5 tables; accepted for publication in Physical Review Letters; source mask file provided as ancillary file

    Journal ref: Phys. Rev. Lett. 131, 151001 (2023)

  42. arXiv:2305.02838  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Superconductivity in Mo$_4$Ga$_{20}$As with Endohedral Gallium Clusters

    Authors: Bin-Bin Ruan, Le-Wei Chen, Yun-Qing Shi, Jun-Kun Yi, Qing-Song Yang, Meng-Hu Zhou, Ming-Wei Ma, Gen-Fu Chen, Zhi-An Ren

    Abstract: We report the discovery and detailed investigation of superconductivity in Mo$_4$Ga$_{20}$As. Mo$_4$Ga$_{20}$As crystallizes in the space group of $I4/m$ (No. 87), with lattice parameters $a$ = 12.86352 Åand $c$ = 5.30031 Å. The resistivity, magnetization, and specific heat data reveal Mo$_4$Ga$_{20}$As to be a type-II superconductor with $T_c$ = 5.6 K. The upper and lower critical fields are esti… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Journal ref: Journal of Physics: Condensed Matter 2023 35, 214002

  43. arXiv:2305.02499  [pdf, other

    cs.CL cs.AI cs.CV cs.LG stat.ML

    AutoML-GPT: Automatic Machine Learning with GPT

    Authors: Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou

    Abstract: AI tasks encompass a wide range of domains and fields. While numerous AI models have been designed for specific tasks and applications, they often require considerable human efforts in finding the right model architecture, optimization algorithm, and hyperparameters. Recent advances in large language models (LLMs) like ChatGPT show remarkable capabilities in various aspects of reasoning, comprehen… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  44. arXiv:2305.01115  [pdf, other

    cs.CV

    In-Context Learning Unlocked for Diffusion Models

    Authors: Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou

    Abstract: We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our model automatically understands the underlying task and performs the same task on a new query image following the text guidance. To achieve this, we propose a vi… ▽ More

    Submitted 18 October, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

  45. arXiv:2305.00562  [pdf, other

    cs.CV cs.LG

    Class-Balancing Diffusion Models

    Authors: Yiming Qin, Huangjie Zheng, Jiangchao Yao, Mingyuan Zhou, Ya Zhang

    Abstract: Diffusion-based models have shown the merits of generating high-quality visual data while preserving better diversity in recent studies. However, such observation is only justified with curated data distribution, where the data samples are nicely pre-processed to be uniformly distributed in terms of their labels. In practice, a long-tailed data distribution appears more common and how diffusion mo… ▽ More

    Submitted 14 June, 2023; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted by CVPR2023

  46. arXiv:2305.00350  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

    Authors: Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, Pengcheng He, Mingyuan Zhou

    Abstract: Through prompting, large-scale pre-trained models have become more expressive and powerful, gaining significant attention in recent years. Though these big models have zero-shot capabilities, in general, labeled data are still required to adapt them to downstream tasks. To overcome this critical limitation, we propose an unsupervised fine-tuning framework to directly fine-tune the model or prompt… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: ICML 2023; PyTorch code is available at https://github.com/korawat-tanwisuth/POUF

  47. Popularity Ratio Maximization: Surpassing Competitors through Influence Propagation

    Authors: Hao Liao, Sheng Bi, Jiao Wu, Wei Zhang, Mingyang Zhou, Rui Mao, Wei Chen

    Abstract: In this paper, we present an algorithmic study on how to surpass competitors in popularity by strategic promotions in social networks. We first propose a novel model, in which we integrate the Preferential Attachment (PA) model for popularity growth with the Independent Cascade (IC) model for influence propagation in social networks called PA-IC model. In PA-IC, a popular item and a novice item gr… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: 22 pages, 8 figures, to be appear SIGMOD 2023

  48. arXiv:2304.12526  [pdf, other

    cs.CV cs.LG

    Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

    Authors: Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou

    Abstract: Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which thus helps democratize diffusion model training to broader users. At the core of our innovations is a new conditional score function at the patch level, where the… ▽ More

    Submitted 18 October, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  49. arXiv:2304.04968  [pdf, other

    cs.CV cs.GR cs.LG

    Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond

    Authors: Mohammadreza Armandpour, Ali Sadeghian, Huangjie Zheng, Amir Sadeghian, Mingyuan Zhou

    Abstract: Although text-to-image diffusion models have made significant strides in generating images from text, they are sometimes more inclined to generate images like the data on which the model was trained rather than the provided text. This limitation has hindered their usage in both 2D and 3D applications. To address this problem, we explored the use of negative prompts but found that the current imple… ▽ More

    Submitted 26 April, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Our project page is available at https://Perp-Neg.github.io/

  50. arXiv:2304.04484  [pdf, other

    cs.IT eess.SP

    Quasi-Synchronous Random Access for Massive MIMO-Based LEO Satellite Constellations

    Authors: Keke Ying, Zhen Gao, Sheng Chen, Mingyu Zhou, Dezhi Zheng, Symeon Chatzinotas, Björn Ottersten, H. Vincent Poor

    Abstract: Low earth orbit (LEO) satellite constellation-enabled communication networks are expected to be an important part of many Internet of Things (IoT) deployments due to their unique advantage of providing seamless global coverage. In this paper, we investigate the random access problem in massive multiple-input multiple-output-based LEO satellite systems, where the multi-satellite cooperative process… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: 38 pages, 16 figures. This paper has been accepted by IEEE JSAC SI on 3GPP Technologies: 5G-Advanced and Beyond. Copyright may be transferred without notice, after which this version may no longer be accessible