Skip to main content

Showing 1–50 of 265 results for author: Gao, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00488  [pdf, other

    cs.CL cs.AI

    PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

    Authors: Kunquan Deng, Zeyu Huang, Chen Li, Chenghua Lin, Min Gao, Wenge Rong

    Abstract: Large Language Models (LLMs) excel in fluency but risk producing inaccurate content, called "hallucinations." This paper outlines a standardized process for categorizing fine-grained hallucination types and proposes an innovative framework--the Progressive Fine-grained Model Editor (PFME)--specifically designed to detect and correct fine-grained hallucinations in LLMs. PFME consists of two collabo… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.18365  [pdf, other

    cs.CL

    Themis: Towards Flexible and Interpretable NLG Evaluation

    Authors: Xinyu Hu, Li Lin, Mingqi Gao, Xunjian Yin, Xiaojun Wan

    Abstract: The evaluation of natural language generation (NLG) tasks is a significant and longstanding research issue. With the recent emergence of powerful large language models (LLMs), some studies have turned to LLM-based automatic evaluation methods, which demonstrate great potential to become a new evaluation paradigm following traditional string-based and model-based metrics. However, despite the impro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.17005  [pdf, other

    cs.CV

    PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

    Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Cheng**g Wu, Ting Liu, Luoqi Liu, Xinyu Liu, **g Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, **gnan Luo , et al. (12 additional authors not shown)

    Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

  4. arXiv:2406.14673  [pdf, other

    cs.CL

    Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

    Authors: Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi

    Abstract: Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information re… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.12066  [pdf, other

    cs.CL

    Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

    Authors: Jack Gallifant, Shan Chen, Pedro Moreira, Nikolaj Munch, Mingye Gao, Jackson Pond, Leo Anthony Celi, Hugo Aerts, Thomas Hartvigsen, Danielle Bitterman

    Abstract: Medical knowledge is context-dependent and requires consistent reasoning across various natural language expressions of semantically equivalent phrases. This is particularly crucial for drug names, where patients often use brand names like Advil or Tylenol instead of their generic equivalents. To study this, we create a new robustness dataset, RABBITS, to evaluate performance differences on medica… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: submitted for review, total 15 pages

  6. arXiv:2406.10304  [pdf, other

    cs.CL

    Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design

    Authors: Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Jianxing Yang, Ming Li, Chin-Hui Lee

    Abstract: Smart home technology has gained widespread adoption, facilitating effortless control of devices through voice commands. However, individuals with dysarthria, a motor speech disorder, face challenges due to the variability of their speech. This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications. To support this, we… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: to be published in Interspeech 2024

  7. arXiv:2406.09406  [pdf, other

    cs.CV cs.AI cs.LG

    4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

    Authors: Roman Bachmann, Oğuzhan Fatih Kar, David Mizrahi, Ali Garjani, Mingfei Gao, David Griffiths, Jiaming Hu, Afshin Dehghan, Amir Zamir

    Abstract: Current multimodal and multitask foundation models like 4M or UnifiedIO show promising results, but in practice their out-of-the-box abilities to accept diverse inputs and perform diverse tasks are limited by the (usually rather small) number of modalities and tasks they are trained on. In this paper, we expand upon the capabilities of them by training a single model on tens of highly diverse moda… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page at 4m.epfl.ch

  8. arXiv:2406.07967  [pdf, other

    cs.CL cs.LG

    Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling

    Authors: Jie Ruan, Xiao Pu, Mingqi Gao, Xiaojun Wan, Yuesheng Zhu

    Abstract: Human evaluation is viewed as a reliable evaluation method for NLG which is expensive and time-consuming. To save labor and costs, researchers usually perform human evaluation on a small subset of data sampled from the whole dataset in practice. However, different selection subsets will lead to different rankings of the systems. To give a more correct inter-system ranking and make the gold standar… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: With Appendix

  9. arXiv:2406.07043  [pdf, other

    cs.CV

    1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

    Authors: Mingqi Gao, **gnan Luo, **yu Yang, Jungong Han, Feng Zheng

    Abstract: Motion Expression guided Video Segmentation (MeViS), as an emerging task, poses many new challenges to the field of referring video object segmentation (RVOS). In this technical report, we investigated and validated the effectiveness of static-dominant data and frame sampling on this challenging setting. Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeVi… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  10. arXiv:2406.04983  [pdf, other

    cs.CV

    CityCraft: A Real Crafter for 3D City Generation

    Authors: Jie Deng, Wenhao Chai, Junsheng Huang, Zhonghan Zhao, Qixuan Huang, Mingyan Gao, Jianshu Guo, Shengyu Hao, Wenhao Hu, Jenq-Neng Hwang, Xi Li, Gaoang Wang

    Abstract: City scene generation has gained significant attention in autonomous driving, smart city development, and traffic simulation. It helps enhance infrastructure planning and monitoring solutions. Existing methods have employed a two-stage process involving city layout generation, typically using Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), or Transformers, followed by neur… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 20 pages, 9 figures

  11. arXiv:2406.01022  [pdf, other

    cs.CR cs.IR

    Poisoning Attacks and Defenses in Recommender Systems: A Survey

    Authors: Zongwei Wang, Junliang Yu, Min Gao, Wei Yuan, Guanhua Ye, Shazia Sadiq, Hongzhi Yin

    Abstract: Modern recommender systems (RS) have profoundly enhanced user experience across digital platforms, yet they face significant threats from poisoning attacks. These attacks, aimed at manipulating recommendation outputs for unethical gains, exploit vulnerabilities in RS through injecting malicious data or intervening model training. This survey presents a unique perspective by examining these threats… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 8 figures

  12. arXiv:2405.18320  [pdf, other

    cs.CV cs.AI cs.CL

    Self-Supervised Learning Based Handwriting Verification

    Authors: Mihir Chauhan, Mohammad Abuzar Shaikh, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

    Abstract: We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND data… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages, 6 figures, 2 tables

  13. arXiv:2405.15414  [pdf, other

    cs.AI

    Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification

    Authors: Yuxuan Guo, Shaohui Peng, Jiaming Guo, Di Huang, Xishan Zhang, Rui Zhang, Yifan Hao, Ling Li, Zikang Tian, Mingju Gao, Yutai Li, Yiming Gan, Shuai Liang, Zihao Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen

    Abstract: Building open agents has always been the ultimate goal in AI research, and creative agents are the more enticing. Existing LLM agents excel at long-horizon tasks with well-defined goals (e.g., `mine diamonds' in Minecraft). However, they encounter difficulties on creative tasks with open goals and abstract criteria due to the inability to bridge the gap between them, thus lacking feedback for self… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  14. arXiv:2405.15245  [pdf, other

    cs.LG cs.AI

    Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee

    Authors: Mengtong Gao, Yifei Zou, Zuyuan Zhang, Xiuzhen Cheng, Dongxiao Yu

    Abstract: The safety of decentralized reinforcement learning (RL) is a challenging problem since malicious agents can share their poisoned policies with benign agents. The paper investigates a cooperative backdoor attack in a decentralized reinforcement learning scenario. Differing from the existing methods that hide a whole backdoor attack behind their shared policies, our method decomposes the backdoor be… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  15. arXiv:2405.14496  [pdf, other

    cs.LG

    Hybrid Global Causal Discovery with Local Search

    Authors: Sujai Hiremath, Jacqueline R. M. A. Maasch, Mengxiao Gao, Promit Ghosal, Kyra Gan

    Abstract: Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose strong parametric assumptions. To address these challenges, we propose a novel hybrid approach for global causal discovery in observational data that leverages l… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  16. arXiv:2405.13019  [pdf, other

    cs.CL cs.AI

    A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models

    Authors: Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha

    Abstract: Despite the crucial importance of accelerating text generation in large language models (LLMs) for efficiently producing content, the sequential nature of this process often leads to high inference latency, posing challenges for real-time applications. Various techniques have been proposed and developed to address these challenges and improve efficiency. This paper presents a comprehensive survey… ▽ More

    Submitted 24 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  17. arXiv:2405.12850  [pdf, other

    cs.CV

    Weakly supervised alignment and registration of MR-CT for cervical cancer radiotherapy

    Authors: Jjahao Zhang, Yin Gu, Deyu Sun, Yuhua Gao, Ming Gao, Ming Cui, Teng Zhang, He Ma

    Abstract: Cervical cancer is one of the leading causes of death in women, and brachytherapy is currently the primary treatment method. However, it is important to precisely define the extent of paracervical tissue invasion to improve cancer diagnosis and treatment options. The fusion of the information characteristics of both computed tomography (CT) and magnetic resonance imaging(MRI) modalities may be use… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  18. arXiv:2405.06932  [pdf, ps, other

    cs.CL cs.AI

    Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training

    Authors: Junqin Huang, Zhongjie Hu, Zihao **g, Mengya Gao, Yichao Wu

    Abstract: In this report, we introduce Piccolo2, an embedding model that surpasses other models in the comprehensive evaluation over 6 tasks on CMTEB benchmark, setting a new state-of-the-art. Piccolo2 primarily leverages an efficient multi-task hybrid loss training approach, effectively harnessing textual data and labels from diverse downstream tasks. In addition, Piccolo2 scales up the embedding dimension… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: tech report

  19. arXiv:2405.05506  [pdf, other

    cs.CL

    Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias

    Authors: Shan Chen, Jack Gallifant, Mingye Gao, Pedro Moreira, Nikolaj Munch, Ajay Muthukkumar, Arvind Rajan, Jaya Kolluri, Amelia Fiske, Janna Hastings, Hugo Aerts, Brian Anthony, Leo Anthony Celi, William G. La Cava, Danielle S. Bitterman

    Abstract: Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence… ▽ More

    Submitted 24 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Submitted for review, data visualization tool available at: www.crosscare.net

  20. arXiv:2405.03435  [pdf, other

    cond-mat.dis-nn cs.AI cs.LG

    A method for quantifying the generalization capabilities of generative models for solving Ising models

    Authors: Qunlong Ma, Zhi Ma, Ming Gao

    Abstract: For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilitie… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures

    Journal ref: Mach. Learn.: Sci. Technol. 5 (2024) 025011

  21. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

  22. arXiv:2404.13946  [pdf, other

    cs.LG

    Dual Model Replacement:invisible Multi-target Backdoor Attack based on Federal Learning

    Authors: Rong Wang, Guichen Zhou, Mingjun Gao, Yunpeng Xiao

    Abstract: In recent years, the neural network backdoor hidden in the parameters of the federated learning model has been proved to have great security risks. Considering the characteristics of trigger generation, data poisoning and model training in backdoor attack, this paper designs a backdoor attack method based on federated learning. Firstly, aiming at the concealment of the backdoor trigger, a TrojanGa… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  23. arXiv:2404.13941  [pdf, other

    eess.SY cs.AI cs.LG

    Autoencoder-assisted Feature Ensemble Net for Incipient Faults

    Authors: Mingxuan Gao, Min Wang, Maoyin Chen

    Abstract: Deep learning has shown the great power in the field of fault detection. However, for incipient faults with tiny amplitude, the detection performance of the current deep learning networks (DLNs) is not satisfactory. Even if prior information about the faults is utilized, DLNs can't successfully detect faults 3, 9 and 15 in Tennessee Eastman process (TEP). These faults are notoriously difficult to… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  24. arXiv:2404.12209  [pdf, other

    cs.CV

    Partial-to-Partial Shape Matching with Geometric Consistency

    Authors: Viktoria Ehm, Maolin Gao, Paul Roetzer, Marvin Eisenberger, Daniel Cremers, Florian Bernard

    Abstract: Finding correspondences between 3D shapes is an important and long-standing problem in computer vision, graphics and beyond. A prominent challenge are partial-to-partial shape matching settings, which occur when the shapes to match are only observed incompletely (e.g. from 3D scanning). Although partial-to-partial matching is a highly relevant setting in practice, it is rarely explored. Our work b… ▽ More

    Submitted 10 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  25. arXiv:2404.11129  [pdf, other

    cs.CV

    Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales

    Authors: Minghe Gao, Shuang Chen, Liang Pang, Yuan Yao, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua

    Abstract: The remarkable performance of Multimodal Large Language Models (MLLMs) has unequivocally demonstrated their proficient understanding capabilities in handling a wide array of visual tasks. Nevertheless, the opaque nature of their black-box reasoning processes persists as an enigma, rendering them uninterpretable and struggling with hallucination. Their ability to execute intricate compositional rea… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  26. arXiv:2404.09226  [pdf, other

    eess.IV cs.CV cs.LG

    Breast Cancer Image Classification Method Based on Deep Transfer Learning

    Authors: Weimin Wang, Min Gao, Mingxuan Xiao, Xu Yan, Yufeng Li

    Abstract: To address the issues of limited samples, time-consuming feature design, and low accuracy in detection and classification of breast cancer pathological images, a breast cancer image classification model algorithm combining deep learning and transfer learning is proposed. This algorithm is based on the DenseNet structure of deep neural networks, and constructs a network model by introducing attenti… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  27. arXiv:2404.08713  [pdf, other

    eess.IV cs.LG q-bio.QM

    Survival Prediction Across Diverse Cancer Types Using Neural Networks

    Authors: Xu Yan, Weimin Wang, MingXuan Xiao, Yufeng Li, Min Gao

    Abstract: Gastric cancer and Colon adenocarcinoma represent widespread and challenging malignancies with high mortality rates and complex treatment landscapes. In response to the critical need for accurate prognosis in cancer patients, the medical community has embraced the 5-year survival rate as a vital metric for estimating patient outcomes. This study introduces a pioneering approach to enhance survival… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  28. arXiv:2404.08279  [pdf, other

    eess.IV cs.CV cs.LG

    Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example

    Authors: MingXuan Xiao, Yufeng Li, Xu Yan, Min Gao, Weimin Wang

    Abstract: Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis often relies on the pathology of cells in the lesion. The pathological diagnosis of breast cancer not only requires professionals and time, but also sometimes involves subjective judgment. To address the challenges of dependence on pathologists expertise and the time-consuming nature of achieving accurate breast… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  29. arXiv:2404.07471  [pdf, other

    cs.SE cs.AI cs.CL

    Structure-aware Fine-tuning for Code Pre-trained Models

    Authors: Jiayi Wu, Renyu Zhu, Nuo Chen, Qiushi Sun, Xiang Li, Ming Gao

    Abstract: Over the past few years, we have witnessed remarkable advancements in Code Pre-trained Models (CodePTMs). These models achieved excellent representation capabilities by designing structure-based pre-training tasks for code. However, how to enhance the absorption of structural knowledge when fine-tuning CodePTMs still remains a significant challenge. To fill this gap, in this paper, we present Stru… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted by COLING 2024

  30. arXiv:2404.06225  [pdf, other

    cond-mat.stat-mech cond-mat.dis-nn cs.LG

    Message Passing Variational Autoregressive Network for Solving Intractable Ising Models

    Authors: Qunlong Ma, Zhi Ma, **long Xu, Hairui Zhang, Ming Gao

    Abstract: Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks. Learning a probability distribution of energy configuration or finding the ground states of a disordered, fully connected Ising model is essential for statistical mechanics and NP-hard problems. Despite tremen… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 18 pages, 14 figures

  31. arXiv:2404.03999  [pdf, other

    cs.CV

    Finsler-Laplace-Beltrami Operators with Application to Shape Analysis

    Authors: Simon Weber, Thomas Dagès, Maolin Gao, Daniel Cremers

    Abstract: The Laplace-Beltrami operator (LBO) emerges from studying manifolds equipped with a Riemannian metric. It is often called the Swiss army knife of geometry processing as it allows to capture intrinsic shape information and gives rise to heat diffusion, geodesic distances, and a multitude of shape descriptors. It also plays a central role in geometric deep learning. In this work, we explore Finsler… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  32. arXiv:2403.18196  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.CY

    Looking Beyond What You See: An Empirical Analysis on Subgroup Intersectional Fairness for Multi-label Chest X-ray Classification Using Social Determinants of Racial Health Inequities

    Authors: Dana Moukheiber, Saurabh Mahindre, Lama Moukheiber, Mira Moukheiber, Mingchen Gao

    Abstract: There has been significant progress in implementing deep learning models in disease diagnosis using chest X- rays. Despite these advancements, inherent biases in these models can lead to disparities in prediction accuracy across protected groups. In this study, we propose a framework to achieve accurate diagnostic outcomes and ensure fairness across intersectional groups in high-dimensional chest… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: ICCV CVAMD 2023

  33. arXiv:2403.14638  [pdf, other

    cs.CY cs.LG

    Personalized Programming Guidance based on Deep Programming Learning Style Capturing

    Authors: Yingfan Liu, Renyu Zhu, Ming Gao

    Abstract: With the rapid development of big data and AI technology, programming is in high demand and has become an essential skill for students. Meanwhile, researchers also focus on boosting the online judging system's guidance ability to reduce students' dropout rates. Previous studies mainly targeted at enhancing learner engagement on online platforms by providing personalized recommendations. However, t… ▽ More

    Submitted 20 February, 2024; originally announced March 2024.

    Comments: 18th International Conference on Computer Science & Education

  34. arXiv:2403.14023  [pdf

    cs.CR

    A system capable of verifiably and privately screening global DNA synthesis

    Authors: Carsten Baum, Jens Berlips, Walther Chen, Hongrui Cui, Ivan Damgard, Jiangbin Dong, Kevin M. Esvelt, Mingyu Gao, Dana Gretton, Leonard Foner, Martin Kysel, Kaiyi Zhang, Juanru Li, Xiang Li, Omer Paneth, Ronald L. Rivest, Francesca Sage-Ling, Adi Shamir, Yue Shen, Meicen Sun, Vinod Vaikuntanathan, Lynn Van Hauwe, Theia Vogel, Benjamin Weinstein-Raun, Yun Wang , et al. (5 additional authors not shown)

    Abstract: Printing custom DNA sequences is essential to scientific and biomedical research, but the technology can be used to manufacture plagues as well as cures. Just as ink printers recognize and reject attempts to counterfeit money, DNA synthesizers and assemblers should deny unauthorized requests to make viral DNA that could be used to ignite a pandemic. There are three complications. First, we don't n… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Main text 10 pages, 4 figures. 5 supplementary figures. Total 21 pages. Direct correspondence to: Ivan B. Damgard ([email protected]), Andrew C. Yao ([email protected]), Kevin M. Esvelt ([email protected])

  35. arXiv:2403.09638  [pdf, other

    cs.CV

    SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior

    Authors: Huan-ang Gao, Mingju Gao, Jiaju Li, Wenyi Li, Rong Zhi, Hao Tang, Hao Zhao

    Abstract: Semantic image synthesis (SIS) shows good promises for sensor simulation. However, current best practices in this field, based on GANs, have not yet reached the desired level of quality. As latent diffusion models make significant strides in image generation, we are prompted to evaluate ControlNet, a notable method for its dense control capabilities. Our investigation uncovered two primary issues… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Project Page: https://air-discover.github.io/SCP-Diff/

  36. arXiv:2402.16343  [pdf, other

    cs.AR

    Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems

    Authors: Yiwei Li, Boyu Tian, Mingyu Gao

    Abstract: Hybrid main memory systems combine both performance and capacity advantages from heterogeneous memory technologies. With larger capacities, higher associativities, and finer granularities, hybrid memory systems currently exhibit significant metadata storage and lookup overheads for flexibly remap** data blocks between the two memory tiers. To alleviate the inefficiencies of existing designs, we… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  37. arXiv:2402.14316  [pdf, other

    cs.CV

    Place Anything into Any Video

    Authors: Ziling Liu, **yu Yang, Mingqi Gao, Feng Zheng

    Abstract: Controllable video editing has demonstrated remarkable potential across diverse applications, particularly in scenarios where capturing or re-capturing real-world videos is either impractical or costly. This paper introduces a novel and efficient system named Place-Anything, which facilitates the insertion of any object into any video solely based on a picture or text description of the target obj… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  38. arXiv:2402.12913  [pdf, other

    cs.CL

    OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination Detection with Weakly Supervised Data

    Authors: Chengcheng Wei, Ze Chen, Songtan Fang, Jiarong He, Max Gao

    Abstract: This paper mainly describes a unified system for hallucination detection of LLMs, which wins the second prize in the model-agnostic track of the SemEval-2024 Task 6, and also achieves considerable results in the model-aware track. This task aims to detect hallucination with LLMs for three different text-generation tasks without labeled training data. We utilize prompt engineering and few-shot lear… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  39. arXiv:2402.12055  [pdf, other

    cs.CL

    Are LLM-based Evaluators Confusing NLG Quality Criteria?

    Authors: Xinyu Hu, Mingqi Gao, Sen Hu, Yang Zhang, Yicheng Chen, Teng Xu, Xiaojun Wan

    Abstract: Some prior work has shown that LLMs perform well in NLG evaluation for different tasks. However, we discover that LLMs seem to confuse different evaluation criteria, which reduces their reliability. For further verification, we first consider avoiding issues of inconsistent conceptualization and vague expression in existing NLG quality criteria themselves. So we summarize a clear hierarchical clas… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024

  40. arXiv:2402.08785  [pdf, other

    cs.CL

    InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment

    Authors: Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, Julian McAuley

    Abstract: Do current large language models (LLMs) better solve graph reasoning and generation tasks with parameter updates? In this paper, we propose InstructGraph, a framework that empowers LLMs with the abilities of graph reasoning and generation by instruction tuning and preference alignment. Specifically, we first propose a structured format verbalizer to unify all graph data into a universal code-like… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 19 pages

  41. arXiv:2402.06380  [pdf, other

    cs.LG stat.ML

    Optimal estimation of Gaussian (poly)trees

    Authors: Yuhao Wang, Ming Gao, Wai Ming Tai, Bryon Aragam, Arnab Bhattacharyya

    Abstract: We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC al… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  42. arXiv:2402.03588  [pdf, other

    cs.LG cs.AI

    Continual Domain Adversarial Adaptation via Double-Head Discriminators

    Authors: Yan Shen, Zhanghexuan Ji, Chunwei Ma, Mingchen Gao

    Abstract: Domain adversarial adaptation in a continual setting poses a significant challenge due to the limitations on accessing previous source domain data. Despite extensive research in continual learning, the task of adversarial adaptation cannot be effectively accomplished using only a small number of stored source domain data, which is a standard setting in memory replay approaches. This limitation ari… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: AISTATS 2024

  43. arXiv:2402.03379  [pdf, other

    cs.IR cs.AI cs.LG

    Entire Chain Uplift Modeling with Context-Enhanced Learning for Intelligent Marketing

    Authors: Yinqiu Huang, Shuli Wang, Min Gao, Xue Wei, Changhao Li, Chuan Luo, Yinhua Zhu, Xiong Xiao, Yi Luo

    Abstract: Uplift modeling, vital in online marketing, seeks to accurately measure the impact of various strategies, such as coupons or discounts, on different users by predicting the Individual Treatment Effect (ITE). In an e-commerce setting, user behavior follows a defined sequential chain, including impression, click, and conversion. Marketing strategies exert varied uplift effects at each stage within t… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW2024

  44. arXiv:2402.01696  [pdf, other

    cs.CL cs.LG

    HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification

    Authors: Vidit Jain, Mukund Rungta, Yuchen Zhuang, Yue Yu, Zeyu Wang, Mu Gao, Jeffrey Skolnick, Chao Zhang

    Abstract: Hierarchical text classification (HTC) is a complex subtask under multi-label text classification, characterized by a hierarchical label taxonomy and data imbalance. The best-performing models aim to learn a static representation by combining document and hierarchical label information. However, the relevance of document sections can vary based on the hierarchy level, necessitating a dynamic docum… ▽ More

    Submitted 21 February, 2024; v1 submitted 23 January, 2024; originally announced February 2024.

  45. arXiv:2402.01383  [pdf, other

    cs.CL

    LLM-based NLG Evaluation: Current Status and Challenges

    Authors: Mingqi Gao, Xinyu Hu, Jie Ruan, Xiao Pu, Xiaojun Wan

    Abstract: Evaluating natural language generation (NLG) is a vital but challenging problem in artificial intelligence. Traditional evaluation metrics mainly capturing content (e.g. n-gram) overlap between system outputs and references are far from satisfactory, and large language models (LLMs) such as ChatGPT have demonstrated great potential in NLG evaluation in recent years. Various automatic evaluation me… ▽ More

    Submitted 26 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  46. arXiv:2401.16441  [pdf, other

    cs.LG cs.AI cs.CL

    FaKnow: A Unified Library for Fake News Detection

    Authors: Yiyuan Zhu, Yongjun Li, Jialiang Wang, Ming Gao, Jiali Wei

    Abstract: Over the past years, a large number of fake news detection algorithms based on deep learning have emerged. However, they are often developed under different frameworks, each mandating distinct utilization methodologies, consequently hindering reproducibility. Additionally, a substantial amount of redundancy characterizes the code development of such fake news detection models. To address these con… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  47. arXiv:2401.12004  [pdf

    eess.IV cs.LG eess.SP

    NLCG-Net: A Model-Based Zero-Shot Learning Framework for Undersampled Quantitative MRI Reconstruction

    Authors: Xinrui Jiang, Yohan Jun, Jae** Cho, Mengze Gao, Xingwang Yong, Berkin Bilgic

    Abstract: Typical quantitative MRI (qMRI) methods estimate parameter maps after image reconstructing, which is prone to biases and error propagation. We propose a Nonlinear Conjugate Gradient (NLCG) optimizer for model-based T2/T1 estimation, which incorporates U-Net regularization trained in a scan-specific manner. This end-to-end method directly estimates qMRI maps from undersampled k-space data using mon… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 8 pages, 5 figures, submitted to International Society for Magnetic Resonance in Medicine 2024

  48. arXiv:2401.08696  [pdf, other

    cs.AR cs.AI cs.LG

    Hierarchical Source-to-Post-Route QoR Prediction in High-Level Synthesis with GNNs

    Authors: Mingzhe Gao, Jieru Zhao, Zhe Lin, Minyi Guo

    Abstract: High-level synthesis (HLS) notably speeds up the hardware design process by avoiding RTL programming. However, the turnaround time of HLS increases significantly when post-route quality of results (QoR) are considered during optimization. To tackle this issue, we propose a hierarchical post-route QoR prediction approach for FPGA HLS, which features: (1) a modeling flow that directly estimates late… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: Accepted for publication at DATE 2024

  49. arXiv:2401.08228  [pdf, other

    cs.IR

    MCRPL: A Pretrain, Prompt & Fine-tune Paradigm for Non-overlap** Many-to-one Cross-domain Recommendation

    Authors: Hao Liu, Lei Guo, Lei Zhu, Yongqiang Jiang, Min Gao, Hongzhi Yin

    Abstract: Cross-domain Recommendation (CR) is the task that tends to improve the recommendations in the sparse target domain by leveraging the information from other rich domains. Existing methods of cross-domain recommendation mainly focus on overlap** scenarios by assuming users are totally or partially overlapped, which are taken as bridges to connect different domains. However, this assumption does no… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  50. arXiv:2401.07518  [pdf, other

    cs.CL cs.AI

    Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends

    Authors: Yunshi Lan, Xinyuan Li, Hanyue Du, Xuesong Lu, Ming Gao, Weining Qian, Aoying Zhou

    Abstract: Natural Language Processing (NLP) aims to analyze text or speech via techniques in the computer science field. It serves the applications in domains of healthcare, commerce, education and so on. Particularly, NLP has been widely applied to the education domain and its applications have enormous potential to help teaching and learning. In this survey, we review recent advances in NLP with the focus… ▽ More

    Submitted 15 March, 2024; v1 submitted 15 January, 2024; originally announced January 2024.