Skip to main content

Showing 1–43 of 43 results for author: Cong, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10167  [pdf, other

    cs.CV

    4DRecons: 4D Neural Implicit Deformable Objects Reconstruction from a single RGB-D Camera with Geometrical and Topological Regularizations

    Authors: Xiaoyan Cong, Haitao Yang, Liyan Chen, Kaifeng Zhang, Li Yi, Chandrajit Bajaj, Qixing Huang

    Abstract: This paper presents a novel approach 4DRecons that takes a single camera RGB-D sequence of a dynamic subject as input and outputs a complete textured deforming 3D model over time. 4DRecons encodes the output as a 4D neural implicit surface and presents an optimization procedure that combines a data term and two regularization terms. The data term fits the 4D implicit surface to the input partial o… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2405.19684  [pdf, other

    cs.CV

    A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning

    Authors: Xiaofeng Cong, Yu Zhao, Jie Gui, Junming Hou, Dacheng Tao

    Abstract: Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent. To foster future advancements, we provide a detailed overview of the UIE task from several perspectives. Firstly, we introduce the physical models, data construction processes, evaluation metrics,… ▽ More

    Submitted 25 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: A survey on the underwater image enhancement task

  3. arXiv:2404.13830  [pdf, other

    cs.CV

    A Comprehensive Survey and Taxonomy on Point Cloud Registration Based on Deep Learning

    Authors: Yu-Xin Zhang, Jie Gui, Xiaofeng Cong, Xin Gong, Wenbing Tao

    Abstract: Point cloud registration (PCR) involves determining a rigid transformation that aligns one point cloud to another. Despite the plethora of outstanding deep learning (DL)-based registration methods proposed, comprehensive and systematic studies on DL-based PCR techniques are still lacking. In this paper, we present a comprehensive survey and taxonomy of recently proposed PCR methods. Firstly, we co… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by IJCAI 2024

  4. arXiv:2404.12804  [pdf, other

    cs.CV eess.IV

    Linearly-evolved Transformer for Pan-sharpening

    Authors: Junming Hou, Zihan Cao, Naishan Zheng, Xuan Li, Xiaoyu Chen, Xinyang Liu, Xiaofeng Cong, Man Zhou, Danfeng Hong

    Abstract: Vision transformer family has dominated the satellite pan-sharpening field driven by the global-wise spatial information modeling mechanism from the core self-attention ingredient. The standard modeling rules within these promising pan-sharpening methods are to roughly stack the transformer variants in a cascaded manner. Despite the remarkable advancement, their success may be at the huge cost of… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 10 pages

  5. arXiv:2404.08364  [pdf, other

    cs.DC

    FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework

    Authors: Junyi Mei, Shixuan Sun, Chao Li, Cheng Xu, Cheng Chen, Yibo Liu, **g Wang, Cheng Zhao, Xiaofeng Hou, Minyi Guo, Bingsheng He, Xiaoliang Cong

    Abstract: Dynamic graph random walk (DGRW) emerges as a practical tool for capturing structural relations within a graph. Effectively executing DGRW on GPU presents certain challenges. First, existing sampling methods demand a pre-processing buffer, causing substantial space complexity. Moreover, the power-law distribution of graph vertex degrees introduces workload imbalance issues, rendering DGRW embarras… ▽ More

    Submitted 26 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  6. arXiv:2404.05661  [pdf, other

    cs.CV

    Automatic Controllable Colorization via Imagination

    Authors: Xiaoyan Cong, Yue Wu, Qifeng Chen, Chenyang Lei

    Abstract: We propose a framework for automatic colorization that allows for iterative editing and modifications. The core of our framework lies in an imagination module: by understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content. These images serve as references for coloring, mimicking the process of human… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project page: https://xy-cong.github.io/imagine-colorization

  7. arXiv:2403.18548  [pdf, other

    cs.CV

    A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint

    Authors: Xiaofeng Cong, Jie Gui, **g Zhang, Junming Hou, Hao Shen

    Abstract: Existing research based on deep learning has extensively explored the problem of daytime image dehazing. However, few studies have considered the characteristics of nighttime hazy scenes. There are two distinctions between nighttime and daytime haze. First, there may be multiple active colored light sources with lower illumination intensity in nighttime scenes, which may cause haze, glow and noise… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by CVPR2024

  8. arXiv:2402.16667  [pdf, other

    cs.CL cs.AI

    RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation

    Authors: Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, Xiaoyin Che, Zhiyuan Liu, Maosong Sun

    Abstract: Generative models have demonstrated considerable potential in software engineering, particularly in tasks such as code generation and debugging. However, their utilization in the domain of code documentation generation remains underexplored. To this end, we introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code docu… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    ACM Class: I.2.7; F.2.2

  9. arXiv:2402.11453  [pdf, other

    cs.CL

    MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization

    Authors: Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, Maosong Sun

    Abstract: Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns. Despite its importance, the use of Large Language Models (LLMs) for scientific data visualization remains rather unexplored. In this study, we introduce MatPlotAgent, an efficient model-agnostic LLM agent framework designed… ▽ More

    Submitted 19 March, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Work in Progress

  10. arXiv:2402.09205  [pdf, other

    cs.CL cs.AI cs.HC

    Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

    Authors: Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions. Although adept at devising strategies and performing tasks, these agents struggle with seeking clarification and gras** precise user intentions. To bridge this gap, we introduce Intention-in-Interaction (IN3), a novel benchmark des… ▽ More

    Submitted 15 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 26 pages, 5 tables, 6 figures

  11. arXiv:2402.03009  [pdf, other

    cs.CL cs.AI

    UniMem: Towards a Unified View of Long-Context Large Language Models

    Authors: Junjie Fang, Likai Tang, Hongzhe Bi, Yujia Qin, Si Sun, Zhenyu Li, Haolun Li, Yongjian Li, Xin Cong, Yukun Yan, Xiaodong Shi, Sen Song, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Long-context processing is a critical ability that constrains the applicability of large language models. Although there exist various methods devoted to enhancing the long-context processing ability of large language models (LLMs), they are developed in an isolated manner and lack systematic analysis and integration of their strengths, hindering further developments. In this paper, we introduce U… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  12. arXiv:2401.13996  [pdf, other

    cs.CL cs.AI

    Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution

    Authors: Cheng Qian, Shihao Liang, Yujia Qin, Yining Ye, Xin Cong, Yankai Lin, Yesai Wu, Zhiyuan Liu, Maosong Sun

    Abstract: This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution. Unlike existing methods focused on intra-task learning, ICE promotes the transfer of knowledge between tasks for genuine self-evolution, similar to human experience learning. The strategy dynamically investigates planning and e… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 18 pages, 5 figures

  13. arXiv:2401.07358  [pdf, other

    cs.CV

    Harnessing Machine Learning for Discerning AI-Generated Synthetic Images

    Authors: Yuyang Wang, Yizhi Hao, Amando Xu Cong

    Abstract: In the realm of digital media, the advent of AI-generated synthetic images has introduced significant challenges in distinguishing between real and fabricated visual content. These images, often indistinguishable from authentic ones, pose a threat to the credibility of digital media, with potential implications for disinformation and fraud. Our research addresses this challenge by employing machin… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  14. arXiv:2401.04621  [pdf, other

    cs.SE cs.AI cs.CL

    DebugBench: Evaluating Debugging Capability of Large Language Models

    Authors: Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Yinxu Pan, Yesai Wu, Haotian Hui, Weichuan Liu, Zhiyuan Liu, Maosong Sun

    Abstract: Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs' debugging ability are significantly limited by the risk of data leakage, the scale of the dataset, and the variety of tested bugs. To overcome these deficiencies… ▽ More

    Submitted 6 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted as Findings of ACL 2024

  15. arXiv:2312.17294  [pdf, other

    cs.SE cs.AI cs.IR

    GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension

    Authors: Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Yujia Qin, Yining Ye, Yaxi Lu, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: While Large Language Models (LLMs) like ChatGPT and GPT-4 have demonstrated exceptional proficiency in natural language processing, their efficacy in addressing complex, multifaceted tasks remains limited. A growing area of research focuses on LLM-based agents equipped with external tools capable of performing diverse tasks. However, existing LLM-based agents only support a limited set of tools wh… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  16. arXiv:2312.17025  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    Experiential Co-Learning of Software-Develo** Agents

    Authors: Chen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, Yifei Wang, Weize Chen, Cheng Yang, Xin Cong, Xiaoyin Che, Zhiyuan Liu, Maosong Sun

    Abstract: Recent advancements in large language models (LLMs) have brought significant changes to various domains, especially through LLM-driven autonomous agents. A representative scenario is in software development, where LLM agents demonstrate efficient collaboration, task division, and assurance of software quality, markedly reducing the need for manual involvement. However, these agents frequently perf… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Accepted to ACL 2024, https://github.com/OpenBMB/ChatDev

  17. arXiv:2311.10751  [pdf, other

    cs.RO cs.AI cs.CL

    ProAgent: From Robotic Process Automation to Agentic Process Automation

    Authors: Yining Ye, Xin Cong, Shizuo Tian, Jiannan Cao, Hao Wang, Yujia Qin, Yaxi Lu, Heyang Yu, Huadong Wang, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: From ancient water wheels to robotic process automation (RPA), automation technology has evolved throughout history to liberate human beings from arduous tasks. Yet, RPA struggles with tasks needing human-like intelligence, especially in elaborate design of workflow construction and dynamic decision-making in workflow execution. As Large Language Models (LLMs) have emerged human-like intelligence,… ▽ More

    Submitted 23 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Work in progress

  18. arXiv:2310.00310  [pdf, other

    cs.CV

    An easy zero-shot learning combination: Texture Sensitive Semantic Segmentation IceHrNet and Advanced Style Transfer Learning Strategy

    Authors: Zhiyong Yang, Yuelong Zhu, Xiaoqin Zeng, Jun Zong, Xiuheng Liu, Ran Tao, Xiaofei Cong, Yufeng Yu

    Abstract: We proposed an easy method of Zero-Shot semantic segmentation by using style transfer. In this case, we successfully used a medical imaging dataset (Blood Cell Imagery) to train a model for river ice semantic segmentation. First, we built a river ice semantic segmentation dataset IPC_RI_SEG using a fixed camera and covering the entire ice melting process of the river. Second, a high-resolution tex… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: 12 pages, 11 figures, submitted to Journal of Hydrology

  19. arXiv:2308.12519  [pdf, other

    cs.CL

    Rational Decision-Making Agent with Internalized Utility Judgment

    Authors: Yining Ye, Xin Cong, Shizuo Tian, Yujia Qin, Chong Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications. Existing approaches to LLM-based decision-making predominantly build upon the manually-designed external performance metrics to guide the decision-making process… ▽ More

    Submitted 17 January, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Received 8,6,6,6 scores on ICLR 2024

  20. arXiv:2308.10848  [pdf, other

    cs.CL

    AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

    Authors: Weize Chen, Yusheng Su, **gwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou

    Abstract: Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework… ▽ More

    Submitted 23 October, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Under review. Code at https://github.com/OpenBMB/AgentVerse/

  21. arXiv:2308.02103  [pdf, other

    cs.CL

    Prompt2Gaussia: Uncertain Prompt-learning for Script Event Prediction

    Authors: Shiyao Cui, Xin Cong, Jiawei Sheng, Xuebin Wang, Tingwen Liu, **qiao Shi

    Abstract: Script Event Prediction (SEP) aims to predict the subsequent event for a given event chain from a candidate list. Prior research has achieved great success by integrating external knowledge to enhance the semantics, but it is laborious to acquisite the appropriate knowledge resources and retrieve the script-related knowledge. In this paper, we regard public pre-trained language models as knowledge… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: 16 pages

  22. arXiv:2308.01857  [pdf, other

    cs.AR

    iEDA: An Open-Source Intelligent Physical Implementation Toolkit and Library

    Authors: Xingquan Li, Simin Tao, Zengrong Huang, Shijian Chen, Zhisheng Zeng, Liwei Ni, Zhipeng Huang, Chunan Zhuang, Hongxi Wu, Weiguo Li1, Xueyan Zhao, He Liu, Shuaiying Long, Wei He, Bojun Liu, Sifeng Gan, Zihao Yu, Tong Liu, Yuchi Miao, Zhiyuan Yan, Hao Wang, Jie Zhao, Yifan Li, Ruizhi Liu, Xiaoze Lin , et al. (31 additional authors not shown)

    Abstract: Open-source EDA shows promising potential in unleashing EDA innovation and lowering the cost of chip design. This paper presents an open-source EDA project, iEDA, aiming for building a basic infrastructure for EDA technology evolution and closing the industrial-academic gap in the EDA area. iEDA now covers the whole flow of physical design (including Floorplan, Placement, CTS, Routing, Timing Opti… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  23. arXiv:2307.16789  [pdf, other

    cs.AI cs.CL cs.LG

    ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

    Authors: Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: Despite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current instruction tuning largely focuses on basic language tasks but ignores the tool-use domain. This is in contrast to the excellent tool-use capabilities of state-of-th… ▽ More

    Submitted 3 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

  24. arXiv:2307.15504  [pdf, other

    cs.CL cs.AI

    Exploring Format Consistency for Instruction Tuning

    Authors: Shihao Liang, Runchu Tian, Kunlun Zhu, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun

    Abstract: Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger col… ▽ More

    Submitted 8 January, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

  25. arXiv:2307.07924  [pdf, other

    cs.SE cs.CL cs.MA

    ChatDev: Communicative Agents for Software Development

    Authors: Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: Software development is a complex task that necessitates cooperation among multiple members with diverse skills. Numerous studies used deep learning to improve specific phases in a waterfall model, such as design, coding, and testing. However, the deep learning model in each phase requires unique designs, leading to technical inconsistencies across various phases, which results in a fragmented and… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: Accepted to ACL 2024; https://github.com/OpenBMB/ChatDev

  26. arXiv:2306.05675  [pdf, other

    cs.CV

    Illumination Controllable Dehazing Network based on Unsupervised Retinex Embedding

    Authors: Jie Gui, Xiaofeng Cong, Lei He, Yuan Yan Tang, James Tin-Yau Kwok

    Abstract: On the one hand, the dehazing task is an illposedness problem, which means that no unique solution exists. On the other hand, the dehazing task should take into account the subjective factor, which is to give the user selectable dehazed images rather than a single result. Therefore, this paper proposes a multi-output dehazing network by introducing illumination controllable ability, called IC-Deha… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  27. arXiv:2305.04469  [pdf, other

    cs.GR

    HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation

    Authors: Longwen Zhang, Zijun Zhao, Xinzhou Cong, Qixuan Zhang, Shuqi Gu, Yuchong Gao, Rui Zheng, Wei Yang, Lan Xu, **gyi Yu

    Abstract: Significant advancements have been made in develo** parametric models for digital humans, with various approaches concentrating on parts such as the human body, hand, or face. Nevertheless, connectors such as the neck have been overlooked in these models, with rich anatomical priors often unutilized. In this paper, we introduce HACK (Head-And-neCK), a novel parametric model for constructing the… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Find HACK model on https://github.com/ZoneLikeWonderland/HACK-Model

  28. arXiv:2304.09322  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-Modality Multi-Scale Cardiovascular Disease Subtypes Classification Using Raman Image and Medical History

    Authors: Bo Yu, Hechang Chen, Chengyou Jia, Hongren Zhou, Lele Cong, Xiankai Li, Jianhui Zhuang, Xianling Cong

    Abstract: Raman spectroscopy (RS) has been widely used for disease diagnosis, e.g., cardiovascular disease (CVD), owing to its efficiency and component-specific testing capabilities. A series of popular deep learning methods have recently been introduced to learn nuance features from RS for binary classifications and achieved outstanding performance than conventional machine learning methods. However, these… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Journal ref: [J]. Expert Systems with Applications, 2023: 119965

  29. Data and Knowledge Co-driving for Cancer Subtype Classification on Multi-Scale Histopathological Slides

    Authors: Bo Yu, Hechang Chen, Yunke Zhang, Lele Cong, Shuchao Pang, Hongren Zhou, Ziye Wang, Xianling Cong

    Abstract: Artificial intelligence-enabled histopathological data analysis has become a valuable assistant to the pathologist. However, existing models lack representation and inference abilities compared with those of pathologists, especially in cancer subtype diagnosis, which is unconvincing in clinical practice. For instance, pathologists typically observe the lesions of a slide from global to local, and… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Journal ref: [J]. Knowledge-Based Systems, 2023, 260: 110168

  30. arXiv:2304.08354  [pdf, other

    cs.CL cs.AI cs.LG

    Tool Learning with Foundation Models

    Authors: Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, **g Yi, Yuzhang Zhu , et al. (16 additional authors not shown)

    Abstract: Humans possess an extraordinary ability to create and utilize tools, allowing them to overcome physical limitations and explore new frontiers. With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. This paradigm, i.e., tool learning with foundation models, combines the strengths of specialized tools and foundation models to achieve enhanced a… ▽ More

    Submitted 15 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

  31. Contrastive Cross-Domain Sequential Recommendation

    Authors: Jiangxia Cao, Xin Cong, Jiawei Sheng, Tingwen Liu, Bin Wang

    Abstract: Cross-Domain Sequential Recommendation (CDSR) aims to predict future interactions based on user's historical sequential interactions from multiple domains. Generally, a key challenge of CDSR is how to mine precise cross-domain user preference based on the intra-sequence and inter-sequence item interactions. Existing works first learn single-domain user preference only with intra-sequence item inte… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: This paper has been accepted by CIKM 2022

  32. Enhancing Multimodal Entity and Relation Extraction with Variational Information Bottleneck

    Authors: Shiyao Cui, Jiangxia Cao, Xin Cong, Jiawei Sheng, Quangang Li, Tingwen Liu, **qiao Shi

    Abstract: This paper studies the multimodal named entity recognition (MNER) and multimodal relation extraction (MRE), which are important for multimedia social platform analysis. The core of MNER and MRE lies in incorporating evident visual information to enhance textual semantics, where two issues inherently demand investigations. The first issue is modality-noise, where the task-irrelevant information in… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024

  33. arXiv:2303.17255  [pdf, other

    cs.CV cs.CR eess.IV

    Fooling the Image Dehazing Models by First Order Gradient

    Authors: Jie Gui, Xiaofeng Cong, Chengwei Peng, Yuan Yan Tang, James Tin-Yau Kwok

    Abstract: The research on the single image dehazing task has been widely explored. However, as far as we know, no comprehensive study has been conducted on the robustness of the well-trained dehazing models. Therefore, there is no evidence that the dehazing networks can resist malicious attacks. In this paper, we focus on designing a group of attack methods based on first order gradient to verify the robust… ▽ More

    Submitted 15 February, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: This paper is accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

  34. arXiv:2301.11621  [pdf, other

    cs.CL

    Event Causality Extraction with Event Argument Correlations

    Authors: Shiyao Cui, Jiawei Sheng, Xin Cong, QuanGang Li, Tingwen Liu, **qiao Shi

    Abstract: Event Causality Identification (ECI), which aims to detect whether a causality relation exists between two given textual events, is an important task for event causality understanding. However, the ECI task ignores crucial event structure and cause-effect causality component information, making it struggle for downstream applications. In this paper, we explore a novel task, namely Event Causality… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted to COLING2022

  35. arXiv:2203.16863  [pdf, other

    cs.IR cs.SI

    Cross-Domain Recommendation to Cold-Start Users via Variational Information Bottleneck

    Authors: Jiangxia Cao, Jiawei Sheng, Xin Cong, Tingwen Liu, Bin Wang

    Abstract: Recommender systems have been widely deployed in many real-world applications, but usually suffer from the long-standing user cold-start problem. As a promising way, Cross-Domain Recommendation (CDR) has attracted a surge of interest, which aims to transfer the user preferences observed in the source domain to make recommendations in the target domain. Previous CDR approaches mostly achieve the go… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted by ICDE 2022

  36. arXiv:2202.03092  [pdf, other

    cs.CL

    Document-Level Event Extraction via Human-Like Reading Process

    Authors: Shiyao Cui, Xin Cong, Bowen Yu, Tingwen Liu, Yucheng Wang, **qiao Shi

    Abstract: Document-level Event Extraction (DEE) is particularly tricky due to the two challenges it poses: scattering-arguments and multi-events. The first challenge means that arguments of one event record could reside in different sentences in the document, while the second one reflects one document may simultaneously contain multiple such event records. Motivated by humans' reading cognitive to extract i… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

    Comments: To apper in ICASSP2022

  37. arXiv:2111.00884  [pdf, other

    cs.CL

    Enhanced Language Representation with Label Knowledge for Span Extraction

    Authors: Pan Yang, Xin Cong, Zhenyun Sun, Xingwu Liu

    Abstract: Span extraction, aiming to extract text spans (such as words or phrases) from plain texts, is a fundamental process in Information Extraction. Recent works introduce the label knowledge to enhance the text representation by formalizing the span extraction task into a question answering problem (QA Formalization), which achieves state-of-the-art performance. However, QA Formalization does not fully… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to the main conference of EMNLP 2021 (long paper)

  38. arXiv:2107.03573  [pdf, other

    cs.SI cs.AI

    Deep Structural Point Process for Learning Temporal Interaction Networks

    Authors: Jiangxia Cao, Xixun Lin, Xin Cong, Shu Guo, Hengzhu Tang, Tingwen Liu, Bin Wang

    Abstract: This work investigates the problem of learning temporal interaction networks. A temporal interaction network consists of a series of chronological interactions between users and items. Previous methods tackle this problem by using different variants of recurrent neural networks to model sequential interactions, which fail to consider the structural information of temporal interaction networks and… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: Accepted by ECML/PKDD 2021, 16 pages, 2 figures

  39. arXiv:2106.03323  [pdf, other

    cs.CV cs.LG

    A Comprehensive Survey and Taxonomy on Single Image Dehazing Based on Deep Learning

    Authors: Jie Gui, Xiaofeng Cong, Yuan Cao, Wenqi Ren, Jun Zhang, **g Zhang, Jiuxin Cao, Dacheng Tao

    Abstract: With the development of convolutional neural networks, hundreds of deep learning based dehazing methods have been proposed. In this paper, we provide a comprehensive survey on supervised, semi-supervised, and unsupervised single image dehazing. We first discuss the physical model, datasets, network modules, loss functions, and evaluation metrics that are commonly used. Then, the main contributions… ▽ More

    Submitted 20 December, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: This paper is accepted by ACM Computing Surveys

  40. arXiv:2012.02353  [pdf, other

    cs.CL

    Few-Shot Event Detection with Prototypical Amortized Conditional Random Field

    Authors: Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, Bin Wang

    Abstract: Event detection tends to struggle when it needs to recognize novel event types with a few samples. The previous work attempts to solve this problem in the identify-then-classify manner but ignores the trigger discrepancy between event types, thus suffering from the error propagation. In this paper, we present a novel unified model which converts the task to a few-shot tagging problem with a double… ▽ More

    Submitted 24 May, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Accepted at ACL 2021

  41. Label Enhanced Event Detection with Heterogeneous Graph Attention Networks

    Authors: Shiyao Cui, Bowen Yu, Xin Cong, Tingwen Liu, Quangang Li, **qiao Shi

    Abstract: Event Detection (ED) aims to recognize instances of specified types of event triggers in text. Different from English ED, Chinese ED suffers from the problem of word-trigger mismatch due to the uncertain word boundaries. Existing approaches injecting word information into character-level models have achieved promising progress to alleviate this problem, but they are limited by two issues. First, t… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    Journal ref: Journal of Computer Science and Technology 2023

  42. arXiv:2009.12072  [pdf, other

    cs.CV

    AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results

    Authors: Pengxu Wei, Hannan Lu, Radu Timofte, Liang Lin, Wangmeng Zuo, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Gang Zhang, **gtuo Liu, Junyu Han, Errui Ding, Tangxin Xie, Liang Cao, Yan Zou, Yi Shen, Jialiang Zhang, Yu Jia, Kaihua Cheng, Chenhuan Wu, Yue Lin, Cen Liu, Yunbo Peng, Xueyi Zou , et al. (51 additional authors not shown)

    Abstract: This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for $\times$2, $\times$3 and $\times$4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, wh… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

    Journal ref: European Conference on Computer Vision Workshops, 2020

  43. arXiv:2006.12816  [pdf, other

    cs.CL

    Inductive Unsupervised Domain Adaptation for Few-Shot Classification via Clustering

    Authors: Xin Cong, Bowen Yu, Tingwen Liu, Shiyao Cui, Hengzhu Tang, Bin Wang

    Abstract: Few-shot classification tends to struggle when it needs to adapt to diverse domains. Due to the non-overlap** label space between domains, the performance of conventional domain adaptation is limited. Previous work tackles the problem in a transductive manner, by assuming access to the full set of test data, which is too restrictive for many real-world applications. In this paper, we set out to… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

    Comments: Accepted by ECML-PKDD 2020