Skip to main content

Showing 1–50 of 71 results for author: Kong, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16896  [pdf, other

    eess.SP cs.LG

    f-GAN: A frequency-domain-constrained generative adversarial network for PPG to ECG synthesis

    Authors: Nathan C. L. Kong, Dae Lee, Huyen Do, Dae Hoon Park, Cong Xu, Hongda Mao, Jonathan Chung

    Abstract: Electrocardiograms (ECGs) and photoplethysmograms (PPGs) are generally used to monitor an individual's cardiovascular health. In clinical settings, ECGs and fingertip PPGs are the main signals used for assessing cardiovascular health, but the equipment necessary for their collection precludes their use in daily monitoring. Although PPGs obtained from wrist-worn devices are susceptible to noise due… ▽ More

    Submitted 15 May, 2024; originally announced June 2024.

  2. arXiv:2406.11890  [pdf, other

    cs.LG cs.AI cs.CL

    Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning

    Authors: Hui Liu, Wenya Wang, Hao Sun, Chris Xing Tian, Chenqi Kong, Xin Dong, Haoliang Li

    Abstract: Large Language Models (LLMs) have demonstrated impressive in-context learning (ICL) capabilities from few-shot demonstration exemplars. While recent learning-based demonstration selection methods have proven beneficial to ICL by choosing more useful exemplars, their underlying mechanisms are opaque, hindering efforts to address limitations such as high training costs and poor generalization across… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2406.09330  [pdf, other

    cs.CL

    Learning from Natural Language Explanations for Generalizable Entity Matching

    Authors: Somin Wadhwa, Adit Krishnan, Runhui Wang, Byron C. Wallace, Chris Kong

    Abstract: Entity matching is the task of linking records from different sources that refer to the same real-world entity. Past work has primarily treated entity linking as a standard supervised learning problem. However, supervised entity matching models often do not generalize well to new data, and collecting exhaustive labeled training data is often cost prohibitive. Further, recent efforts have adopted L… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2404.08452  [pdf, other

    cs.CV

    MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

    Authors: Chenqi Kong, Anwei Luo, Peijun Bao, Yi Yu, Haoliang Li, Zengwei Zheng, Shiqi Wang, Alex C. Kot

    Abstract: Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial comp… ▽ More

    Submitted 7 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  5. arXiv:2404.01984  [pdf, other

    cs.CV

    Fashion Style Editing with Generative Human Prior

    Authors: Chaerin Kong, Seungyong Lee, Soohyeok Im, Wonsuk Yang

    Abstract: Image editing has been a long-standing challenge in the research community with its far-reaching impact on numerous applications. Recently, text-driven methods started to deliver promising results in domains like human faces, but their applications to more complex domains have been relatively limited. In this work, we explore the task of fashion style editing, where we aim to manipulate the fashio… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 5 pages

  6. arXiv:2403.01509  [pdf, other

    cs.CL

    Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics

    Authors: Zhu Liu, Cunliang Kong, Ying Liu, Maosong Sun

    Abstract: Large language models have achieved remarkable success in general language understanding tasks. However, as a family of generative methods with the objective of next token prediction, the semantic evolution with the depth of these models are not fully explored, unlike their predecessors, such as BERT-like architectures. In this paper, we specifically investigate the bottom-up evolution of lexical… ▽ More

    Submitted 9 June, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to Findings of ACL 2024

  7. arXiv:2402.18140  [pdf, other

    cs.CV

    OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction

    Authors: Jian Liu, Sipeng Zhang, Chuixin Kong, Wenyuan Zhang, Yuhang Wu, Yikang Ding, Borun Xu, Ruibo Ming, Donglai Wei, Xianming Liu

    Abstract: This technical report presents our solution, "occTransformer" for the 3D occupancy prediction track in the autonomous driving challenge at CVPR 2023. Our method builds upon the strong baseline BEVFormer and improves its performance through several simple yet effective techniques. Firstly, we employed data augmentation to increase the diversity of the training data and improve the model's generaliz… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Innovation Award in the 3D Occupancy Prediction Challenge (CVPR23)

  8. arXiv:2402.16311  [pdf, other

    cs.CL cs.AI

    Cross-domain Chinese Sentence Pattern Parsing

    Authors: **gsi Yu, Cunliang Kong, Liner Yang, Meishan Zhang, Lin Zhu, Yujie Wang, Haozhe Lin, Maosong Sun, Erhong Yang

    Abstract: Sentence Pattern Structure (SPS) parsing is a syntactic analysis method primarily employed in language teaching.Existing SPS parsers rely heavily on textbook corpora for training, lacking cross-domain capability.To overcome this constraint, this paper proposes an innovative approach leveraging large language models (LLMs) within a self-training framework. Partial syntactic rules from a source doma… ▽ More

    Submitted 7 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  9. arXiv:2402.15604  [pdf, other

    cs.RO eess.SY

    Goal-Reaching Trajectory Design Near Danger with Piecewise Affine Reach-avoid Computation

    Authors: Long Kiu Chung, Wonsuhk Jung, Chuizheng Kong, Shreyas Kousik

    Abstract: Autonomous mobile robots must maintain safety, but should not sacrifice performance, leading to the classical reach-avoid problem: find a trajectory that is guaranteed to reach a goal and avoid obstacles. This paper addresses the near danger case, also known as a narrow gap, where the agent starts near the goal, but must navigate through tight obstacles that block its path. The proposed method bui… ▽ More

    Submitted 28 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: The first two authors contributed equally to the work. This work has been submitted for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  10. arXiv:2402.13740  [pdf, other

    cs.CL

    From Text to CQL: Bridging Natural Language and Corpus Search Engine

    Authors: Luming Lu, Jiyuan An, Yujie Wang, Liner yang, Cunliang Kong, Zhenghao Liu, Shuo Wang, Haozhe Lin, Mingwei Fang, Ya** Huang, Erhong Yang

    Abstract: Natural Language Processing (NLP) technologies have revolutionized the way we interact with information systems, with a significant focus on converting natural language queries into formal query languages such as SQL. However, less emphasis has been placed on the Corpus Query Language (CQL), a critical tool for linguistic research and detailed analysis within text corpora. The manual construction… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  11. arXiv:2402.13524  [pdf, other

    cs.CL

    OMGEval: An Open Multilingual Generative Evaluation Benchmark for Large Language Models

    Authors: Yang Liu, Meng Xu, Shuo Wang, Liner Yang, Haoyu Wang, Zhenghao Liu, Cunliang Kong, Yun Chen, Yang Liu, Maosong Sun, Erhong Yang

    Abstract: Modern large language models (LLMs) should generally benefit individuals from various cultural backgrounds around the world. However, most recent advanced generative evaluation benchmarks tailed for LLMs mainly focus on English. To this end, we introduce OMGEval, the first Open-source Multilingual Generative test set that can assess the capability of LLMs in different languages. For each language,… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  12. arXiv:2402.00667  [pdf, other

    cs.CL

    Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

    Authors: Jitao Sang, Yuhang Wang, **g Zhang, Yanxu Zhu, Chao Kong, Junhong Ye, Shuyu Wei, **lin Xiao

    Abstract: This paper presents a follow-up study to OpenAI's recent superalignment work on Weak-to-Strong Generalization (W2SG). Superalignment focuses on ensuring that high-level AI systems remain consistent with human values and intentions when dealing with complex, high-risk tasks. The W2SG framework has opened new possibilities for empirical research in this evolving field. Our study simulates two phases… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  13. arXiv:2401.11934  [pdf, ps, other

    cs.PF

    Systematic Performance Evaluation Framework for LEO Mega-Constellation Satellite Networks

    Authors: Yu Wang, Chuili Kong, Xian Meng, Hejia Luo, Ke-Xin Li, Jun Wang

    Abstract: Low Earth orbit (LEO) mega-constellation satellite networks have shown great potential to extend the coverage capability of conventional terrestrial networks. How to systematically define, quantify, and assess the technical performance of LEO mega-constellation satellite networks remains an open issue. In this paper, we propose a comprehensive key performance indicator (KPI) framework for mega-con… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 6 pages, 8 figures, accepted by IEEE ICC2024

  14. arXiv:2401.06133  [pdf

    cs.CV

    The possibility of making \$138,000 from shredded banknote pieces using computer vision

    Authors: Chung To Kong

    Abstract: Every country must dispose of old banknotes. At the Hong Kong Monetary Authority visitor center, visitors can buy a paperweight souvenir full of shredded banknotes. Even though the shredded banknotes are small, by using computer vision, it is possible to reconstruct the whole banknote like a jigsaw puzzle. Each paperweight souvenir costs \… ▽ More

    Submitted 16 November, 2023; originally announced January 2024.

  15. arXiv:2401.00020  [pdf, other

    cs.AI cs.DB cs.IR

    ShennongAlpha: an AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledge

    Authors: Zijie Yang, Yong**g Yin, Chaojun Kong, Tiange Chi, Wufan Tao, Yue Zhang, Tian Xu

    Abstract: Natural Medicinal Materials (NMMs) have a long history of global clinical applications and a wealth of records and knowledge. Although NMMs are a major source for drug discovery and clinical application, the utilization and sharing of NMM knowledge face crucial challenges, including the standardized description of critical information, efficient curation and acquisition, and language barriers. To… ▽ More

    Submitted 16 May, 2024; v1 submitted 27 December, 2023; originally announced January 2024.

    Comments: 53 pages, 6 figures, 10 supplementary figures, 2 supplementary tables

  16. arXiv:2311.16421  [pdf, other

    cs.CL cs.CY

    CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models

    Authors: Yuhang Wang, Yanxu Zhu, Chao Kong, Shuyu Wei, Xiaoyuan Yi, Xing Xie, Jitao Sang

    Abstract: As the scaling of Large Language Models (LLMs) has dramatically enhanced their capabilities, there has been a growing focus on the alignment problem to ensure their responsible and ethical use. While existing alignment efforts predominantly concentrate on universal values such as the HHH principle, the aspect of culture, which is inherently pluralistic and diverse, has not received adequate attent… ▽ More

    Submitted 20 June, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted by the Cross-Cultural Considerations in NLP Workshop @ ACL 2024

  17. arXiv:2311.09774  [pdf, other

    cs.CL cs.AI cs.LG

    HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

    Authors: Junying Chen, Xidong Wang, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang

    Abstract: Adapting a language model into a specific domain, a.k.a `domain adaption', is a common practice when specialized knowledge, e.g. medicine, is not encapsulated in a general language model like Llama2. The challenge lies in the heterogeneity of data across the two training stages, as it varies in languages, genres, or formats. To tackle this and simplify the learning protocol, we propose to transfor… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  18. arXiv:2310.00234  [pdf, other

    cs.CR cs.CV eess.IV

    Pixel-Inconsistency Modeling for Image Manipulation Localization

    Authors: Chenqi Kong, Anwei Luo, Shiqi Wang, Haoliang Li, Anderson Rocha, Alex C. Kot

    Abstract: Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity,… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  19. arXiv:2309.11092  [pdf, other

    cs.CV cs.MM

    Forgery-aware Adaptive Vision Transformer for Face Forgery Detection

    Authors: Anwei Luo, Rizhao Cai, Chenqi Kong, Xiangui Kang, Jiwu Huang, Alex C. Kot

    Abstract: With the advancement in face manipulation technologies, the importance of face forgery detection in protecting authentication integrity becomes increasingly evident. Previous Vision Transformer (ViT)-based detectors have demonstrated subpar performance in cross-database evaluations, primarily because fully fine-tuning with limited Deepfake data often leads to forgetting pre-trained knowledge and o… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  20. arXiv:2309.04038  [pdf, other

    cs.CV

    S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens

    Authors: Rizhao Cai, Zitong Yu, Chenqi Kong, Haoliang Li, Changsheng Chen, Yongjian Hu, Alex Kot

    Abstract: Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces. State-of-the-art FAS techniques predominantly rely on deep learning models but their cross-domain generalization capabilities are often hindered by the domain shift problem, which arises due to different distributions between training and testing data. In this study, we devel… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE Transactions on Information Forensics Security (June 2024)

  21. arXiv:2308.11534  [pdf, other

    cs.CL cs.AI

    PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator

    Authors: Chuyi Kong, Yaxin Fan, Xiang Wan, Feng Jiang, Benyou Wang

    Abstract: The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT dialogues, as evidenced by Vicuna. However, due to challenges in gathering dialogues involving human participation, current endeavors like Baize and UltraChat rely on ChatGPT conducting roleplay to simulate humans based on instruction… ▽ More

    Submitted 27 May, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted to ACL 2024 (main conference)

  22. arXiv:2308.11199  [pdf, other

    cs.CV cs.AI cs.LG

    ConcatPlexer: Additional Dim1 Batching for Faster ViTs

    Authors: Donghoon Han, Seunghyeon Seo, Donghyeon Jeon, Jiho Jang, Chaerin Kong, Nojun Kwak

    Abstract: Transformers have demonstrated tremendous success not only in the natural language processing (NLP) domain but also the field of computer vision, igniting various creative approaches and applications. Yet, the superior performance and modeling flexibility of transformers came with a severe increase in computation costs, and hence several works have proposed methods to reduce this burden. Inspired… ▽ More

    Submitted 31 January, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

  23. arXiv:2306.06362  [pdf, other

    cs.CV cs.AI cs.LG

    Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception

    Authors: Xiaqing Pan, Nicholas Charron, Yongqian Yang, Scott Peters, Thomas Whelan, Chen Kong, Omkar Parkhi, Richard Newcombe, Carl Yuheng Ren

    Abstract: We introduce the Aria Digital Twin (ADT) - an egocentric dataset captured using Aria glasses with extensive object, environment, and human level ground truth. This ADT release contains 200 sequences of real-world activities conducted by Aria wearers in two real indoor scenes with 398 object instances (324 stationary and 74 dynamic). Each sequence consists of: a) raw data of two monochrome camera s… ▽ More

    Submitted 13 June, 2023; v1 submitted 10 June, 2023; originally announced June 2023.

  24. arXiv:2306.03378  [pdf, other

    cs.IR

    Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction

    Authors: Yuhang Wang, Dongyuan Lu, Chao Kong, Jitao Sang

    Abstract: Many works employed prompt tuning methods to automatically optimize prompt queries and extract the factual knowledge stored in Pretrained Language Models. In this paper, we observe that the optimized prompts, including discrete prompts and continuous prompts, exhibit undesirable object bias. To handle this problem, we propose a novel prompt tuning method called MeCoD. consisting of three modules:… ▽ More

    Submitted 9 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Findings

  25. arXiv:2305.04001  [pdf, other

    cs.CV cs.AI

    AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion

    Authors: Seungwoo Lee, Chaerin Kong, Donghyeon Jeon, Nojun Kwak

    Abstract: Recent advances in diffusion models have showcased promising results in the text-to-video (T2V) synthesis task. However, as these T2V models solely employ text as the guidance, they tend to struggle in modeling detailed temporal dynamics. In this paper, we introduce a novel T2V framework that additionally employ audio signals to control the temporal dynamics, empowering an off-the-shelf T2I diffus… ▽ More

    Submitted 23 May, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: CVPR2023 Workshop on AI for Content Creation. Project Page: https://lifrary.github.io/AADiff/

  26. arXiv:2304.12489  [pdf, other

    cs.CV cs.CR

    Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery Detection

    Authors: Anwei Luo, Chenqi Kong, Jiwu Huang, Yongjian Hu, Xiangui Kang, Alex C. Kot

    Abstract: Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  27. arXiv:2303.03282  [pdf, other

    cs.RO eess.SY

    Learning Object Manipulation With Under-Actuated Impulse Generator Arrays

    Authors: Chuizheng Kong, William Yerazunis, Daniel Nikovski

    Abstract: For more than half a century, vibratory bowl feeders have been the standard in automated assembly for singulation, orientation, and manipulation of small parts. Unfortunately, these feeders are expensive, noisy, and highly specialized on a single part design bases. We consider an alternative device and learning control method for singulation, orientation, and manipulation by means of seven fixed-p… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: Accepted at the 2023 American Control Conference

  28. arXiv:2303.00917  [pdf, other

    cs.CV

    Enhancing General Face Forgery Detection via Vision Transformer with Low-Rank Adaptation

    Authors: Chenqi Kong, Haoliang Li, Shiqi Wang

    Abstract: Nowadays, forgery faces pose pressing security concerns over fake news, fraud, impersonation, etc. Despite the demonstrated success in intra-domain face forgery detection, existing detection methods lack generalization capability and tend to suffer from dramatic performance drops when deployed to unforeseen domains. To mitigate this issue, this paper designs a more general fake face detection mode… ▽ More

    Submitted 27 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  29. arXiv:2302.10305  [pdf, other

    cs.CV cs.AI cs.LG

    Analyzing Multimodal Objectives Through the Lens of Generative Diffusion Guidance

    Authors: Chaerin Kong, Nojun Kwak

    Abstract: Recent years have witnessed astonishing advances in the field of multimodal representation learning, with contrastive learning being the cornerstone for major breakthroughs. Latest works delivered further improvements by incorporating different objectives such as masked modeling and captioning into the frameworks, but our understanding on how these objectives facilitate learning remains vastly inc… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: 6 pages

  30. arXiv:2301.12831  [pdf, other

    cs.MM cs.CV

    M3FAS: An Accurate and Robust MultiModal Mobile Face Anti-Spoofing System

    Authors: Chenqi Kong, Kexin Zheng, Yibing Liu, Shiqi Wang, Anderson Rocha, Haoliang Li

    Abstract: Face presentation attacks (FPA), also known as face spoofing, have brought increasing concerns to the public through various malicious applications, such as financial fraud and privacy leakage. Therefore, safeguarding face recognition systems against FPA is of utmost importance. Although existing learning-based face anti-spoofing (FAS) models can achieve outstanding detection performance, they lac… ▽ More

    Submitted 21 March, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

  31. arXiv:2211.16786  [pdf, ps, other

    cs.CV cs.AI

    Two-branch Multi-scale Deep Neural Network for Generalized Document Recapture Attack Detection

    Authors: Jiaxing Li, Chenqi Kong, Shiqi Wang, Haoliang Li

    Abstract: The image recapture attack is an effective image manipulation method to erase certain forensic traces, and when targeting on personal document images, it poses a great threat to the security of e-commerce and other web applications. Considering the current learning-based methods suffer from serious overfitting problem, in this paper, we propose a novel two-branch deep neural network by mining bett… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, under review

  32. arXiv:2211.14540  [pdf, other

    cs.CL

    Lexical Complexity Controlled Sentence Generation

    Authors: **ran Nie, Liner Yang, Yun Chen, Cunliang Kong, Junhui Zhu, Erhong Yang

    Abstract: Text generation rarely considers the control of lexical complexity, which limits its more comprehensive practical application. We introduce a novel task of lexical complexity controlled sentence generation, which aims at keywords to sentence generation with desired complexity levels. It has enormous potential in domains such as grade reading, language teaching and acquisition. The challenge of thi… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

  33. arXiv:2211.11153  [pdf, other

    cs.LG cs.CL cs.CV

    Unifying Vision-Language Representation Space with Single-tower Transformer

    Authors: Jiho Jang, Chaerin Kong, Donghyeon Jeon, Seonhoon Kim, Nojun Kwak

    Abstract: Contrastive learning is a form of distance learning that aims to learn invariant features from two related representations. In this paper, we explore the bold hypothesis that an image and its caption can be simply regarded as two different views of the underlying mutual information, and train a model to learn a unified vision-language representation space that encodes both modalities at once in a… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: AAAI 2023, 11 pages

  34. arXiv:2210.05872  [pdf, other

    cs.CV

    Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation

    Authors: Chaerin Kong, DongHyeon Jeon, Ohjoon Kwon, Nojun Kwak

    Abstract: Fashion attribute editing is a task that aims to convert the semantic attributes of a given fashion image while preserving the irrelevant regions. Previous works typically employ conditional GANs where the generator explicitly learns the target attributes and directly execute the conversion. These approaches, however, are neither scalable nor generic as they operate only with few limited attribute… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted to WACV 2023

  35. arXiv:2210.04127  [pdf, other

    cs.CV

    Towards Efficient Neural Scene Graphs by Learning Consistency Fields

    Authors: Yeji Song, Chaerin Kong, Seoyoung Lee, Nojun Kwak, Joonseok Lee

    Abstract: Neural Radiance Fields (NeRF) achieves photo-realistic image rendering from novel views, and the Neural Scene Graphs (NSG) \cite{ost2021neural} extends it to dynamic scenes (video) with multiple objects. Nevertheless, computationally heavy ray marching for every image frame becomes a huge burden. In this paper, taking advantage of significant redundancy across adjacent frames in videos, we propose… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: BMVC 2022, 22 pages

  36. arXiv:2209.14692  [pdf, other

    cs.CV cs.CR

    Digital and Physical Face Attacks: Reviewing and One Step Further

    Authors: Chenqi Kong, Shiqi Wang, Haoliang Li

    Abstract: With the rapid progress over the past five years, face authentication has become the most pervasive biometric recognition method. Thanks to the high-accuracy recognition performance and user-friendly usage, automatic face recognition (AFR) has exploded into a plethora of practical applications over device unlocking, checking-in, and financial payment. In spite of the tremendous success of face aut… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  37. arXiv:2209.14614  [pdf, other

    cs.CL

    COMPILING: A Benchmark Dataset for Chinese Complexity Controllable Definition Generation

    Authors: Jiaxin Yuan, Cunliang Kong, Chenhui Xie, Liner Yang, Erhong Yang

    Abstract: The definition generation task aims to generate a word's definition within a specific context automatically. However, owing to the lack of datasets for different complexities, the definitions produced by models tend to keep the same complexity level. This paper proposes a novel task of generating definitions for a word with controllable complexity levels. Correspondingly, we introduce COMPILING, a… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted by CCL 2022

  38. arXiv:2207.14491  [pdf, other

    cs.CV

    Conservative Generator, Progressive Discriminator: Coordination of Adversaries in Few-shot Incremental Image Synthesis

    Authors: Chaerin Kong, Nojun Kwak

    Abstract: The capacity to learn incrementally from an online stream of data is an envied trait of human learners, as deep neural networks typically suffer from catastrophic forgetting and stability-plasticity dilemma. Several works have previously explored incremental few-shot learning, a task with greater challenges due to data constraint, mostly in classification setting with mild success. In this work, w… ▽ More

    Submitted 29 July, 2022; originally announced July 2022.

    Comments: 4 pages

  39. arXiv:2205.10721  [pdf, other

    cs.NI cs.IT

    System-Level Evaluation of Beam Hop** in NR-Based LEO Satellite Communication System

    Authors: **gwei Zhang, Dali Qin, Chuili Kong, Feiran Zhao, Rong Li, Jun Wang, Ye Wang

    Abstract: Satellite communication by leveraging the use of low earth orbit (LEO) satellites is expected to play an essential role in future communication systems through providing ubiquitous and continuous wireless connectivity. This thus has motivated the work in the 3rd generation partnership project (3GPP) to ensure the operation of fifth generation (5G) New Radio (NR) protocols for non-terrestrial netwo… ▽ More

    Submitted 15 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: 6 pages, 13 figures

  40. arXiv:2204.11087  [pdf, other

    cs.CL

    LitMind Dictionary: An Open-Source Online Dictionary

    Authors: Cunliang Kong, Xuezhi Fang, Liner Yang, Yun Chen, Erhong Yang

    Abstract: Dictionaries can help language learners to learn vocabulary by providing definitions of words. Since traditional dictionaries present word senses as discrete items in predefined inventories, they fall short of flexibility, which is required in providing specific meanings of words in particular contexts. In this paper, we introduce the LitMind Dictionary (https://dictionary.litmind.ink), an open-so… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.

  41. arXiv:2204.07701  [pdf, other

    cs.CL

    BLCU-ICALL at SemEval-2022 Task 1: Cross-Attention Multitasking Framework for Definition Modeling

    Authors: Cunliang Kong, Yujie Wang, Ruining Chong, Liner Yang, Hengyuan Zhang, Erhong Yang, Ya** Huang

    Abstract: This paper describes the BLCU-ICALL system used in the SemEval-2022 Task 1 Comparing Dictionaries and Word Embeddings, the Definition Modeling subtrack, achieving 1st on Italian, 2nd on Spanish and Russian, and 3rd on English and French. We propose a transformer-based multitasking framework to explore the task. The framework integrates multiple embedding architectures through the cross-attention m… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

  42. arXiv:2203.12926  [pdf, other

    cs.CL

    Multitasking Framework for Unsupervised Simple Definition Generation

    Authors: Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, Erhong Yang

    Abstract: The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner's dictionaries in many languages, and therefore the lack of data… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2022 (main conference)

  43. arXiv:2203.07677  [pdf, other

    eess.IV cs.CV

    Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning

    Authors: Xiang Chen, Zhentao Fan, Pengpeng Li, Longgang Dai, Caihua Kong, Zhuoran Zheng, Yufeng Huang, Yufeng Li

    Abstract: We offer a practical unpaired learning based image dehazing network from an unpaired set of clear and hazy images. This paper provides a new perspective to treat image dehazing as a two-class separated factor disentanglement task, i.e, the task-relevant factor of clear image reconstruction and the task-irrelevant factor of haze-relevant distribution. To achieve the disentanglement of these two-cla… ▽ More

    Submitted 12 July, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

  44. arXiv:2202.08466  [pdf, other

    cs.GT cs.CR

    Insightful Mining Equilibria

    Authors: Mengqian Zhang, Yuhao Li, Jichen Li, Chaozhe Kong, Xiaotie Deng

    Abstract: The selfish mining attack, arguably the most famous game-theoretic attack in blockchain, indicates that the Bitcoin protocol is not incentive-compatible. Most subsequent works mainly focus on strengthening the selfish mining strategy, thus enabling a single strategic agent more likely to deviate. In sharp contrast, little attention has been paid to the resistant behavior against the selfish mining… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

  45. arXiv:2201.12114  [pdf, other

    cs.LG cs.CL cs.CV

    Rethinking Attention-Model Explainability through Faithfulness Violation Test

    Authors: Yibing Liu, Haoliang Li, Yangyang Guo, Chenqi Kong, **g Li, Shiqi Wang

    Abstract: Attention mechanisms are dominating the explainability of deep models. They produce probability distributions over the input, which are widely deemed as feature-importance indicators. However, in this paper, we find one critical limitation in attention explanations: weakness in identifying the polarity of feature impact. This would be somehow misleading -- features with higher attention weights ma… ▽ More

    Submitted 5 July, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: Accepted to ICML 2022

  46. arXiv:2112.15043  [pdf, other

    cs.CL

    YACLC: A Chinese Learner Corpus with Multidimensional Annotation

    Authors: Yingying Wang, Cunliang Kong, Liner Yang, Yijun Wang, Xiaorong Lu, Renfen Hu, Shan He, Zhenghao Liu, Yun Chen, Erhong Yang, Maosong Sun

    Abstract: Learner corpus collects language data produced by L2 learners, that is second or foreign-language learners. This resource is of great relevance for second language acquisition research, foreign-language teaching, and automatic grammatical error correction. However, there is little focus on learner corpus for Chinese as Foreign Language (CFL) learners. Therefore, we propose to construct a large-sca… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

    Comments: 4 pages, 3 figures

  47. arXiv:2111.12958  [pdf, other

    cs.CV

    Self-Distilled Self-Supervised Representation Learning

    Authors: Jiho Jang, Seonhoon Kim, Kiyoon Yoo, Chaerin Kong, Jangho Kim, Nojun Kwak

    Abstract: State-of-the-art frameworks in self-supervised learning have recently shown that fully utilizing transformer-based models can lead to performance boost compared to conventional CNN models. Striving to maximize the mutual information of two views of an image, existing works apply a contrastive loss to the final representations. Motivated by self-distillation in the supervised regime, we further exp… ▽ More

    Submitted 23 November, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: WACV 23, 11 pages

  48. arXiv:2111.11672  [pdf, other

    cs.CV

    Few-shot Image Generation with Mixup-based Distance Learning

    Authors: Chaerin Kong, Jeesoo Kim, Donghoon Han, Nojun Kwak

    Abstract: Producing diverse and realistic images with generative models such as GANs typically requires large scale training with vast amount of images. GANs trained with limited data can easily memorize few training samples and display undesirable properties like "stairlike" latent space where interpolation in the latent space yields discontinuous transitions in the output space. In this work, we consider… ▽ More

    Submitted 7 July, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

    Comments: ECCV 2022, 27 pages

  49. arXiv:2109.02973  [pdf, other

    cs.CV

    Unpaired Deep Image Deraining Using Dual Contrastive Learning

    Authors: Xiang Chen, **shan Pan, Kui Jiang, Yufeng Li, Yufeng Huang, Caihua Kong, Longgang Dai, Zhentao Fan

    Abstract: Learning single image deraining (SID) networks from an unpaired set of clean and rainy images is practical and valuable as acquiring paired real-world data is almost infeasible. However, without the paired data as the supervision, learning a SID network is challenging. Moreover, simply using existing unpaired learning methods (e.g., unpaired adversarial learning and cycle-consistency constraints)… ▽ More

    Submitted 24 March, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: Accepted by CVPR 2022

  50. No-Reference Image Quality Assessment by Hallucinating Pristine Features

    Authors: Baoliang Chen, Lingyu Zhu, Chenqi Kong, Hanwei Zhu, Shiqi Wang, Zhu Li

    Abstract: In this paper, we propose a no-reference (NR) image quality assessment (IQA) method via feature level pseudo-reference (PR) hallucination. The proposed quality assessment framework is grounded on the prior models of natural image statistical behaviors and rooted in the view that the perceptually meaningful features could be well exploited to characterize the visual quality. Herein, the PR features… ▽ More

    Submitted 2 September, 2022; v1 submitted 9 August, 2021; originally announced August 2021.