Skip to main content

Showing 1–50 of 63 results for author: Fan, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11432  [pdf, other

    cs.CV cs.AI

    AnyTrans: Translate AnyText in the Image with Large Scale Models

    Authors: Zhipeng Qian, Pei Zhang, Baosong Yang, Kai Fan, Yiwei Ma, Derek F. Wong, Xiaoshuai Sun, Rongrong Ji

    Abstract: This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI), which includes multilingual text translation and text fusion within images. Our framework leverages the strengths of large-scale models, such as Large Language Models (LLMs) and text-guided diffusion models, to incorporate contextual cues from both textual and visual elements during tr… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.10858  [pdf, other

    cs.CL cs.AI

    Step-level Value Preference Optimization for Mathematical Reasoning

    Authors: Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan

    Abstract: Direct Preference Optimization (DPO) using an implicit reward model has proven to be an effective alternative to reinforcement learning from human feedback (RLHF) for fine-tuning preference aligned large language models (LLMs). However, the overall preference annotations of responses do not fully capture the fine-grained quality of model outputs in complex multi-step reasoning tasks, such as mathe… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Ongoing Work

  3. arXiv:2406.09196  [pdf, other

    cs.CV cs.LG

    Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

    Authors: Ke Fan, Zechen Bai, Tianjun Xiao, Tong He, Max Horn, Yanwei Fu, Francesco Locatello, Zheng Zhang

    Abstract: Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major drawback of most object-centric models, including slot… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: CVPR 2024

  4. arXiv:2405.18802  [pdf, other

    cs.CR cs.AI

    Enhancing Security and Privacy in Federated Learning using Update Digests and Voting-Based Defense

    Authors: Wenjie Li, Kai Fan, **gyuan Zhang, Hui Li, Wei Yang Bryan Lim, Qiang Yang

    Abstract: Federated Learning (FL) is a promising privacy-preserving machine learning paradigm that allows data owners to collaboratively train models while kee** their data localized. Despite its potential, FL faces challenges related to the trustworthiness of both clients and servers, especially in the presence of curious or malicious adversaries. In this paper, we introduce a novel framework named \unde… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 14 pages

  5. arXiv:2405.15763  [pdf, other

    cs.CV

    FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

    Authors: Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, **gyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma

    Abstract: Text-to-motion synthesis is a crucial task in computer vision. Existing methods are limited in their universality, as they are tailored for single-person or two-person scenarios and can not be applied to generate motions for more individuals. To achieve the number-free motion synthesis, this paper reconsiders motion generation and proposes to unify the single and multi-person motion by the conditi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  6. arXiv:2405.04974  [pdf, other

    cs.CV cs.AI

    Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI

    Authors: Keqiang Fan, Xiaohao Cai, Mahesan Niranjan

    Abstract: Diffusion probabilistic models (DPMs) have exhibited significant effectiveness in computer vision tasks, particularly in image generation. However, their notable performance heavily relies on labelled datasets, which limits their application in medical images due to the associated high-cost annotations. Current DPM-related methods for lesion detection in medical imaging, which can be categorized i… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  7. arXiv:2405.03553  [pdf, other

    cs.CL cs.AI

    AlphaMath Almost Zero: process Supervision without process

    Authors: Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan

    Abstract: Recent advancements in large language models (LLMs) have substantially enhanced their mathematical reasoning abilities. However, these models still struggle with complex problems that require multiple reasoning steps, frequently leading to logical or numerical errors. While numerical mistakes can be largely addressed by integrating a code interpreter, identifying logical errors within intermediate… ▽ More

    Submitted 23 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: update to the latest results

  8. arXiv:2405.02756  [pdf, other

    cs.AR

    Efficient Open Modification Spectral Library Searching in High-Dimensional Space with Multi-Level-Cell Memory

    Authors: Keming Fan, Wei-Chen Chen, Sumukh **e, H. -S. Philip Wong, Tajana Rosing

    Abstract: Open Modification Search (OMS) is a promising algorithm for mass spectrometry analysis that enables the discovery of modified peptides. However, OMS encounters challenges as it exponentially extends the search scope. Existing OMS accelerators either have limited parallelism or struggle to scale effectively with growing data volumes. In this work, we introduce an OMS accelerator utilizing multi-lev… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Accepted by DAC'24

  9. arXiv:2404.13925  [pdf, other

    cs.CL

    MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit

    Authors: Boning Zhang, Chengxi Li, Kai Fan

    Abstract: Large language models (LLMs) have been explored in a variety of reasoning tasks including solving of mathematical problems. Each math dataset typically includes its own specially designed evaluation script, which, while suitable for its intended use, lacks generalizability across different datasets. Consequently, updates and adaptations to these evaluation tools tend to occur without being systema… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  10. arXiv:2404.05606  [pdf, other

    cs.CV

    Learning Topology Uniformed Face Mesh by Volume Rendering for Multi-view Reconstruction

    Authors: Yating Wang, Ran Yi, Ke Fan, **kun Hao, Jiangbo Lu, Lizhuang Ma

    Abstract: Face meshes in consistent topology serve as the foundation for many face-related applications, such as 3DMM constrained face reconstruction and expression retargeting. Traditional methods commonly acquire topology uniformed face meshes by two separate steps: multi-view stereo (MVS) to reconstruct shapes followed by non-rigid registration to align topology, but struggles with handling noise and non… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  11. arXiv:2404.03518  [pdf, other

    cs.CV

    SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

    Authors: Sichen Chen, Yingyi Zhang, Siming Huang, Ran Yi, Ke Fan, Ruixin Zhang, Peixian Chen, Jun Wang, Shouhong Ding, Lizhuang Ma

    Abstract: Recently, transformer-based methods have achieved state-of-the-art prediction quality on human pose estimation(HPE). Nonetheless, most of these top-performing transformer-based models are too computation-consuming and storage-demanding to deploy on edge computing platforms. Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus pe… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  12. arXiv:2404.02656  [pdf, other

    cs.CV cs.AI

    Non-negative Subspace Feature Representation for Few-shot Learning in Medical Imaging

    Authors: Keqiang Fan, Xiaohao Cai, Mahesan Niranjan

    Abstract: Unlike typical visual scene recognition domains, in which massive datasets are accessible to deep neural networks, medical image interpretations are often obstructed by the paucity of data. In this paper, we investigate the effectiveness of data-based few-shot learning in medical imaging by exploring different data attribute representations in a low-dimensional space. We introduce different types… ▽ More

    Submitted 4 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  13. arXiv:2404.00252  [pdf, other

    eess.IV cs.CV

    Learned Scanpaths Aid Blind Panoramic Video Quality Assessment

    Authors: Kanglong Fan, Wen Wen, Mu Li, Yifan Peng, Kede Ma

    Abstract: Panoramic videos have the advantage of providing an immersive and interactive viewing experience. Nevertheless, their spherical nature gives rise to various and uncertain user viewing behaviors, which poses significant challenges for panoramic video quality assessment (PVQA). In this work, we propose an end-to-end optimized, blind PVQA method with explicit modeling of user viewing patterns through… ▽ More

    Submitted 15 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  14. arXiv:2403.16897  [pdf, other

    cs.CV

    Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

    Authors: Junshu Tang, Yanhong Zeng, Ke Fan, Xuheng Wang, Bo Dai, Kai Chen, Lizhuang Ma

    Abstract: Creating and animating 3D biped cartoon characters is crucial and valuable in various applications. Compared with geometry, the diverse texture design plays an important role in making 3D biped cartoon characters vivid and charming. Therefore, we focus on automatic texture design for cartoon characters based on input instructions. This is challenging for domain-specific requirements and a lack of… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Project page: https://make-it-vivid.github.io/

  15. arXiv:2401.16861  [pdf, other

    cs.CV

    Repositioning the Subject within Image

    Authors: Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, Yifan Li, Xiangyang Xue, Yanwei Fu

    Abstract: Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental… ▽ More

    Submitted 17 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Project page: https://yikai-wang.github.io/seele/. Dataset: https://github.com/Yikai-Wang/ReS. Arxiv version uses small size images for fast preview. Full size PDF is available at project page

  16. arXiv:2401.08190  [pdf, other

    cs.CL

    MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline

    Authors: Minpeng Liao, Wei Luo, Chengxi Li, **g Wu, Kai Fan

    Abstract: Large language models (LLMs) have seen considerable advancements in natural language understanding tasks, yet there remains a gap to bridge before attaining true artificial general intelligence, especially concerning shortcomings in mathematical reasoning capabilities. We postulate that the inherent nature of LLM training, which focuses on predicting probabilities of next token, presents challenge… ▽ More

    Submitted 21 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  17. arXiv:2401.01466  [pdf, other

    cs.RO

    Human Leading or Following Preferences: Effects on Human Perception of the Robot and the Human-Robot Collaboration

    Authors: Ali Noormohammadi-Asl, Kevin Fan, Stephen L. Smith, Kerstin Dautenhahn

    Abstract: Achieving effective and seamless human-robot collaboration requires two key outcomes: enhanced team performance and fostering a positive human perception of both the robot and the collaboration. This paper investigates the capability of the proposed task planning framework to realize these objectives by integrating human leading/following preference and performance into its task allocation and sch… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  18. arXiv:2310.14853  [pdf, other

    cs.CL

    Adaptive Policy with Wait-$k$ Model for Simultaneous Translation

    Authors: Libo Zhao, Kai Fan, Wei Luo, **g Wu, Shushu Wang, Ziqian Zeng, Zhongqiang Huang

    Abstract: Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model. Traditional methods rely on either a fixed wait-$k$ policy coupled with a standalone wait-$k$ translation model, or an adaptive policy jointly trained with the translation model. In this study, we propose a more flexible approach by decoupling the adaptive policy model… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accept to EMNLP 2023 main conference. 17 pages, 12 figures, 5 tables

  19. arXiv:2309.13248  [pdf, other

    cs.CV

    Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation

    Authors: Ke Fan, **gshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

    Abstract: Video amodal segmentation is a particularly challenging task in computer vision, which requires to deduce the full shape of an object from the visible parts of it. Recently, some studies have achieved promising performance by using motion flow to integrate information across frames under a self-supervised setting. However, motion flow has a clear limitation by the two factors of moving cameras and… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023

  20. arXiv:2309.09858  [pdf, other

    cs.CV

    Unsupervised Open-Vocabulary Object Localization in Videos

    Authors: Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He

    Abstract: In this paper, we show that recent advances in video representation learning and pre-trained vision-language models allow for substantial improvements in self-supervised video object localization. We propose a method that first localizes objects in videos via an object-centric approach with slot attention and then assigns text to the obtained slots. The latter is achieved by an unsupervised way to… ▽ More

    Submitted 26 June, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023; Presented on CVPR 2024 Workshop CORR; Project Page:https://github.com/amazon-science/object-centric-vol

  21. arXiv:2309.07693  [pdf, other

    cs.RO

    Towards Safer Robot-Assisted Surgery: A Markerless Augmented Reality Framework

    Authors: Ziyang Chen, Laura Cruciani, Ke Fan, Matteo Fontana, Elena Lievore, Ottavio De Cobelli, Gennaro Musi, Giancarlo Ferrigno, Elena De Momi

    Abstract: Robot-assisted surgery is rapidly develo** in the medical field, and the integration of augmented reality shows the potential of improving the surgeons' operation performance by providing more visual information. In this paper, we proposed a markerless augmented reality framework to enhance safety by avoiding intra-operative bleeding which is a high risk caused by the collision between the surgi… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  22. arXiv:2308.05633  [pdf, other

    cs.CV cs.CL cs.LG

    IIHT: Medical Report Generation with Image-to-Indicator Hierarchical Transformer

    Authors: Keqiang Fan, Xiaohao Cai, Mahesan Niranjan

    Abstract: Automated medical report generation has become increasingly important in medical analysis. It can produce computer-aided diagnosis descriptions and thus significantly alleviate the doctors' work. Inspired by the huge success of neural machine translation and image captioning, various deep learning methods have been proposed for medical report generation. However, due to the inherent properties of… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  23. arXiv:2307.09263  [pdf, other

    cs.DC cs.LG

    Mobility-Aware Joint User Scheduling and Resource Allocation for Low Latency Federated Learning

    Authors: Kecheng Fan, Wen Chen, Jun Li, Xiumei Deng, Xuefeng Han, Ming Ding

    Abstract: As an efficient distributed machine learning approach, Federated learning (FL) can obtain a shared model by iterative local model training at the user side and global model aggregating at the central server side, thereby protecting privacy of users. Mobile users in FL systems typically communicate with base stations (BSs) via wireless channels, where training performance could be degraded due to u… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  24. arXiv:2307.06099  [pdf, other

    cs.CV

    RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

    Authors: Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

    Abstract: Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods. The transparent property makes it difficult to be distinguished from background, while the tiny separation boundary further impedes the acquisition of their exact contour. In this paper, by revealing the key co-evolution demand of semantic and boundary learning, we propose a Selective… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: Accepted by 2023 International Joint Conference on Artificial Intelligence (IJCAI2023)

  25. arXiv:2307.04427  [pdf, other

    astro-ph.HE astro-ph.GA cs.LG

    Observation of high-energy neutrinos from the Galactic plane

    Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus , et al. (364 additional authors not shown)

    Abstract: The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Submitted on May 12th, 2022; Accepted on May 4th, 2023

    Journal ref: Science 380, 6652, 1338-1343 (2023)

  26. arXiv:2305.18978  [pdf, other

    cs.AI cs.LG physics.optics

    IDToolkit: A Toolkit for Benchmarking and Develo** Inverse Design Algorithms in Nanophotonics

    Authors: Jia-Qi Yang, Yucheng Xu, Jia-Lei Shen, Kebin Fan, De-Chuan Zhan, Yang Yang

    Abstract: Aiding humans with scientific designs is one of the most exciting of artificial intelligence (AI) and machine learning (ML), due to their potential for the discovery of new drugs, design of new materials and chemical compounds, etc. However, scientific design typically requires complex domain knowledge that is not familiar to AI researchers. Further, scientific studies involve professional skills… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: KDD'23

  27. arXiv:2305.17698  [pdf, other

    cs.CL

    Neural Machine Translation with Dynamic Graph Convolutional Decoder

    Authors: Lei Li, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan

    Abstract: Existing wisdom demonstrates the significance of syntactic knowledge for the improvement of neural machine translation models. However, most previous works merely focus on leveraging the source syntax in the well-known encoder-decoder framework. In sharp contrast, this paper proposes an end-to-end translation architecture from the (graph \& sequence) structural inputs to the (graph \& sequence) ou… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  28. arXiv:2305.04206  [pdf, other

    cs.CV cs.AI

    RATs-NAS: Redirection of Adjacent Trails on GCN for Neural Architecture Search

    Authors: Yu-Ming Zhang, Jun-Wei Hsieh, Chun-Chieh Lee, Kuo-Chin Fan

    Abstract: Various hand-designed CNN architectures have been developed, such as VGG, ResNet, DenseNet, etc., and achieve State-of-the-Art (SoTA) levels on different tasks. Neural Architecture Search (NAS) now focuses on automatically finding the best CNN architecture to handle the above tasks. However, the verification of a searched architecture is very time-consuming and makes predictor-based methods become… ▽ More

    Submitted 8 May, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

  29. arXiv:2305.02536  [pdf, other

    cs.CV

    Scanpath Prediction in Panoramic Videos via Expected Code Length Minimization

    Authors: Mu Li, Kanglong Fan, Kede Ma

    Abstract: Predicting human scanpaths when exploring panoramic videos is a challenging task due to the spherical geometry and the multimodality of the input, and the inherent uncertainty and diversity of the output. Most previous methods fail to give a complete treatment of these characteristics, and thus are prone to errors. In this paper, we present a simple new criterion for scanpath prediction based on p… ▽ More

    Submitted 4 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

  30. arXiv:2303.15705  [pdf, other

    cs.CL cs.SD eess.AS

    Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

    Authors: Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu

    Abstract: Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process. In this paper, we propose Lyrics-Melody Translation with Adaptive Grou** (LTAG), a holistic solution to automatic song translation by… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 13 pages

  31. arXiv:2303.07914  [pdf, other

    cs.CL

    Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference

    Authors: Biao Fu, Minpeng Liao, Kai Fan, Zhongqiang Huang, Boxing Chen, Yidong Chen, Xiaodong Shi

    Abstract: A popular approach to streaming speech translation is to employ a single offline model with a wait-k policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints. However, there is a mismatch problem in using a model trained with complete utterances for streaming inference with partial input. We demonstrate that speech r… ▽ More

    Submitted 26 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Accept to EMNLP 2023 main conference

  32. arXiv:2301.08563  [pdf

    cs.HC cs.AI cs.DC cs.GT cs.LG

    A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd

    Authors: Jianheng Tang, Kejia Fan, Wenxuan Xie, Luomin Zeng, Feijiang Han, Guosheng Huang, Tian Wang, Anfeng Liu, Shaobo Zhang

    Abstract: The recruitment of trustworthy and high-quality workers is an important research issue for MCS. Previous studies either assume that the qualities of workers are known in advance, or assume that the platform knows the qualities of workers once it receives their collected data. In reality, to reduce costs and thus maximize revenue, many strategic workers do not perform their sensing tasks honestly a… ▽ More

    Submitted 22 June, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: 18 pages, 14 figures

    Journal ref: Computer Communications, 2023, 206: 85-100

  33. arXiv:2211.13865  [pdf, other

    cs.CL cs.AI

    Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?

    Authors: Pei Zhang, Baosong Yang, Haoran Wei, Dayiheng Liu, Kai Fan, Luo Si, Jun Xie

    Abstract: Neural machine translation (NMT) is often criticized for failures that happen without awareness. The lack of competency awareness makes NMT untrustworthy. This is in sharp contrast to human translators who give feedback or conduct further investigations whenever they are in doubt about predictions. To fill this gap, we propose a novel competency-aware NMT by extending conventional NMT with a self-… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: accepted to EMNLP 2022

  34. arXiv:2210.00546  [pdf, other

    cs.CV cs.LG

    Siamese-NAS: Using Trained Samples Efficiently to Find Lightweight Neural Architecture by Prior Knowledge

    Authors: Yu-Ming Zhang, Jun-Wei Hsieh, Chun-Chieh Lee, Kuo-Chin Fan

    Abstract: In the past decade, many architectures of convolution neural networks were designed by handcraft, such as Vgg16, ResNet, DenseNet, etc. They all achieve state-of-the-art level on different tasks in their time. However, it still relies on human intuition and experience, and it also takes so much time consumption for trial and error. Neural Architecture Search (NAS) focused on this issue. In recent… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

  35. arXiv:2209.03042  [pdf, other

    hep-ex astro-ph.IM cs.LG physics.data-an physics.ins-det

    Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

    Authors: R. Abbasi, M. Ackermann, J. Adams, N. Aggarwal, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, V. Basu, R. Bay, J. J. Beatty, K. -H. Becker , et al. (359 additional authors not shown)

    Abstract: IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen… ▽ More

    Submitted 11 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: Prepared for submission to JINST

  36. arXiv:2208.12928  [pdf, other

    cs.DB cs.LG

    A scalable pipeline for COVID-19: the case study of Germany, Czechia and Poland

    Authors: Wildan Abdussalam, Adam Mertel, Kai Fan, Lennart Schüler, Weronika Schlechte-Wełnicz, Justin M. Calabrese

    Abstract: Throughout the coronavirus disease 2019 (COVID-19) pandemic, decision makers have relied on forecasting models to determine and implement non-pharmaceutical interventions (NPI). In building the forecasting models, continuously updated datasets from various stakeholders including developers, analysts, and testers are required to provide precise predictions. Here we report the design of a scalable p… ▽ More

    Submitted 27 August, 2022; originally announced August 2022.

    Comments: DECo@VLDB22: workshop paper accepted

  37. arXiv:2208.08265  [pdf, other

    stat.ML cs.LG

    Semi-Supervised Anomaly Detection Based on Quadratic Multiform Separation

    Authors: Ko-Hui Michael Fan, Chih-Chung Chang, Kuang-Hsiao-Yin Kongguoluo

    Abstract: In this paper we propose a novel method for semi-supervised anomaly detection (SSAD). Our classifier is named QMS22 as its inception was dated 2022 upon the framework of quadratic multiform separation (QMS), a recently introduced classification model. QMS22 tackles SSAD by solving a multi-class classification problem involving both the training set and the test set of the original problem. The cla… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

  38. arXiv:2208.04683  [pdf, other

    cs.CY cs.AI cs.LG stat.AP

    Applying data technologies to combat AMR: current status, challenges, and opportunities on the way forward

    Authors: Leonid Chindelevitch, Elita Jauneikaite, Nicole E. Wheeler, Kasim Allel, Bede Yaw Ansiri-Asafoakaa, Wireko A. Awuah, Denis C. Bauer, Stephan Beisken, Kara Fan, Gary Grant, Michael Graz, Yara Khalaf, Veranja Liyanapathirana, Carlos Montefusco-Pereira, Lawrence Mugisha, Atharv Naik, Sylvia Nanono, Anthony Nguyen, Timothy Rawson, Kessendri Reddy, Juliana M. Ruzante, Anneke Schmider, Roman Stocker, Leonhardt Unruh, Daniel Waruingi , et al. (2 additional authors not shown)

    Abstract: Antimicrobial resistance (AMR) is a growing public health threat, estimated to cause over 10 million deaths per year and cost the global economy 100 trillion USD by 2050 under status quo projections. These losses would mainly result from an increase in the morbidity and mortality from treatment failure, AMR infections during medical procedures, and a loss of quality of life attributed to AMR. Nume… ▽ More

    Submitted 11 August, 2022; v1 submitted 5 July, 2022; originally announced August 2022.

    Comments: 65 pages, 3 figures

    ACM Class: I.2.1; J.3

  39. arXiv:2207.08210  [pdf, other

    cs.CV cs.AI cs.LG

    A Simple Test-Time Method for Out-of-Distribution Detection

    Authors: Ke Fan, Yikai Wang, Qian Yu, Da Li, Yanwei Fu

    Abstract: Neural networks are known to produce over-confident predictions on input images, even when these images are out-of-distribution (OOD) samples. This limits the applications of neural network models in real-world scenarios, where OOD samples exist. Many existing approaches identify the OOD instances via exploiting various cues, such as finding irregular patterns in the feature space, logits space, g… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

  40. arXiv:2204.06175  [pdf, other

    cs.CL

    Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

    Authors: Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

    Abstract: k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT). It aims to alleviate the performance degradation of advanced MT systems in translating out-of-domain sentences by coordinating with an additional token-level feature-based retrieval module constructed from in-domain data. Previous studie… ▽ More

    Submitted 3 May, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: 8 pages,6 figures, Accepted by ACL 2022 main conference

  41. arXiv:2203.02445  [pdf, other

    cs.CV

    SFPN: Synthetic FPN for Object Detection

    Authors: Yu-Ming Zhang, Jun-Wei Hsieh, Chun-Chieh Lee, Kuo-Chin Fan

    Abstract: FPN (Feature Pyramid Network) has become a basic component of most SoTA one stage object detectors. Many previous studies have repeatedly proved that FPN can caputre better multi-scale feature maps to more precisely describe objects if they are with different sizes. However, for most backbones such VGG, ResNet, or DenseNet, the feature maps at each layer are downsized to their quarters due to the… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

  42. arXiv:2202.09784  [pdf, other

    cs.LG cs.AI cs.CV stat.ME

    Clustering by the Probability Distributions from Extreme Value Theory

    Authors: Sixiao Zheng, Ke Fan, Yanxi Hou, Jianfeng Feng, Yanwei Fu

    Abstract: Clustering is an essential task to unsupervised learning. It tries to automatically separate instances into coherent subsets. As one of the most well-known clustering algorithms, k-means assigns sample points at the boundary to a unique cluster, while it does not utilize the information of sample distribution or density. Comparably, it would potentially be more beneficial to consider the probabili… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: IEEE Transactions on Artificial Intelligence

  43. arXiv:2112.03912  [pdf, other

    cs.LG cs.AI stat.ML

    RID-Noise: Towards Robust Inverse Design under Noisy Environments

    Authors: Jia-Qi Yang, Ke-Bin Fan, Hao Ma, De-Chuan Zhan

    Abstract: From an engineering perspective, a design should not only perform well in an ideal condition, but should also resist noises. Such a design methodology, namely robust design, has been widely implemented in the industry for product quality control. However, classic robust design requires a lot of evaluations for a single design target, while the results of these evaluations could not be reused for a… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: AAAI'22

  44. arXiv:2111.11718  [pdf, other

    cs.CV

    StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

    Authors: Lei Li, Kai Fan, Chun Yuan

    Abstract: Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts. In this paper, StrokeNet is proposed to effectively detect the texts by capturing the fine-grained strokes, and infer structural relations between the hierarchical representation in the graph. Different from existing approaches that represent the text ar… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

  45. arXiv:2110.07936  [pdf, other

    cs.CL

    Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

    Authors: Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

    Abstract: Cross-Lingual Summarization (CLS) is a task that extracts important information from a source document and summarizes it into a summary in another language. It is a challenging task that requires a system to understand, summarize, and translate at the same time, making it highly related to Monolingual Summarization (MS) and Machine Translation (MT). In practice, the training resources for Machine… ▽ More

    Submitted 24 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted by SIGIR 2022

  46. arXiv:2110.04925  [pdf, ps, other

    stat.ML cs.LG

    Quadratic Multiform Separation: A New Classification Model in Machine Learning

    Authors: Ko-Hui Michael Fan, Chih-Chung Chang, Kuang-Hsiao-Yin Kongguoluo

    Abstract: In this paper we present a new classification model in machine learning. Our result is threefold: 1) The model produces comparable predictive accuracy to that of most common classification models. 2) It runs significantly faster than most common classification models. 3) It has the ability to identify a portion of unseen samples for which class labels can be found with much higher predictive accur… ▽ More

    Submitted 17 August, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

  47. arXiv:2107.04829  [pdf, other

    cs.CV

    CSL-YOLO: A New Lightweight Object Detection System for Edge Computing

    Authors: Yu-Ming Zhang, Chun-Chieh Lee, Jun-Wei Hsieh, Kuo-Chin Fan

    Abstract: The development of lightweight object detectors is essential due to the limited computation resources. To reduce the computation cost, how to generate redundant features plays a significant role. This paper proposes a new lightweight Convolution method Cross-Stage Lightweight (CSL) Module, to generate redundant features from cheap operations. In the intermediate expansion stage, we replaced Pointw… ▽ More

    Submitted 10 July, 2021; originally announced July 2021.

  48. How the Design of YouTube Influences User Sense of Agency

    Authors: Kai Lukoff, Ulrik Lyngs, Himanshu Zade, J. Vera Liao, James Choi, Kaiyue Fan, Sean A. Munson, Alexis Hiniker

    Abstract: In the attention economy, video apps employ design mechanisms like autoplay that exploit psychological vulnerabilities to maximize watch time. Consequently, many people feel a lack of agency over their app use, which is linked to negative life effects such as loss of sleep. Prior design research has innovated external mechanisms that police multiple apps, such as lockout timers. In this work, we s… ▽ More

    Submitted 27 January, 2021; originally announced January 2021.

    Comments: 14 pages, 3 figures, Forthcoming at the CHI 2021 Conference

  49. arXiv:2009.09127  [pdf, other

    cs.CL cs.AI

    Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

    Authors: Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

    Abstract: Many document-level neural machine translation (NMT) systems have explored the utility of context-aware architecture, usually requiring an increasing number of parameters and computational complexity. However, few attention is paid to the baseline model. In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regres… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

    Comments: accepted to EMNLP 2020

  50. arXiv:2009.09126  [pdf, other

    cs.CL cs.AI

    Computer Assisted Translation with Neural Quality Estimation and Automatic Post-Editing

    Authors: Jiayi Wang, Ke Wang, Niyu Ge, Yangbing Shi, Yu Zhao, Kai Fan

    Abstract: With the advent of neural machine translation, there has been a marked shift towards leveraging and consuming the machine translation results. However, the gap between machine translation systems and human translators needs to be manually closed by post-editing. In this paper, we propose an end-to-end deep learning framework of the quality estimation and automatic post-editing of the machine trans… ▽ More

    Submitted 27 September, 2020; v1 submitted 18 September, 2020; originally announced September 2020.