Skip to main content

Showing 1–46 of 46 results for author: Peng, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13532  [pdf, other

    cs.CV

    SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation

    Authors: Qiang Hu, Zhenyu Yi, Ying Zhou, Fang Peng, Mei Liu, Qiang Li, Zhiwei Wang

    Abstract: Colonoscopy videos provide richer information in polyp segmentation for rectal cancer diagnosis. However, the endoscope's fast moving and close-up observing make the current methods suffer from large spatial incoherence and continuous low-quality frames, and thus yield limited segmentation accuracy. In this context, we focus on robust video polyp segmentation by enhancing the adjacent feature cons… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to MICCAI 2024. Code and models: https://github.com/Scatteredrain/SALI

  2. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2404.13400  [pdf, other

    cs.CV

    HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding

    Authors: Linhui Xiao, Xiaoshan Yang, Fang Peng, Yaowei Wang, Changsheng Xu

    Abstract: Visual grounding, which aims to ground a visual region via natural language, is a task that heavily relies on cross-modal alignment. Existing works utilized uni-modal pre-trained models to transfer visual/linguistic knowledge separately while ignoring the multimodal corresponding information. Motivated by recent advancements in contrastive language-image pre-training and low-rank adaptation (LoRA)… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: The project page: https://github.com/linhuixiao/HiVG

  4. arXiv:2403.09611  [pdf, other

    cs.CV cs.CL cs.LG

    MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

    Authors: Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman , et al. (7 additional authors not shown)

    Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  5. arXiv:2310.14860  [pdf, other

    cs.RO

    Adaptive Tuning of Robotic Polishing Skills based on Force Feedback Model

    Authors: Yu Wang, Zhouyi Zheng, Chen Chen, Zezheng Wang, Zhitao Gao, Fangyu Peng, Xiaowei Tang, Rong Yan

    Abstract: Acquiring human skills offers an efficient approach to tackle complex task planning challenges. When performing a learned skill model for a continuous contact task, such as robot polishing in an uncertain environment, the robot needs to be able to adaptively modify the skill model to suit the environment and perform the desired task. The environmental perturbation of the polishing task is mainly r… ▽ More

    Submitted 22 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted by The 2023 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2023)

  6. CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding

    Authors: Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, Yaowei Wang, Changsheng Xu

    Abstract: Visual Grounding (VG) is a crucial topic in the field of vision and language, which involves locating a specific region described by expressions within an image. To reduce the reliance on manually labeled data, unsupervised visual grounding have been developed to locate regions using pseudo-labels. However, the performance of existing unsupervised methods is highly dependent on the quality of pseu… ▽ More

    Submitted 24 December, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE Transaction on Multimedia (2023), Paper page: https://ieeexplore.ieee.org/abstract/document/10269126. Code are available at https://github.com/linhuixiao/CLIP-VG

  7. arXiv:2211.16191  [pdf, other

    cs.CV cs.MM

    SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification

    Authors: Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu

    Abstract: Although significant progress has been made in few-shot learning, most of existing few-shot image classification methods require supervised pre-training on a large amount of samples of base classes, which limits their generalization ability in real world application. Recently, large-scale Vision-Language Pre-trained models (VLPs) have been gaining increasing attention in few-shot learning because… ▽ More

    Submitted 20 January, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

  8. arXiv:2211.12713  [pdf, other

    cs.LG cs.CR cs.NE

    Reliable Robustness Evaluation via Automatically Constructed Attack Ensembles

    Authors: Shengcai Liu, Fu Peng, Ke Tang

    Abstract: Attack Ensemble (AE), which combines multiple attacks together, provides a reliable way to evaluate adversarial robustness. In practice, AEs are often constructed and tuned by human experts, which however tends to be sub-optimal and time-consuming. In this work, we present AutoAE, a conceptually simple approach for automatically constructing AEs. In brief, AutoAE repeatedly adds the attack and its… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  9. arXiv:2210.07749   

    eess.AS cs.SD

    LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge

    Authors: Yan Jia, Mi Hong, **gyu Hou, Kailong Ren, Sifan Ma, ** Wang, Fangzhen Peng, Yinglin Ji, Lin Yang, Junjie Wang

    Abstract: This paper describes LeVoice automatic speech recognition systems to track2 of intelligent cockpit speech recognition challenge 2022. Track2 is a speech recognition task without limits on the scope of model size. Our main points include deep learning based speech enhancement, text-to-speech based speech generation, training data augmentation via various techniques and speech recognition model fusi… ▽ More

    Submitted 16 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: There are experimental errors

  10. arXiv:2210.06772  [pdf, ps, other

    cs.CL cs.LG

    Mitigating Unintended Memorization in Language Models via Alternating Teaching

    Authors: Zhe Liu, Xuedong Zhang, Fuchun Peng

    Abstract: Recent research has shown that language models have a tendency to memorize rare or unique sequences in the training corpora which can thus leak sensitive attributes of user data. We employ a teacher-student framework and propose a novel approach called alternating teaching to mitigate unintended memorization in sequential modeling. In our method, multiple teachers are trained on disjoint training… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  11. arXiv:2210.01863  [pdf, other

    stat.ML cs.LG

    Group Personalized Federated Learning

    Authors: Zhe Liu, Yue Hui, Fuchun Peng

    Abstract: Federated learning (FL) can help promote data privacy by training a shared model in a de-centralized manner on the physical devices of clients. In the presence of highly heterogeneous distributions of local data, personalized FL strategy seeks to mitigate the potential client drift. In this paper, we present the group personalization approach for applications of FL in which there exist inherent pa… ▽ More

    Submitted 11 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

  12. arXiv:2209.05281  [pdf, other

    eess.AS cs.AI cs.LG cs.SD stat.ML

    Modeling Dependent Structure for Utterances in ASR Evaluation

    Authors: Zhe Liu, Fuchun Peng

    Abstract: The bootstrap resampling method has been popular for performing significance analysis on word error rate (WER) in automatic speech recognition (ASR) evaluation. To deal with dependent speech data, the blockwise bootstrap approach is also introduced. By dividing utterances into uncorrelated blocks, this approach resamples these blocks instead of original data. However, it is typically nontrivial to… ▽ More

    Submitted 8 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

  13. arXiv:2201.11867  [pdf, other

    cs.CL cs.SD eess.AS

    Neural-FST Class Language Model for End-to-End Speech Recognition

    Authors: Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo Wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer

    Abstract: We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework. Our method utilizes a background NNLM which models generic background text together with a collection of domain-specific entities modeled as individual FSTs. Each outpu… ▽ More

    Submitted 31 January, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted for publication at ICASSP 2022

  14. Optimal Update for Energy Harvesting Sensor with Reliable Backup Energy

    Authors: Lixin Wang, Fuzhou Peng, Xiang Chen, Shidong Zhou

    Abstract: In this paper, we consider an information update system where a wireless sensor sends timely updates to the destination over an erasure channel with the supply of harvested energy and reliable backup energy. The metric Age of Information(AoI) is adopted to measure the timeliness of the received updates at the destination. We aim to find the optimal information updating policy that minimizes the ti… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: 9 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2110.07233

  15. arXiv:2112.14834  [pdf, other

    cs.NE cs.AI cs.LG

    Training Quantized Deep Neural Networks via Cooperative Coevolution

    Authors: Fu Peng, Shengcai Liu, Ning Lu, Ke Tang

    Abstract: This work considers a challenging Deep Neural Network(DNN) quantization task that seeks to train quantized DNNs without involving any full-precision operations. Most previous quantization approaches are not applicable to this task since they rely on full-precision gradients to update network weights. To fill this gap, in this work we advocate using Evolutionary Algorithms (EAs) to search for the o… ▽ More

    Submitted 23 May, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: 13 pages, 4 figures, accepted for publication of ICSI

  16. arXiv:2110.10026  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Private Language Model Adaptation for Speech Recognition

    Authors: Zhe Liu, Ke Li, Shreyan Bakshi, Fuchun Peng

    Abstract: Speech model adaptation is crucial to handle the discrepancy between server-side proxy training data and actual data received on local devices of users. With the use of federated learning (FL), we introduce an efficient approach on continuously adapting neural network language models (NNLMs) on private devices with applications on automatic speech recognition (ASR). To address the potential speech… ▽ More

    Submitted 15 June, 2022; v1 submitted 27 September, 2021; originally announced October 2021.

  17. arXiv:2110.07233  [pdf, ps, other

    cs.IT

    Optimal Update in Energy Harvesting Aided Terahertz Communications with Random Blocking

    Authors: Lixin Wang, Fuzhou Peng, Xiang Chen, Shidong Zhou

    Abstract: In this paper, we consider an information update system where wireless sensor sends timely updates to the destination over a random blocking terahertz channel with the supply of harvested energy and reliable energy backup. The paper aims to find the optimal information updating policy that minimize the time-average weighted sum of the Age of information(AoI) and the reliable energy costs by formul… ▽ More

    Submitted 15 October, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 9 pages, 4 Postscript figures

  18. arXiv:2109.09061  [pdf, ps, other

    stat.ML cs.CY cs.LG

    Model-Based Approach for Measuring the Fairness in ASR

    Authors: Zhe Liu, Irina-Elena Veliche, Fuchun Peng

    Abstract: The issue of fairness arises when the automatic speech recognition (ASR) systems do not perform equally well for all subgroups of the population. In any fairness measurement studies for ASR, the open questions of how to control the nuisance factors, how to handle unobserved heterogeneity across speakers, and how to trace the source of any word error rate (WER) gap among different subgroups are esp… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

  19. arXiv:2107.04154  [pdf, other

    eess.AS cs.LG

    On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

    Authors: Xiaohui Zhang, Vimal Manohar, David Zhang, Frank Zhang, Yangyang Shi, Nayan Singhal, Julian Chan, Fuchun Peng, Yatharth Saraf, Mike Seltzer

    Abstract: Hybrid automatic speech recognition (ASR) models are typically sequentially trained with CTC or LF-MMI criteria. However, they have vastly different legacies and are usually implemented in different frameworks. In this paper, by decoupling the concepts of modeling units and label topologies and building proper numerator/denominator graphs accordingly, we establish a generalized framework for hybri… ▽ More

    Submitted 26 September, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: accepted by ASRU 2021

  20. arXiv:2105.12849  [pdf, ps, other

    cs.LG

    CARLS: Cross-platform Asynchronous Representation Learning System

    Authors: Chun-Ta Lu, Yun Zeng, Da-Cheng Juan, Yicheng Fan, Zhe Li, Jan Dlabal, Yi-Ting Chen, Arjun Gopalan, Allan Heydon, Chun-Sung Ferng, Reah Miyara, Ariel Fuxman, Futang Peng, Zhen Li, Tom Duerig, Andrew Tomkins

    Abstract: In this work, we propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks by enabling multiple components -- model trainers, knowledge makers and knowledge banks -- to concertedly work together in an asynchronous fashion across hardware platforms. The proposed CARLS is particularly suitable for learning paradigms where model training benefits from additiona… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

  21. arXiv:2104.12369  [pdf, other

    cs.CL

    PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

    Authors: Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xin**g Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, ** Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang , et al. (13 additional authors not shown)

    Abstract: Large-scale Pretrained Language Models (PLMs) have become the new paradigm for Natural Language Processing (NLP). PLMs with hundreds of billions parameters such as GPT-3 have demonstrated strong performances on natural language understanding and generation with \textit{few-shot in-context} learning. In this work, we present our practice on training large-scale autoregressive language models named… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

    Comments: The technique report for PanGu-$α$

  22. arXiv:2101.01304  [pdf, ps, other

    cs.IT

    Algebraic Geometric Secret Sharing Schemes over Large Fields Are Asymptotically Threshold

    Authors: Fan Peng, Hao Chen, Chang-An Zhao

    Abstract: In Chen-Cramer Crypto 2006 paper \cite{cc} algebraic geometric secret sharing schemes were proposed such that the "Fundamental Theorem in Information-Theoretically Secure Multiparty Computation" by Ben-Or, Goldwasser and Wigderson \cite{BGW88} and Chaum, Crépeau and Damgård \cite{CCD88} can be established over constant-size base finite fields. These algebraic geometric secret sharing schemes defin… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

  23. arXiv:2012.00898  [pdf, ps, other

    cs.CL stat.ML

    Federated Marginal Personalization for ASR Rescoring

    Authors: Zhe Liu, Fuchun Peng

    Abstract: We introduce federated marginal personalization (FMP), a novel method for continuously updating personalized neural network language models (NNLMs) on private devices using federated learning (FL). Instead of fine-tuning the parameters of NNLMs on personal data, FMP regularly estimates global and personalized marginal distributions of words, and adjusts the probabilities from NNLMs by an adaptatio… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  24. arXiv:2011.04785  [pdf, ps, other

    eess.AS cs.SD

    Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

    Authors: Xiaohui Zhang, Frank Zhang, Chunxi Liu, Kjell Schubert, Julian Chan, Pradyot Prakash, Jun Liu, Ching-Feng Yeh, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig

    Abstract: In this work, to measure the accuracy and efficiency for a latency-controlled streaming automatic speech recognition (ASR) application, we perform comprehensive evaluations on three popular training criteria: LF-MMI, CTC and RNN-T. In transcribing social media videos of 7 languages with training data 3K-14K hours, we conduct large-scale controlled experimentation across each criterion using identi… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted for publication at IEEE Spoken Language Technology Workshop (SLT), 2021

  25. arXiv:2003.03701  [pdf, other

    cs.CV

    Unifying Specialist Image Embedding into Universal Image Embedding

    Authors: Yang Feng, Futang Peng, Xu Zhang, Wei Zhu, Shanfeng Zhang, Howard Zhou, Zhen Li, Tom Duerig, Shih-Fu Chang, Jiebo Luo

    Abstract: Deep image embedding provides a way to measure the semantic similarity of two images. It plays a central role in many applications such as image search, face verification, and zero-shot learning. It is desirable to have a universal deep embedding model applicable to various domains of images. However, existing methods mainly rely on training specialist embedding models each of which is applicable… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

  26. arXiv:2002.10242  [pdf, other

    cs.IT cs.NI

    Age of Information Optimized MAC in V2X Sidelink via Piggyback-Based Collaboration

    Authors: Fei Peng, Zhiyuan Jiang, Shunqing Zhang, Shugong Xu

    Abstract: Real-time status update in future vehicular networks is vital to enable control-level cooperative autonomous driving. Cellular Vehicle-to-Everything (C-V2X), as one of the most promising vehicular wireless technologies, adopts a Semi-Persistent Scheduling (SPS) based Medium-Access-Control (MAC) layer protocol for its sidelink communications. Despite the recent and ongoing efforts to optimize SPS,… ▽ More

    Submitted 13 April, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Submitted to IEEE TWC for possible publication

  27. arXiv:2002.01255  [pdf, other

    cs.IT cs.NI eess.SP

    Revealing Much While Saying Less: Predictive Wireless for Status Update

    Authors: Zhiyuan Jiang, Zixu Cao, Siyu Fu, Fei Peng, Shan Cao, Shunqing Zhang, Shugong Xu

    Abstract: Wireless communications for status update are becoming increasingly important, especially for machine-type control applications. Existing work has been mainly focused on Age of Information (AoI) optimizations. In this paper, a status-aware predictive wireless interface design, networking and implementation are presented which aim to minimize the status recovery error of a wireless networked system… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: To appear in IEEE INFOCOM 2020

  28. arXiv:1912.09508  [pdf, other

    stat.ML cs.LG eess.AS

    Statistical Testing on ASR Performance via Blockwise Bootstrap

    Authors: Zhe Liu, Fuchun Peng

    Abstract: A common question being raised in automatic speech recognition (ASR) evaluations is how reliable is an observed word error rate (WER) improvement comparing two ASR systems, where statistical hypothesis testing and confidence interval (CI) can be utilized to tell whether this improvement is real or only due to random chance. The bootstrap resampling method has been popular for such significance ana… ▽ More

    Submitted 20 May, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

    Comments: 6 pages, 2 figures

  29. arXiv:1911.10235  [pdf, other

    cs.CL cs.LG

    Improving N-gram Language Models with Pre-trained Deep Transformer

    Authors: Yiren Wang, Hongzhao Huang, Zhe Liu, Yutong Pang, Yongqiang Wang, ChengXiang Zhai, Fuchun Peng

    Abstract: Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference. In this paper, we demonstrate that n-gram LM can be improved by neural LMs through a text generation based data augmentation method. In contrast to previous approaches, we employ a large-scale general domain pre-t… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

  30. arXiv:1911.07874  [pdf, other

    cs.LG stat.ML

    RWNE: A Scalable Random-Walk-Based Network Embedding Framework with Personalized Higher-Order Proximity Preserved

    Authors: Jianxin Li, Cheng Ji, Hao Peng, Yu He, Yangqiu Song, Xinmiao Zhang, Fanzhang Peng

    Abstract: Higher-order proximity preserved network embedding has attracted increasing attention. In particular, due to the superior scalability, random-walk-based network embedding has also been well developed, which could efficiently explore higher-order neighborhoods via multi-hop random walks. However, despite the success of current random-walk-based methods, most of them are usually not expressive enoug… ▽ More

    Submitted 7 April, 2021; v1 submitted 18 November, 2019; originally announced November 2019.

  31. arXiv:1910.12367  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Training ASR models by Generation of Contextual Information

    Authors: Kritika Singh, Dmytro Okhonko, Jun Liu, Yongqiang Wang, Frank Zhang, Ross Girshick, Sergey Edunov, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

    Abstract: Supervised ASR models have reached unprecedented levels of accuracy, thanks in part to ever-increasing amounts of labelled training data. However, in many applications and locales, only moderate amounts of data are available, which has led to a surge in semi- and weakly-supervised learning research. In this paper, we conduct a large-scale study evaluating the effectiveness of weakly-supervised lea… ▽ More

    Submitted 14 February, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

  32. arXiv:1910.11450  [pdf, ps, other

    cs.CL eess.AS

    An Empirical Study of Efficient ASR Rescoring with Transformers

    Authors: Hongzhao Huang, Fuchun Peng

    Abstract: Neural language models (LMs) have been proved to significantly outperform classical n-gram LMs for language modeling due to their superior abilities to model long-range dependencies in text and handle data sparsity problems. And recently, well configured deep Transformers have exhibited superior performance over shallow stack of recurrent neural network layers for language modeling. However, these… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: 5 pages, 5 tables

  33. arXiv:1910.10670  [pdf, other

    cs.CL cs.LG

    Efficient Dynamic WFST Decoding for Personalized Language Models

    Authors: Jun Liu, Jiedan Zhu, Vishal Kathuria, Fuchun Peng

    Abstract: We propose a two-layer cache mechanism to speed up dynamic WFST decoding with personalized language models. The first layer is a public cache that stores most of the static part of the graph. This is shared globally among all users. A second layer is a private cache that caches the graph that represents the personalized language model, which is only shared by the utterances from a particular user.… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Comments: 5 pages, 4 figures

  34. arXiv:1910.07117  [pdf, other

    cs.CL cs.AI cs.LG

    Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

    Authors: Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

    Abstract: In this work, we study how the finetuning stage in the pretrain-finetune framework changes the behavior of a pretrained neural language generator. We focus on the transformer encoder-decoder model for the open-domain dialogue response generation task. Our major finding is that after standard finetuning, the model forgets some of the important language generation skills acquired during large-scale… ▽ More

    Submitted 16 January, 2021; v1 submitted 15 October, 2019; originally announced October 2019.

    Journal ref: EACL 2021

  35. arXiv:1907.08489  [pdf, other

    cs.AI cs.CY cs.LG

    Empowering A* Search Algorithms with Neural Networks for Personalized Route Recommendation

    Authors: **gyuan Wang, Ning Wu, Wayne Xin Zhao, Fanzhang Peng, Xin Lin

    Abstract: Personalized Route Recommendation (PRR) aims to generate user-specific route suggestions in response to users' route queries. Early studies cast the PRR task as a pathfinding problem on graphs, and adopt adapted search algorithms by integrating heuristic strategies. Although these methods are effective to some extent, they require setting the cost functions with heuristics. In addition, it is diff… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: 9 pages, 25TH ACM SIGKDD Conference On Knowledge Discovery And Data Mining

  36. arXiv:1902.10814  [pdf, other

    cs.CV cs.LG stat.ML

    Graph-RISE: Graph-Regularized Image Semantic Embedding

    Authors: Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Yaxi Gao, Tom Duerig, Andrew Tomkins, Sujith Ravi

    Abstract: Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering. In this paper, we present Graph-Regularized Image Semantic Embedding (Graph-RISE), a large-scale neural graph learning framework that allows us to train embeddings to discriminate an unprecedented O(40M) ultra-fine-grained semant… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: 9 pages, 7 figures

  37. arXiv:1811.07665  [pdf

    cs.CV

    FD-GAN: Face-demorphing generative adversarial network for restoring accomplice's facial image

    Authors: Fei Peng, Le-bing Zhang, Min Long

    Abstract: Face morphing attack is proved to be a serious threat to the existing face recognition systems. Although a few face morphing detection methods have been put forward, the face morphing accomplice's facial restoration remains a challenging problem. In this paper, a face de-morphing generative adversarial network (FD-GAN) is proposed to restore the accomplice's facial image. It utilizes a symmetric d… ▽ More

    Submitted 21 March, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: 9 pages, 7 figures

  38. arXiv:1811.00253  [pdf, other

    cs.CL

    Hybrid Self-Attention Network for Machine Translation

    Authors: Kaitao Song, Xu Tan, Furong Peng, Jianfeng Lu

    Abstract: The encoder-decoder is the typical framework for Neural Machine Translation (NMT), and different structures have been developed for improving the translation performance. Transformer is one of the most promising structures, which can leverage the self-attention mechanism to capture the semantic dependency from global view. However, it cannot distinguish the relative position of different tokens ve… ▽ More

    Submitted 10 December, 2018; v1 submitted 1 November, 2018; originally announced November 2018.

  39. A Trip to the Moon: Personalized Animated Movies for Self-reflection

    Authors: Fengjiao Peng, Veronica LaBelle, Emily Yue, Rosalind Picard

    Abstract: Self-tracking physiological and psychological data poses the challenge of presentation and interpretation. Insightful narratives for self-tracking data can motivate the user towards constructive self-reflection. One powerful form of narrative that engages audience across various culture and age groups is animated movies. We collected a week of self-reported mood and behavior data from each user an… ▽ More

    Submitted 8 January, 2018; originally announced January 2018.

    ACM Class: H.5.1

    Journal ref: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018

  40. arXiv:1708.04670  [pdf, other

    cs.CV cs.AI cs.LG

    DeepFaceLIFT: Interpretable Personalized Models for Automatic Estimation of Self-Reported Pain

    Authors: Dianbo Liu, Fengjiao Peng, Andrew Shea, Ognjen, Rudovic, Rosalind Picard

    Abstract: Previous research on automatic pain estimation from facial expressions has focused primarily on "one-size-fits-all" metrics (such as PSPI). In this work, we focus on directly estimating each individual's self-reported visual-analog scale (VAS) pain metric, as this is considered the gold standard for pain measurement. The VAS pain score is highly subjective and context-dependent, and its range can… ▽ More

    Submitted 9 August, 2017; originally announced August 2017.

  41. arXiv:1612.05793  [pdf, ps, other

    cs.RO

    Autonomous Localization and Map** Using a Single Mobile Device

    Authors: Tiexing Wang, Fangrong Peng, Biao Chen

    Abstract: This paper considers the problem of simultaneous 2-D room shape reconstruction and self-localization without the requirement of any pre-established infrastructure. A mobile device equipped with co-located microphone and loudspeaker as well as internal motion sensors is used to emit acoustic pulses and collect echoes reflected by the walls. Using only first order echoes, room shape recovery and sel… ▽ More

    Submitted 17 December, 2016; originally announced December 2016.

    Comments: Submitted to the IEEE Transactions on Audio, Speech and Language Processing

  42. arXiv:1611.06026  [pdf, other

    cs.CV

    Cross Domain Knowledge Transfer for Person Re-identification

    Authors: Qiqi Xiao, Kelei Cao, Haonan Chen, Fangyue Peng, Chi Zhang

    Abstract: Person Re-Identification (re-id) is a challenging task in computer vision, especially when there are limited training data from multiple camera views. In this paper, we pro- pose a deep learning based person re-identification method by transferring knowledge of mid-level attribute features and high-level classification features. Building on the idea that identity classification, attribute recognit… ▽ More

    Submitted 18 November, 2016; originally announced November 2016.

    Comments: 8 pages

  43. arXiv:1611.03264  [pdf, other

    cs.ET

    A Memristor Crossbar-Based Computation Scheme with High Precision

    Authors: Junyi Li, Fulin Peng, Fan Yang, Xuan Zeng

    Abstract: The memristor is promising to be the basic cell of next-generation computation systems. Compared to the traditional MOSFET device, the memristor is efficient over energy and area. But one of the biggest challenges faced with researchers is how to program a memristor's resistance precisely. Recently, an algorithm designed to save 8 valid bits in each memristor is proposed, but this is still not suf… ▽ More

    Submitted 19 November, 2016; v1 submitted 10 November, 2016; originally announced November 2016.

    Comments: 6 pages,5 figures,conference

  44. arXiv:1504.00150  [pdf, other

    cs.DB

    Discovering Restricted Regular Expressions with Interleaving

    Authors: Feifei Peng, Haiming Chen

    Abstract: Discovering a concise schema from given XML documents is an important problem in XML applications. In this paper, we focus on the problem of learning an unordered schema from a given set of XML examples, which is actually a problem of learning a restricted regular expression with interleaving using positive example strings. Schemas with interleaving could present meaningful knowledge that cannot b… ▽ More

    Submitted 1 April, 2015; originally announced April 2015.

    Comments: 12 pages

  45. arXiv:1212.2514  [pdf

    cs.LG stat.ML

    Boltzmann Machine Learning with the Latent Maximum Entropy Principle

    Authors: Shaojun Wang, Dale Schuurmans, Fuchun Peng, Yunxin Zhao

    Abstract: We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is different both from Jaynes maximum entropy principle and from standard maximum likelihood estimation.We demonstrate the LME principle BY deriving new algorithms for Boltzmann machine parameter estimation, and show how robust a… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-567-574

  46. arXiv:1207.4157  [pdf

    cs.LG cs.DL cs.IR stat.ML

    An Integrated, Conditional Model of Information Extraction and Coreference with Applications to Citation Matching

    Authors: Ben Wellner, Andrew McCallum, Fuchun Peng, Michael Hay

    Abstract: Although information extraction and coreference resolution appear together in many applications, most current systems perform them as ndependent steps. This paper describes an approach to integrated inference for extraction and coreference based on conditionally-trained undirected graphical models. We discuss the advantages of conditional probability training, and of a coreference model structure… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-593-601