Skip to main content

Showing 1–50 of 63 results for author: Pang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15774  [pdf, other

    cs.RO

    Observation Time Difference: an Online Dynamic Objects Removal Method for Ground Vehicles

    Authors: Rongguang Wu, Chenglin Pang, Xuankang Wu, Zheng Fang

    Abstract: In the process of urban environment map**, the sequential accumulations of dynamic objects will leave a large number of traces in the map. These traces will usually have bad influences on the localization accuracy and navigation performance of the robot. Therefore, dynamic objects removal plays an important role for creating clean map. However, conventional dynamic objects removal methods usuall… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.13243  [pdf, ps, other

    cs.IT

    Abelian Group Codes for Classical and Classical-Quantum Channels: One-shot and Asymptotic Rate Bounds

    Authors: James Chin-Jen Pang, Sandeep Pradhan, Hessam Mahdavifar

    Abstract: We study the problem of transmission of information over classical and classical-quantum channels in the one-shot regime where the underlying codes are constrained to be group codes. In the achievability part, we introduce a new input probability distribution that incorporates the encoding homomorphism and the underlying channel law. Using a random coding argument, we characterize the performance… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 41 pages

  3. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, **ming Guo, Xiaolin Chen, **gcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.04113  [pdf, other

    cs.CL

    Uncovering Limitations of Large Language Models in Information Seeking from Tables

    Authors: Chaoxu Pang, Yixuan Cao, Chunhao Yang, ** Luo

    Abstract: Tables are recognized for their high information density and widespread usage, serving as essential sources of information. Seeking information from tables (TIS) is a crucial capability for Large Language Models (LLMs), serving as the foundation of knowledge-based Q&A systems. However, this field presently suffers from an absence of thorough and reliable evaluation. This paper introduces a more re… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024

  5. arXiv:2405.07765  [pdf, other

    cs.CL

    TANQ: An open domain dataset of table answered questions

    Authors: Mubashara Akhtar, Chenxi Pang, Andreea Marzoca, Yasemin Altun, Julian Martin Eisenschlos

    Abstract: Language models, potentially augmented with tool usage such as retrieval are becoming the go-to means of answering questions. Understanding and answering questions in real-world settings often requires retrieving information from different sources, processing and aggregating data to extract insights, and presenting complex findings in form of structured artifacts such as novel tables, charts, or i… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 10 pages

  6. arXiv:2403.20213  [pdf, other

    cs.CV

    H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model

    Authors: Chao Pang, Jiang Wu, Jiayu Li, Yi Liu, Jiaxing Sun, Weijia Li, Xingxing Weng, Shuai Wang, Litong Feng, Gui-Song Xia, Conghui He

    Abstract: The generic large Vision-Language Models (VLMs) is rapidly develo**, but still perform poorly in Remote Sensing (RS) domain, which is due to the unique and specialized nature of RS imagery and the comparatively limited spatial perception of current VLMs. Existing Remote Sensing specific Vision Language Models (RSVLMs) still have considerable potential for improvement, primarily owing to the lack… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Equal contribution: Chao Pang, Jiang Wu; Corresponding author: Gui-Song Xia, Conghui He

  7. arXiv:2402.18132  [pdf, other

    cs.CV cs.NE

    Understanding the Role of Pathways in a Deep Neural Network

    Authors: Lei Lyu, Chen Pang, Jihua Wang

    Abstract: Deep neural networks have demonstrated superior performance in artificial intelligence applications, but the opaqueness of their inner working mechanism is one major drawback in their application. The prevailing unit-based interpretation is a statistical observation of stimulus-response data, which fails to show a detailed internal process of inherent mechanisms of neural networks. In this work, w… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  8. RadarMOSEVE: A Spatial-Temporal Transformer Network for Radar-Only Moving Object Segmentation and Ego-Velocity Estimation

    Authors: Changsong Pang, Xieyuanli Chen, Yimin Liu, Huimin Lu, Yuwei Cheng

    Abstract: Moving object segmentation (MOS) and Ego velocity estimation (EVE) are vital capabilities for mobile systems to achieve full autonomy. Several approaches have attempted to achieve MOSEVE using a LiDAR sensor. However, LiDAR sensors are typically expensive and susceptible to adverse weather conditions. Instead, millimeter-wave radar (MWR) has gained popularity in robotics and autonomous driving for… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted at AAAI-24

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence.38(2024)4424-4432

  9. arXiv:2402.04400  [pdf, other

    cs.LG cs.AI cs.CY

    CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines

    Authors: Chao Pang, Xinzhuo Jiang, Nishanth Parameshwar Pavinkurve, Krishna S. Kalluri, Elise L. Minto, Jason Patterson, Linying Zhang, George Hripcsak, Gamze Gürsoy, Noémie Elhadad, Karthik Natarajan

    Abstract: Synthetic Electronic Health Records (EHR) have emerged as a pivotal tool in advancing healthcare applications and machine learning models, particularly for researchers without direct access to healthcare data. Although existing methods, like rule-based approaches and generative adversarial networks (GANs), generate synthetic data that resembles real-world EHR data, these methods often use a tabula… ▽ More

    Submitted 5 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  10. arXiv:2401.10752  [pdf, other

    cs.CV

    HiCD: Change Detection in Quality-Varied Images via Hierarchical Correlation Distillation

    Authors: Chao Pang, Xingxing Weng, Jiang Wu, Qiang Wang, Gui-Song Xia

    Abstract: Advanced change detection techniques primarily target image pairs of equal and high quality. However, variations in imaging conditions and platforms frequently lead to image pairs with distinct qualities: one image being high-quality, while the other being low-quality. These disparities in image quality present significant challenges for understanding image pairs semantically and extracting change… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: accepted by TGRS

  11. arXiv:2312.14557  [pdf, other

    cs.CL

    Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning

    Authors: Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Yaofei Duan, Kunyan Cai, Han Ma, Jiaxi Cui, Jian Li, Patrick Cheong-Iao Pang, Yapeng Wang, Tao Tan

    Abstract: Existing research has demonstrated that refining large language models (LLMs) through the utilization of machine-generated instruction-following data empowers these models to exhibit impressive zero-shot capabilities for novel tasks, without requiring human-authored instructions. In this paper, we systematically investigate, preprocess, and integrate three Chinese instruction-following datasets wi… ▽ More

    Submitted 1 January, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 10 pages, 2 figures

  12. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  13. arXiv:2312.06682  [pdf, other

    cs.AI cs.LG

    Learning to Denoise Unreliable Interactions for Link Prediction on Biomedical Knowledge Graph

    Authors: Tengfei Ma, Yujie Chen, Wen Tao, Dashun Zheng, Xuan Lin, Patrick Cheong-lao Pang, Yi** Liu, Yijun Wang, Bosheng Song, Xiangxiang Zeng

    Abstract: Link prediction in biomedical knowledge graphs (KGs) aims at predicting unknown interactions between entities, including drug-target interaction (DTI) and drug-drug interaction (DDI), which is critical for drug discovery and therapeutics. Previous methods prefer to utilize the rich semantic relations and topological structure of the KG to predict missing links, yielding promising outcomes. However… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  14. arXiv:2310.17901  [pdf, other

    cs.LG stat.ML

    Improving the Knowledge Gradient Algorithm

    Authors: Yang Le, Gao Siyang, Ho Chin Pang

    Abstract: The knowledge gradient (KG) algorithm is a popular policy for the best arm identification (BAI) problem. It is built on the simple idea of always choosing the measurement that yields the greatest expected one-step improvement in the estimate of the best mean of the arms. In this research, we show that this policy has limitations, causing the algorithm not asymptotically optimal. We next provide a… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: 32 pages, 42 figures

  15. arXiv:2310.05066  [pdf, other

    cs.CL cs.LG

    Guideline Learning for In-context Information Extraction

    Authors: Chaoxu Pang, Yixuan Cao, Qiang Ding, ** Luo

    Abstract: Large language models (LLMs) can perform a new task by merely conditioning on task instructions and a few input-output examples, without optimizing any parameters. This is called In-Context Learning (ICL). In-context Information Extraction (IE) has recently garnered attention in the research community. However, the performance of In-context IE generally lags behind the state-of-the-art supervised… ▽ More

    Submitted 21 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 main conference

  16. arXiv:2310.02815  [pdf, other

    cs.CV cs.RO eess.IV

    CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

    Authors: Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

    Abstract: Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses pre… ▽ More

    Submitted 17 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The source code will be made publicly available at https://github.com/MasterHow/CoBEV

  17. arXiv:2307.10512  [pdf, other

    cs.CL cs.AI

    IvyGPT: InteractiVe Chinese pathwaY language model in medical domain

    Authors: Rongsheng Wang, Yaofei Duan, ChanTong Lam, Jiexi Chen, Jiangsheng Xu, Haoming Chen, Xiaohong Liu, Patrick Cheong-Iao Pang, Tao Tan

    Abstract: General large language models (LLMs) such as ChatGPT have shown remarkable success. However, such LLMs have not been widely adopted for medical purposes, due to poor accuracy and inability to provide medical advice. We propose IvyGPT, an LLM based on LLaMA that is trained and fine-tuned with high-quality medical question-answer (QA) instances and Reinforcement Learning from Human Feedback (RLHF).… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 5 pages, 3 figures

  18. arXiv:2305.07328  [pdf, other

    cs.CV

    Configurable Spatial-Temporal Hierarchical Analysis for Flexible Video Anomaly Detection

    Authors: Kai Cheng, Xinhua Zeng, Yang Liu, Tian Wang, Chengxin Pang, **g Teng, Zhaoyang Xia, **g Liu

    Abstract: Video anomaly detection (VAD) is a vital task with great practical applications in industrial surveillance, security system, and traffic control. Unlike previous unsupervised VAD methods that adopt a fixed structure to learn normality without considering different detection demands, we design a spatial-temporal hierarchical architecture (STHA) as a configurable architecture to flexibly detect diff… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: submitted to IEEE TCSVT, under peer review

  19. arXiv:2304.03981  [pdf, other

    cs.LG cs.CV

    Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

    Authors: Meng Wang, Tian Lin, Lianyu Wang, Aidi Lin, Ke Zou, Xinxing Xu, Yi Zhou, Yuanyuan Peng, Qingquan Meng, Yiming Qian, Guoyao Deng, Zhiqun Wu, Junhong Chen, Jianhong Lin, Mingzhi Zhang, Weifang Zhu, Changqing Zhang, Daoqiang Zhang, Rick Siow Mong Goh, Yong Liu, Chi Pui Pang, Xinjian Chen, Haoyu Chen, Huazhu Fu

    Abstract: Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We established an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calcul… ▽ More

    Submitted 29 August, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

  20. StoryChat: Designing a Narrative-Based Viewer Participation Tool for Live Streaming Chatrooms

    Authors: Ryan Yen, Li Feng, Brinda Mehra, Ching Christie Pang, Siying Hu, Zhicong Lu

    Abstract: Live streaming platforms and existing viewer participation tools enable users to interact and engage with an online community, but the anonymity and scale of chat usually result in the spread of negative comments. However, only a few existing moderation tools investigate the influence of proactive moderation on viewers' engagement and prosocial behavior. To address this, we developed StoryChat, a… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  21. arXiv:2303.09511  [pdf, other

    cs.IT eess.SP

    Capacity-achieving Polar-based Codes with Sparsity Constraints on the Generator Matrices

    Authors: James Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan

    Abstract: In this paper, we leverage polar codes and the well-established channel polarization to design capacity-achieving codes with a certain constraint on the weights of all the columns in the generator matrix (GM) while having a low-complexity decoding algorithm. We first show that given a binary-input memoryless symmetric (BMS) channel $W$ and a constant $s \in (0, 1]$, there exists a polarization ker… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 31 pages, single column. arXiv admin note: substantial text overlap with arXiv:2012.13977

  22. Harms from Increasingly Agentic Algorithmic Systems

    Authors: Alan Chan, Rebecca Salganik, Alva Markelius, Chris Pang, Nitarshan Rajkumar, Dmitrii Krasheninnikov, Lauro Langosco, Zhonghao He, Yawen Duan, Micah Carroll, Michelle Lin, Alex Mayhew, Katherine Collins, Maryam Molamohammadi, John Burden, Wanru Zhao, Shalaleh Rismani, Konstantinos Voudouris, Umang Bhatt, Adrian Weller, David Krueger, Tegan Maharaj

    Abstract: Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems… ▽ More

    Submitted 11 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted at FAccT 2023

  23. arXiv:2302.05582  [pdf, other

    eess.AS cs.CL cs.SD cs.SE

    ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems

    Authors: Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim, David Lo

    Abstract: Recent years have witnessed wider adoption of Automated Speech Recognition (ASR) techniques in various domains. Consequently, evaluating and enhancing the quality of ASR systems is of great importance. This paper proposes ASDF, an Automated Speech Recognition Differential Testing Framework for testing ASR systems. ASDF extends an existing ASR testing tool, the CrossASR++, which synthesizes test ca… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: Accpeted by ICST 2023 Tool Demo Track

  24. arXiv:2302.04456  [pdf, other

    cs.SD cs.AI cs.CL cs.MM eess.AS

    ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

    Authors: Pengfei Zhu, Chao Pang, Yekun Chai, Lei Li, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu

    Abstract: In recent years, the burgeoning interest in diffusion models has led to significant advances in image and speech generation. Nevertheless, the direct synthesis of music waveforms from unrestricted textual prompts remains a relatively underexplored domain. In response to this lacuna, this paper introduces a pioneering contribution in the form of a text-to-waveform music generation model, underpinne… ▽ More

    Submitted 21 September, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted by AACL demo 2023

  25. arXiv:2301.11495  [pdf, other

    cs.CV

    Skeleton-based Action Recognition through Contrasting Two-Stream Spatial-Temporal Networks

    Authors: Chen Pang, Xuequan Lu, Lei Lyu

    Abstract: For pursuing accurate skeleton-based action recognition, most prior methods use the strategy of combining Graph Convolution Networks (GCNs) with attention-based methods in a serial way. However, they regard the human skeleton as a complete graph, resulting in less variations between different actions (e.g., the connection between the elbow and head in action ``clap** hands''). For this, we propo… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: 14 pages, 9 figures

  26. arXiv:2301.10922  [pdf, other

    cs.CV

    Detecting Building Changes with Off-Nadir Aerial Images

    Authors: Chao Pang, Jiang Wu, Jian Ding, Can Song, Gui-Song Xia

    Abstract: The tilted viewing nature of the off-nadir aerial images brings severe challenges to the building change detection (BCD) problem: the mismatch of the nearby buildings and the semantic ambiguity of the building facades. To tackle these challenges, we present a multi-task guided change detection network model, named as MTGCD-Net. The proposed model approaches the specific BCD problem by designing th… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Journal ref: SCIENCE CHINA Information Sciences (SCIS) 2023

  27. arXiv:2212.10505  [pdf, other

    cs.CL cs.AI cs.CV

    DePlot: One-shot visual language reasoning by plot-to-table translation

    Authors: Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun

    Abstract: Visual language such as charts and plots is ubiquitous in the human world. Comprehending plots and charts requires strong reasoning skills. Prior state-of-the-art (SOTA) models require at least tens of thousands of training examples and their reasoning capabilities are still much limited, especially on complex human-written queries. This paper presents the first one-shot solution to visual languag… ▽ More

    Submitted 23 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023 (Findings)

  28. arXiv:2212.09662  [pdf, other

    cs.CL cs.AI cs.CV

    MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering

    Authors: Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos

    Abstract: Visual language data such as plots, charts, and infographics are ubiquitous in the human world. However, state-of-the-art vision-language models do not perform well on these data. We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models' capabilities in jointly modeling charts/plots and language data. Specifically, we propose several pretraining tasks… ▽ More

    Submitted 23 May, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  29. arXiv:2212.06742  [pdf, other

    cs.CL cs.LG cs.PL cs.SE

    ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

    Authors: Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu

    Abstract: Software engineers working with the same programming language (PL) may speak different natural languages (NLs) and vice versa, erecting huge barriers to communication and working efficiency. Recent studies have demonstrated the effectiveness of generative pre-training in computer programs, yet they are always English-centric. In this work, we step towards bridging the gap between multilingual NLs… ▽ More

    Submitted 19 May, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted at ACL 2023 (Findings)

  30. arXiv:2212.01575  [pdf

    cs.LG q-bio.BM

    Multi-view deep learning based molecule design and structural optimization accelerates the SARS-CoV-2 inhibitor discovery

    Authors: Chao Pang, Yu Wang, Yi Jiang, Ruheng Wang, Ran Su, Leyi Wei

    Abstract: In this work, we propose MEDICO, a Multi-viEw Deep generative model for molecule generation, structural optimization, and the SARS-CoV-2 Inhibitor disCOvery. To the best of our knowledge, MEDICO is the first-of-this-kind graph generative model that can generate molecular graphs similar to the structure of targeted molecules, with a multi-view representation learning framework to sufficiently and a… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

  31. arXiv:2211.07454  [pdf, other

    cs.CV

    LGN-Net: Local-Global Normality Network for Video Anomaly Detection

    Authors: Mengyang Zhao, Xinhua Zeng, Yang Liu, **g Liu, Di Li, Xing Hu, Chengxin Pang

    Abstract: Video anomaly detection (VAD) has been intensively studied for years because of its potential applications in intelligent video systems. Existing unsupervised VAD methods tend to learn normality from training sets consisting of only normal videos and regard instances deviating from such normality as anomalies. However, they often consider only local or global normality in the temporal dimension. S… ▽ More

    Submitted 8 January, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

  32. arXiv:2211.03885  [pdf, other

    cs.CV eess.IV

    Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xu, Minsu Kwon, Yaqi Wu, Jiesi Zheng, Zhihao Fan, Xun Wu, Feng Zhang, Albert No, Minhyeok Cho, Zewen Chen, Xiaze Zhang, Ran Li , et al. (13 additional authors not shown)

    Abstract: The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. Th… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  33. arXiv:2211.03545  [pdf, other

    eess.AS cs.CL cs.SD

    ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

    Authors: Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu

    Abstract: Speech representation learning has improved both speech understanding and speech synthesis tasks for single language. However, its ability in cross-lingual scenarios has not been explored. In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing. We prop… ▽ More

    Submitted 4 December, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

  34. arXiv:2210.02630  [pdf

    cs.LG physics.chem-ph q-bio.BM

    MechRetro is a chemical-mechanism-driven graph learning framework for interpretable retrosynthesis prediction and pathway planning

    Authors: Yu Wang, Chao Pang, Yuzhe Wang, Yi Jiang, Junru **, Sirui Liang, Quan Zou, Leyi Wei

    Abstract: Leveraging artificial intelligence for automatic retrosynthesis speeds up organic pathway planning in digital laboratories. However, existing deep learning approaches are unexplainable, like "black box" with few insights, notably limiting their applications in real retrosynthesis scenarios. Here, we propose MechRetro, a chemical-mechanism-driven graph learning framework for interpretable retrosynt… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Journal ref: Nat Commun 14, 6155 (2023)

  35. arXiv:2204.10082  [pdf, other

    cs.RO

    Viko 2.0: A Hierarchical Gecko-inspired Adhesive Gripper with Visuotactile Sensor

    Authors: Chohei Pang, Qicheng Wang, Kinwing Mak, Hongyu Yu, Michael Yu Wang

    Abstract: Robotic grippers with visuotactile sensors have access to rich tactile information for gras** tasks but encounter difficulty in partially encompassing large objects with sufficient grip force. While hierarchical gecko-inspired adhesives are a potential technique for bridging performance gaps, they require a large contact area for efficient usage. In this work, we present a new version of an adap… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  36. arXiv:2204.09962  [pdf, other

    cs.CV cs.MM eess.IV

    ChildPredictor: A Child Face Prediction Framework with Disentangled Learning

    Authors: Yuzhi Zhao, Lai-Man Po, Xuehui Wang, Qiong Yan, Wei Shen, Yujia Zhang, Wei Liu, Chun-Kit Wong, Chiu-Sing Pang, Weifeng Ou, Wing-Yin Yu, Buhua Liu

    Abstract: The appearances of children are inherited from their parents, which makes it feasible to predict them. Predicting realistic children's faces may help settle many social problems, such as age-invariant face recognition, kinship verification, and missing child identification. It can be regarded as an image-to-image translation task. Existing approaches usually assume domain information in the image-… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: accepted to IEEE Transactions on Multimedia

  37. arXiv:2202.03472  [pdf, other

    cs.IT eess.SP

    New Bounds on the Size of Binary Codes with Large Minimum Distance

    Authors: James Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan

    Abstract: Let $A(n, d)$ denote the maximum size of a binary code of length $n$ and minimum Hamming distance $d$. Studying $A(n, d)$, including efforts to determine it as well to derive bounds on $A(n, d)$ for large $n$'s, is one of the most fundamental subjects in coding theory. In this paper, we explore new lower and upper bounds on $A(n, d)$ in the large-minimum distance regime, in particular, when… ▽ More

    Submitted 23 May, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 23 pages

  38. arXiv:2112.12731  [pdf, other

    cs.CL

    ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

    Authors: Shuohuan Wang, Yu Sun, Yang Xiang, Zhihua Wu, Siyu Ding, Weibao Gong, Shikun Feng, Junyuan Shang, Yanbin Zhao, Chao Pang, Jiaxiang Liu, Xuyi Chen, Yuxiang Lu, Weixin Liu, Xi Wang, Yangfan Bai, Qiuliang Chen, Li Zhao, Shiyong Li, Peng Sun, Dianhai Yu, Yanjun Ma, Hao Tian, Hua Wu, Tian Wu , et al. (4 additional authors not shown)

    Abstract: Pre-trained language models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. GPT-3 has shown that scaling up pre-trained language models can further exploit their enormous potential. A unified framework named ERNIE 3.0 was recently proposed for pre-training large-scale knowledge enhanced models and trained a model with 10 billion parameters. ERNIE 3.0 outp… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: text overlap with arXiv:2107.02137

  39. arXiv:2111.08585  [pdf, other

    cs.LG

    CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks

    Authors: Chao Pang, Xinzhuo Jiang, Krishna S Kalluri, Matthew Spotnitz, RuiJun Chen, Adler Perotte, Karthik Natarajan

    Abstract: Embedding algorithms are increasingly used to represent clinical concepts in healthcare for improving machine learning tasks such as clinical phenoty** and disease prediction. Recent studies have adapted state-of-the-art bidirectional encoder representations from transformers (BERT) architecture to structured electronic health records (EHR) data for the generation of contextualized concept embed… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Journal ref: Proceedings of Machine Learning for Health, PMLR 158:239-260, 2021

  40. arXiv:2109.08635  [pdf, other

    cs.SE cs.CR

    Facilitating Parallel Fuzzing with mutually-exclusive Task Distribution

    Authors: Yifan Wang, Yuchen Zhang, Chengbin Pang, Peng Li, Nikolaos Triandopoulos, Jun Xu

    Abstract: Fuzz testing, or fuzzing, has become one of the de facto standard techniques for bug finding in the software industry. In general, fuzzing provides various inputs to the target program to discover unhandled exceptions and crashes. In business sectors where the time budget is limited, software vendors often launch many fuzzing instances in parallel as common means of increasing code coverage. Howev… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

  41. Requirements-Aided Automatic Test Case Generation for Industrial Cyber-physical Systems

    Authors: Roopak Sinha, Cheng Pang, Gerardo Santillán Martínez, Juha Kuronen, Valeriy Vyatkin

    Abstract: Industrial cyber-physical systems require complex distributed software to orchestrate many heterogeneous mechatronic components and control multiple physical processes. Industrial automation software is typically developed in a model-driven fashion where abstractions of physical processes called plant models are co-developed and iteratively refined along with the control code. Testing such multi-d… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: Conference paper, 5 pages, 5 figures

    Journal ref: Proceedings of the 20th International Conference on Engineering of Complex Computer Systems (ICECCS2015). Gold Coast, QLD, Australia, IEEE Computer Society Press, pp.198-201

  42. arXiv:2107.02137  [pdf, other

    cs.CL

    ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

    Authors: Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen, Yanbin Zhao, Yuxiang Lu, Weixin Liu, Zhihua Wu, Weibao Gong, Jianzhong Liang, Zhizhou Shang, Peng Sun, Wei Liu, Xuan Ouyang, Dianhai Yu, Hao Tian, Hua Wu, Haifeng Wang

    Abstract: Pre-trained models have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up pre-trained language models can improve their generalization abilities. Particularly, the GPT-3 model with 175 billion parameters shows its strong task-agnostic zero-shot/few-shot learning capabilities. Despite their success, the… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  43. arXiv:2105.05159  [pdf, ps, other

    cs.PL cs.FL cs.SE eess.SY

    Proving LTL Properties of Bitvector Programs and Decompiled Binaries (Extended)

    Authors: Yuandong Cyrus Liu, Chengbin Pang, Daniel Dietsch, Eric Koskinen, Ton-Chanh Le, Georgios Portokalidis, Jun Xu

    Abstract: There is increasing interest in applying verification tools to programs that have bitvector operations (eg., binaries). SMT solvers, which serve as a foundation for these tools, have thus increased support for bitvector reasoning through bit-blasting and linear arithmetic approximations. In this paper we show that similar linear arithmetic approximation of bitvector operations can be done at the s… ▽ More

    Submitted 28 August, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: 39 pages(including Appendix), 10 tables, 4 Postscript figures, accepted to APLAS 2021

  44. arXiv:2105.00680  [pdf, other

    cs.RO

    Viko: An Adaptive Gecko Gripper with Vision-based Tactile Sensor

    Authors: Chohei Pang, Kinwing Mak, Yazhan Zhang, Yang Yang, Yu Alexander Tse, Michael Yu Wang

    Abstract: Monitoring the state of contact is essential for robotic devices, especially grippers that implement gecko-inspired adhesives where intimate contact is crucial for a firm attachment. However, due to the lack of deformable sensors, few have demonstrated tactile sensing for gecko grippers. We present Viko, an adaptive gecko gripper that utilizes vision-based tactile sensors to monitor contact state.… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

    Comments: This paper is accepted to ICRA2021, please contact corresponding author Y. Tse ([email protected]) for details

  45. arXiv:2104.03168  [pdf, other

    cs.CR

    Towards Optimal Use of Exception Handling Information for Function Detection

    Authors: Chengbin Pang, Ruotong Yu, Dongpeng Xu, Eric Koskinen, Georgios Portokalidis, Jun Xu

    Abstract: Function entry detection is critical for security of binary code. Conventional methods heavily rely on patterns, inevitably missing true functions and introducing errors. Recently, call frames have been used in exception-handling for function start detection. However, existing methods have two problems. First, they combine call frames with heuristic-based approaches, which often brings error and u… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

  46. Multi-source Transfer Learning with Ensemble for Financial Time Series Forecasting

    Authors: Qi-Qiao He, Patrick Cheong-Iao Pang, Yain-Whar Si

    Abstract: Although transfer learning is proven to be effective in computer vision and natural language processing applications, it is rarely investigated in forecasting financial time series. Majority of existing works on transfer learning are based on single-source transfer learning due to the availability of open-access large-scale datasets. However, in financial domain, the lengths of individual time ser… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

  47. arXiv:2012.15674  [pdf, other

    cs.CL

    ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora

    Authors: Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

    Abstract: Recent studies have demonstrated that pre-trained cross-lingual models achieve impressive performance in downstream cross-lingual tasks. This improvement benefits from learning a large amount of monolingual and parallel corpora. Although it is generally acknowledged that parallel corpora are critical for improving the model performance, existing methods are often constrained by the size of paralle… ▽ More

    Submitted 17 September, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Accepted by EMNLP 2021 (main conference, long paper)

  48. arXiv:2012.13977  [pdf, other

    cs.IT

    Capacity-achieving Polar-based LDGM Codes

    Authors: James Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan

    Abstract: In this paper, we study codes with sparse generator matrices. More specifically, low-density generator matrix (LDGM) codes with a certain constraint on the weight of the columns in the generator matrix are considered. In this paper, it is first shown that when a BMS channel W and a constant s>0 are given, there exists a polarization kernel such that the corresponding polar code is capacity-achievi… ▽ More

    Submitted 27 June, 2022; v1 submitted 27 December, 2020; originally announced December 2020.

    Comments: Extended version, now includes moderate-block length comparison with the RLE. arXiv admin note: text overlap with arXiv:2001.11986

  49. arXiv:2010.04931  [pdf, other

    cs.RO

    Origami-based Shape Morphing Fingertip to Enhance Gras** Stability and Dexterity

    Authors: Zicheng Kan, Yazhan Zhang, Chohei Pang, Michael Yu Wang

    Abstract: Adaptation to various scene configurations and object properties, stability and dexterity in robotic gras** manipulation is far from explored. This work presents an origami-based shape morphing fingertip design to actively tackle the gras** stability and dexterity problems. The proposed fingertip utilizes origami as its skeleton providing degrees of freedom at desired positions and motor-drive… ▽ More

    Submitted 10 October, 2020; originally announced October 2020.

    Comments: Accepted to CASE2020

  50. arXiv:2007.14266  [pdf

    cs.CR

    SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask

    Authors: Chengbin Pang, Ruotong Yu, Yaohui Chen, Eric Koskinen, Georgios Portokalidis, Bing Mao, Jun Xu

    Abstract: Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.