Search | arXiv e-print repository

FreeTumor: Advance Tumor Segmentation via Large-Scale Tumor Synthesis

Authors: Linshan Wu, Jiaxin Zhuang, Xuefeng Ni, Hao Chen

Abstract: AI-driven tumor analysis has garnered increasing attention in healthcare. However, its progress is significantly hindered by the lack of annotated tumor cases, which requires radiologists to invest a lot of effort in collecting and annotation. In this paper, we introduce a highly practical solution for robust tumor synthesis and segmentation, termed FreeTumor, which refers to annotation-free synth… ▽ More AI-driven tumor analysis has garnered increasing attention in healthcare. However, its progress is significantly hindered by the lack of annotated tumor cases, which requires radiologists to invest a lot of effort in collecting and annotation. In this paper, we introduce a highly practical solution for robust tumor synthesis and segmentation, termed FreeTumor, which refers to annotation-free synthetic tumors and our desire to free patients that suffering from tumors. Instead of pursuing sophisticated technical synthesis modules, we aim to design a simple yet effective tumor synthesis paradigm to unleash the power of large-scale data. Specifically, FreeTumor advances existing methods mainly from three aspects: (1) Existing methods only leverage small-scale labeled data for synthesis training, which limits their ability to generalize well on unseen data from different sources. To this end, we introduce the adversarial training strategy to leverage large-scale and diversified unlabeled data in synthesis training, significantly improving tumor synthesis. (2) Existing methods largely ignored the negative impact of low-quality synthetic tumors in segmentation training. Thus, we employ an adversarial-based discriminator to automatically filter out the low-quality synthetic tumors, which effectively alleviates their negative impact. (3) Existing methods only used hundreds of cases in tumor segmentation. In FreeTumor, we investigate the data scaling law in tumor segmentation by scaling up the dataset to 11k cases. Extensive experiments demonstrate the superiority of FreeTumor, e.g., on three tumor segmentation benchmarks, average $+8.9\%$ DSC over the baseline that only using real tumors and $+6.6\%$ DSC over the state-of-the-art tumor synthesis method. Code will be available. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Preprint

arXiv:2405.10251 [pdf, other]

A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks

Authors: Xuanfan Ni, Piji Li

Abstract: Recent efforts have evaluated large language models (LLMs) in areas such as commonsense reasoning, mathematical reasoning, and code generation. However, to the best of our knowledge, no work has specifically investigated the performance of LLMs in natural language generation (NLG) tasks, a pivotal criterion for determining model excellence. Thus, this paper conducts a comprehensive evaluation of w… ▽ More Recent efforts have evaluated large language models (LLMs) in areas such as commonsense reasoning, mathematical reasoning, and code generation. However, to the best of our knowledge, no work has specifically investigated the performance of LLMs in natural language generation (NLG) tasks, a pivotal criterion for determining model excellence. Thus, this paper conducts a comprehensive evaluation of well-known and high-performing LLMs, namely ChatGPT, ChatGLM, T5-based models, LLaMA-based models, and Pythia-based models, in the context of NLG tasks. We select English and Chinese datasets encompassing Dialogue Generation and Text Summarization. Moreover, we propose a common evaluation setting that incorporates input templates and post-processing strategies. Our study reports both automatic results, accompanied by a detailed analysis. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: CCL2023

arXiv:2405.01718 [pdf, other]

Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

Authors: Xinyi Ni, Lifeng Lai

Abstract: Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goa… ▽ More Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.12000 [pdf, other]

How far are AI-powered programming assistants from meeting developers' needs?

Authors: Xin Tan, Xiao Long, Xianjun Ni, Yinghao Zhu, **g Jiang, Li Zhang

Abstract: Recent In-IDE AI coding assistant tools (ACATs) like GitHub Copilot have significantly impacted developers' coding habits. While some studies have examined their effectiveness, there lacks in-depth investigation into the actual assistance process. To bridge this gap, we simulate real development scenarios encompassing three typical types of software development tasks and recruit 27 computer scienc… ▽ More Recent In-IDE AI coding assistant tools (ACATs) like GitHub Copilot have significantly impacted developers' coding habits. While some studies have examined their effectiveness, there lacks in-depth investigation into the actual assistance process. To bridge this gap, we simulate real development scenarios encompassing three typical types of software development tasks and recruit 27 computer science students to investigate their behavior with three popular ACATs. Our goal is to comprehensively assess ACATs' effectiveness, explore characteristics of recommended code, identify reasons for modifications, and understand users' challenges and expectations. To facilitate the study, we develop an experimental platform that includes a data collection plugin for VSCode IDE and provides functions for screen recording, code evaluation, and automatic generation of personalized interview and survey questions. Through analysis of the collected data, we find that ACATs generally enhance task completion rates, reduce time, improve code quality, and increase self-perceived productivity. However, the improvement is influenced by both the nature of coding tasks and users' experience level. Notably, for experienced participants, the use of ACATs may even increase completion time. We observe that "edited line completion" is the most frequently recommended way, while "comments completion" and "string completion" have the lowest acceptance rates. The primary reasons for modifying recommended code are disparities between output formats and requirements, flawed logic, and inconsistent code styles. In terms of challenges and expectations, optimization of service access and help documentation is also concerned by participants except for functionality and performance. Our study provides valuable insights into the effectiveness and usability of ACATs, informing further improvements in their design and implementation. △ Less

Submitted 24 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.08978 [pdf, other]

Incremental Residual Concept Bottleneck Models

Authors: Chenming Shang, Shiji Zhou, Hengyuan Zhang, Xinzhe Ni, Yujiu Yang, Yuwang Wang

Abstract: Concept Bottleneck Models (CBMs) map the black-box visual representations extracted by deep neural networks onto a set of interpretable concepts and use the concepts to make predictions, enhancing the transparency of the decision-making process. Multimodal pre-trained models can match visual representations with textual concept embeddings, allowing for obtaining the interpretable concept bottlenec… ▽ More Concept Bottleneck Models (CBMs) map the black-box visual representations extracted by deep neural networks onto a set of interpretable concepts and use the concepts to make predictions, enhancing the transparency of the decision-making process. Multimodal pre-trained models can match visual representations with textual concept embeddings, allowing for obtaining the interpretable concept bottleneck without the expertise concept annotations. Recent research has focused on the concept bank establishment and the high-quality concept selection. However, it is challenging to construct a comprehensive concept bank through humans or large language models, which severely limits the performance of CBMs. In this work, we propose the Incremental Residual Concept Bottleneck Model (Res-CBM) to address the challenge of concept completeness. Specifically, the residual concept bottleneck model employs a set of optimizable vectors to complete missing concepts, then the incremental concept discovery module converts the complemented vectors with unclear meanings into potential concepts in the candidate concept bank. Our approach can be applied to any user-defined concept bank, as a post-hoc processing method to enhance the performance of any CBMs. Furthermore, to measure the descriptive efficiency of CBMs, the Concept Utilization Efficiency (CUE) metric is proposed. Experiments show that the Res-CBM outperforms the current state-of-the-art methods in terms of both accuracy and efficiency and achieves comparable performance to black-box models across multiple datasets. △ Less

Submitted 17 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.05446 [pdf, other]

XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

Authors: Xuanfan Ni, Hengyi Cai, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Piji Li

Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks but are constrained by their small context window sizes. Various efforts have been proposed to expand the context window to accommodate even up to 200K input tokens. Meanwhile, building high-quality benchmarks with much longer text lengths and more demanding tasks to provide comprehensive evaluations is of i… ▽ More Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks but are constrained by their small context window sizes. Various efforts have been proposed to expand the context window to accommodate even up to 200K input tokens. Meanwhile, building high-quality benchmarks with much longer text lengths and more demanding tasks to provide comprehensive evaluations is of immense practical interest to facilitate long context understanding research of LLMs. However, prior benchmarks create datasets that ostensibly cater to long-text comprehension by expanding the input of traditional tasks, which falls short to exhibit the unique characteristics of long-text understanding, including long dependency tasks and longer text length compatible with modern LLMs' context window size. In this paper, we introduce a benchmark for extremely long context understanding with long-range dependencies, XL$^2$Bench, which includes three scenarios: Fiction Reading, Paper Reading, and Law Reading, and four tasks of increasing complexity: Memory Retrieval, Detailed Understanding, Overall Understanding, and Open-ended Generation, covering 27 subtasks in English and Chinese. It has an average length of 100K+ words (English) and 200K+ characters (Chinese). Evaluating six leading LLMs on XL$^2$Bench, we find that their performance significantly lags behind human levels. Moreover, the observed decline in performance across both the original and enhanced datasets underscores the efficacy of our approach to mitigating data contamination. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Work in progress

arXiv:2404.01067 [pdf, other]

Exploring the Mystery of Influential Data for Mathematical Reasoning

Authors: Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Abstract: Selecting influential data for fine-tuning on downstream tasks is a key factor for both performance and computation efficiency. Recent works have shown that training with only limited data can show a superior performance on general tasks. However, the feasibility on mathematical reasoning tasks has not been validated. To go further, there exist two open questions for mathematical reasoning: how to… ▽ More Selecting influential data for fine-tuning on downstream tasks is a key factor for both performance and computation efficiency. Recent works have shown that training with only limited data can show a superior performance on general tasks. However, the feasibility on mathematical reasoning tasks has not been validated. To go further, there exist two open questions for mathematical reasoning: how to select influential data and what is an influential data composition. For the former one, we propose a Quality-aware Diverse Selection (QaDS) strategy adaptable for mathematical reasoning. A comparison with other selection strategies validates the superiority of QaDS. For the latter one, we first enlarge our setting and explore the influential data composition. We conduct a series of experiments and highlight: scaling up reasoning data, and training with general data selected by QaDS is helpful. Then, we define our optimal mixture as OpenMathMix, an influential data mixture with open-source data selected by QaDS. With OpenMathMix, we achieve a state-of-the-art 48.8% accuracy on MATH with 7B base model. Additionally, we showcase the use of QaDS in creating efficient fine-tuning mixtures with various selection ratios, and analyze the quality of a wide range of open-source datasets, which can perform as a reference for future works on mathematical reasoning tasks. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2312.14839 [pdf, other]

Simulating Parametric Thin Shells by Bicubic Hermite Elements

Authors: Xingyu Ni, Xuwen Chen, Cheng Yu, Bin Wang, Baoquan Chen

Abstract: In this study, we present the bicubic Hermite element method (BHEM), a new computational framework devised for the elastodynamic simulation of parametric thin-shell structures. The BHEM is constructed based on parametric quadrilateral Hermite patches, which serve as a unified representation for shell geometry, simulation, collision avoidance, as well as rendering. Compared with the commonly utiliz… ▽ More In this study, we present the bicubic Hermite element method (BHEM), a new computational framework devised for the elastodynamic simulation of parametric thin-shell structures. The BHEM is constructed based on parametric quadrilateral Hermite patches, which serve as a unified representation for shell geometry, simulation, collision avoidance, as well as rendering. Compared with the commonly utilized linear FEM, the BHEM offers higher-order solution spaces, enabling the capture of more intricate and smoother geometries while employing significantly fewer finite elements. In comparison to other high-order methods, the BHEM achieves conforming $\mathcal{C}^1$ continuity for Kirchhoff-Love (KL) shells with minimal complexity. Furthermore, by leveraging the subdivision and convex hull properties of Hermite patches, we develop an efficient algorithm for ray-patch intersections, facilitating collision handling in simulations and ray tracing in rendering. This eliminates the need for laborious remodeling of the pre-existing parametric surface as the conventional approaches do. We substantiate our claims with comprehensive experiments, which demonstrate the high accuracy and versatility of the proposed method. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.01091 [pdf, other]

doi 10.1145/3576915.3616590

Demystifying DeFi MEV Activities in Flashbots Bundle

Authors: Zihao Li, Jianfeng Li, Zheyuan He, Xiapu Luo, Ting Wang, Xiaoze Ni, Wenwu Yang, Xi Chen, Ting Chen

Abstract: Decentralized Finance, mushrooming in permissionless blockchains, has attracted a recent surge in popularity. Due to the transparency of permissionless blockchains, opportunistic traders can compete to earn revenue by extracting Miner Extractable Value (MEV), which undermines both the consensus security and efficiency of blockchain systems. The Flashbots bundle mechanism further aggravates the MEV… ▽ More Decentralized Finance, mushrooming in permissionless blockchains, has attracted a recent surge in popularity. Due to the transparency of permissionless blockchains, opportunistic traders can compete to earn revenue by extracting Miner Extractable Value (MEV), which undermines both the consensus security and efficiency of blockchain systems. The Flashbots bundle mechanism further aggravates the MEV competition because it empowers opportunistic traders with the capability of designing more sophisticated MEV extraction. In this paper, we conduct the first systematic study on DeFi MEV activities in Flashbots bundle by develo** ActLifter, a novel automated tool for accurately identifying DeFi actions in transactions of each bundle, and ActCluster, a new approach that leverages iterative clustering to facilitate us to discover known/unknown DeFi MEV activities. Extensive experimental results show that ActLifter can achieve nearly 100% precision and recall in DeFi action identification, significantly outperforming state-of-the-art techniques. Moreover, with the help of ActCluster, we obtain many new observations and discover 17 new kinds of DeFi MEV activities, which occur in 53.12% of bundles but have not been reported in existing studies △ Less

Submitted 2 December, 2023; originally announced December 2023.

Comments: This submission serves as our full paper version with the appendix

arXiv:2311.07306 [pdf, other]

What Large Language Models Bring to Text-rich VQA?

Authors: Xue**g Liu, Wei Tang, Xinzhe Ni, **ghui Lu, Rui Zhao, Zechao Li, Fei Tan

Abstract: Text-rich VQA, namely Visual Question Answering based on text recognition in the images, is a cross-modal task that requires both image comprehension and text recognition. In this work, we focus on investigating the advantages and bottlenecks of LLM-based approaches in addressing this problem. To address the above concern, we separate the vision and language modules, where we leverage external OCR… ▽ More Text-rich VQA, namely Visual Question Answering based on text recognition in the images, is a cross-modal task that requires both image comprehension and text recognition. In this work, we focus on investigating the advantages and bottlenecks of LLM-based approaches in addressing this problem. To address the above concern, we separate the vision and language modules, where we leverage external OCR models to recognize texts in the image and Large Language Models (LLMs) to answer the question given texts. The whole framework is training-free benefiting from the in-context ability of LLMs. This pipeline achieved superior performance compared to the majority of existing Multimodal Large Language Models (MLLM) on four text-rich VQA datasets. Besides, based on the ablation study, we find that LLM brings stronger comprehension ability and may introduce helpful knowledge for the VQA problem. The bottleneck for LLM to address text-rich VQA problems may primarily lie in visual part. We also combine the OCR module with MLLMs and pleasantly find that the combination of OCR module with MLLM also works. It's worth noting that not all MLLMs can comprehend the OCR information, which provides insights into how to train an MLLM that preserves the abilities of LLM. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.01949 [pdf, other]

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks

Authors: Yifan Wang, Qingyan Guo, Xinzhe Ni, Chufan Shi, Lemao Liu, Haiyun Jiang, Yujiu Yang

Abstract: In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label map**s from demonstrations and perform well on downstream tasks. However, under the standard ICL setting, LLMs may sometimes neglect query-related information in demonstrations, leading to incorrect predictions. To address this limitation, we propose a new p… ▽ More In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label map**s from demonstrations and perform well on downstream tasks. However, under the standard ICL setting, LLMs may sometimes neglect query-related information in demonstrations, leading to incorrect predictions. To address this limitation, we propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering, an important form in knowledge-intensive tasks. HICL leverages LLMs' reasoning ability to extract query-related knowledge from demonstrations, then concatenates the knowledge to prompt LLMs in a more explicit way. Furthermore, we track the source of this knowledge to identify specific examples, and introduce a Hint-related Example Retriever (HER) to select informative examples for enhanced demonstrations. We evaluate HICL with HER on 3 open-domain QA benchmarks, and observe average performance gains of 2.89 EM score and 2.52 F1 score on gpt-3.5-turbo, 7.62 EM score and 7.27 F1 score on LLaMA-2-Chat-7B compared with standard setting. △ Less

Submitted 18 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

Comments: Accepted by ICASSP 2024

arXiv:2310.06613 [pdf, other]

BandMap: Application Map** with Bandwidth Allocation forCoarse-Grained Reconfigurable Array

Authors: Xiaobing Ni, Jiaheng Ruan, Mengke Ge, Wendi Sun, Song Chen, Yi Kang

Abstract: This paper proposes an application map** algorithm, BandMap, for coarse-grained reconfigurable array (CGRA), which allocates the bandwidth in PE array according to the transferring demands of data, especially the data with high spatial reuse, to reduce the routing PEs. To cover bandwidth allocation, BandMap maps the data flow graphs (DFGs), abstracted from applications, by solving the maximum in… ▽ More This paper proposes an application map** algorithm, BandMap, for coarse-grained reconfigurable array (CGRA), which allocates the bandwidth in PE array according to the transferring demands of data, especially the data with high spatial reuse, to reduce the routing PEs. To cover bandwidth allocation, BandMap maps the data flow graphs (DFGs), abstracted from applications, by solving the maximum independent set (MIS) on a mixture of tuple and quadruple resource occupation conflict graph. Compared to a state-of-art BusMap work, Bandmap can achieve reduced routing PEs with the same or even smaller initiation interval (II). △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.13063 [pdf, other]

Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies

Authors: Chirag Shah, Ryen W. White, Reid Andersen, Georg Buscher, Scott Counts, Sarkar Snigdha Sarathi Das, Ali Montazer, Sathish Manivannan, Jennifer Neville, Xiaochuan Ni, Nagu Rangan, Tara Safavi, Siddharth Suri, Mengting Wan, Leijie Wang, Longqi Yang

Abstract: Log data can reveal valuable information about how users interact with Web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especially for emerging forms of Web search such as AI-driven chat. To understand user intents from log data, we need a way to label them with meaningful categories that capture their diversity and dynamics.… ▽ More Log data can reveal valuable information about how users interact with Web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especially for emerging forms of Web search such as AI-driven chat. To understand user intents from log data, we need a way to label them with meaningful categories that capture their diversity and dynamics. Existing methods rely on manual or machine-learned labeling, which are either expensive or inflexible for large and dynamic datasets. We propose a novel solution using large language models (LLMs), which can generate rich and relevant concepts, descriptions, and examples for user intents. However, using LLMs to generate a user intent taxonomy and apply it for log analysis can be problematic for two main reasons: (1) such a taxonomy is not externally validated; and (2) there may be an undesirable feedback loop. To address this, we propose a new methodology with human experts and assessors to verify the quality of the LLM-generated taxonomy. We also present an end-to-end pipeline that uses an LLM with human-in-the-loop to produce, refine, and apply labels for user intent analysis in log data. We demonstrate its effectiveness by uncovering new insights into user intents from search and chat logs from the Microsoft Bing commercial search engine. The proposed work's novelty stems from the method for generating purpose-driven user intent taxonomies with strong validation. This method not only helps remove methodological and practical bottlenecks from intent-focused research, but also provides a new framework for generating, validating, and applying other kinds of taxonomies in a scalable and adaptable way with reasonable human effort. △ Less

Submitted 9 May, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

Report number: MSR-TR-2023-32

arXiv:2308.08446 [pdf, other]

CSPM: A Contrastive Spatiotemporal Preference Model for CTR Prediction in On-Demand Food Delivery Services

Authors: Guyu Jiang, Xiaoyun Li, Rongrong **g, Ruoqi Zhao, Xingliang Ni, Guodong Cao, Ning Hu

Abstract: Click-through rate (CTR) prediction is a crucial task in the context of an online on-demand food delivery (OFD) platform for precisely estimating the probability of a user clicking on food items. Unlike universal e-commerce platforms such as Taobao and Amazon, user behaviors and interests on the OFD platform are more location and time-sensitive due to limited delivery ranges and regional commodity… ▽ More Click-through rate (CTR) prediction is a crucial task in the context of an online on-demand food delivery (OFD) platform for precisely estimating the probability of a user clicking on food items. Unlike universal e-commerce platforms such as Taobao and Amazon, user behaviors and interests on the OFD platform are more location and time-sensitive due to limited delivery ranges and regional commodity supplies. However, existing CTR prediction algorithms in OFD scenarios concentrate on capturing interest from historical behavior sequences, which fails to effectively model the complex spatiotemporal information within features, leading to poor performance. To address this challenge, this paper introduces the Contrastive Sres under different search states using three modules: contrastive spatiotemporal representation learning (CSRL), spatiotemporal preference extractor (StPE), and spatiotemporal information filter (StIF). CSRL utilizes a contrastive learning framework to generate a spatiotemporal activation representation (SAR) for the search action. StPE employs SAR to activate users' diverse preferences related to location and time from the historical behavior sequence field, using a multi-head attention mechanism. StIF incorporates SAR into a gating network to automatically capture important features with latent spatiotemporal effects. Extensive experiments conducted on two large-scale industrial datasets demonstrate the state-of-the-art performance of CSPM. Notably, CSPM has been successfully deployed in Alibaba's online OFD platform Ele.me, resulting in a significant 0.88% lift in CTR, which has substantial business implications. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2307.10168 [pdf, other]

LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

Authors: Tongshuang Wu, Haiyi Zhu, Maya Albayrak, Alexis Axon, Amanda Bertsch, Wenxing Deng, Ziqi Ding, Bill Guo, Sireesh Gururaja, Tzu-Sheng Kuo, Jenny T. Liang, Ryan Liu, Ihita Mandal, Jeremiah Milbauer, Xiaolin Ni, Namrata Padmanabhan, Subhashini Ramkumar, Alexis Sudjianto, Jordan Taylor, Ying-Jui Tseng, Patricia Vaidos, Zhi** Wu, Wei Wu, Chenyang Yang

Abstract: LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but… ▽ More LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate (1) the relative strengths of LLMs on different tasks (by cross-comparing their performances on sub-tasks) and (2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans. △ Less

Submitted 19 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.00190 [pdf, other]

Contextualizing Problems to Student Interests at Scale in Intelligent Tutoring System Using Large Language Models

Authors: Gautam Yadav, Ying-Jui Tseng, Xiaolin Ni

Abstract: Contextualizing problems to align with student interests can significantly improve learning outcomes. However, this task often presents scalability challenges due to resource and time constraints. Recent advancements in Large Language Models (LLMs) like GPT-4 offer potential solutions to these issues. This study explores the ability of GPT-4 in the contextualization of problems within CTAT, an int… ▽ More Contextualizing problems to align with student interests can significantly improve learning outcomes. However, this task often presents scalability challenges due to resource and time constraints. Recent advancements in Large Language Models (LLMs) like GPT-4 offer potential solutions to these issues. This study explores the ability of GPT-4 in the contextualization of problems within CTAT, an intelligent tutoring system, aiming to increase student engagement and enhance learning outcomes. Through iterative prompt engineering, we achieved meaningful contextualization that preserved the difficulty and original intent of the problem, thereby not altering values or overcomplicating the questions. While our research highlights the potential of LLMs in educational settings, we acknowledge current limitations, particularly with geometry problems, and emphasize the need for ongoing evaluation and research. Future work includes systematic studies to measure the impact of this tool on students' learning outcomes and enhancements to handle a broader range of problems. △ Less

Submitted 31 May, 2023; originally announced June 2023.

arXiv:2305.14304 [pdf, other]

A Classical Architecture For Digital Quantum Computers

Authors: Fang Zhang, Xing Zhu, Rui Chao, Cup** Huang, Linghang Kong, Guoyang Chen, Dawei Ding, Haishan Feng, Yihuai Gao, Xiaotong Ni, Liwei Qiu, Zhe Wei, Yueming Yang, Yang Zhao, Yaoyun Shi, Weifeng Zhang, Peng Zhou, Jianxin Chen

Abstract: Scaling bottlenecks the making of digital quantum computers, posing challenges from both the quantum and the classical components. We present a classical architecture to cope with a comprehensive list of the latter challenges {\em all at once}, and implement it fully in an end-to-end system by integrating a multi-core RISC-V CPU with our in-house control electronics. Our architecture enables sca… ▽ More Scaling bottlenecks the making of digital quantum computers, posing challenges from both the quantum and the classical components. We present a classical architecture to cope with a comprehensive list of the latter challenges {\em all at once}, and implement it fully in an end-to-end system by integrating a multi-core RISC-V CPU with our in-house control electronics. Our architecture enables scalable, high-precision control of large quantum processors and accommodates evolving requirements of quantum hardware. A central feature is a microarchitecture executing quantum operations in parallel on arbitrary predefined qubit groups. Another key feature is a reconfigurable quantum instruction set that supports easy qubit re-grou** and instructions extensions. As a demonstration, we implement the widely-studied surface code quantum computing workflow, which is instructive for being demanding on both the controllers and the integrated classical computation. Our design, for the first time, reduces instruction issuing and transmission costs to constants, which do not scale with the number of qubits, without adding any overheads in decoding or dispatching. Rather than relying on specialized hardware for syndrome decoding, our system uses a dedicated multi-core CPU for both qubit control and classical computation, including syndrome decoding. This simplifies the system design and facilitates load-balancing between the quantum and classical components. We implement recent proposals as decoding firmware on a RISC-V system-on-chip (SoC) that parallelizes general inner decoders. By using our in-house Union-Find and PyMatching 2 implementations, we can achieve unprecedented decoding capabilities of up to distances 47 and 67 with the currently available SoCs, under realistic and optimistic assumptions of physical error rate $p=0.001 and p=0.0001, respectively, all in just 1 \textmu s. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 12 pages, 12 figures

arXiv:2305.10205 [pdf]

doi 10.3311/CCC2023-006

DesignTracking: Track and Replay BIM-based Design Process

Authors: Xiang-Rui Ni, Zhe Zheng, Jia-Rui Lin, Zhen-Zhong Hu, Xin Zhang

Abstract: Among different phases of the life cycle of a building or facility, design is of the utmost importance to ensure safety, efficiency and sustainability of the building or facility. How to control and improve design quality and efficiency has been explored for years, and more studies emerged with the popularization of Building Information Modelling (BIM). However, most of them focused on the extract… ▽ More Among different phases of the life cycle of a building or facility, design is of the utmost importance to ensure safety, efficiency and sustainability of the building or facility. How to control and improve design quality and efficiency has been explored for years, and more studies emerged with the popularization of Building Information Modelling (BIM). However, most of them focused on the extraction of design behaviors, while paying less attention to how a design is formed. Therefore, this study proposes an approach to tracking and replaying the BIM-based design process by integrating data logging and 4D visualization techniques. First of all, potential design behaviors and procedures are analyzed and extracted by observing how a designer designs a BIM model. Meanwhile, the required data for logging design process is defined and a relevant method to collect these data is developed based on the APIs of BIM software. Then, strategies on how to visualize different design procedures are designed and implemented via 4D visualization. Finally, a prototype system is developed based on Autodesk Revit and validated through a case study. Result shows that the proposed approach enables intuitively and interactively review of the design process, and makes it easier to understand design behaviors and even identify potential pitfalls, thus improving the design efficiency and quality. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Journal ref: Creative Construction Conference 2023

arXiv:2303.14956 [pdf, other]

Unified Text Structuralization with Instruction-tuned Language Models

Authors: Xuanfan Ni, Piji Li, Huayang Li

Abstract: Text structuralization is one of the important fields of natural language processing (NLP) consists of information extraction (IE) and structure formalization. However, current studies of text structuralization suffer from a shortage of manually annotated high-quality datasets from different domains and languages, which require specialized professional knowledge. In addition, most IE methods are d… ▽ More Text structuralization is one of the important fields of natural language processing (NLP) consists of information extraction (IE) and structure formalization. However, current studies of text structuralization suffer from a shortage of manually annotated high-quality datasets from different domains and languages, which require specialized professional knowledge. In addition, most IE methods are designed for a specific type of structured data, e.g., entities, relations, and events, making them hard to generalize to others. In this work, we propose a simple and efficient approach to instruct large language model (LLM) to extract a variety of structures from texts. More concretely, we add a prefix and a suffix instruction to indicate the desired IE task and structure type, respectively, before feeding the text into a LLM. Experiments on two LLMs show that this approach can enable language models to perform comparable with other state-of-the-art methods on datasets of a variety of languages and knowledge, and can generalize to other IE sub-tasks via changing the content of instruction. Another benefit of our approach is that it can help researchers to build datasets in low-source and domain-specific scenarios, e.g., fields in finance and law, with low cost. △ Less

Submitted 30 March, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: 13 pages, 5 figures

arXiv:2302.14012 [pdf]

Drone-based quantum key distribution

Authors: Xiao-Hui Tian, Ran Yang, Ji-Ning Zhang, Hua Yu, Yao Zhang, Pengfei Fan, Mengwen Chen, Changsheng Gu, Xin Ni, Mingzhe Hu, Xun Cao, Xiaopeng Hu, Gang Zhao, Yan-Qing Lu, Zhi-Jun Yin, Hua-Ying Liu, Yan-Xiao Gong, Zhenda Xie, Shi-Ning Zhu

Abstract: Drone-based quantum link has the potential to realize mobile quantum network, and entanglement distribution has been demonstrated using one and two drones. Here we report the first drone-based quantum key distribution (QKD), with average secure key rate larger than 8 kHz using decoy-state BB84 protocol with polarization coding. Compact acquisition, pointing, and tracking (APT) system and QKD modul… ▽ More Drone-based quantum link has the potential to realize mobile quantum network, and entanglement distribution has been demonstrated using one and two drones. Here we report the first drone-based quantum key distribution (QKD), with average secure key rate larger than 8 kHz using decoy-state BB84 protocol with polarization coding. Compact acquisition, pointing, and tracking (APT) system and QKD modules are developed and loaded on a home-made octocopter, within takeoff weight of 30 kg. A robust link is established between the flying octocopter and a ground station separated 200 meters away and real-time QKD is performed for 400 seconds. This work shows potential of drone-based quantum communication for the future mobile quantum networks. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2301.04748 [pdf, other]

LSDM: Long-Short Diffeomorphic Motion for Weakly-Supervised Ultrasound Landmark Tracking

Authors: Zhihua Liu, Bin Yang, Yan Shen, Xuejun Ni, Huiyu Zhou

Abstract: Accurate tracking of an anatomical landmark over time has been of high interests for disease assessment such as minimally invasive surgery and tumor radiation therapy. Ultrasound imaging is a promising modality benefiting from low-cost and real-time acquisition. However, generating a precise landmark tracklet is very challenging, as attempts can be easily distorted by different interference such a… ▽ More Accurate tracking of an anatomical landmark over time has been of high interests for disease assessment such as minimally invasive surgery and tumor radiation therapy. Ultrasound imaging is a promising modality benefiting from low-cost and real-time acquisition. However, generating a precise landmark tracklet is very challenging, as attempts can be easily distorted by different interference such as landmark deformation, visual ambiguity and partial observation. In this paper, we propose a long-short diffeomorphic motion network, which is a multi-task framework with a learnable deformation prior to search for the plausible deformation of landmark. Specifically, we design a novel diffeomorphism representation in both long and short temporal domains for delineating motion margins and reducing long-term cumulative tracking errors. To further mitigate local anatomical ambiguity, we propose an expectation maximisation motion alignment module to iteratively optimize both long and short deformation, aligning to the same directional and spatial representation. The proposed multi-task system can be trained in a weakly-supervised manner, which only requires few landmark annotations for tracking and zero annotation for long-short deformation learning. We conduct extensive experiments on two ultrasound landmark tracking datasets. Experimental results show that our proposed method can achieve better or competitive landmark tracking performance compared with other state-of-the-art tracking methods, with a strong generalization capability across different scanner types and different ultrasound modalities. △ Less

Submitted 31 January, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

arXiv:2212.04873 [pdf, other]

Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition

Authors: Xinzhe Ni, Yong Liu, Hao Wen, Yatai Ji, **g Xiao, Yujiu Yang

Abstract: Current methods for few-shot action recognition mainly fall into the metric learning framework following ProtoNet, which demonstrates the importance of prototypes. Although they achieve relatively good performance, the effect of multimodal information is ignored, e.g. label texts. In this work, we propose a novel MultimOdal PRototype-ENhanced Network (MORN), which uses the semantic information of… ▽ More Current methods for few-shot action recognition mainly fall into the metric learning framework following ProtoNet, which demonstrates the importance of prototypes. Although they achieve relatively good performance, the effect of multimodal information is ignored, e.g. label texts. In this work, we propose a novel MultimOdal PRototype-ENhanced Network (MORN), which uses the semantic information of label texts as multimodal information to enhance prototypes. A CLIP visual encoder and a frozen CLIP text encoder are introduced to obtain features with good multimodal initialization. Then in the visual flow, visual prototypes are computed by a visual prototype-computed module. In the text flow, a semantic-enhanced (SE) module and an inflating operation are used to obtain text prototypes. The final multimodal prototypes are then computed by a multimodal prototype-enhanced (MPE) module. Besides, we define a PRototype SImilarity DiffErence (PRIDE) to evaluate the quality of prototypes, which is used to verify our improvement on the prototype level and effectiveness of MORN. We conduct extensive experiments on four popular few-shot action recognition datasets: HMDB51, UCF101, Kinetics and SSv2, and MORN achieves state-of-the-art results. When plugging PRIDE into the training stage, the performance can be further improved. △ Less

Submitted 21 May, 2024; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: Accepted by ICMR 2024 (oral)

arXiv:2211.03789 [pdf]

doi 10.3389/fenrg.2021.708456

A Random Forest and Current Fault Texture Feature-Based Method for Current Sensor Fault Diagnosis in Three-Phase PWM VSR

Authors: Lei Kou, Xiao-dong Gong, Yi Zheng, Xiu-hui Ni, Yang Li, Quan-de Yuan, Ya-nan Dong

Abstract: Three-phase PWM voltage-source rectifier (VSR) systems have been widely used in various energy conversion systems, where current sensors are the key component for state monitoring and system control. The current sensor faults may bring hidden danger or damage to the whole system; therefore, this paper proposed a random forest (RF) and current fault texture feature-based method for current sensor f… ▽ More Three-phase PWM voltage-source rectifier (VSR) systems have been widely used in various energy conversion systems, where current sensors are the key component for state monitoring and system control. The current sensor faults may bring hidden danger or damage to the whole system; therefore, this paper proposed a random forest (RF) and current fault texture feature-based method for current sensor fault diagnosis in three-phase PWM VSR systems. First, the three-phase alternating currents (ACs) of the three-phase PWM VSR are collected to extract the current fault texture features, and no additional hardware sensors are needed to avoid causing additional unstable factors. Then, the current fault texture features are adopted to train the random forest current sensor fault detection and diagnosis (CSFDD) classifier, which is a data-driven CSFDD classifier. Finally, the effectiveness of the proposed method is verified by simulation experiments. The result shows that the current sensor faults can be detected and located successfully and that it can effectively provide fault locations for maintenance personnel to keep the stable operation of the whole system. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: Frontiers in Energy Research

MSC Class: 68Q04 ACM Class: I.2

arXiv:2210.01689 [pdf]

Vision-based Warning System for Maintenance Personnel on Short-Term Roadwork Site

Authors: Xiao Ni, Walpola Layantha Perera, Carsten Kühnel, Christian Vollrath

Abstract: We propose a vision-based warning system for the maintenance personnel working on short-term construction sites. Traditional solutions use passive protection, like setting up traffic cones, safety beacons, or even nothing. However, such methods cannot function as physical safety barriers to separate working areas from used lanes. In contrast, our system provides active protection, leveraging acous… ▽ More We propose a vision-based warning system for the maintenance personnel working on short-term construction sites. Traditional solutions use passive protection, like setting up traffic cones, safety beacons, or even nothing. However, such methods cannot function as physical safety barriers to separate working areas from used lanes. In contrast, our system provides active protection, leveraging acoustic and visual warning signals to help road workers be cautious of approaching vehicles before they pass the working area. To decrease too many warnings to relieve a disturbance of road workers, we implemented our traffic flow check algorithm, by which about 80% of the useless notices can be filtered. We conduct the evaluations in laboratory conditions and the real world, proving our system's applicability and reliability. △ Less

Submitted 20 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

arXiv:2206.01833 [pdf, other]

Leveraging Heterogeneous Capabilities in Multi-Agent Systems for Environmental Conflict Resolution

Authors: Michael Enqi Cao, Jonas Warnke, Yunhai Han, Xinpei Ni, Ye Zhao, Samuel Coogan

Abstract: In this paper, we introduce a high-level controller synthesis framework that enables teams of heterogeneous agents to assist each other in resolving environmental conflicts that appear at runtime. This conflict resolution method is built upon temporal-logic-based reactive synthesis to guarantee safety and task completion under specific environment assumptions. In heterogeneous multi-agent systems,… ▽ More In this paper, we introduce a high-level controller synthesis framework that enables teams of heterogeneous agents to assist each other in resolving environmental conflicts that appear at runtime. This conflict resolution method is built upon temporal-logic-based reactive synthesis to guarantee safety and task completion under specific environment assumptions. In heterogeneous multi-agent systems, every agent is expected to complete its own tasks in service of a global team objective. However, at runtime, an agent may encounter un-modeled obstacles (e.g., doors or walls) that prevent it from achieving its own task. To address this problem, we employ the capabilities of other heterogeneous agents to resolve the obstacle. A controller framework is proposed to redirect agents with the capability of resolving the appropriate obstacles to the required target when such a situation is detected. Three case studies involving a bipedal robot Digit and a quadcopter are used to evaluate the controller performance in action. Additionally, we implement the proposed framework on a physical multi-agent robotic system to demonstrate its viability for real world applications. △ Less

Submitted 1 September, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

Comments: Submitted to The International Symposium on Safety, Security, and Rescue Robotics (SSRR) 2022

arXiv:2112.12509 [pdf, other]

doi 10.1038/s41534-022-00614-3

Integrating Quantum Processor Device and Control Optimization in a Gradient-based Framework

Authors: Xiaotong Ni, Hui-Hai Zhao, Lei Wang, Feng Wu, Jianxin Chen

Abstract: In a quantum processor, the device design and external controls together contribute to the quality of the target quantum operations. As we continuously seek better alternative qubit platforms, we explore the increasingly large device and control design space. Thus, optimization becomes more and more challenging. In this work, we demonstrate that the figure of merit reflecting a design goal can be… ▽ More In a quantum processor, the device design and external controls together contribute to the quality of the target quantum operations. As we continuously seek better alternative qubit platforms, we explore the increasingly large device and control design space. Thus, optimization becomes more and more challenging. In this work, we demonstrate that the figure of merit reflecting a design goal can be made differentiable with respect to the device and control parameters. In addition, we can compute the gradient of the design objective efficiently in a similar manner to the back-propagation algorithm and then utilize the gradient to optimize the device and the control parameters jointly and efficiently. This extends the scope of the quantum optimal control to superconducting device design. We also demonstrate the viability of gradient-based joint optimization over the device and control parameters through a few examples. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Journal ref: npj Quantum Information volume 8, 106 (2022)

arXiv:2110.09005 [pdf, other]

Unsupervised Learned Kalman Filtering

Authors: Guy Revach, Nir Shlezinger, Timur Locher, Xiaoyong Ni, Ruud J. G. van Sloun, Yonina C. Eldar

Abstract: In this paper we adapt KalmanNet, which is a recently pro-posed deep neural network (DNN)-aided system whose architecture follows the operation of the model-based Kalman filter (KF), to learn its map** in an unsupervised manner, i.e., without requiring ground-truth states. The unsupervised adaptation is achieved by exploiting the hybrid model-based/data-driven architecture of KalmanNet, which in… ▽ More In this paper we adapt KalmanNet, which is a recently pro-posed deep neural network (DNN)-aided system whose architecture follows the operation of the model-based Kalman filter (KF), to learn its map** in an unsupervised manner, i.e., without requiring ground-truth states. The unsupervised adaptation is achieved by exploiting the hybrid model-based/data-driven architecture of KalmanNet, which internally predicts the next observation as the KF does. These internal features are then used to compute the loss rather than the state estimate at the output of the system. With the capability of unsupervised learning, one can use KalmanNet not only to track the hidden state, but also to adapt to variations in the state space (SS) model. We numerically demonstrate that when the noise statistics are unknown, unsupervised KalmanNet achieves a similar performance to KalmanNet with supervised learning. We also show that we can adapt a pre-trained KalmanNet to changing SS models without providing additional data thanks to the unsupervised capabilities. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 5 Pages, 5 Figures, Submitted to ICASSP 2022

arXiv:2109.10393 [pdf, other]

Towards a Real-Time Facial Analysis System

Authors: Bishwo Adhikari, Xingyang Ni, Esa Rahtu, Heikki Huttunen

Abstract: Facial analysis is an active research area in computer vision, with many practical applications. Most of the existing studies focus on addressing one specific task and maximizing its performance. For a complete facial analysis system, one needs to solve these tasks efficiently to ensure a smooth experience. In this work, we present a system-level design of a real-time facial analysis system. With… ▽ More Facial analysis is an active research area in computer vision, with many practical applications. Most of the existing studies focus on addressing one specific task and maximizing its performance. For a complete facial analysis system, one needs to solve these tasks efficiently to ensure a smooth experience. In this work, we present a system-level design of a real-time facial analysis system. With a collection of deep neural networks for object detection, classification, and regression, the system recognizes age, gender, facial expression, and facial similarity for each person that appears in the camera view. We investigate the parallelization and interplay of individual tasks. Results on common off-the-shelf architecture show that the system's accuracy is comparable to the state-of-the-art methods, and the recognition speed satisfies real-time requirements. Moreover, we propose a multitask network for jointly predicting the first three attributes, i.e., age, gender, and facial expression. Source code and trained models are available at https://github.com/mahehu/TUT-live-age-estimator. △ Less

Submitted 21 September, 2021; originally announced September 2021.

Comments: Accepted in IEEE MMSP 2021

arXiv:2108.07147 [pdf, other]

On the Importance of Encrypting Deep Features

Authors: Xingyang Ni, Heikki Huttunen, Esa Rahtu

Abstract: In this study, we analyze model inversion attacks with only two assumptions: feature vectors of user data are known, and a black-box API for inference is provided. On the one hand, limitations of existing studies are addressed by opting for a more practical setting. Experiments have been conducted on state-of-the-art models in person re-identification, and two attack scenarios (i.e., recognizing a… ▽ More In this study, we analyze model inversion attacks with only two assumptions: feature vectors of user data are known, and a black-box API for inference is provided. On the one hand, limitations of existing studies are addressed by opting for a more practical setting. Experiments have been conducted on state-of-the-art models in person re-identification, and two attack scenarios (i.e., recognizing auxiliary attributes and reconstructing user data) are investigated. Results show that an adversary could successfully infer sensitive information even under severe constraints. On the other hand, it is advisable to encrypt feature vectors, especially for a machine learning model in production. As an alternative to traditional encryption methods such as AES, a simple yet effective method termed ShuffleBits is presented. More specifically, the binary sequence of each floating-point number gets shuffled. Deployed using the one-time pad scheme, it serves as a plug-and-play module that is applicable to any neural network, and the resulting model directly outputs deep features in encrypted form. Source code is publicly available at https://github.com/nixingyang/ShuffleBits. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: First Version

arXiv:2108.04089 [pdf, other]

A Self-Configurable Grou** Method for Integrated Wi-SUN FAN and TSCH-based Networks

Authors: Xinyu Ni, Michael Baddeley, Nan Jiang, Yichao **

Abstract: Recent applications in large-scale wireless mesh networks (WSN), e.g., Advanced Metering Infrastructure (AMI) scenarios, expect to support an extended number of nodes with higher throughput, which cannot be sufficiently supported by the current WSN protocols. Two prior protocols, Wi-SUN Field Area Network (Wi-SUN FAN) and IETF 6TiSCH standards, are popularly used that are respectively based on asy… ▽ More Recent applications in large-scale wireless mesh networks (WSN), e.g., Advanced Metering Infrastructure (AMI) scenarios, expect to support an extended number of nodes with higher throughput, which cannot be sufficiently supported by the current WSN protocols. Two prior protocols, Wi-SUN Field Area Network (Wi-SUN FAN) and IETF 6TiSCH standards, are popularly used that are respectively based on asynchronous Carrier Sense Multiple Access / Collision Avoidance (CSMA/CA) and IEEE 802.15.4 Time Scheduled Channel Hop** (TSCH). However, the former one with CSMA/CA can be prone to the Hidden Node Problem (HNP) that leads to a degradation of reliability, while the latter one with TSCH avoids HNP by using synchronously scheduling but requires massive control signalling that degrades the upper-bound of throughput. Accordingly, this paper tackles the challenge of how to improve the upper-bound of throughput without loss of reliability. To do so, we first present an in-depth evaluation of reliability and throughput between the two existing standards via system-level simulation. We then propose a self-configurable grou** (SCG) method to cluster nodes without using location information of each node. This SCG method eliminates around 99\% HNP in CSMA/CA, thus greatly improves its reliability with maintaining the relatively high upper-bound of throughput. Our results show that the SCG method almost doubles the network throughput compared with both 6TiSCH and Wi-SUN FAN in heavy traffic scenarios, while providing extremely high reliability of more than 99.999\%. △ Less

Submitted 9 August, 2021; originally announced August 2021.

arXiv:2108.01298 [pdf, other]

Synthesizing Brain-Network-Inspired Interconnections for Large-Scale Network-on-Chips

Authors: Mengke Ge, Xiaobing Ni, Qi Xu, Song Chen, **glei Huang, Yi Kang, Feng Wu

Abstract: Brain network is a large-scale complex network with scale-free, small-world, and modularity properties, which largely supports this high-efficiency massive system. In this paper, we propose to synthesize brain-network-inspired interconnections for large-scale network-on-chips. Firstly, we propose a method to generate brain-network-inspired topologies with limited scale-free and power-law small-wor… ▽ More Brain network is a large-scale complex network with scale-free, small-world, and modularity properties, which largely supports this high-efficiency massive system. In this paper, we propose to synthesize brain-network-inspired interconnections for large-scale network-on-chips. Firstly, we propose a method to generate brain-network-inspired topologies with limited scale-free and power-law small-world properties, which have a low total link length and extremely low average hop count approximately proportional to the logarithm of the network size. In addition, given the large-scale applications and the modular topology, we present an application map** method, including task map** and deterministic deadlock-free routing, to minimize the power consumption and hop count. Finally, a cycle-accurate simulator BookSim2 is used to validate the architecture performance with different synthetic traffic patterns and large-scale test cases, including real-world communication networks for the graph processing application. Experiments show that, compared with other topologies and methods, the NoC design generated by the proposed method presents significantly lower average hop count and lower average latency. Especially in graph processing applications with a power-law and tightly coupled inter-core communication, the brain-network-inspired NoC has up to 70% lower average hop count and 75% lower average latency than mesh-based NoCs. △ Less

Submitted 26 August, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

Comments: 19 pages, 15 figures, 8 tables, accepted by ACM TODAES

arXiv:2107.10043 [pdf, other]

doi 10.1109/TSP.2022.3158588

KalmanNet: Neural Network Aided Kalman Filtering for Partially Known Dynamics

Authors: Guy Revach, Nir Shlezinger, Xiaoyong Ni, Adria Lopez Escoriza, Ruud J. G. van Sloun, Yonina C. Eldar

Abstract: State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present Ka… ▽ More State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics with partial information. By incorporating the structural SS model with a dedicated recurrent neural network module in the flow of the KF, we retain data efficiency and interpretability of the classic algorithm while implicitly learning complex dynamics from data. We demonstrate numerically that KalmanNet overcomes non-linearities and model mismatch, outperforming classic filtering methods operating with both mismatched and accurate domain knowledge. △ Less

Submitted 10 March, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: Accepted for publication in IEEE Transactions on Signal Processing - TSP

arXiv:2106.11593 [pdf, other]

A Vertical Federated Learning Framework for Graph Convolutional Network

Authors: Xiang Ni, Xiaolong Xu, Lingjuan Lyu, Changhua Meng, Weiqiang Wang

Abstract: Recently, Graph Neural Network (GNN) has achieved remarkable success in various real-world problems on graph data. However in most industries, data exists in the form of isolated islands and the data privacy and security is also an important issue. In this paper, we propose FedVGCN, a federated GCN learning paradigm for privacy-preserving node classification task under data vertically partitioned… ▽ More Recently, Graph Neural Network (GNN) has achieved remarkable success in various real-world problems on graph data. However in most industries, data exists in the form of isolated islands and the data privacy and security is also an important issue. In this paper, we propose FedVGCN, a federated GCN learning paradigm for privacy-preserving node classification task under data vertically partitioned setting, which can be generalized to existing GCN models. Specifically, we split the computation graph data into two parts. For each iteration of the training process, the two parties transfer intermediate results to each other under homomorphic encryption. We conduct experiments on benchmark data and the results demonstrate the effectiveness of FedVGCN in the case of GraphSage. △ Less

Submitted 22 June, 2021; originally announced June 2021.

arXiv:2105.05639 [pdf, other]

FlipReID: Closing the Gap between Training and Inference in Person Re-Identification

Authors: Xingyang Ni, Esa Rahtu

Abstract: Since neural networks are data-hungry, incorporating data augmentation in training is a widely adopted technique that enlarges datasets and improves generalization. On the other hand, aggregating predictions of multiple augmented samples (i.e., test-time augmentation) could boost performance even further. In the context of person re-identification models, it is common practice to extract embedding… ▽ More Since neural networks are data-hungry, incorporating data augmentation in training is a widely adopted technique that enlarges datasets and improves generalization. On the other hand, aggregating predictions of multiple augmented samples (i.e., test-time augmentation) could boost performance even further. In the context of person re-identification models, it is common practice to extract embeddings for both the original images and their horizontally flipped variants. The final representation is the mean of the aforementioned feature vectors. However, such scheme results in a gap between training and inference, i.e., the mean feature vectors calculated in inference are not part of the training pipeline. In this study, we devise the FlipReID structure with the flip** loss to address this issue. More specifically, models using the FlipReID structure are trained on the original images and the flipped images simultaneously, and incorporating the flip** loss minimizes the mean squared error between feature vectors of corresponding image pairs. Extensive experiments show that our method brings consistent improvements. In particular, we set a new record for MSMT17 which is the largest person re-identification dataset. The source code is available at https://github.com/nixingyang/FlipReID. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: First Version

arXiv:2104.11645 [pdf, other]

Software-Defined Edge Computing: A New Architecture Paradigm to Support IoT Data Analysis

Authors: Di Wu, Xiaofeng Xie, Xiang Ni, Bin Fu, Hanhui Deng, Haibo Zeng, Zhi** Qin

Abstract: The rapid deployment of Internet of Things (IoT) applications leads to massive data that need to be processed. These IoT applications have specific communication requirements on latency and bandwidth, and present new features on their generated data such as time-dependency. Therefore, it is desirable to reshape the current IoT architectures by exploring their inherent nature of communication and c… ▽ More The rapid deployment of Internet of Things (IoT) applications leads to massive data that need to be processed. These IoT applications have specific communication requirements on latency and bandwidth, and present new features on their generated data such as time-dependency. Therefore, it is desirable to reshape the current IoT architectures by exploring their inherent nature of communication and computing to support smart IoT data process and analysis. We introduce in this paper features of IoT data, trends of IoT network architectures, some problems in IoT data analysis, and their solutions. Specifically, we view that software-defined edge computing is a promising architecture to support the unique needs of IoT data analysis. We further present an experiment on data anomaly detection in this architecture, and the comparison between two architectures for ECG diagnosis. Results show that our method is effective and feasible. △ Less

Submitted 25 April, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

arXiv:2102.08465 [pdf, other]

Prioritizing Original News on Facebook

Authors: Xiuyan Ni, Shujian Bu, Igor L. Markov

Abstract: This work outlines how we prioritize original news, a critical indicator of news quality. By examining the landscape and life-cycle of news posts on our social media platform, we identify challenges of building and deploying an originality score. We pursue an approach based on normalized PageRank values and three-step clustering, and refresh the score on an hourly basis to capture the dynamics of… ▽ More This work outlines how we prioritize original news, a critical indicator of news quality. By examining the landscape and life-cycle of news posts on our social media platform, we identify challenges of building and deploying an originality score. We pursue an approach based on normalized PageRank values and three-step clustering, and refresh the score on an hourly basis to capture the dynamics of online news. We describe a near real-time system architecture, evaluate our methodology, and deploy it to production. Our empirical results validate individual components and show that prioritizing original news increases user engagement with news and improves proprietary cumulative metrics. △ Less

Submitted 14 March, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: 9 pages, 8 figures, 6 tables, 2 algorithm pseudocodes

Journal ref: CIKM 2021

arXiv:2101.06907 [pdf, other]

Quartic Perturbation-based Outage-constrained Robust Design in Two-hop One-way Relay Networks

Authors: Sissi Xiaoxiao Wu, Sherry Xue-Ying Ni, Jiaying Li, Anthony Man-Cho So

Abstract: In this work, we study a classic robust design problem in two-hop one-way relay system. We are particularly interested in the scenario where channel uncertainty exists in both the transmitter-to-relay and relay-to-receiver links. By considering the problem design that minimizes the average amplify-and-forward power budget at the relay side while satisfying SNR outage requirements, an outage-constr… ▽ More In this work, we study a classic robust design problem in two-hop one-way relay system. We are particularly interested in the scenario where channel uncertainty exists in both the transmitter-to-relay and relay-to-receiver links. By considering the problem design that minimizes the average amplify-and-forward power budget at the relay side while satisfying SNR outage requirements, an outage-constrained robust design problem involving quartic perturbations is formulated to guarantee the robustness during transmission. This problem is in general difficult as it involves constraints on the tail probability of a high-order polynomial. Herein, we resort to moment inequality and Bernstein-type inequality to tackle this problem, which provide convex restrictions, or safe approximations, of the original design. We also analyze the relative tightness of the two safe approximations for a quadratic perturbation-based outage constrained problem. Our analysis shows that the Bernstein-type inequality approach is less conservative than the moment inequality approach when the outage rate is within some prescribed regime. To our best knowledge, this is the first provable tightness result for these two safe approximations. Our numerical simulations verify the superiority of the robust design and corroborate the tightness results. △ Less

Submitted 18 January, 2021; originally announced January 2021.

arXiv:2012.13099 [pdf, other]

Cooperative Policy Learning with Pre-trained Heterogeneous Observation Representations

Authors: Wenlei Shi, Xinran Wei, Jia Zhang, Xiaoyuan Ni, Arthur Jiang, Jiang Bian, Tie-Yan Liu

Abstract: Multi-agent reinforcement learning (MARL) has been increasingly explored to learn the cooperative policy towards maximizing a certain global reward. Many existing studies take advantage of graph neural networks (GNN) in MARL to propagate critical collaborative information over the interaction graph, built upon inter-connected agents. Nevertheless, the vanilla GNN approach yields substantial defect… ▽ More Multi-agent reinforcement learning (MARL) has been increasingly explored to learn the cooperative policy towards maximizing a certain global reward. Many existing studies take advantage of graph neural networks (GNN) in MARL to propagate critical collaborative information over the interaction graph, built upon inter-connected agents. Nevertheless, the vanilla GNN approach yields substantial defects in dealing with complex real-world scenarios since the generic message passing mechanism is ineffective between heterogeneous vertices and, moreover, simple message aggregation functions are incapable of accurately modeling the combinational interactions from multiple neighbors. While adopting complex GNN models with more informative message passing and aggregation mechanisms can obviously benefit heterogeneous vertex representations and cooperative policy learning, it could, on the other hand, increase the training difficulty of MARL and demand more intense and direct reward signals compared to the original global reward. To address these challenges, we propose a new cooperative learning framework with pre-trained heterogeneous observation representations. Particularly, we employ an encoder-decoder based graph attention to learn the intricate interactions and heterogeneous representations that can be more easily leveraged by MARL. Moreover, we design a pre-training with local actor-critic algorithm to ease the difficulty in cooperative policy learning. Extensive experiments over real-world scenarios demonstrate that our new approach can significantly outperform existing MARL baselines as well as operational research solutions that are widely-used in industry. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: accepted as an oral paper in AAMAS 2021

arXiv:2011.08410 [pdf, other]

Semi-Supervised Few-Shot Atomic Action Recognition

Authors: Xiaoyuan Ni, Sizhe Song, Yu-Wing Tai, Chi-Keung Tang

Abstract: Despite excellent progress has been made, the performance on action recognition still heavily relies on specific datasets, which are difficult to extend new action classes due to labor-intensive labeling. Moreover, the high diversity in Spatio-temporal appearance requires robust and representative action feature aggregation and attention. To address the above issues, we focus on atomic actions and… ▽ More Despite excellent progress has been made, the performance on action recognition still heavily relies on specific datasets, which are difficult to extend new action classes due to labor-intensive labeling. Moreover, the high diversity in Spatio-temporal appearance requires robust and representative action feature aggregation and attention. To address the above issues, we focus on atomic actions and propose a novel model for semi-supervised few-shot atomic action recognition. Our model features unsupervised and contrastive video embedding, loose action alignment, multi-head feature comparison, and attention-based aggregation, together of which enables action recognition with only a few training examples through extracting more representative features and allowing flexibility in spatial and temporal alignment and variations in the action. Experiments show that our model can attain high accuracy on representative atomic action datasets outperforming their respective state-of-the-art classification accuracy in full supervision setting. △ Less

Submitted 16 November, 2020; originally announced November 2020.

Comments: 7 pages, 3 figures, 2 tables

arXiv:2009.04559 [pdf, other]

doi 10.1145/3299815.3314478

Develo** and Improving Risk Models using Machine-learning Based Algorithms

Authors: Yan Wang, Xuelei Sherry Ni

Abstract: The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and model ensembling algorithms. The rationale under the analyses is firstly to obtain good base binary classifiers (include Logistic Regression ($LR$), K-Nearest Neighbors ($KNN$)… ▽ More The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and model ensembling algorithms. The rationale under the analyses is firstly to obtain good base binary classifiers (include Logistic Regression ($LR$), K-Nearest Neighbors ($KNN$), Decision Tree ($DT$), and Artificial Neural Networks ($ANN$)) via regularization and appropriate settings of hyper-parameters. Then two model ensembling algorithms including bagging and boosting are performed on the good base classifiers for further model improvement. The models are evaluated using accuracy, Area Under the Receiver Operating Characteristic Curve (AUC of ROC), recall, and F1 score via repeating 10-fold cross-validation 10 times. The results show the optimal base classifiers along with the hyper-parameter settings are $LR$ without regularization, $KNN$ by using 9 nearest neighbors, $DT$ by setting the maximum level of the tree to be 7, and $ANN$ with three hidden layers. Bagging on $KNN$ with $K$ valued 9 is the optimal model we can get for risk classification as it reaches the average accuracy, AUC, recall, and F1 score valued 0.90, 0.93, 0.82, and 0.89, respectively. △ Less

Submitted 9 September, 2020; originally announced September 2020.

arXiv:2009.04536 [pdf, other]

doi 10.1145/3374135.3385272

Improving Investment Suggestions for Peer-to-Peer (P2P) Lending via Integrating Credit Scoring into Profit Scoring

Authors: Yan Wang, Xuelei Sherry Ni

Abstract: In the peer-to-peer (P2P) lending market, lenders lend the money to the borrowers through a virtual platform and earn the possible profit generated by the interest rate. From the perspective of lenders, they want to maximize the profit while minimizing the risk. Therefore, many studies have used machine learning algorithms to help the lenders identify the "best" loans for making investments. The s… ▽ More In the peer-to-peer (P2P) lending market, lenders lend the money to the borrowers through a virtual platform and earn the possible profit generated by the interest rate. From the perspective of lenders, they want to maximize the profit while minimizing the risk. Therefore, many studies have used machine learning algorithms to help the lenders identify the "best" loans for making investments. The studies have mainly focused on two categories to guide the lenders' investments: one aims at minimizing the risk of investment (i.e., the credit scoring perspective) while the other aims at maximizing the profit (i.e., the profit scoring perspective). However, they have all focused on one category only and there is seldom research trying to integrate the two categories together. Motivated by this, we propose a two-stage framework that incorporates the credit information into a profit scoring modeling. We conducted the empirical experiment on a real-world P2P lending data from the US P2P market and used the Light Gradient Boosting Machine (lightGBM) algorithm in the two-stage framework. Results show that the proposed two-stage method could identify more profitable loans and thereby provide better investment guidance to the investors compared to the existing one-stage profit scoring alone approach. Therefore, the proposed framework serves as an innovative perspective for making investment decisions in P2P lending. △ Less

Submitted 9 September, 2020; originally announced September 2020.

arXiv:2007.07875 [pdf, other]

Adaptive L2 Regularization in Person Re-Identification

Authors: Xingyang Ni, Liang Fang, Heikki Huttunen

Abstract: We introduce an adaptive L2 regularization mechanism in the setting of person re-identification. In the literature, it is common practice to utilize hand-picked regularization factors which remain constant throughout the training procedure. Unlike existing approaches, the regularization factors in our proposed method are updated adaptively through backpropagation. This is achieved by incorporating… ▽ More We introduce an adaptive L2 regularization mechanism in the setting of person re-identification. In the literature, it is common practice to utilize hand-picked regularization factors which remain constant throughout the training procedure. Unlike existing approaches, the regularization factors in our proposed method are updated adaptively through backpropagation. This is achieved by incorporating trainable scalar variables as the regularization factors, which are further fed into a scaled hard sigmoid function. Extensive experiments on the Market-1501, DukeMTMC-reID and MSMT17 datasets validate the effectiveness of our framework. Most notably, we obtain state-of-the-art performance on MSMT17, which is the largest dataset for person re-identification. Source code is publicly available at https://github.com/nixingyang/AdaptiveL2Regularization. △ Less

Submitted 18 October, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

Comments: Accepted at ICPR 2020

arXiv:2006.16400 [pdf]

doi 10.1007/s11265-020-01567-6

Vehicle Attribute Recognition by Appearance: Computer Vision Methods for Vehicle Type, Make and Model Classification

Authors: Xingyang Ni, Heikki Huttunen

Abstract: This paper studies vehicle attribute recognition by appearance. In the literature, image-based target recognition has been extensively investigated in many use cases, such as facial recognition, but less so in the field of vehicle attribute recognition. We survey a number of algorithms that identify vehicle properties ranging from coarse-grained level (vehicle type) to fine-grained level (vehicle… ▽ More This paper studies vehicle attribute recognition by appearance. In the literature, image-based target recognition has been extensively investigated in many use cases, such as facial recognition, but less so in the field of vehicle attribute recognition. We survey a number of algorithms that identify vehicle properties ranging from coarse-grained level (vehicle type) to fine-grained level (vehicle make and model). Moreover, we discuss two alternative approaches for these tasks, including straightforward classification and a more flexible metric learning method. Furthermore, we design a simulated real-world scenario for vehicle attribute recognition and present an experimental comparison of the two approaches. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: Published in Journal of Signal Processing Systems

arXiv:1911.08517 [pdf, other]

Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning

Authors: Xiang Ni, **g Li, Mo Yu, Wang Zhou, Kun-Lung Wu

Abstract: This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real time in a large distributed system. To maximize system throughput, the resource allocation strategy that partitions the computation tasks of a stream processing graph onto computing devices must simultaneously balance workload distribution and minimize communication.… ▽ More This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real time in a large distributed system. To maximize system throughput, the resource allocation strategy that partitions the computation tasks of a stream processing graph onto computing devices must simultaneously balance workload distribution and minimize communication. Since this problem of graph partitioning is known to be NP-complete yet crucial to practical streaming systems, many heuristic-based algorithms have been developed to find reasonably good solutions. In this paper, we present a graph-aware encoder-decoder framework to learn a generalizable resource allocation strategy that can properly distribute computation tasks of stream processing graphs unobserved from training data. We, for the first time, propose to leverage graph embedding to learn the structural information of the stream processing graphs. Jointly trained with the graph-aware decoder using deep reinforcement learning, our approach can effectively find optimized solutions for unseen graphs. Our experiments show that the proposed model outperforms both METIS, a state-of-the-art graph partitioning algorithm, and an LSTM-based encoder-decoder model, in about 70% of the test cases. △ Less

Submitted 19 November, 2019; originally announced November 2019.

Comments: Accepted by AAAI 2020

arXiv:1903.05535 [pdf]

Predicting class-imbalanced business risk using resampling, regularization, and model ensembling algorithms

Authors: Yan Wang, Xuelei Sherry Ni

Abstract: We aim at develo** and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, and ensembling techniques. Area Under the Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random undersampling (RUS)… ▽ More We aim at develo** and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, and ensembling techniques. Area Under the Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random undersampling (RUS) and cluster centroid undersampling (CCUS), as well as two oversampling methods including random oversampling (ROS) and Synthetic Minority Oversampling Technique (SMOTE), are applied. Three highly interpretable classifiers, including logistic regression without regularization (LR), L1-regularized LR (L1LR), and decision tree (DT) are implemented. Two ensembling techniques, including Bagging and Boosting, are applied on the DT classifier for further model improvement. The results show that, Boosting on DT by using the oversampled data containing 50% positives via SMOTE is the optimal model and it can achieve AUC, recall, and F1 score valued 0.8633, 0.9260, and 0.8907, respectively. △ Less

Submitted 13 March, 2019; originally announced March 2019.

Journal ref: International Journal of Managing Information Technology (IJIMIT) Vol. 11, No. 1, Februray 2019

arXiv:1902.04954 [pdf, other]

doi 10.1145/3374135.3385287

Risk Prediction of Peer-to-Peer Lending Market by a LSTM Model with Macroeconomic Factor

Authors: Yan Wang, Xuelei Sherry Ni

Abstract: In the peer to peer (P2P) lending platform, investors hope to maximize their return while minimizing the risk through a comprehensive understanding of the P2P market. A low and stable average default rate across all the borrowers denotes a healthy P2P market and provides investors more confidence in a promising investment. Therefore, having a powerful model to describe the trend of the default rat… ▽ More In the peer to peer (P2P) lending platform, investors hope to maximize their return while minimizing the risk through a comprehensive understanding of the P2P market. A low and stable average default rate across all the borrowers denotes a healthy P2P market and provides investors more confidence in a promising investment. Therefore, having a powerful model to describe the trend of the default rate in the P2P market is crucial. Different from previous studies that focus on modeling the default rate at the individual level, in this paper, we are the first to comprehensively explore the monthly trend of the default rate at the aggregative level for the P2P data from October 2007 to January 2016 in the US. We use the long short term memory (LSTM) approach to sequentially predict the default risk of the borrowers in Lending Club, which is the largest P2P lending platform in the US. Although being first applied in modeling the P2P sequential data, the LSTM approach shows its great potential by outperforming traditionally utilized time series models in our experiments. Furthermore, incorporating the macroeconomic feature \textit{unemp\_rate} (i.e., unemployment rate) can improve the LSTM performance by decreasing RMSE on both the training and the testing datasets. Our study can broaden the applications of the LSTM algorithm by using it on the sequential P2P data and guide the investors in making investment strategies. △ Less

Submitted 9 September, 2020; v1 submitted 13 February, 2019; originally announced February 2019.

arXiv:1901.08433 [pdf]

A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization

Authors: Yan Wang, Xuelei Sherry Ni

Abstract: This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weigh… ▽ More This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weight by information are applied to alleviate the effect of redundant features. Two hyper-parameter optimization approaches, random search (RS) and Bayesian tree-structured Parzen Estimator (TPE), are applied in XGBoost. The effect of different FS and hyper-parameter optimization methods on the model performance are investigated by the Wilcoxon Signed Rank Test. The performance of XGBoost is compared to the traditionally utilized logistic regression (LR) model in terms of classification accuracy, area under the curve (AUC), recall, and F1 score obtained from the 10-fold cross validation. Results show that hierarchical clustering is the optimal FS method for LR while weight by Chi-square achieves the best performance in XG-Boost. Both TPE and RS optimization in XGBoost outperform LR significantly. TPE optimization shows a superiority over RS since it results in a significantly higher accuracy and a marginally higher AUC, recall and F1 score. Furthermore, XGBoost with TPE tuning shows a lower variability than the RS method. Finally, the ranking of feature importance based on XGBoost enhances the model interpretation. Therefore, XGBoost with Bayesian TPE hyper-parameter optimization serves as an operative while powerful approach for business risk modeling. △ Less

Submitted 24 January, 2019; originally announced January 2019.

Comments: Accepted by International Journal of Database Management Systems (IJDMS)

arXiv:1901.00251 [pdf, other]

An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification

Authors: Yan Wang, Xuelei Sherry Ni, Brian Stone

Abstract: In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logi… ▽ More In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logistic regression. The motivation of the proposed hybrid model is that adding variable interactions may improve the performance of logistic regression. To demonstrate the effectiveness of the proposed hybrid model, it is evaluated on a real credit customer response data set. As the results reveal, by identifying potential interactions among independent variables, the proposed hybrid approach outperforms the logistic regression without searching for interactions in terms of classification accuracy, the area under the receiver operating characteristic curve (ROC), and Kolmogorov-Smirnov (KS) statistics. Furthermore, CHAID analysis for interaction detection is much more computationally efficient than the stepwise search mentioned above and some identified interactions are shown to have statistically significant predictive power on the target variable. Last but not least, the customer profile created based on the CHAID tree provides a reasonable interpretation of the interactions, which is the required by regulations of the credit industry. Hence, this study provides an alternative for handling bankcard classification tasks. △ Less

Submitted 1 January, 2019; originally announced January 2019.

Journal ref: The 2018 5th International Conference on Systems and Informatics (ICSAI2018)

arXiv:1812.02546 [pdf]

A two-stage hybrid model by using artificial neural networks as feature construction algorithms

Authors: Yan Wang, Xuelei Sherry Ni, Brian Stone

Abstract: We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simple neural network structure as the new feature construction tool in the first stage, then the newly created features are used as the additional input variables in logistic regression in the second stage. The model is compared… ▽ More We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simple neural network structure as the new feature construction tool in the first stage, then the newly created features are used as the additional input variables in logistic regression in the second stage. The model is compared with the traditional one-stage model in credit customer response classification. It is observed that the proposed two-stage model outperforms the one-stage model in terms of accuracy, the area under ROC curve, and KS statistic. By creating new features with the neural network technique, the underlying nonlinear relationships between variables are identified. Furthermore, by using a very simple neural network structure, the model could overcome the drawbacks of neural networks in terms of its long training time, complex topology, and limited interpretability. △ Less

Submitted 6 December, 2018; originally announced December 2018.

arXiv:1809.06640 [pdf, other]

doi 10.22331/q-2020-08-24-310

Neural Network Decoders for Large-Distance 2D Toric Codes

Authors: Xiaotong Ni

Abstract: We still do not have perfect decoders for topological codes that can satisfy all needs of different experimental setups. Recently, a few neural network based decoders have been studied, with the motivation that they can adapt to a wide range of noise models, and can easily run on dedicated chips without a full-fledged computer. The later feature might lead to fast speed and the ability to operate… ▽ More We still do not have perfect decoders for topological codes that can satisfy all needs of different experimental setups. Recently, a few neural network based decoders have been studied, with the motivation that they can adapt to a wide range of noise models, and can easily run on dedicated chips without a full-fledged computer. The later feature might lead to fast speed and the ability to operate at low temperatures. However, a question which has not been addressed in previous works is whether neural network decoders can handle 2D topological codes with large distances. In this work, we provide a positive answer for the toric code. The structure of our neural network decoder is inspired by the renormalization group decoder. With a fairly strict policy on training time, when the bit-flip error rate is lower than $9\%$ and syndrome extraction is perfect, the neural network decoder performs better when code distance increases. With a less strict policy, we find it is not hard for the neural decoder to achieve a performance close to the minimum-weight perfect matching algorithm. The numerical simulation is done up to code distance $d=64$. Last but not least, we describe and analyze a few failed approaches. They guide us to the final design of our neural decoder, but also serve as a caution when we gauge the versatility of stock deep neural networks. The source code of our neural decoder can be found at https://github.com/XiaotongNi/toric-code-neural-decoder . △ Less

Submitted 7 April, 2020; v1 submitted 18 September, 2018; originally announced September 2018.

Journal ref: Quantum 4, 310 (2020)

Showing 1–50 of 58 results for author: Ni, X