Search | arXiv e-print repository

Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

Authors: Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang

Abstract: Existing checkpointing approaches seem ill-suited for distributed training even though hardware limitations make model parallelism, i.e., sharding model state across multiple accelerators, a requirement for model scaling. Consolidating distributed model state into a single checkpoint unacceptably slows down training, and is impractical at extreme scales. Distributed checkpoints, in contrast, are t… ▽ More Existing checkpointing approaches seem ill-suited for distributed training even though hardware limitations make model parallelism, i.e., sharding model state across multiple accelerators, a requirement for model scaling. Consolidating distributed model state into a single checkpoint unacceptably slows down training, and is impractical at extreme scales. Distributed checkpoints, in contrast, are tightly coupled to the model parallelism and hardware configurations of the training run, and thus unusable on different configurations. To address this problem, we propose Universal Checkpointing, a technique that enables efficient checkpoint creation while providing the flexibility of resuming on arbitrary parallelism strategy and hardware configurations. Universal Checkpointing unlocks unprecedented capabilities for large-scale training such as improved resilience to hardware failures through continued training on remaining healthy hardware, and reduced training time through opportunistic exploitation of elastic capacity. The key insight of Universal Checkpointing is the selection of the optimal representation in each phase of the checkpointing life cycle: distributed representation for saving, and consolidated representation for loading. This is achieved using two key mechanisms. First, the universal checkpoint format, which consists of a consolidated representation of each model parameter and metadata for map** parameter fragments into training ranks of arbitrary model-parallelism configuration. Second, the universal checkpoint language, a simple but powerful specification language for converting distributed checkpoints into the universal checkpoint format. Our evaluation demonstrates the effectiveness and generality of Universal Checkpointing on state-of-the-art model architectures and a wide range of parallelism techniques. △ Less

Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.04593 [pdf, other]

SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Authors: Chonghuan Zhang, Qianghua Lin, Biwei Zhu, Haopeng Yang, Xiao Lian, Hao Deng, Jiajun Zheng, Kuangbiao Liao

Abstract: The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLM into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting sy… ▽ More The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLM into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting synthetic tasks, paving the way for the development of LLMs tailored to the organic chemistry field. In this work, we introduce SynAsk, a comprehensive organic chemistry domain-specific LLM platform developed by AIChemEco Inc. By finetuning an LLM with domain-specific data and integrating it with a chain of thought approach, SynAsk seamlessly accesses our knowledge base and advanced chemistry tools in a question-and-answer format. This includes functionalities such as a basic chemistry knowledge base, molecular information retrieval, reaction performance prediction, retrosynthesis prediction, chemical literature acquisition, and more. This novel methodology synergizes fine-tuning techniques with external resource integration, resulting in an organic chemistry-specific model poised to facilitate research and discovery in the field. Accessible via http://synask.aichemeco.com, SynAsk represents a significant advancement in leveraging NLP for synthetic applications. △ Less

Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.19623 [pdf, other]

A Novel Approach for Automated Design Information Mining from Issue Logs

Authors: Jiuang Zhao, Zitian Yang, Li Zhang, Xiaoli Lian, Donghao Yang

Abstract: Software architectures are usually meticulously designed to address multiple quality concerns and support long-term maintenance. However, due to the imbalance between the cost and value for developers to document design rationales (i.e., the design alternatives and the underlying arguments for making or rejecting decisions), these rationales are often obsolete or even missing. The lack of design k… ▽ More Software architectures are usually meticulously designed to address multiple quality concerns and support long-term maintenance. However, due to the imbalance between the cost and value for developers to document design rationales (i.e., the design alternatives and the underlying arguments for making or rejecting decisions), these rationales are often obsolete or even missing. The lack of design knowledge has motivated a number of studies to extract design information from various platforms in recent years. Unfortunately, despite the wealth of discussion records related to design information provided by platforms like open-source communities, existing research often overlooks the underlying arguments behind alternatives due to challenges such as the intricate semantics of discussions and the lack of benchmarks for design rationale extraction. In this paper, we propose a novel method, named by DRMiner, to automatically mine latent design rationales from developers' live discussion in open-source community (i.e., issue logs in Jira). To better identify solutions and the arguments supporting them, DRMiner skillfully decomposes the problem into multiple text classification tasks and tackles them using prompt tuning of language models and customized text-related features. To evaluate DRMiner, we acquire issue logs from Cassandra, Flink, and Solr repositories in Jira, and then annotate and process them under a rigorous scheme, ultimately forming a dataset for design rationale mining. Experimental results show that DRMiner achieves an F1 score of 65% for mining design rationales, outperforming all baselines with a 7% improvement over GPT-4.0. Furthermore, we investigate the usefulness of the design rationales mined by DRMiner for automated program repair (APR) and find that the design rationales significantly enhance APR, achieving 14 times higher full-match repairs on average. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.01510 [pdf, other]

Reverse Influential Community Search Over Social Networks (Technical Report)

Authors: Qi Wen, Nan Zhang, Yutong Ye, Xiang Lian, Mingsong Chen

Abstract: As an important fundamental task of numerous real-world applications such as social network analysis and online advertising/marketing, several prior works studied influential community search, which retrieves a community with high structural cohesiveness and maximum influences on other users in social networks. However, previous works usually considered the influences of the community on arbitrary… ▽ More As an important fundamental task of numerous real-world applications such as social network analysis and online advertising/marketing, several prior works studied influential community search, which retrieves a community with high structural cohesiveness and maximum influences on other users in social networks. However, previous works usually considered the influences of the community on arbitrary users in social networks, rather than specific groups (e.g., customer groups, or senior communities). Inspired by this, we propose a novel Reverse Influential Community Search (RICS) problem, which obtains a seed community with the maximum influence on a user-specified target community, satisfying both structural and keyword constraints. To efficiently tackle the RICS problem, we design effective pruning strategies to filter out false alarms of candidate seed communities, and propose an effective index mechanism to facilitate the community retrieval. We also formulate and tackle an RICS variant, named Relaxed Reverse Influential Community Search (R2ICS), which returns a subgraph with the relaxed structural constraints and having the maximum influence on a user-specified target community. Comprehensive experiments have been conducted to verify the efficiency and effectiveness of our RICS and R2ICS approaches on both real-world and synthetic social networks under various parameter settings. △ Less

Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2403.03835 [pdf, other]

Cobweb: An Incremental and Hierarchical Model of Human-Like Category Learning

Authors: Xin Lian, Sashank Varma, Christopher J. MacLellan

Abstract: Cobweb, a human-like category learning system, differs from most cognitive science models in incrementally constructing hierarchically organized tree-like structures guided by the category utility measure. Prior studies have shown that Cobweb can capture psychological effects such as basic-level, typicality, and fan effects. However, a broader evaluation of Cobweb as a model of human categorizatio… ▽ More Cobweb, a human-like category learning system, differs from most cognitive science models in incrementally constructing hierarchically organized tree-like structures guided by the category utility measure. Prior studies have shown that Cobweb can capture psychological effects such as basic-level, typicality, and fan effects. However, a broader evaluation of Cobweb as a model of human categorization remains lacking. The current study addresses this gap. It establishes Cobweb's alignment with classical human category learning effects. It also explores Cobweb's flexibility to exhibit both exemplar- and prototype-like learning within a single framework. These findings set the stage for further research on Cobweb as a robust model of human category learning. △ Less

Submitted 8 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

Comments: Accepted by CogSci-24

arXiv:2402.16933 [pdf, other]

Avoiding Catastrophic Forgetting in Visual Classification Using Human Concept Formation

Authors: Nicki Barari, Xin Lian, Christopher J. MacLellan

Abstract: Deep neural networks have excelled in machine learning, particularly in vision tasks, however, they often suffer from catastrophic forgetting when learning new tasks sequentially. In this work, we propose Cobweb4V, a novel visual classification approach that builds on Cobweb, a human like learning system that is inspired by the way humans incrementally learn new concepts over time. In this researc… ▽ More Deep neural networks have excelled in machine learning, particularly in vision tasks, however, they often suffer from catastrophic forgetting when learning new tasks sequentially. In this work, we propose Cobweb4V, a novel visual classification approach that builds on Cobweb, a human like learning system that is inspired by the way humans incrementally learn new concepts over time. In this research, we conduct a comprehensive evaluation, showcasing the proficiency of Cobweb4V in learning visual concepts, requiring less data to achieve effective learning outcomes compared to traditional methods, maintaining stable performance over time, and achieving commendable asymptotic behavior, without catastrophic forgetting effects. These characteristics align with learning strategies in human cognition, positioning Cobweb4V as a promising alternative to neural network approaches. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2311.14987 [pdf]

Reconstruction of a Long-term spatially Contiguous Solar-Induced Fluorescence (LCSIF) over 1982-2022

Authors: Jianing Fang, Xu Lian, Youngryel Ryu, Sungchan Jeong, Chongya Jiang, Pierre Gentine

Abstract: Satellite-observed solar-induced chlorophyll fluorescence (SIF) is a powerful proxy for diagnosing the photosynthetic characteristics of terrestrial ecosystems. Despite the increasing spatial and temporal resolutions of these satellite retrievals, records of SIF are primarily limited to the recent decade, impeding their application in detecting long-term dynamics of ecosystem function and structur… ▽ More Satellite-observed solar-induced chlorophyll fluorescence (SIF) is a powerful proxy for diagnosing the photosynthetic characteristics of terrestrial ecosystems. Despite the increasing spatial and temporal resolutions of these satellite retrievals, records of SIF are primarily limited to the recent decade, impeding their application in detecting long-term dynamics of ecosystem function and structure. In this study, we leverage the two surface reflectance bands (red and near-infrared) available both from Advanced Very High-Resolution Radiometer (AVHRR, 1982-2022) and MODerate-resolution Imaging Spectroradiometer (MODIS, 2001-2022). Importantly, we calibrate and orbit-correct the AVHRR bands against their MODIS counterparts during their overlap** period. Using the long-term bias-corrected reflectance data, a neural network is then built to reproduce the Orbiting Carbon Observatory-2 SIF using AVHRR and MODIS, and used to map SIF globally over the entire 1982-2022 period. Compared with the previous MODIS-based CSIF product relying on four reflectance bands, our two-band-based product has similar skill but can be advantageously extended to the bias-corrected AVHRR period. Further comparison with three widely used vegetation indices (NDVI, kNDVI, NIRv; all based empirically on red and near-infrared bands) shows a higher or comparable correlation of LCSIF with satellite SIF and site-level GPP estimates across vegetation types, ensuring a greater capacity of LCSIF for representing terrestrial photosynthesis. Globally, LCSIF-AVHRR shows an accelerating upward trend since 1982, with an average rate of 0.0025 mW m-2 nm-1 sr-1 per decade during 1982-2000 and 0.0038 mW m-2 nm-1 sr-1 per decade during 2001-2022. Our LCSIF data provide opportunities to better understand the long-term dynamics of ecosystem photosynthesis and their underlying driving processes. △ Less

Submitted 19 June, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.13162 [pdf, other]

Top-L Most Influential Community Detection Over Social Networks (Technical Report)

Authors: Nan Zhang, Yutong Ye, Xiang Lian, Mingsong Chen

Abstract: In many real-world applications such as social network analysis and online marketing/advertising, the community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social… ▽ More In many real-world applications such as social network analysis and online marketing/advertising, the community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social networks. Inspired by this, in this paper, we investigate the influence propagation from some seed communities and their influential effects that result in the influenced communities. We propose a novel problem, named Top-L most Influential Community DEtection (TopL-ICDE) over social networks, which aims to retrieve top-L seed communities with the highest influences, having high structural cohesiveness, and containing user-specified query keywords. In order to efficiently tackle the TopL-ICDE problem, we design effective pruning strategies to filter out false alarms of seed communities and propose an effective index mechanism to facilitate efficient Top-L community retrieval. We develop an efficient TopL-ICDE answering algorithm by traversing the index and applying our proposed pruning strategies. We also formulate and tackle a variant of TopL-ICDE, named diversified top-L most influential community detection (DTopL-ICDE), which returns a set of L diversified communities with the highest diversity score (i.e., collaborative influences by L communities). We prove that DTopL-ICDE is NP-hard, and propose an efficient greedy algorithm with our designed diversity score pruning. Through extensive experiments, we verify the efficiency and effectiveness of our proposed TopL-ICDE and DTopL-ICDE approaches over real/synthetic social networks under various parameter settings. △ Less

Submitted 1 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.02914 [pdf, other]

Tight upper bound on the clique size in the square of 2-degenerate graphs

Authors: Seog-** Kim, Xiaopan Lian

Abstract: The {\em square} of a graph $G$, denoted $G^2$, has the same vertex set as $G$ and has an edge between two vertices if the distance between them in $G$ is at most $2$. In general, $Δ(G) + 1 \leq χ(G^2) \leq Δ(G)^2 +1$ for every graph $G$. Charpentier [1] asked whether $χ(G^2) \leq 2 Δ(G)$ if $mad(G) < 4$. But Hocquard, Kim, and Pierron [6] answered his question negatively. For every even value of… ▽ More The {\em square} of a graph $G$, denoted $G^2$, has the same vertex set as $G$ and has an edge between two vertices if the distance between them in $G$ is at most $2$. In general, $Δ(G) + 1 \leq χ(G^2) \leq Δ(G)^2 +1$ for every graph $G$. Charpentier [1] asked whether $χ(G^2) \leq 2 Δ(G)$ if $mad(G) < 4$. But Hocquard, Kim, and Pierron [6] answered his question negatively. For every even value of $Δ(G)$, they constructed a 2-degenerate graph $G$ such that $ω(G^2) = \frac{5}{2} Δ(G)$. Note that if $G$ is a 2-degenerate graph, then $mad(G) < 4$. Thus, we have that \[ {\displaystyle \frac{5}{2} Δ(G) \leq \max \{χ(G^2) : G \mbox{ is a 2-degenerate graph} \} \leq 3 Δ(G) +1}. \] So, it was naturally asked whether there exists a constant $D_0$ such that $χ(G^2) \leq \frac{5}{2} Δ(G)$ if $G$ is a 2-degenerate graph with $Δ(G) \geq D_0$. Recently Cranston and Yu [3] showed that $ω(G^2) \leq \frac{5}{2} Δ(G)+72$ if $G$ is a 2-degenerate graph, and $ω(G^2) \leq \frac{5}{2} Δ(G)+60$ if $G$ is a 2-degenerate graph with $Δ(G) \geq 1729$. We show that there exists a constant $D_0$ such that $ω(G^2) \leq \frac{5}{2} Δ(G)$ if $G$ is a 2-degenerate graph with $Δ(G) \geq D_0$. This upper bound on $ω(G^2)$ is tight by the construction in [6]. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 32 pages

arXiv:2310.09690 [pdf, other]

Configuration Validation with Large Language Models

Authors: Xinyu Lian, Yinfang Chen, Runxiang Cheng, Jie Huang, Parth Thakkar, Minjia Zhang, Tianyin Xu

Abstract: Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models. Recent advances in Large Language Model… ▽ More Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models. Recent advances in Large Language Models (LLMs) show promise in addressing some of the long-lasting limitations of ML-based configuration validation. We present a first analysis on the feasibility and effectiveness of using LLMs for configuration validation. We empirically evaluate LLMs as configuration validators by develo** a generic LLM-based configuration validation framework, named Ciri. Ciri employs effective prompt engineering with few-shot learning based on both valid configuration and misconfiguration data. Ciri checks outputs from LLMs when producing results, addressing hallucination and nondeterminism of LLMs. We evaluate Ciri's validation effectiveness on eight popular LLMs using configuration data of ten widely deployed open-source systems. Our analysis (1) confirms the potential of using LLMs for configuration validation, (2) explores design space of LLMbased validators like Ciri, and (3) reveals open challenges such as ineffectiveness in detecting certain types of misconfigurations and biases towards popular configuration parameters. △ Less

Submitted 2 April, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

arXiv:2309.15641 [pdf, other]

Efficient Exact Subgraph Matching via GNN-based Path Dominance Embedding (Technical Report)

Authors: Yutong Ye, Xiang Lian, Mingsong Chen

Abstract: The classic problem of exact subgraph matching returns those subgraphs in a large-scale data graph that are isomorphic to a given query graph, which has gained increasing importance in many real-world applications such as social network analysis, knowledge graph discovery in the Semantic Web, bibliographical network mining, and so on. In this paper, we propose a novel and effective graph neural ne… ▽ More The classic problem of exact subgraph matching returns those subgraphs in a large-scale data graph that are isomorphic to a given query graph, which has gained increasing importance in many real-world applications such as social network analysis, knowledge graph discovery in the Semantic Web, bibliographical network mining, and so on. In this paper, we propose a novel and effective graph neural network (GNN)-based path embedding framework (GNN-PE), which allows efficient exact subgraph matching without introducing false dismissals. Unlike traditional GNN-based graph embeddings that only produce approximate subgraph matching results, in this paper, we carefully devise GNN-based embeddings for paths, such that: if two paths (and 1-hop neighbors of vertices on them) have the subgraph relationship, their corresponding GNN-based embedding vectors will strictly follow the dominance relationship. With such a newly designed property of path dominance embeddings, we are able to propose effective pruning strategies based on path label/dominance embeddings and guarantee no false dismissals for subgraph matching. We build multidimensional indexes over path embedding vectors, and develop an efficient subgraph matching algorithm by traversing indexes over graph partitions in parallel and applying our pruning methods. We also propose a cost-model-based query plan that obtains query paths from the query graph with low query cost. Through extensive experiments, we confirm the efficiency and effectiveness of our proposed GNN-PE approach for exact subgraph matching on both real and synthetic graph data. △ Less

Submitted 15 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2307.05897 [pdf, ps, other]

On a variant of dichromatic number for digraphs with prescribed sets of arcs

Authors: O-joung Kwon, Xiaopan Lian

Abstract: In this paper, we consider a variant of dichromatic number on digraphs with prescribed sets of arcs. Let $D$ be a digraph and let $Z_1, Z_2$ be two sets of arcs in $D$. For a subdigraph $H$ of $D$, let $A(H)$ denote the set of all arcs of $H$. Let $μ(D, Z_1, Z_2)$ be the minimum number of parts in a vertex partition $\mathcal{P}$ of $D$ such that for every $X\in \mathcal{P}$, the subdigraph of… ▽ More In this paper, we consider a variant of dichromatic number on digraphs with prescribed sets of arcs. Let $D$ be a digraph and let $Z_1, Z_2$ be two sets of arcs in $D$. For a subdigraph $H$ of $D$, let $A(H)$ denote the set of all arcs of $H$. Let $μ(D, Z_1, Z_2)$ be the minimum number of parts in a vertex partition $\mathcal{P}$ of $D$ such that for every $X\in \mathcal{P}$, the subdigraph of $D$ induced by $X$ contains no directed cycle $C$ with $|A(C)\cap Z_1|\neq |A(C)\cap Z_2|$. For $Z_1=A(D)$ and $Z_2=\emptyset$, $μ(D, Z_1, Z_2)$ is equal to the dichromatic number of $D$. We prove that for every digraph $F$ and every tuple $(a_e,b_e,r_e, q_e)$ of integers with $q_e\ge 2$ and $\gcd(a_e,q_e)=\gcd(b_e,q_e)=1$ for each arc $e$ of $F$, there exists an integer $N$ such that if $μ(D, Z_1, Z_2)\ge N$, then $D$ contains a subdigraph isomorphic to a subdivision of $F$ in which each arc $e$ of $F$ is subdivided into a directed path~$P_e$ such that~$a_e|A(P_e)\cap Z_1|+b_e|A(P_e)\cap Z_2|\equiv {r_e}\pmod {q_e}$. This generalizes a theorem of Steiner [Subdivisions with congruence constraints in digraphs of large chromatic number, arXiv:2208.06358] which corresponds to the case when $(a_e, b_e, Z_1, Z_2)=(1, 1, A(D), \emptyset)$. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 13 pages, 2 figures

arXiv:2305.05194 [pdf, other]

The square of every subcubic planar graph of girth at least 6 is 7-choosable

Authors: Seog-** Kim, Xiaopan Lian

Abstract: The square of a graph $G$, denoted $G^2$, has the same vertex set as $G$ and has an edge between two vertices if the distance between them in $G$ is at most $2$. Thomassen (2018) and Hartke, Jahanbekam and Thomas (2016) proved that $χ(G^2) \leq 7$ if $G$ is a subcubic planar graph. A natural question is whether $χ_{\ell}(G^2) \leq 7$ or not if $G$ is a subcubic planar graph. Cranston and Kim (2008… ▽ More The square of a graph $G$, denoted $G^2$, has the same vertex set as $G$ and has an edge between two vertices if the distance between them in $G$ is at most $2$. Thomassen (2018) and Hartke, Jahanbekam and Thomas (2016) proved that $χ(G^2) \leq 7$ if $G$ is a subcubic planar graph. A natural question is whether $χ_{\ell}(G^2) \leq 7$ or not if $G$ is a subcubic planar graph. Cranston and Kim (2008) showed that $χ_{\ell}(G^2) \leq 7$ if $G$ is a subcubic planar graph of girth at least 7. We prove that $χ_{\ell}(G^2) \leq 7$ if $G$ is a subcubic planar graph of girth at least 6. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 9 pages, 1 figure

arXiv:2305.01472 [pdf, other]

Group vertex-arboricity of group-labelled graphs

Authors: O-joung Kwon, Xiaopan Lian

Abstract: We introduce the vertex-arboricity of group-labelled graphs. For an abelian group $Γ$, a $Γ$-labelled graph is a graph whose edges are labelled by elements of $Γ$. For an abelian group $Γ$ and $A\subseteq Γ$, the $(Γ, A)$-vertex-arboricity of a $Γ$-labelled graph is the minimum integer $k$ such that its vertex set can be partitioned into $k$ parts where each part induces a subgraph having no cycle… ▽ More We introduce the vertex-arboricity of group-labelled graphs. For an abelian group $Γ$, a $Γ$-labelled graph is a graph whose edges are labelled by elements of $Γ$. For an abelian group $Γ$ and $A\subseteq Γ$, the $(Γ, A)$-vertex-arboricity of a $Γ$-labelled graph is the minimum integer $k$ such that its vertex set can be partitioned into $k$ parts where each part induces a subgraph having no cycle of value in $A$. We prove that for every positive integer $ω$, there is a function $f_ω:\mathbb{N}\times\mathbb{N}\to \mathbb{R}$ such that if $|Γ\setminus A|\le ω$, then every $Γ$-labelled graph with $(Γ, A)$-vertex-arboricity at least $f_ω(t,d)$ contains a subdivision of $K_t$ where all branching paths are of value in $A$ and of length at least $d$. This extends a well-known result that every graph of sufficiently large chromatic number contains a subdivision of $K_t$, in various directions. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Comments: 15 pages, 1 figure

arXiv:2304.11171 [pdf, other]

Granular-ball computing: an efficient, robust, and interpretable adaptive multi-granularity representation and computation method

Authors: Shuyin Xia, Guoyin Wang, Xinbo Gao, Xiaoyu Lian

Abstract: Human cognition operates on a "Global-first" cognitive mechanism, prioritizing information processing based on coarse-grained details. This mechanism inherently possesses an adaptive multi-granularity description capacity, resulting in computational traits such as efficiency, robustness, and interpretability. The analysis pattern reliance on the finest granularity and single-granularity makes most… ▽ More Human cognition operates on a "Global-first" cognitive mechanism, prioritizing information processing based on coarse-grained details. This mechanism inherently possesses an adaptive multi-granularity description capacity, resulting in computational traits such as efficiency, robustness, and interpretability. The analysis pattern reliance on the finest granularity and single-granularity makes most existing computational methods less efficient, robust, and interpretable, which is an important reason for the current lack of interpretability in neural networks. Multi-granularity granular-ball computing employs granular-balls of varying sizes to daptively represent and envelop the sample space, facilitating learning based on these granular-balls. Given that the number of coarse-grained "granular-balls" is fewer than sample points, granular-ball computing proves more efficient. Moreover, the inherent coarse-grained nature of granular-balls reduces susceptibility to fine-grained sample disturbances, enhancing robustness. The multi-granularity construct of granular-balls generates topological structures and coarse-grained descriptions, naturally augmenting interpretability. Granular-ball computing has successfully ventured into diverse AI domains, fostering the development of innovative theoretical methods, including granular-ball classifiers, clustering techniques, neural networks, rough sets, and evolutionary computing. This has notably ameliorated the efficiency, noise robustness, and interpretability of traditional methods. Overall, granular-ball computing is a rare and innovative theoretical approach in AI that can adaptively and simultaneously enhance efficiency, robustness, and interpretability. This article delves into the main application landscapes for granular-ball computing, aiming to equip future researchers with references and insights to refine and expand this promising theory. △ Less

Submitted 18 January, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

arXiv:2301.12850 [pdf, other]

GE-Blender: Graph-Based Knowledge Enhancement for Blender

Authors: Xiaolei Lian, Xunzhu Tang, Yue Wang

Abstract: Although the great success of open-domain dialogue generation, unseen entities can have a large impact on the dialogue generation task. It leads to performance degradation of the model in the dialog generation. Previous researches used retrieved knowledge of seen entities as the auxiliary data to enhance the representation of the model. Nevertheless, logical explanation of unseen entities remains… ▽ More Although the great success of open-domain dialogue generation, unseen entities can have a large impact on the dialogue generation task. It leads to performance degradation of the model in the dialog generation. Previous researches used retrieved knowledge of seen entities as the auxiliary data to enhance the representation of the model. Nevertheless, logical explanation of unseen entities remains unexplored, such as possible co-occurrence or semantically similar words of them and their entity category. In this work, we propose an approach to address the challenge above. We construct a graph by extracting entity nodes in them, enhancing the representation of the context of the unseen entity with the entity's 1-hop surrounding nodes. Furthermore, We added the named entity tag prediction task to apply the problem that the unseen entity does not exist in the graph. We conduct our experiments on an open dataset Wizard of Wikipedia and the empirical results indicate that our approach outperforms the state-of-the-art approaches on Wizard of Wikipedia. △ Less

Submitted 30 January, 2023; originally announced January 2023.

arXiv:2212.12948 [pdf, other]

Human Health Indicator Prediction from Gait Video

Authors: Ziqing Li, Xuexin Yu, Xiaocong Lian, Yifeng Wang, Xiangyang Ji

Abstract: Body Mass Index (BMI), age, height and weight are important indicators of human health conditions, which can provide useful information for plenty of practical purposes, such as health care, monitoring and re-identification. Most existing methods of health indicator prediction mainly use front-view body or face images. These inputs are hard to be obtained in daily life and often lead to the lack o… ▽ More Body Mass Index (BMI), age, height and weight are important indicators of human health conditions, which can provide useful information for plenty of practical purposes, such as health care, monitoring and re-identification. Most existing methods of health indicator prediction mainly use front-view body or face images. These inputs are hard to be obtained in daily life and often lead to the lack of robustness for the models, considering their strict requirements on view and pose. In this paper, we propose to employ gait videos to predict health indicators, which are more prevalent in surveillance and home monitoring scenarios. However, the study of health indicator prediction from gait videos using deep learning was hindered due to the small amount of open-sourced data. To address this issue, we analyse the similarity and relationship between pose estimation and health indicator prediction tasks, and then propose a paradigm enabling deep learning for small health indicator datasets by pre-training on the pose estimation task. Furthermore, to better suit the health indicator prediction task, we bring forward Global-Local Aware aNd Centrosymmetric Encoder (GLANCE) module. It first extracts local and global features by progressive convolutions and then fuses multi-level features by a centrosymmetric double-path hourglass structure in two different ways. Experiments demonstrate that the proposed paradigm achieves state-of-the-art results for predicting health indicators on MoVi, and that the GLANCE module is also beneficial for pose estimation on 3DPW. △ Less

Submitted 25 December, 2022; originally announced December 2022.

arXiv:2211.16716 [pdf, other]

Automated Generating Natural Language Requirements based on Domain Ontology

Authors: Ziyan Zhao, Li Zhang, Xiaoyun Gao, Xiaoli Lian, Heyang Lv, Lin Shi

Abstract: Software requirements specification is undoubtedly critical for the whole software life-cycle. Nowadays, writing software requirements specifications primarily depends on human work. Although massive studies have been proposed to fasten the process via proposing advanced elicitation and analysis techniques, it is still a time-consuming and error-prone task that needs to take domain knowledge and b… ▽ More Software requirements specification is undoubtedly critical for the whole software life-cycle. Nowadays, writing software requirements specifications primarily depends on human work. Although massive studies have been proposed to fasten the process via proposing advanced elicitation and analysis techniques, it is still a time-consuming and error-prone task that needs to take domain knowledge and business information into consideration. In this paper, we propose an approach, named ReqGen, which can provide recommendations by automatically generating natural language requirements specifications based on certain given keywords. Specifically, ReqGen consists of three critical steps. First, keywords-oriented knowledge is selected from domain ontology and is injected to the basic Unified pre-trained Language Model (UniLM) for domain fine-tuning. Second, a copy mechanism is integrated to ensure the occurrence of keywords in the generated statements. Finally, a requirement syntax constrained decoding is designed to close the semantic and syntax distance between the candidate and reference specifications. Experiments on two public datasets from different groups and domains show that ReqGen outperforms six popular natural language generation approaches with respect to the hard constraint of keywords(phrases) inclusion, BLEU, ROUGE and syntax compliance. We believe that ReqGen can promote the efficiency and intelligence of specifying software requirements. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2210.17479 [pdf, other]

kt-Safety: Graph Release via k-Anonymity and t-Closeness (Technical Report)

Authors: Weilong Ren, Kambiz Ghazinour, Xiang Lian

Abstract: In a wide spectrum of real-world applications, it is very important to analyze and mine graph data such as social networks, communication networks, citation networks, and so on. However, the release of such graph data often raises privacy issue, and the graph privacy preservation has recently drawn much attention from the database community. While prior works on graph privacy preservation mainly f… ▽ More In a wide spectrum of real-world applications, it is very important to analyze and mine graph data such as social networks, communication networks, citation networks, and so on. However, the release of such graph data often raises privacy issue, and the graph privacy preservation has recently drawn much attention from the database community. While prior works on graph privacy preservation mainly focused on protecting the privacy of either the graph structure only or vertex attributes only, in this paper, we propose a novel mechanism for graph privacy preservation by considering attacks from both graph structures and vertex attributes, which transforms the original graph to a so-called kt-safe graph, via k-anonymity and t-closeness. We prove that the generation of a kt-safe graph is NP-hard, therefore, we propose a feasible framework for effectively and efficiently anonymizing a graph with low anonymization cost. In particular, we design a cost-model-based graph partitioning approach to enable our proposed divide-and-conquer strategy for the graph anonymization, and propose effective optimization techniques such as pruning method and a tree synopsis to improve the anonymization efficiency over large-scale graphs. Extensive experiments have been conducted to verify the efficiency and effectiveness of our proposed kt-safe graph generation approach on both real and synthetic data sets. △ Less

Submitted 31 October, 2022; originally announced October 2022.

Comments: 22 pages, 31 figures, the technical report of a TKDE paper entitled "kt-Safety: Graph Release via k-Anonymity and t-Closeness"

arXiv:2210.11675 [pdf, other]

Granular-Ball Fuzzy Set and Its Implementation in SVM

Authors: Shuyin Xia, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Yabin Shao

Abstract: Most existing fuzzy set methods use points as their input, which is the finest granularity from the perspective of granular computing. Consequently, these methods are neither efficient nor robust to label noise. Therefore, we propose a frame-work called granular-ball fuzzy set by introducing granular-ball computing into fuzzy set. The computational framework is based on the granular-balls input ra… ▽ More Most existing fuzzy set methods use points as their input, which is the finest granularity from the perspective of granular computing. Consequently, these methods are neither efficient nor robust to label noise. Therefore, we propose a frame-work called granular-ball fuzzy set by introducing granular-ball computing into fuzzy set. The computational framework is based on the granular-balls input rather than points; therefore, it is more efficient and robust than traditional fuzzy methods, and can be used in various fields of fuzzy data processing according to its extensibility. Furthermore, the framework is extended to the classifier fuzzy support vector machine (FSVM), to derive the granular ball fuzzy SVM (GBFSVM). The experimental results demonstrate the effectiveness and efficiency of GBFSVM. △ Less

Submitted 26 November, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.06247 [pdf, ps, other]

Some Mader-perfect graph classes

Authors: Hui Lei, Siyan Li, Xiaopan Lian, Susu Wang

Abstract: The dichromatic number of $D$, denoted by $\overrightarrowχ(D)$, is the smallest integer $k$ such that $D$ admits an acyclic $k$-coloring. We use $mader_{\overrightarrowχ}(F)$ to denote the smallest integer $k$ such that if $\overrightarrowχ(D)\ge k$, then $D$ contains a subdivision of $F$. A digraph $F$ is called Mader-perfect if for every subdigraph $F'$ of $F$,… ▽ More The dichromatic number of $D$, denoted by $\overrightarrowχ(D)$, is the smallest integer $k$ such that $D$ admits an acyclic $k$-coloring. We use $mader_{\overrightarrowχ}(F)$ to denote the smallest integer $k$ such that if $\overrightarrowχ(D)\ge k$, then $D$ contains a subdivision of $F$. A digraph $F$ is called Mader-perfect if for every subdigraph $F'$ of $F$, ${\rm mader }_{\overrightarrowχ}(F')=|V(F')|$. We extend octi digraphs to a larger class of digraphs and prove that it is Mader-perfect, which generalizes a result of Gishboliner, Steiner and Szabó [Dichromatic number and forced subdivisions, {\it J. Comb. Theory, Ser. B} {\bf 153} (2022) 1--30]. We also show that if $K$ is a proper subdigraph of $\overleftrightarrow{C_4}$ except for the digraph obtained from $\overleftrightarrow{C_4}$ by deleting an arbitrary arc, then $K$ is Mader-perfect. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: 12 pages, 2 figures

arXiv:2210.03120 [pdf, other]

GBSVM: Granular-ball Support Vector Machine

Authors: Shuyin Xia, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Jiancu Chen, Xiaoli Peng

Abstract: GBSVM (Granular-ball Support Vector Machine) is a significant attempt to construct a classifier using the coarse-to-fine granularity of a granular-ball as input, rather than a single data point. It is the first classifier whose input contains no points. However, the existing model has some errors, and its dual model has not been derived. As a result, the current algorithm cannot be implemented or… ▽ More GBSVM (Granular-ball Support Vector Machine) is a significant attempt to construct a classifier using the coarse-to-fine granularity of a granular-ball as input, rather than a single data point. It is the first classifier whose input contains no points. However, the existing model has some errors, and its dual model has not been derived. As a result, the current algorithm cannot be implemented or applied. To address these problems, this paper has fixed the errors of the original model of the existing GBSVM, and derived its dual model. Furthermore, a particle swarm optimization algorithm is designed to solve the dual model. The sequential minimal optimization algorithm is also carefully designed to solve the dual model. The solution is faster and more stable than the particle swarm optimization based version. The experimental results on the UCI benchmark datasets demonstrate that GBSVM has good robustness and efficiency. All codes have been released in the open source library at http://www.cquptshuyinxia.com/GBSVM.html or https://github.com/syxiaa/GBSVM. △ Less

Submitted 11 February, 2024; v1 submitted 6 October, 2022; originally announced October 2022.

arXiv:2209.09532 [pdf, other]

doi 10.1109/ICPR56361.2022.9956358

Boosting the Discriminant Power of Naive Bayes

Authors: Shihe Wang, Jianfeng Ren, Xiaoyu Lian, Ruibin Bai, Xudong Jiang

Abstract: Naive Bayes has been widely used in many applications because of its simplicity and ability in handling both numerical data and categorical data. However, lack of modeling of correlations between features limits its performance. In addition, noise and outliers in the real-world dataset also greatly degrade the classification performance. In this paper, we propose a feature augmentation method empl… ▽ More Naive Bayes has been widely used in many applications because of its simplicity and ability in handling both numerical data and categorical data. However, lack of modeling of correlations between features limits its performance. In addition, noise and outliers in the real-world dataset also greatly degrade the classification performance. In this paper, we propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes. The proposed stack auto-encoder consists of two auto-encoders for different purposes. The first encoder shrinks the initial features to derive a compact feature representation in order to remove the noise and redundant information. The second encoder boosts the discriminant power of the features by expanding them into a higher-dimensional space so that different classes of samples could be better separated in the higher-dimensional space. By integrating the proposed feature augmentation method with the regularized naive Bayes, the discrimination power of the model is greatly enhanced. The proposed method is evaluated on a set of machine-learning benchmark datasets. The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Accepted by 2022 International Conference on Pattern Recognition

arXiv:2208.12986 [pdf, other]

6D Robotic Assembly Based on RGB-only Object Pose Estimation

Authors: Bowen Fu, Sek Kun Leong, Xiaocong Lian, Xiangyang Ji

Abstract: Vision-based robotic assembly is a crucial yet challenging task as the interaction with multiple objects requires high levels of precision. In this paper, we propose an integrated 6D robotic system to perceive, grasp, manipulate and assemble blocks with tight tolerances. Aiming to provide an off-the-shelf RGB-only solution, our system is built upon a monocular 6D object pose estimation network tra… ▽ More Vision-based robotic assembly is a crucial yet challenging task as the interaction with multiple objects requires high levels of precision. In this paper, we propose an integrated 6D robotic system to perceive, grasp, manipulate and assemble blocks with tight tolerances. Aiming to provide an off-the-shelf RGB-only solution, our system is built upon a monocular 6D object pose estimation network trained solely with synthetic images leveraging physically-based rendering. Subsequently, pose-guided 6D transformation along with collision-free assembly is proposed to construct any designed structure with arbitrary initial poses. Our novel 3-axis calibration operation further enhances the precision and robustness by disentangling 6D pose estimation and robotic assembly. Both quantitative and qualitative results demonstrate the effectiveness of our proposed 6D robotic assembly system. △ Less

Submitted 27 August, 2022; originally announced August 2022.

Comments: Accepted by IROS 2022

arXiv:2208.06757 [pdf, other]

A Preliminary Study on the Potential Usefulness of Open Domain Model for Missing Software Requirements Recommendation

Authors: Ziyan Zhao, Li Zhang, Xiaoli Lian

Abstract: Completeness is one of the most important attributes of software requirement specifications. Unfortunately, incompleteness is meanwhile one of the most difficult problems to detect. Some approaches have been proposed to detect missing requirements based on the requirement-oriented domain model. However, this kind of models are lacking for lots of domains. Fortunately, the domain models constructed… ▽ More Completeness is one of the most important attributes of software requirement specifications. Unfortunately, incompleteness is meanwhile one of the most difficult problems to detect. Some approaches have been proposed to detect missing requirements based on the requirement-oriented domain model. However, this kind of models are lacking for lots of domains. Fortunately, the domain models constructed for different purposes can usually be found online. This raises a question: whether or not these domain models are helpful in finding the missing functional information in requirement specification? To explore this question, we design and conduct a preliminary study by computing the overlap** rate between the entities in domain models and the concepts of natural language software requirements and then digging into four regularities of the occurrence of these entities(concepts) based on two example domains. The usefulness of these regularities, especially the one based on our proposed metric AHME (with F2 gains of 146% and 223% on the two domains than without any regularity), has been shown in experiments. △ Less

Submitted 13 August, 2022; originally announced August 2022.

arXiv:2206.05778 [pdf, other]

Learning-Based Data Storage [Vision] (Technical Report)

Authors: Xiang Lian, Xiaofei Zhang

Abstract: Deep neural network (DNN) and its variants have been extensively used for a wide spectrum of real applications such as image classification, face/speech recognition, fraud detection, and so on. In addition to many important machine learning tasks, as artificial networks emulating the way brain cells function, DNNs also show the capability of storing non-linear relationships between input and outpu… ▽ More Deep neural network (DNN) and its variants have been extensively used for a wide spectrum of real applications such as image classification, face/speech recognition, fraud detection, and so on. In addition to many important machine learning tasks, as artificial networks emulating the way brain cells function, DNNs also show the capability of storing non-linear relationships between input and output data, which exhibits the potential of storing data via DNNs. We envision a new paradigm of data storage, "DNN-as-a-Database", where data are encoded in well-trained machine learning models. Compared with conventional data storage that directly records data in raw formats, learning-based structures (e.g., DNN) can implicitly encode data pairs of inputs and outputs and compute/materialize actual output data of different resolutions only if input data are provided. This new paradigm can greatly enhance the data security by allowing flexible data privacy settings on different levels, achieve low space consumption and fast computation with the acceleration of new hardware (e.g., Diffractive Neural Network and AI chips), and can be generalized to distributed DNN-based storage/computing. In this paper, we propose this novel concept of learning-based data storage, which utilizes a learning structure called learning-based memory unit (LMU), to store, organize, and retrieve data. As a case study, we use DNNs as the engine in the LMU, and study the data capacity and accuracy of the DNN-based data storage. Our preliminary experimental results show the feasibility of the learning-based data storage by achieving high (100%) accuracy of the DNN storage. We explore and design effective solutions to utilize the DNN-based data storage to manage and query relational tables. We discuss how to generalize our solutions to other data types (e.g., graphs) and environments such as distributed DNN storage/computing. △ Less

Submitted 22 January, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

Comments: 14 pages, 16 figures

ACM Class: E.2; H.2.1; I.2.0; I.2.11

arXiv:2206.02281 [pdf, other]

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

Authors: Zhenyu Hu, Zhenyu Wu, Pengcheng Pi, Yunhe Xue, Jiayi Shen, Jianchao Tan, Xiangru Lian, Zhangyang Wang, Ji Liu

Abstract: Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by… ▽ More Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by UAV. To reduce energy consumption, we further propose a multi-stage image processor that takes videos' redundancy, continuity, and mixed degradation into account. Lastly, the model is pruned and quantized before deployed on Raspberry Pi. Our proposed energy-efficient video text spotting solution, dubbed as E^2VTS, outperforms all previous methods by achieving a competitive tradeoff between energy efficiency and performance. All our codes and pre-trained models are available at https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpotting. △ Less

Submitted 5 June, 2022; originally announced June 2022.

arXiv:2206.02114 [pdf, other]

Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial

Authors: Xin Lian

Abstract: With the COVID-19 pandemic continuing, hatred against Asians is intensifying in countries outside Asia, especially among the Chinese. There is an urgent need to detect and prevent hate speech towards Asians effectively. In this work, we first create COVID-HATE-2022, an annotated dataset including 2,025 annotated tweets fetched in early February 2022, which are labeled based on specific criteria, a… ▽ More With the COVID-19 pandemic continuing, hatred against Asians is intensifying in countries outside Asia, especially among the Chinese. There is an urgent need to detect and prevent hate speech towards Asians effectively. In this work, we first create COVID-HATE-2022, an annotated dataset including 2,025 annotated tweets fetched in early February 2022, which are labeled based on specific criteria, and we present the comprehensive collection of scenarios of hate and non-hate tweets in the dataset. Second, we fine-tune the BERT model based on the relevant datasets and demonstrate several strategies related to the "cleaning" of the tweets. Third, we investigate the performance of advanced fine-tuning strategies with various model-centric and data-centric approaches, and we show that both strategies generally improve the performance, while data-centric ones outperform the others, and it demonstrates the feasibility and effectiveness of the data-centric approaches in the associated tasks. △ Less

Submitted 21 August, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

arXiv:2205.11717 [pdf]

Realization of ultra-broadband IR up-conversion imaging

Authors: X. H. Li, P. Bai, S. H. Huang, X. Q. Bai, W. J. Song, X. R. Lian, C. Hu, Z. W. Shi, W. Z. Shen, Y. H. Zhang, Z. L. Fu, D. X. Shao, Z. Y. Tan, J. C. Cao, C. Tan, G. Y. Xu

Abstract: Ultra-broadband imaging devices with high performance are in great demand for a variety of technological applications, including imaging, remote sensing, and communications. An ultra-broadband up-converter is realized based on a p-GaAs homojunction interfacial workfunction internal photoemission (HIWIP) detector-light emitting diode (LED) device. The device demonstrates an ultra-broad response ran… ▽ More Ultra-broadband imaging devices with high performance are in great demand for a variety of technological applications, including imaging, remote sensing, and communications. An ultra-broadband up-converter is realized based on a p-GaAs homojunction interfacial workfunction internal photoemission (HIWIP) detector-light emitting diode (LED) device. The device demonstrates an ultra-broad response ranging from visible to terahertz (THz) with good reproducibility. The peak responsivity in the mid-infrared (MIR) region is 140 mA/W at 10.5 microns. The HIWIP-LED shows enormous potential for ultra-broadband up-conversion covering all infrared atmospheric windows, as well as the THz region, and the pixel-less imaging of the MIR spot from the CO2 laser is further demonstrated. In addition, the proposed up-converter also performs as a near-infrared and visible detector under zero bias by using a bi-functional LED. Thanks to its ultra-wide response, the HIWIP-LED up-converter has great promise for stable, high-performance ultra-broadband pixel-less imaging and multi-functional analysis systems. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: 23 pages, 5 figures

arXiv:2204.13224 [pdf, other]

Top-k Community Similarity Search Over Large-Scale Road Networks (Technical Report)

Authors: Niranjan Rai, Xiang Lian

Abstract: With the urbanization and development of infrastructure, the community search over road networks has become increasingly important in many real applications such as urban/city planning, social study on local communities, and community recommendations by real estate agencies. In this paper, we propose a novel problem, namely top-k community similarity search (Top-kCS2) over road networks, which eff… ▽ More With the urbanization and development of infrastructure, the community search over road networks has become increasingly important in many real applications such as urban/city planning, social study on local communities, and community recommendations by real estate agencies. In this paper, we propose a novel problem, namely top-k community similarity search (Top-kCS2) over road networks, which efficiently and effectively obtains k spatial communities that are the most similar to a given query community in road-network graphs. In order to efficiently and effectively tackle the Top-kCS2 problem, in this paper, we will design an effective similarity measure between spatial communities, and propose a framework for retrieving Top-kCS2 query answers, which integrates offline pre-processing and online computation phases. Moreover, we also consider a variant, namely continuous top-k community similarity search (CTop-kCS2), where the query community continuously moves along a query line segment. We develop an efficient algorithm to split query line segments into intervals, incrementally obtain similar candidate communities for each interval and define actual CTop-kCS2 query answers. Extensive experiments have been conducted on real and synthetic data sets to confirm the efficiency and effectiveness of our proposed Top-kCS2 and CTop-kCS2 approaches under various parameter setting △ Less

Submitted 27 April, 2022; originally announced April 2022.

arXiv:2204.11540 [pdf, other]

doi 10.1007/s11042-022-12800-8

Research Status of Deep Learning Methods for Rumor Detection

Authors: Li Tan, Ge Wang, Feiyang Jia, Xiaofeng Lian

Abstract: To manage the rumors in social media to reduce the harm of rumors in society. Many studies used methods of deep learning to detect rumors in open networks. To comprehensively sort out the research status of rumor detection from multiple perspectives, this paper analyzes the highly focused work from three perspectives: Feature Selection, Model Structure, and Research Methods. From the perspective o… ▽ More To manage the rumors in social media to reduce the harm of rumors in society. Many studies used methods of deep learning to detect rumors in open networks. To comprehensively sort out the research status of rumor detection from multiple perspectives, this paper analyzes the highly focused work from three perspectives: Feature Selection, Model Structure, and Research Methods. From the perspective of feature selection, we divide methods into content feature, social feature, and propagation structure feature of the rumors. Then, this work divides deep learning models of rumor detection into CNN, RNN, GNN, Transformer based on the model structure, which is convenient for comparison. Besides, this work summarizes 30 works into 7 rumor detection methods such as propagation trees, adversarial learning, cross-domain methods, multi-task learning, unsupervised and semi-supervised methods, based knowledge graph, and other methods for the first time. And compare the advantages of different methods to detect rumors. In addition, this review enumerate datasets available and discusses the potential issues and future work to help researchers advance the development of field. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted by MTAP

arXiv:2204.05538 [pdf, other]

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

Authors: Xueqing Deng, Peng Wang, Xiaochen Lian, Shawn Newsam

Abstract: The semantic segmentation of nighttime scenes is a challenging problem that is key to impactful applications like self-driving cars. Yet, it has received little attention compared to its daytime counterpart. In this paper, we propose NightLab, a novel nighttime segmentation framework that leverages multiple deep learning models imbued with night-aware features to yield State-of-The-Art (SoTA) perf… ▽ More The semantic segmentation of nighttime scenes is a challenging problem that is key to impactful applications like self-driving cars. Yet, it has received little attention compared to its daytime counterpart. In this paper, we propose NightLab, a novel nighttime segmentation framework that leverages multiple deep learning models imbued with night-aware features to yield State-of-The-Art (SoTA) performance on multiple night segmentation benchmarks. Notably, NightLab contains models at two levels of granularity, i.e. image and regional, and each level is composed of light adaptation and segmentation modules. Given a nighttime image, the image level model provides an initial segmentation estimate while, in parallel, a hardness detection module identifies regions and their surrounding context that need further analysis. A regional level model focuses on these difficult regions to provide a significantly improved segmentation. All the models in NightLab are trained end-to-end using a set of proposed night-aware losses without handcrafted heuristics. Extensive experiments on the NightCity and BDD100K datasets show NightLab achieves SoTA performance compared to concurrent methods. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: 8pages, 6 figures, accept at CVPR 2022

arXiv:2202.08427 [pdf, ps, other]

Weak-odd chromatic index of special digraph classes

Authors: Ruijuan Gu, Hui Lei, Xiaopan Lian, Zhenyu Taoqiu

Abstract: Give a digraph $D=(V(D),A(D))$, let $\partial^+_D(v)=\{vw|w\in N^+_D(v)\}$ and $\partial^-_D(v)=\{uv|u\in N^-_D(v)\}$ be semi-cuts of $v$. A map** $\varphi:A(D)\rightarrow [k]$ is called a weak-odd $k$-edge coloring of $D$ if it satisfies the condition: for each $v\in V(D)$, there is at least one color with an odd number of occurrences on each non-empty semi-cut of $v$. We call the minimum integ… ▽ More Give a digraph $D=(V(D),A(D))$, let $\partial^+_D(v)=\{vw|w\in N^+_D(v)\}$ and $\partial^-_D(v)=\{uv|u\in N^-_D(v)\}$ be semi-cuts of $v$. A map** $\varphi:A(D)\rightarrow [k]$ is called a weak-odd $k$-edge coloring of $D$ if it satisfies the condition: for each $v\in V(D)$, there is at least one color with an odd number of occurrences on each non-empty semi-cut of $v$. We call the minimum integer $k$ the weak-odd chromatic index of $D$. When limit to 2 colors, use $def(D)$ to denote the defect of $D$, the minimum number of vertices in $D$ at which the above condition is not satisfied. In this paper, we give a descriptive characterization about the weak-odd chromatic index and the defect of semicomplete digraphs and extended tournaments, which generalize results of tournaments to broader classes. And we initiated the study of weak-odd edge covering on digraphs. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: 14 pages, 1 figures

arXiv:2201.08616 [pdf, other]

Diffusion Multi-unit Auctions with Diminishing Marginal Utility Buyers

Authors: Haolin Liu, Xinyuan Lian, Dengji Zhao

Abstract: We consider an auction design problem where a seller sells multiple homogeneous items to a set of connected buyers. Each buyer only knows the buyers she directly connects with and has a diminishing marginal utility valuation for the items. The seller initially only connects to some buyers who can be directly invited to the sale by the seller. Our goal is to design an auction to incentivize the buy… ▽ More We consider an auction design problem where a seller sells multiple homogeneous items to a set of connected buyers. Each buyer only knows the buyers she directly connects with and has a diminishing marginal utility valuation for the items. The seller initially only connects to some buyers who can be directly invited to the sale by the seller. Our goal is to design an auction to incentivize the buyers who are aware of the auction to further invite their neighbors to join the auction. This is challenging because the buyers are competing for the items and they would not invite each other by default. Thus, rewards need to be given to buyers who diffuse information, but the rewards should be carefully designed to guarantee both invitation incentives and the seller's revenue. Solutions have been proposed recently for the settings where each buyer requires at most one unit and demonstrated the difficulties of the design. We move this forward to propose the very first diffusion auction for the multi-unit demand settings to improve both the social welfare and the seller's revenue. △ Less

Submitted 26 February, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

arXiv:2112.13387 [pdf, ps, other]

On critical graphs for the chromatic edge-stability number

Authors: Hui Lei, Xiaopan Lian, Xianhao Meng, Yongtang Shi, Yiqiao Wang

Abstract: The {\em chromatic edge-stability number} $es_χ(G)$ of a graph $G$ is the minimum number of edges whose removal results in a spanning subgraph with the chromatic number smaller than that of $G$. A graph $G$ is called {\em $(3,2)$-critical} if $χ(G)=3$, $es_χ(G)=2$ and for any edge $e\in E(G)$, $es_χ(G-e)<es_χ(G)$. In this paper, we characterize $(3,2)$-critical graphs which contain at least five o… ▽ More The {\em chromatic edge-stability number} $es_χ(G)$ of a graph $G$ is the minimum number of edges whose removal results in a spanning subgraph with the chromatic number smaller than that of $G$. A graph $G$ is called {\em $(3,2)$-critical} if $χ(G)=3$, $es_χ(G)=2$ and for any edge $e\in E(G)$, $es_χ(G-e)<es_χ(G)$. In this paper, we characterize $(3,2)$-critical graphs which contain at least five odd cycles. This answers a question proposed by Brešar, Klavžar and Movarraei in [Critical graphs for the chromatic edge-stability number, {\it Discrete Math.} {\bf 343}(2020) 111845]. △ Less

Submitted 26 December, 2021; originally announced December 2021.

Comments: 12 pages, 2 figures

MSC Class: 05C15

arXiv:2111.05897 [pdf, other]

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

Authors: Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, **gru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang , et al. (2 additional authors not shown)

Abstract: Deep learning based models have dominated the current landscape of production recommender systems. Furthermore, recent years have witnessed an exponential growth of the model scale--from Google's 2016 model with 1 billion parameters to the latest Facebook's model with 12 trillion parameters. Significant quality boost has come with each jump of the model capacity, which makes us believe the era of… ▽ More Deep learning based models have dominated the current landscape of production recommender systems. Furthermore, recent years have witnessed an exponential growth of the model scale--from Google's 2016 model with 1 billion parameters to the latest Facebook's model with 12 trillion parameters. Significant quality boost has come with each jump of the model capacity, which makes us believe the era of 100 trillion parameters is around the corner. However, the training of such models is challenging even within industrial scale data centers. This difficulty is inherited from the staggering heterogeneity of the training computation--the model's embedding layer could include more than 99.99% of the total model size, which is extremely memory-intensive; while the rest neural network is increasingly computation-intensive. To support the training of such huge models, an efficient distributed training system is in urgent need. In this paper, we resolve this challenge by careful co-design of both the optimization algorithm and the distributed system architecture. Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm. Both theoretical demonstration and empirical study up to 100 trillion parameters have conducted to justified the system design and implementation of Persia. We make Persia publicly available (at https://github.com/PersiaML/Persia) so that anyone would be able to easily train a recommender model at the scale of 100 trillion parameters. △ Less

Submitted 23 November, 2021; v1 submitted 10 November, 2021; originally announced November 2021.

arXiv:2110.07218 [pdf, other]

Deep-3D Microscope: 3D volumetric microscopy of thick scattering samples using a wide-field microscope and machine learning

Authors: Bowen Li, Shiyu Tan, Jiuyang Dong, Xiaocong Lian, Yongbing Zhang, Xiangyang Ji, Ashok Veeraraghavan

Abstract: Confocal microscopy is the standard approach for obtaining volumetric images of a sample with high axial and lateral resolution, especially when dealing with scattering samples. Unfortunately, a confocal microscope is quite expensive compared to traditional microscopes. In addition, the point scanning in a confocal leads to slow imaging speed and photobleaching due to the high dose of laser energy… ▽ More Confocal microscopy is the standard approach for obtaining volumetric images of a sample with high axial and lateral resolution, especially when dealing with scattering samples. Unfortunately, a confocal microscope is quite expensive compared to traditional microscopes. In addition, the point scanning in a confocal leads to slow imaging speed and photobleaching due to the high dose of laser energy. In this paper, we demonstrate how the advances in machine learning can be exploited to "teach" a traditional wide-field microscope, one that's available in every lab, into producing 3D volumetric images like a confocal. The key idea is to obtain multiple images with different focus settings using a wide-field microscope and use a 3D Generative Adversarial Network (GAN) based neural network to learn the map** between the blurry low-contrast image stack obtained using wide-field and the sharp, high-contrast images obtained using a confocal. After training the network with widefield-confocal image pairs, the network can reliably and accurately reconstruct 3D volumetric images that rival confocal in terms of its lateral resolution, z-sectioning and image contrast. Our experimental results demonstrate generalization ability to handle unseen data, stability in the reconstruction results, high spatial resolution even when imaging thick ($\sim40$ microns) highly-scattering samples. We believe that such learning-based-microscopes have the potential to bring confocal quality imaging to every lab that has a wide-field microscope. △ Less

Submitted 14 October, 2021; originally announced October 2021.

arXiv:2110.05827 [pdf, ps, other]

A characterization of 4-$χ_S$-vertex-critical graphs for packing sequences with $s_1 =1$ and $s_2\ge 3$

Authors: Sandi Klavžar, Hui Lei, Xiaopan Lian, Yongtang Shi

Abstract: If $S=(s_1,s_2,\ldots)$ is a non-decreasing sequence of positive integers, then the $S$-packing $k$-coloring of a graph $G$ is a map** $c: V(G)\rightarrow[k]$ such that if $c(u)=c(v)=i$ for $u\neq v\in V(G)$, then $d_G(u,v)>s_i$. The $S$-packing chromatic number of $G$ is the smallest integer $k$ such that $G$ admits an $S$-packing $k$-coloring. A graph $G$ is $χ_S$-vertex-critical if… ▽ More If $S=(s_1,s_2,\ldots)$ is a non-decreasing sequence of positive integers, then the $S$-packing $k$-coloring of a graph $G$ is a map** $c: V(G)\rightarrow[k]$ such that if $c(u)=c(v)=i$ for $u\neq v\in V(G)$, then $d_G(u,v)>s_i$. The $S$-packing chromatic number of $G$ is the smallest integer $k$ such that $G$ admits an $S$-packing $k$-coloring. A graph $G$ is $χ_S$-vertex-critical if $χ_S(G-u) < χ_S(G)$ for each $u\in V(G)$. If $G$ is $χ_S$-vertex-critical and $χ_S(G) = k$, then $G$ is $k$-$χ_S$-vertex-critical. In this paper, $4$-$χ_S$-vertex-critical graphs are characterized for sequences $S = (1,s_2, s_3, \ldots)$ with $s_2 \ge 3$. There are $28$ sporadic examples and two infinite families of such graphs. △ Less

Submitted 13 May, 2023; v1 submitted 12 October, 2021; originally announced October 2021.

Comments: 19 pages, 2 figures

arXiv:2110.02461 [pdf]

doi 10.1021/acsnano.1c09592

Epitaxial Growth of Ultraflat Bismuthene with Large Topological Band Inversion Enabled by Substrate-Orbital-Filtering Effect

Authors: Shuo Sun, **g-Yang You, Sisheng Duan, Jian Gou, Yongzheng Luo, Weinan Lin, Xu Lian, Tengyu **, Jiawei Liu, Yuli Huang, Yihe Wang, Andrew T. S. Wee, Yuan ** Feng, Lei Shen, Jia Lin Zhang, **gsheng Chen, Wei Chen

Abstract: Quantum spin Hall (QSH) systems hold promises of low-power-consuming spintronic devices, yet their practical applications are extremely impeded by the small energy gaps. Fabricating QSH materials with large gaps, especially under the guidance of design principles, is essential for both scientific research and practical applications. Here, we demonstrate that large on-site atomic spin-orbit couplin… ▽ More Quantum spin Hall (QSH) systems hold promises of low-power-consuming spintronic devices, yet their practical applications are extremely impeded by the small energy gaps. Fabricating QSH materials with large gaps, especially under the guidance of design principles, is essential for both scientific research and practical applications. Here, we demonstrate that large on-site atomic spin-orbit coupling can be directly exploited via the intriguing substrate-orbital-filtering effect to generate large-gap QSH systems and experimentally realized on the epitaxially synthesized ultraflat bismuthene on Ag(111). Theoretical calculations reveal that the underlying substrate selectively filters Bi pz orbitals away from the Fermi level, leading pxy orbitals with nonzero magnetic quantum numbers, resulting in large topological gap of ~1 eV at the K point. The corresponding topological edge states are identified through scanning tunneling spectroscopy combined with density functional theory calculations. Our findings provide general strategies to design large-gap QSH systems and further explore their topology-related physics. △ Less

Submitted 18 December, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

arXiv:2107.01499 [pdf, other]

BAGUA: Scaling up Distributed Learning with System Relaxations

Authors: Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen Yang, Ji Liu, Ce Zhang

Abstract: Recent years have witnessed a growing list of systems for distributed data-parallel training. Existing systems largely fit into two paradigms, i.e., parameter server and MPI-style collective operations. On the algorithmic side, researchers have proposed a wide range of techniques to lower the communication via system relaxations: quantization, decentralization, and communication delay. However, mo… ▽ More Recent years have witnessed a growing list of systems for distributed data-parallel training. Existing systems largely fit into two paradigms, i.e., parameter server and MPI-style collective operations. On the algorithmic side, researchers have proposed a wide range of techniques to lower the communication via system relaxations: quantization, decentralization, and communication delay. However, most, if not all, existing systems only rely on standard synchronous and asynchronous stochastic gradient (SG) based optimization, therefore, cannot take advantage of all possible optimizations that the machine learning community has been develo** recently. Given this emerging gap between the current landscapes of systems and theory, we build BAGUA, a MPI-style communication library, providing a collection of primitives, that is both flexible and modular to support state-of-the-art system relaxation techniques of distributed training. Powered by this design, BAGUA has a great ability to implement and extend various state-of-the-art distributed learning algorithms. In a production cluster with up to 16 machines (128 GPUs), BAGUA can outperform PyTorch-DDP, Horovod and BytePS in the end-to-end training time by a significant margin (up to 2 times) across a diverse range of tasks. Moreover, we conduct a rigorous tradeoff exploration showing that different algorithms and system relaxations achieve the best performance over different network conditions. △ Less

Submitted 25 November, 2021; v1 submitted 3 July, 2021; originally announced July 2021.

arXiv:2106.06560 [pdf, other]

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

Authors: Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie **, Zhiwu Lu, ** Luo

Abstract: High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. Learning HR representations is typically ignored in previous Neural Architecture Search (NAS) methods that focus on image classification. This work proposes a novel NAS method, called HR-NAS, which is able to find efficient and accurate networks for different tasks, b… ▽ More High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. Learning HR representations is typically ignored in previous Neural Architecture Search (NAS) methods that focus on image classification. This work proposes a novel NAS method, called HR-NAS, which is able to find efficient and accurate networks for different tasks, by effectively encoding multiscale contextual information while maintaining high-resolution representations. In HR-NAS, we renovate the NAS search space as well as its searching strategy. To better encode multiscale image contexts in the search space of HR-NAS, we first carefully design a lightweight transformer, whose computational complexity can be dynamically changed with respect to different objective functions and computation budgets. To maintain high-resolution representations of the learned networks, HR-NAS adopts a multi-branch architecture that provides convolutional encoding of multiple feature resolutions, inspired by HRNet. Last, we proposed an efficient fine-grained search strategy to train HR-NAS, which effectively explores the search space, and finds optimal architectures given various tasks and computation resources. HR-NAS is capable of achieving state-of-the-art trade-offs between performance and FLOPs for three dense prediction tasks and an image classification task, given only small computational budgets. For example, HR-NAS surpasses SqueezeNAS that is specially designed for semantic segmentation while improving efficiency by 45.9%. Code is available at https://github.com/dingmyu/HR-NAS △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: Accepted by CVPR 2021 (Oral)

arXiv:2106.06135 [pdf, other]

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Authors: Daochen Zha, **gru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu

Abstract: Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents. While significant achievements have been made in various perfect- and imperfect-information games, DouDizhu (a.k.a. Fighting the Landlord), a three-player card game, is still unsolved. DouDizhu is a very challenging domain with competition, collaboration, imperfect information, large… ▽ More Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents. While significant achievements have been made in various perfect- and imperfect-information games, DouDizhu (a.k.a. Fighting the Landlord), a three-player card game, is still unsolved. DouDizhu is a very challenging domain with competition, collaboration, imperfect information, large state space, and particularly a massive set of possible actions where the legal actions vary significantly from turn to turn. Unfortunately, modern reinforcement learning algorithms mainly focus on simple and small action spaces, and not surprisingly, are shown not to make satisfactory progress in DouDizhu. In this work, we propose a conceptually simple yet effective DouDizhu AI system, namely DouZero, which enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel actors. Starting from scratch in a single server with four GPUs, DouZero outperformed all the existing DouDizhu AI programs in days of training and was ranked the first in the Botzone leaderboard among 344 AI agents. Through building DouZero, we show that classic Monte-Carlo methods can be made to deliver strong results in a hard domain with a complex action space. The code and an online demo are released at https://github.com/kwai/DouZero with the hope that this insight could motivate future work. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: Accepted by ICML 2021

arXiv:2105.04486 [pdf, other]

Probabilistic Top-k Dominating Queries in Distributed Uncertain Databases (Technical Report)

Authors: Niranjan Rai, Xiang Lian

Abstract: In many real-world applications such as business planning and sensor data monitoring, one important, yet challenging, the task is to rank objects(e.g., products, documents, or spatial objects) based on their ranking scores and efficiently return those objects with the highest scores. In practice, due to the unreliability of data sources, many real-world objects often contain noises and are thus im… ▽ More In many real-world applications such as business planning and sensor data monitoring, one important, yet challenging, the task is to rank objects(e.g., products, documents, or spatial objects) based on their ranking scores and efficiently return those objects with the highest scores. In practice, due to the unreliability of data sources, many real-world objects often contain noises and are thus imprecise and uncertain. In this paper, we study the problem of probabilistic top-k dominating(PTD) query on such large-scale uncertain data in a distributed environment, which retrieves k uncertain objects from distributed uncertain databases(on multiple distributed servers), having the largest ranking scores with high confidences. In order to efficiently tackle the distributed PTD problem, we propose a MapReduce framework for processing distributed PTD queries over distributed uncertain databases. In this MapReduce framework, we design effective pruning strategies to filter out false alarms in the distributed setting, propose cost-model-based index distribution mechanisms over servers, and develop efficient distributed PTD query processing algorithms. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed distributed PTD approach on both real and synthetic data sets through various experimental settings. △ Less

Submitted 12 May, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

arXiv:2103.11886 [pdf, other]

DeepViT: Towards Deeper Vision Transformer

Authors: Daquan Zhou, Bingyi Kang, Xiaojie **, Linjie Yang, Xiaochen Lian, Zihang Jiang, Qibin Hou, Jiashi Feng

Abstract: Vision transformers (ViTs) have been successfully applied in image classification tasks recently. In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper. More specifically, we empirically observe that such scaling difficulty is caused by the attention collapse i… ▽ More Vision transformers (ViTs) have been successfully applied in image classification tasks recently. In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper. More specifically, we empirically observe that such scaling difficulty is caused by the attention collapse issue: as the transformer goes deeper, the attention maps gradually become similar and even much the same after certain layers. In other words, the feature maps tend to be identical in the top layers of deep ViT models. This fact demonstrates that in deeper layers of ViTs, the self-attention mechanism fails to learn effective concepts for representation learning and hinders the model from getting expected performance gain. Based on above observation, we propose a simple yet effective method, named Re-attention, to re-generate the attention maps to increase their diversity at different layers with negligible computation and memory cost. The pro-posed method makes it feasible to train deeper ViT models with consistent performance improvements via minor modification to existing ViT models. Notably, when training a deep ViT model with 32 transformer blocks, the Top-1 classification accuracy can be improved by 1.6% on ImageNet. Code is publicly available at https://github.com/zhoudaquan/dvit_repo. △ Less

Submitted 19 April, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

arXiv:2103.11833 [pdf, other]

AutoSpace: Neural Architecture Search with Less Human Interference

Authors: Daquan Zhou, Xiaojie **, Xiaochen Lian, Linjie Yang, Yu**g Xue, Qibin Hou, Jiashi Feng

Abstract: Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction. In this paper, we consider automating the search space design to minimize human interference, which however faces two challenges: the explosive complexity of the exploration space and the expensive computation cost to evaluate the quality of different sea… ▽ More Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction. In this paper, we consider automating the search space design to minimize human interference, which however faces two challenges: the explosive complexity of the exploration space and the expensive computation cost to evaluate the quality of different search spaces. To solve them, we propose a novel differentiable evolutionary framework named AutoSpace, which evolves the search space to an optimal one with following novel techniques: a differentiable fitness scoring function to efficiently evaluate the performance of cells and a reference architecture to speedup the evolution procedure and avoid falling into sub-optimal solutions. The framework is generic and compatible with additional computational constraints, making it feasible to learn specialized search spaces that fit different computational budgets. With the learned search space, the performance of recent NAS algorithms can be improved significantly compared with using previously manually designed spaces. Remarkably, the models generated from the new search space achieve 77.8% top-1 accuracy on ImageNet under the mobile setting (MAdds < 500M), out-performing previous SOTA EfficientNet-B0 by 0.7%. All codes will be made public. △ Less

Submitted 22 March, 2021; originally announced March 2021.

arXiv:2103.08720 [pdf, other]

Online Topic-Aware Entity Resolution Over Incomplete Data Streams (Technical Report)

Authors: Weilong Ren, Xiang Lian, Kambiz Ghazinour

Abstract: In many real applications such as the data integration, social network analysis, and the Semantic Web, the entity resolution (ER) is an important and fundamental problem, which identifies and links the same real-world entities from various data sources. While prior works usually consider ER over static and complete data, in practice, application data are usually collected in a streaming fashion, a… ▽ More In many real applications such as the data integration, social network analysis, and the Semantic Web, the entity resolution (ER) is an important and fundamental problem, which identifies and links the same real-world entities from various data sources. While prior works usually consider ER over static and complete data, in practice, application data are usually collected in a streaming fashion, and often incur missing attributes (due to the inaccuracy of data extraction techniques). Therefore, in this paper, we will formulate and tackle a novel problem, topic-aware entity resolution over incomplete data streams (TER-iDS), which online imputes incomplete tuples and detects pairs of topic-related matching entities from incomplete data streams. In order to effectively and efficiently tackle the TER-iDS problem, we propose an effective imputation strategy, carefully design effective pruning strategies, as well as indexes/synopsis, and develop an efficient TER-iDS algorithm via index joins. Extensive experiments have been conducted to evaluate the effectiveness and efficiency of our proposed TER-iDS approach over real data sets. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: Technical report of the paper entitled "Online Topic-Aware Entity Resolution Over Incomplete Data Streams", published on SIGMOD 2021

arXiv:2103.02255 [pdf, other]

Automatically detecting the conflicts between software requirements based on finer semantic analysis

Authors: Weize Guo, Li Zhang, Xiaoli Lian

Abstract: Context: Conflicts between software requirements bring uncertainties to product development. Some great approaches have been proposed to identify these conflicts. However, they usually require the software requirements represented with specific templates and/or depend on other external source which is often uneasy to build for lots of projects in practice. Objective: We aim to propose an approach… ▽ More Context: Conflicts between software requirements bring uncertainties to product development. Some great approaches have been proposed to identify these conflicts. However, they usually require the software requirements represented with specific templates and/or depend on other external source which is often uneasy to build for lots of projects in practice. Objective: We aim to propose an approach Finer Semantic Analysis-based Requirements Conflict Detector (FSARC) to automatically detecting the conflicts between the given natural language functional requirements by analyzing their finer semantic compositions. Method: We build a harmonized semantic meta-model of functional requirements with the form of eight-tuple. Then we propose algorithms to automatically analyze the linguistic features of requirements and to annotate the semantic elements for their semantic model construction. And we define seven types of conflicts as long as their heuristic detecting rules on the ground of their text pattern and semantical dependency. Finally, we design and implement the algorithm for conflicts detection. Results: The experiment with four requirement datasets illustrates that the recall of FSARC is nearly 100% and the average precision is 83.88% on conflicts detection. Conclusion: We provide a useful tool for detecting the conflicts between natural language functional requirements to improve the quality of the final requirements set. Besides, our approach is capable of transforming the natural language functional requirements into eight semantic tuples, which is useful not only the detection of the conflicts between requirements but also some other tasks such as constructing the association between requirements and so on. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: 17 pages, 2 figures

MSC Class: 68N30 ACM Class: D.2.1

arXiv:2102.02888 [pdf, other]

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

Authors: Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He

Abstract: Scalable training of large models (like BERT and GPT-3) requires careful optimization rooted in model design, architecture, and system capabilities. From a system standpoint, communication has become a major bottleneck, especially on commodity systems with standard TCP interconnects that offer limited network bandwidth. Communication compression is an important technique to reduce training time on… ▽ More Scalable training of large models (like BERT and GPT-3) requires careful optimization rooted in model design, architecture, and system capabilities. From a system standpoint, communication has become a major bottleneck, especially on commodity systems with standard TCP interconnects that offer limited network bandwidth. Communication compression is an important technique to reduce training time on such systems. One of the most effective methods is error-compensated compression, which offers robust convergence speed even under 1-bit compression. However, state-of-the-art error compensation techniques only work with basic optimizers like SGD and momentum SGD, which are linearly dependent on the gradients. They do not work with non-linear gradient-based optimizers like Adam, which offer state-of-the-art convergence efficiency and accuracy for models like BERT. In this paper, we propose 1-bit Adam that reduces the communication volume by up to $5\times$, offers much better scalability, and provides the same convergence speed as uncompressed Adam. Our key finding is that Adam's variance (non-linear term) becomes stable (after a warmup phase) and can be used as a fixed precondition for the rest of the training (compression phase). Experiments on up to 256 GPUs show that 1-bit Adam enables up to $3.3\times$ higher throughput for BERT-Large pre-training and up to $2.9\times$ higher throughput for SQuAD fine-tuning. In addition, we provide theoretical analysis for our proposed work. △ Less

Submitted 29 June, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: arXiv admin note: text overlap with arXiv:2008.11343

arXiv:2101.11446 [pdf]

A study on information behavior of scholars for article keywords selection

Authors: Z. X. Lian

Abstract: This project takes the factors of keyword selection behavior as the research object. Qualitative analysis methods such as interview and grounded theory were used to construct causal influence path model. Combined with computer simulation technology such as multi-agent simulation experiment method was used to study the factors of keyword selection from two dimensions of individual to group. The res… ▽ More This project takes the factors of keyword selection behavior as the research object. Qualitative analysis methods such as interview and grounded theory were used to construct causal influence path model. Combined with computer simulation technology such as multi-agent simulation experiment method was used to study the factors of keyword selection from two dimensions of individual to group. The research was carried out according to the path of factor analysis at individual level macro situation simulation optimization of scientific research data management. Based on the aforementioned review of existing researches and explanations of keywords selection, this study adopts a qualitative research design to expand the explanation, and macro simulation based on the results of qualitative research. There are two steps in this study, one is do interview with authors and then design macro simulation according the deductive and qualitative content analysis results. △ Less

Submitted 27 January, 2021; originally announced January 2021.

Comments: 10 pages

MSC Class: 62-11

arXiv:2008.11343 [pdf, other]

APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm

Authors: Hanlin Tang, Shaoduo Gan, Samyam Rajbhandari, Xiangru Lian, Ji Liu, Yuxiong He, Ce Zhang

Abstract: Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet. However, Adam is generally not compatible with information (gradient) compression technology. Therefore, the communication usually becomes the bottleneck for parallelizing Adam. In this paper, we propose a communication efficient {\bf A}DAM {\bf p}reconditi… ▽ More Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet. However, Adam is generally not compatible with information (gradient) compression technology. Therefore, the communication usually becomes the bottleneck for parallelizing Adam. In this paper, we propose a communication efficient {\bf A}DAM {\bf p}reconditioned {\bf M}omentum SGD algorithm-- named APMSqueeze-- through an error compensated method compressing gradients. The proposed algorithm achieves a similar convergence efficiency to Adam in term of epochs, but significantly reduces the running time per epoch. In terms of end-to-end performance (including the full-precision pre-condition step), APMSqueeze is able to provide {sometimes by up to $2-10\times$ speed-up depending on network bandwidth.} We also conduct theoretical analysis on the convergence and efficiency. △ Less

Submitted 27 August, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

Showing 1–50 of 90 results for author: Lian, X