Search | arXiv e-print repository

AIGC-Chain: A Blockchain-Enabled Full Lifecycle Recording System for AIGC Product Copyright Management

Authors: Jiajia Jiang, Moting Su, Xiangli Xiao, Yushu Zhang, Yuming Fang

Abstract: As artificial intelligence technology becomes increasingly prevalent, Artificial Intelligence Generated Content (AIGC) is being adopted across various sectors. Although AIGC is playing an increasingly significant role in business and culture, questions surrounding its copyright have sparked widespread debate. The current legal framework for copyright and intellectual property is grounded in the co… ▽ More As artificial intelligence technology becomes increasingly prevalent, Artificial Intelligence Generated Content (AIGC) is being adopted across various sectors. Although AIGC is playing an increasingly significant role in business and culture, questions surrounding its copyright have sparked widespread debate. The current legal framework for copyright and intellectual property is grounded in the concept of human authorship, but in the creation of AIGC, human creators primarily provide conceptual ideas, with AI independently responsible for the expressive elements. This disconnect creates complexity and difficulty in determining copyright ownership under existing laws. Consequently, it is imperative to reassess the intellectual contributions of all parties involved in the creation of AIGC to ensure a fair allocation of copyright ownership. To address this challenge, we introduce AIGC-Chain, a blockchain-enabled full lifecycle recording system designed to manage the copyright of AIGC products. It is engineered to meticulously document the entire lifecycle of AIGC products, providing a transparent and dependable platform for copyright management. Furthermore, we propose a copyright tracing method based on an Indistinguishable Bloom Filter, named IBFT, which enhances the efficiency of blockchain transaction queries and significantly reduces the risk of fraudulent copyright claims for AIGC products. In this way, auditors can analyze the copyright of AIGC products by reviewing all relevant information retrieved from the blockchain. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.14191 [pdf, other]

Temporal Knowledge Graph Question Answering: A Survey

Authors: Miao Su, ZiXuan Li, Zhuo Chen, Long Bai, Xiaolong **, Jiafeng Guo

Abstract: Knowledge Base Question Answering (KBQA) has been a long-standing field to answer questions based on knowledge bases. Recently, the evolving dynamics of knowledge have attracted a growing interest in Temporal Knowledge Graph Question Answering (TKGQA), an emerging task to answer temporal questions. However, this field grapples with ambiguities in defining temporal questions and lacks a systematic… ▽ More Knowledge Base Question Answering (KBQA) has been a long-standing field to answer questions based on knowledge bases. Recently, the evolving dynamics of knowledge have attracted a growing interest in Temporal Knowledge Graph Question Answering (TKGQA), an emerging task to answer temporal questions. However, this field grapples with ambiguities in defining temporal questions and lacks a systematic categorization of existing methods for TKGQA. In response, this paper provides a thorough survey from two perspectives: the taxonomy of temporal questions and the methodological categorization for TKGQA. Specifically, we first establish a detailed taxonomy of temporal questions engaged in prior studies. Subsequently, we provide a comprehensive review of TKGQA techniques of two categories: semantic parsing-based and TKG embedding-based. Building on this review, the paper outlines potential research directions aimed at advancing the field of TKGQA. This work aims to serve as a comprehensive reference for TKGQA and to stimulate further research. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 8 pages, 3 figures

arXiv:2406.05504 [pdf, other]

G-Transformer: Counterfactual Outcome Prediction under Dynamic and Time-varying Treatment Regimes

Authors: Hong Xiong, Feng Wu, Leon Deng, Megan Su, Li-wei H Lehman

Abstract: In the context of medical decision making, counterfactual prediction enables clinicians to predict treatment outcomes of interest under alternative courses of therapeutic actions given observed patient history. Prior machine learning approaches for counterfactual predictions under time-varying treatments focus on static time-varying treatment regimes where treatments do not depend on previous cova… ▽ More In the context of medical decision making, counterfactual prediction enables clinicians to predict treatment outcomes of interest under alternative courses of therapeutic actions given observed patient history. Prior machine learning approaches for counterfactual predictions under time-varying treatments focus on static time-varying treatment regimes where treatments do not depend on previous covariate history. In this work, we present G-Transformer, a Transformer-based framework supporting g-computation for counterfactual prediction under dynamic and time-varying treatment strategies. G-Transfomer captures complex, long-range dependencies in time-varying covariates using a Transformer architecture. G-Transformer estimates the conditional distribution of relevant covariates given covariate and treatment history at each time point using an encoder architecture, then produces Monte Carlo estimates of counterfactual outcomes by simulating forward patient trajectories under treatment strategies of interest. We evaluate G-Transformer extensively using two simulated longitudinal datasets from mechanistic models, and a real-world sepsis ICU dataset from MIMIC-IV. G-Transformer outperforms both classical and state-of-the-art counterfactual prediction models in these settings. To the best of our knowledge, this is the first Transformer-based architecture for counterfactual outcome prediction under dynamic and time-varying treatment strategies. △ Less

Submitted 27 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.03136 [pdf, ps, other]

Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

Authors: Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu

Abstract: We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of n… ▽ More We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of nearly linear algorithms by controlling the LoRA update computation term by term, assuming the Strong Exponential Time Hypothesis (SETH). For the former, we identify a sharp transition in the efficiency of all possible rank-$r$ LoRA update algorithms for transformers, based on specific norms resulting from the multiplications of the input sequence $\mathbf{X}$, pretrained weights $\mathbf{W^\star}$, and adapter matrices $α\mathbf{B} \mathbf{A} / r$. Specifically, we derive a shared upper bound threshold for such norms and show that efficient (sub-quadratic) approximation algorithms of LoRA exist only below this threshold. For the latter, we prove the existence of nearly linear approximation algorithms for LoRA adaptation by utilizing the hierarchical low-rank structures of LoRA gradients and approximating the gradients with a series of chained low-rank approximations. To showcase our theory, we consider two practical scenarios: partial (e.g., only $\mathbf{W}_V$ and $\mathbf{W}_Q$) and full adaptations (e.g., $\mathbf{W}_Q$, $\mathbf{W}_V$, and $\mathbf{W}_K$) of weights in attention heads. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2404.10354 [pdf]

Physical formula enhanced multi-task learning for pharmacokinetics prediction

Authors: Ruifeng Li, Dongzhan Zhou, Ancheng Shen, Ao Zhang, Mao Su, Mingqian Li, Hongyang Chen, Gang Chen, Yin Zhang, Shufei Zhang, Yuqiang Li, Wanli Ouyang

Abstract: Artificial intelligence (AI) technology has demonstrated remarkable potential in drug dis-covery, where pharmacokinetics plays a crucial role in determining the dosage, safety, and efficacy of new drugs. A major challenge for AI-driven drug discovery (AIDD) is the scarcity of high-quality data, which often requires extensive wet-lab work. A typical example of this is pharmacokinetic experiments. I… ▽ More Artificial intelligence (AI) technology has demonstrated remarkable potential in drug dis-covery, where pharmacokinetics plays a crucial role in determining the dosage, safety, and efficacy of new drugs. A major challenge for AI-driven drug discovery (AIDD) is the scarcity of high-quality data, which often requires extensive wet-lab work. A typical example of this is pharmacokinetic experiments. In this work, we develop a physical formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously. By incorporating physical formulas into the multi-task framework, PEMAL facilitates effective knowledge sharing and target alignment among the pharmacokinetic parameters, thereby enhancing the accuracy of prediction. Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks. Moreover, we demonstrate that PEMAL enhances the robustness to noise, an advantage that conventional Neural Networks do not possess. Another advantage of PEMAL is its high flexibility, which can be potentially applied to other multi-task machine learning scenarios. Overall, our work illustrates the benefits and potential of using PEMAL in AIDD and other scenarios with data scarcity and noise. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2403.20134 [pdf, other]

User Modeling Challenges in Interactive AI Assistant Systems

Authors: Megan Su, Yuwei Bao

Abstract: Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language mo… ▽ More Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language models to interpret user profiles for more personalized user guidance. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.12716 [pdf, ps, other]

A New Reduction Method from Multivariate Polynomials to Univariate Polynomials

Authors: Cancan Wang, Ming Su, Gang Wang, Qingpo Zhang

Abstract: Polynomial multiplication is a fundamental problem in symbolic computation. There are efficient methods for the multiplication of two univariate polynomials. However, there is rarely efficiently nontrivial method for the multiplication of two multivariate polynomials. Therefore, we consider a new multiplication mechanism that involves a) reversibly reducing multivariate polynomials into univariate… ▽ More Polynomial multiplication is a fundamental problem in symbolic computation. There are efficient methods for the multiplication of two univariate polynomials. However, there is rarely efficiently nontrivial method for the multiplication of two multivariate polynomials. Therefore, we consider a new multiplication mechanism that involves a) reversibly reducing multivariate polynomials into univariate polynomials, b) calculating the product of the derived univariate polynomials by the Toom-Cook or FFT algorithm, and c) correctly recovering the product of multivariate polynomials from the product of two univariate polynomials. This work focuses on step a), expecting the degrees of the derived univariate polynomials to be as small as possible. We propose iterative Kronecker substitution, where smaller substitution exponents are selected instead of standard Kronecker substitution. We also apply the Chinese remainder theorem to polynomial reduction and find its advantages in some cases. Afterwards, we provide a hybrid reduction combining the advantages of both reduction methods. Moreover, we compare these reduction methods in terms of lower and upper bounds of the degree of the product of two derived univariate polynomials, and their computational complexities. With randomly generated multivariate polynomials, experiments show that the degree of the product of two univariate polynomials derived from the hybrid reduction can be reduced even to approximately 3% that resulting from the standard Kronecker substitution, implying an efficient subsequent multiplication of two univariate polynomials. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 15 pages

arXiv:2403.07969 [pdf, other]

KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction

Authors: Zixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, Xiang Li, Zhilei Hu, Long Bai, Wei Li, Yidan Liu, Pan Yang, Xiaolong **, Jiafeng Guo, Xueqi Cheng

Abstract: In this paper, we propose KnowCoder, a Large Language Model (LLM) to conduct Universal Information Extraction (UIE) via code generation. KnowCoder aims to develop a kind of unified schema representation that LLMs can easily understand and an effective learning framework that encourages LLMs to follow schemas and extract structured knowledge accurately. To achieve these, KnowCoder introduces a code… ▽ More In this paper, we propose KnowCoder, a Large Language Model (LLM) to conduct Universal Information Extraction (UIE) via code generation. KnowCoder aims to develop a kind of unified schema representation that LLMs can easily understand and an effective learning framework that encourages LLMs to follow schemas and extract structured knowledge accurately. To achieve these, KnowCoder introduces a code-style schema representation method to uniformly transform different schemas into Python classes, with which complex schema information, such as constraints among tasks in UIE, can be captured in an LLM-friendly manner. We further construct a code-style schema library covering over $\textbf{30,000}$ types of knowledge, which is the largest one for UIE, to the best of our knowledge. To ease the learning process of LLMs, KnowCoder contains a two-phase learning framework that enhances its schema understanding ability via code pretraining and its schema following ability via instruction tuning. After code pretraining on around $1.5$B automatically constructed data, KnowCoder already attains remarkable generalization ability and achieves relative improvements by $\textbf{49.8%}$ F1, compared to LLaMA2, under the few-shot setting. After instruction tuning, KnowCoder further exhibits strong generalization ability on unseen schemas and achieves up to $\textbf{12.5%}$ and $\textbf{21.9%}$, compared to sota baselines, under the zero-shot setting and the low resource setting, respectively. Additionally, based on our unified schema representations, various human-annotated datasets can simultaneously be utilized to refine KnowCoder, which achieves significant improvements up to $\textbf{7.5%}$ under the supervised setting. △ Less

Submitted 13 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.06852 [pdf]

ChemLLM: A Chemical Large Language Model

Authors: Di Zhang, Wei Liu, Qian Tan, **gdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Wanli Ouyang, Dongzhan Zhou, Shufei Zhang, Mao Su, Han-Sen Zhong, Yuqiang Li

Abstract: Large language models (LLMs) have made impressive progress in chemistry applications. However, the community lacks an LLM specifically designed for chemistry. The main challenges are two-fold: firstly, most chemical data and scientific knowledge are stored in structured databases, which limits the model's ability to sustain coherent dialogue when used directly. Secondly, there is an absence of obj… ▽ More Large language models (LLMs) have made impressive progress in chemistry applications. However, the community lacks an LLM specifically designed for chemistry. The main challenges are two-fold: firstly, most chemical data and scientific knowledge are stored in structured databases, which limits the model's ability to sustain coherent dialogue when used directly. Secondly, there is an absence of objective and fair benchmark that encompass most chemistry tasks. Here, we introduce ChemLLM, a comprehensive framework that features the first LLM dedicated to chemistry. It also includes ChemData, a dataset specifically designed for instruction tuning, and ChemBench, a robust benchmark covering nine essential chemistry tasks. ChemLLM is adept at performing various tasks across chemical disciplines with fluid dialogue interaction. Notably, ChemLLM achieves results comparable to GPT-4 on the core chemical tasks and demonstrates competitive performance with LLMs of similar size in general scenarios. ChemLLM paves a new path for exploration in chemical studies, and our method of incorporating structured chemical knowledge into dialogue systems sets a new standard for develo** LLMs in various scientific fields. Codes, Datasets, and Model weights are publicly accessible at https://hf.co/AI4Chem △ Less

Submitted 25 April, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: 9 pages, 5 figures

arXiv:2401.00374 [pdf, other]

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

Authors: Haiyang Liu, Zihao Zhu, Giorgio Becherini, Yichen Peng, Mingyang Su, You Zhou, Xuefei Zhe, Naoya Iwamoto, Bo Zheng, Michael J. Black

Abstract: We propose EMAGE, a framework to generate full-body human gestures from audio and masked gestures, encompassing facial, local body, hands, and global movements. To achieve this, we first introduce BEAT2 (BEAT-SMPLX-FLAME), a new mesh-level holistic co-speech dataset. BEAT2 combines a MoShed SMPL-X body with FLAME head parameters and further refines the modeling of head, neck, and finger movements,… ▽ More We propose EMAGE, a framework to generate full-body human gestures from audio and masked gestures, encompassing facial, local body, hands, and global movements. To achieve this, we first introduce BEAT2 (BEAT-SMPLX-FLAME), a new mesh-level holistic co-speech dataset. BEAT2 combines a MoShed SMPL-X body with FLAME head parameters and further refines the modeling of head, neck, and finger movements, offering a community-standardized, high-quality 3D motion captured dataset. EMAGE leverages masked body gesture priors during training to boost inference performance. It involves a Masked Audio Gesture Transformer, facilitating joint training on audio-to-gesture generation and masked gesture reconstruction to effectively encode audio and body gesture hints. Encoded body hints from masked gestures are then separately employed to generate facial and body movements. Moreover, EMAGE adaptively merges speech features from the audio's rhythm and content and utilizes four compositional VQ-VAEs to enhance the results' fidelity and diversity. Experiments demonstrate that EMAGE generates holistic gestures with state-of-the-art performance and is flexible in accepting predefined spatial-temporal gesture inputs, generating complete, audio-synchronized results. Our code and dataset are available https://pantomatrix.github.io/EMAGE/ △ Less

Submitted 30 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: Fix typos; Conflict of Interest Disclosure; CVPR Camera Ready; Project Page: https://pantomatrix.github.io/EMAGE/

arXiv:2312.10359 [pdf, other]

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices

Authors: Mingbin Xu, Alex **, Sicheng Wang, Mu Su, Tim Ng, Henry Mason, Shiyi Han, Zhihong Lei, Yaqiao Deng, Zhen Huang, Mahesh Krishnamoorthy

Abstract: With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still challenging to implement on-device ASR on resource-constrained devices, such as smartphones, smart wearables, and other smart home automation devices.… ▽ More With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still challenging to implement on-device ASR on resource-constrained devices, such as smartphones, smart wearables, and other smart home automation devices. In this paper, we propose a series of model architecture adaptions, neural network graph transformations, and numerical optimizations to fit an advanced Conformer based end-to-end streaming ASR system on resource-constrained devices without accuracy degradation. We achieve over 5.26 times faster than realtime (0.19 RTF) speech recognition on smart wearables while minimizing energy consumption and achieving state-of-the-art accuracy. The proposed methods are widely applicable to other transformer-based server-free AI applications. In addition, we provide a complete theory on optimal pre-normalizers that numerically stabilize layer normalization in any Lp-norm using any floating point precision. △ Less

Submitted 13 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

arXiv:2311.09566 [pdf, other]

A Knowledge Distillation Approach for Sepsis Outcome Prediction from Multivariate Clinical Time Series

Authors: Anna Wong, Shu Ge, Nassim Oufattole, Adam Dejl, Megan Su, Ardavan Saeedi, Li-wei H. Lehman

Abstract: Sepsis is a life-threatening condition triggered by an extreme infection response. Our objective is to forecast sepsis patient outcomes using their medical history and treatments, while learning interpretable state representations to assess patients' risks in develo** various adverse outcomes. While neural networks excel in outcome prediction, their limited interpretability remains a key issue.… ▽ More Sepsis is a life-threatening condition triggered by an extreme infection response. Our objective is to forecast sepsis patient outcomes using their medical history and treatments, while learning interpretable state representations to assess patients' risks in develo** various adverse outcomes. While neural networks excel in outcome prediction, their limited interpretability remains a key issue. In this work, we use knowledge distillation via constrained variational inference to distill the knowledge of a powerful "teacher" neural network model with high predictive power to train a "student" latent variable model to learn interpretable hidden state representations to achieve high predictive performance for sepsis outcome prediction. Using real-world data from the MIMIC-IV database, we trained an LSTM as the "teacher" model to predict mortality for sepsis patients, given information about their recent history of vital signs, lab values and treatments. For our student model, we use an autoregressive hidden Markov model (AR-HMM) to learn interpretable hidden states from patients' clinical time series, and use the posterior distribution of the learned state representations to predict various downstream outcomes, including hospital mortality, pulmonary edema, need for diuretics, dialysis, and mechanical ventilation. Our results show that our approach successfully incorporates the constraint to achieve high predictive power similar to the teacher model, while maintaining the generative performance. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 12 pages

arXiv:2311.00738 [pdf, other]

Can Foundation Models Watch, Talk and Guide You Step by Step to Make a Cake?

Authors: Yuwei Bao, Keunwoo Peter Yu, Yichi Zhang, Shane Storks, Itamar Bar-Yossef, Alexander De La Iglesia, Megan Su, Xiao Lin Zheng, Joyce Chai

Abstract: Despite tremendous advances in AI, it remains a significant challenge to develop interactive task guidance systems that can offer situated, personalized guidance and assist humans in various tasks. These systems need to have a sophisticated understanding of the user as well as the environment, and make timely accurate decisions on when and what to say. To address this issue, we created a new multi… ▽ More Despite tremendous advances in AI, it remains a significant challenge to develop interactive task guidance systems that can offer situated, personalized guidance and assist humans in various tasks. These systems need to have a sophisticated understanding of the user as well as the environment, and make timely accurate decisions on when and what to say. To address this issue, we created a new multimodal benchmark dataset, Watch, Talk and Guide (WTaG) based on natural interaction between a human user and a human instructor. We further proposed two tasks: User and Environment Understanding, and Instructor Decision Making. We leveraged several foundation models to study to what extent these models can be quickly adapted to perceptually enabled task guidance. Our quantitative, qualitative, and human evaluation results show that these models can demonstrate fair performances in some cases with no task-specific training, but a fast and reliable adaptation remains a significant challenge. Our benchmark and baselines will provide a step** stone for future work on situated task guidance. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Accepted to EMNLP 2023 Findings

arXiv:2309.12960 [pdf, other]

Nested Event Extraction upon Pivot Element Recogniton

Authors: Weicheng Ren, Zixuan Li, Xiaolong **, Long Bai, Miao Su, Yantao Liu, Sai** Guan, Jiafeng Guo, Xueqi Cheng

Abstract: Nested Event Extraction (NEE) aims to extract complex event structures where an event contains other events as its arguments recursively. Nested events involve a kind of Pivot Elements (PEs) that simultaneously act as arguments of outer-nest events and as triggers of inner-nest events, and thus connect them into nested structures. This special characteristic of PEs brings challenges to existing NE… ▽ More Nested Event Extraction (NEE) aims to extract complex event structures where an event contains other events as its arguments recursively. Nested events involve a kind of Pivot Elements (PEs) that simultaneously act as arguments of outer-nest events and as triggers of inner-nest events, and thus connect them into nested structures. This special characteristic of PEs brings challenges to existing NEE methods, as they cannot well cope with the dual identities of PEs. Therefore, this paper proposes a new model, called PerNee, which extracts nested events mainly based on recognizing PEs. Specifically, PerNee first recognizes the triggers of both inner-nest and outer-nest events and further recognizes the PEs via classifying the relation type between trigger pairs. The model uses prompt learning to incorporate information from both event types and argument roles for better trigger and argument representations to improve NEE performance. Since existing NEE datasets (e.g., Genia11) are limited to specific domains and contain a narrow range of event types with nested structures, we systematically categorize nested events in the generic domain and construct a new NEE dataset, called ACE2005-Nest. Experimental results demonstrate that PerNee consistently achieves state-of-the-art performance on ACE2005-Nest, Genia11, and Genia13. The ACE2005-Nest dataset and the code of the PerNee model are available at https://github.com/waysonren/PerNee. △ Less

Submitted 7 April, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: Accepted at LREC-COLING 2024

arXiv:2308.02269 [pdf, other]

Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph

Authors: Hiroki Arimura, Shunsuke Inenaga, Yasuaki Kobayashi, Yuto Nakashima, Mizuki Sue

Abstract: In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size $e$ for a text $T$ of length $n$ into other text indexing structures for the same text, suitable for highly repetitive texts: the run-length BWT of size $r$, the irreducible PLCP array of size $r$, and the qu… ▽ More In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size $e$ for a text $T$ of length $n$ into other text indexing structures for the same text, suitable for highly repetitive texts: the run-length BWT of size $r$, the irreducible PLCP array of size $r$, and the quasi-irreducible LPF array of size $e$, as well as the lex-parse of size $O(r)$ and the LZ77-parse of size $z$, where $r, z \le e$. As main results, we showed that the above structures can be optimally computed from either the CDAWG for $T$ stored in read-only memory or its self-index version of size $e$ without a text in $O(e)$ worst-case time and words of working space. To obtain the above results, we devised techniques for enumerating a particular subset of suffixes in the lexicographic and text orders using the forward and backward search on the CDAWG by extending the results by Belazzougui et al. in 2015. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: The short version of this paper will appear in SPIRE 2023, Pisa, Italy, September 26-28, 2023, Lecture Notes in Computer Science, Springer

arXiv:2304.06292 [pdf, ps, other]

Improved Naive Bayes with Mislabeled Data

Authors: Qianhan Zeng, Yingqiu Zhu, Xuening Zhu, Feifei Wang, Weichen Zhao, Shuning Sun, Meng Su, Hansheng Wang

Abstract: Labeling mistakes are frequently encountered in real-world applications. If not treated well, the labeling mistakes can deteriorate the classification performances of a model seriously. To address this issue, we propose an improved Naive Bayes method for text classification. It is analytically simple and free of subjective judgements on the correct and incorrect labels. By specifying the generatin… ▽ More Labeling mistakes are frequently encountered in real-world applications. If not treated well, the labeling mistakes can deteriorate the classification performances of a model seriously. To address this issue, we propose an improved Naive Bayes method for text classification. It is analytically simple and free of subjective judgements on the correct and incorrect labels. By specifying the generating mechanism of incorrect labels, we optimize the corresponding log-likelihood function iteratively by using an EM algorithm. Our simulation and experiment results show that the improved Naive Bayes method greatly improves the performances of the Naive Bayes method with mislabeled data. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2212.03741 [pdf, other]

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

Authors: Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Ze** Ren, Han Zhang, Yansong Tang, Xiu Li

Abstract: Generating full-body and multi-genre dance sequences from given music is a challenging task, due to the limitations of existing datasets and the inherent complexity of the fine-grained hand motion and dance genres. To address these problems, we propose FineDance, which contains 14.6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurat… ▽ More Generating full-body and multi-genre dance sequences from given music is a challenging task, due to the limitations of existing datasets and the inherent complexity of the fine-grained hand motion and dance genres. To address these problems, we propose FineDance, which contains 14.6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurate posture. To the best of our knowledge, FineDance is the largest music-dance paired dataset with the most dance genres. Additionally, to address monotonous and unnatural hand movements existing in previous methods, we propose a full-body dance generation network, which utilizes the diverse generation capabilities of the diffusion model to solve monotonous problems, and use expert nets to solve unreal problems. To further enhance the genre-matching and long-term stability of generated dances, we propose a Genre&Coherent aware Retrieval Module. Besides, we propose a novel metric named Genre Matching Score to evaluate the genre-matching degree between dance and music. Quantitative and qualitative experiments demonstrate the quality of FineDance, and the state-of-the-art performance of FineNet. The FineDance Dataset and more qualitative samples can be found at our website. △ Less

Submitted 30 August, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: Accepted by ICCV 2023

arXiv:2210.16190 [pdf]

doi 10.1038/s41524-023-01130-4

Transferable E(3) equivariant parameterization for Hamiltonian of molecules and solids

Authors: Yang Zhong, Hongyu Yu, Mao Su, Xingao Gong, Hongjun Xiang

Abstract: Using the message-passing mechanism in machine learning (ML) instead of self-consistent iterations to directly build the map** from structures to electronic Hamiltonian matrices will greatly improve the efficiency of density functional theory (DFT) calculations. In this work, we proposed a general analytic Hamiltonian representation in an E(3) equivariant framework, which can fit the ab initio H… ▽ More Using the message-passing mechanism in machine learning (ML) instead of self-consistent iterations to directly build the map** from structures to electronic Hamiltonian matrices will greatly improve the efficiency of density functional theory (DFT) calculations. In this work, we proposed a general analytic Hamiltonian representation in an E(3) equivariant framework, which can fit the ab initio Hamiltonian of molecules and solids by a complete data-driven method and are equivariant under rotation, space inversion, and time reversal operations. Our model reached state-of-the-art precision in the benchmark test and accurately predicted the electronic Hamiltonian matrices and related properties of various periodic and aperiodic systems, showing high transferability and generalization ability. This framework provides a general transferable model that can be used to accelerate the electronic structure calculations on different large systems with the same network weights trained on small structures. △ Less

Submitted 4 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

Comments: 33 pages, 6 figures

arXiv:2206.00447 [pdf, other]

CD$^2$: Fine-grained 3D Mesh Reconstruction With Twice Chamfer Distance

Authors: Rongfei Zeng, Mai Su, Ruiyun Yu, Xingwei Wang

Abstract: Monocular 3D reconstruction is to reconstruct the shape of object and its other information from a single RGB image. In 3D reconstruction, polygon mesh, with detailed surface information and low computational cost, is the most prevalent expression form obtained from deep learning models. However, the state-of-the-art schemes fail to directly generate well-structured meshes, and we identify that mo… ▽ More Monocular 3D reconstruction is to reconstruct the shape of object and its other information from a single RGB image. In 3D reconstruction, polygon mesh, with detailed surface information and low computational cost, is the most prevalent expression form obtained from deep learning models. However, the state-of-the-art schemes fail to directly generate well-structured meshes, and we identify that most meshes have severe Vertices Clustering (VC) and Illegal Twist (IT) problems. By analyzing the mesh deformation process, we pinpoint that the inappropriate usage of Chamfer Distance (CD) loss is a root cause of VC and IT problems in deep learning model. In this paper, we initially demonstrate these two problems induced by CD loss with visual examples and quantitative analyses. Then, we propose a fine-grained reconstruction method CD$^2$ by employing Chamfer distance twice to perform a plausible and adaptive deformation. Extensive experiments on two 3D datasets and comparisons with five latest schemes demonstrate that our CD$^2$ directly generates a well-structured mesh and outperforms others in terms of several quantitative metrics. △ Less

Submitted 29 January, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: Just accepted by TOMM

arXiv:2204.01601 [pdf, other]

Towards Privacy-Preserving and Verifiable Federated Matrix Factorization

Authors: Xicheng Wan, Yifeng Zheng, Qun Li, Anmin Fu, Mang Su, Yansong Gao

Abstract: Recent years have witnessed the rapid growth of federated learning (FL), an emerging privacy-aware machine learning paradigm that allows collaborative learning over isolated datasets distributed across multiple participants. The salient feature of FL is that the participants can keep their private datasets local and only share model updates. Very recently, some research efforts have been initiated… ▽ More Recent years have witnessed the rapid growth of federated learning (FL), an emerging privacy-aware machine learning paradigm that allows collaborative learning over isolated datasets distributed across multiple participants. The salient feature of FL is that the participants can keep their private datasets local and only share model updates. Very recently, some research efforts have been initiated to explore the applicability of FL for matrix factorization (MF), a prevalent method used in modern recommendation systems and services. It has been shown that sharing the gradient updates in federated MF entails privacy risks on revealing users' personal ratings, posing a demand for protecting the shared gradients. Prior art is limited in that they incur notable accuracy loss, or rely on heavy cryptosystem, with a weak threat model assumed. In this paper, we propose VPFedMF, a new design aimed at privacy-preserving and verifiable federated MF. VPFedMF provides guarantees on the confidentiality of individual gradient updates through lightweight and secure aggregation. Moreover, VPFedMF ambitiously and newly supports correctness verification of the aggregation results produced by the coordinating server in federated MF. Experiments on a real-world movie rating dataset demonstrate the practical performance of VPFedMF in terms of computation, communication, and accuracy. △ Less

Submitted 11 June, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: Accepted by Knowledge-Based Systems

arXiv:2112.01215 [pdf]

Adaptive Group Collaborative Artificial Bee Colony Algorithm

Authors: Haiquan Wang, Hans-DietrichHaasis, Panpan Du, Xiaobin Xu, Menghao Su, Shengjun Wen, Wenxuan Yue, Shanshan Zhang

Abstract: As an effective algorithm for solving complex optimization problems, artificial bee colony (ABC) algorithm has shown to be competitive, but the same as other population-based algorithms, it is poor at balancing the abilities of global searching in the whole solution space (named as exploration) and quick searching in local solution space which is defined as exploitation. For improving the performa… ▽ More As an effective algorithm for solving complex optimization problems, artificial bee colony (ABC) algorithm has shown to be competitive, but the same as other population-based algorithms, it is poor at balancing the abilities of global searching in the whole solution space (named as exploration) and quick searching in local solution space which is defined as exploitation. For improving the performance of ABC, an adaptive group collaborative ABC (AgABC) algorithm is introduced where the population in different phases is divided to specific groups and different search strategies with different abilities are assigned to the members in groups, and the member or strategy which obtains the best solution will be employed for further searching. Experimental results on benchmark functions show that the proposed algorithm with dynamic mechanism is superior to other algorithms in searching accuracy and stability. Furthermore, numerical experiments show that the proposed method can generate the optimal solution for the complex scheduling problem. △ Less

Submitted 2 December, 2021; originally announced December 2021.

arXiv:2110.13499 [pdf, other]

SEDML: Securely and Efficiently Harnessing Distributed Knowledge in Machine Learning

Authors: Yansong Gao, Qun Li, Yifeng Zheng, Guohong Wang, Jiannan Wei, Mang Su

Abstract: Training high-performing deep learning models require a rich amount of data which is usually distributed among multiple data sources in practice. Simply centralizing these multi-sourced data for training would raise critical security and privacy concerns, and might be prohibited given the increasingly strict data regulations. To resolve the tension between privacy and data utilization in distribut… ▽ More Training high-performing deep learning models require a rich amount of data which is usually distributed among multiple data sources in practice. Simply centralizing these multi-sourced data for training would raise critical security and privacy concerns, and might be prohibited given the increasingly strict data regulations. To resolve the tension between privacy and data utilization in distributed learning, a machine learning framework called private aggregation of teacher ensembles(PATE) has been recently proposed. PATE harnesses the knowledge (label predictions for an unlabeled dataset) from distributed teacher models to train a student model, obviating access to distributed datasets. Despite being enticing, PATE does not offer protection for the individual label predictions from teacher models, which still entails privacy risks. In this paper, we propose SEDML, a new protocol which allows to securely and efficiently harness the distributed knowledge in machine learning. SEDML builds on lightweight cryptography and provides strong protection for the individual label predictions, as well as differential privacy guarantees on the aggregation results. Extensive evaluations show that while providing privacy protection, SEDML preserves the accuracy as in the plaintext baseline. Meanwhile, SEDML's performance in computing and communication is 43 times and 1.23 times higher than the latest technology, respectively. △ Less

Submitted 26 October, 2021; originally announced October 2021.

arXiv:2106.12288 [pdf, other]

MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning

Authors: Chen Liu, Bo Li, Jun Zhao, Ming Su, Xu-Dong Liu

Abstract: Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heteroge… ▽ More Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heterogeneous graphs and investigates real-world meta-graphs between malware objects, which can effectively characterize more discriminative malicious evolutionary patterns between malware and their variants. Then, MG-DVD presents two dynamic walk-based heterogeneous graph learning methods to learn more comprehensive representations of malware variants, which significantly reduces the cost of the entire graph retraining. As a result, MG-DVD is equipped with the ability to detect malware variants in real time, and it presents better interpretability by introducing meaningful meta-graphs. Comprehensive experiments on large-scale samples prove that our proposed MG-DVD outperforms state-of-the-art methods in detecting malware variants in terms of effectiveness and efficiency. △ Less

Submitted 24 June, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

Comments: 8 pages, 7 figures, Accepted at the 30th International Joint Conference on Artificial Intelligence(IJCAI 2021)

arXiv:2105.08959 [pdf, other]

VSGM -- Enhance robot task understanding ability through visual semantic graph

Authors: Cheng Yu Tsai, Mu-Chun Su

Abstract: In recent years, develo** AI for robotics has raised much attention. The interaction of vision and language of robots is particularly difficult. We consider that giving robots an understanding of visual semantics and language semantics will improve inference ability. In this paper, we propose a novel method-VSGM (Visual Semantic Graph Memory), which uses the semantic graph to obtain better visua… ▽ More In recent years, develo** AI for robotics has raised much attention. The interaction of vision and language of robots is particularly difficult. We consider that giving robots an understanding of visual semantics and language semantics will improve inference ability. In this paper, we propose a novel method-VSGM (Visual Semantic Graph Memory), which uses the semantic graph to obtain better visual image features, improve the robot's visual understanding ability. By providing prior knowledge of the robot and detecting the objects in the image, it predicts the correlation between the attributes of the object and the objects and converts them into a graph-based representation; and map** the object in the image to be a top-down egocentric map. Finally, the important object features of the current task are extracted by Graph Neural Networks. The method proposed in this paper is verified in the ALFRED (Action Learning From Realistic Environments and Directives) dataset. In this dataset, the robot needs to perform daily indoor household tasks following the required language instructions. After the model is added to the VSGM, the task success rate can be improved by 6~10%. △ Less

Submitted 25 May, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

Comments: 16 pages, 7 figures

arXiv:1906.03181 [pdf, other]

doi 10.1016/j.cose.2019.04.014

POBA-GA: Perturbation Optimized Black-Box Adversarial Attacks via Genetic Algorithm

Authors: **yin Chen, Mengmeng Su, Shi**g Shen, Hui Xiong, Haibin Zheng

Abstract: Most deep learning models are easily vulnerable to adversarial attacks. Various adversarial attacks are designed to evaluate the robustness of models and develop defense model. Currently, adversarial attacks are brought up to attack their own target model with their own evaluation metrics. And most of the black-box adversarial attack algorithms cannot achieve the expected success rate compared wit… ▽ More Most deep learning models are easily vulnerable to adversarial attacks. Various adversarial attacks are designed to evaluate the robustness of models and develop defense model. Currently, adversarial attacks are brought up to attack their own target model with their own evaluation metrics. And most of the black-box adversarial attack algorithms cannot achieve the expected success rate compared with white-box attacks. In this paper, comprehensive evaluation metrics are brought up for different adversarial attack methods. A novel perturbation optimized black-box adversarial attack based on genetic algorithm (POBA-GA) is proposed for achieving white-box comparable attack performances. Approximate optimal adversarial examples are evolved through evolutionary operations including initialization, selection, crossover and mutation. Fitness function is specifically designed to evaluate the example individual in both aspects of attack ability and perturbation control. Population diversity strategy is brought up in evolutionary process to promise the approximate optimal perturbations obtained. Comprehensive experiments are carried out to testify POBA-GA's performances. Both simulation and application results prove that our method is better than current state-of-art black-box attack methods in aspects of attack capability and perturbation control. △ Less

Submitted 1 May, 2019; originally announced June 2019.

Journal ref: Computers and Security, Volume 85, August 2019, Pages 89-106

arXiv:1903.11968 [pdf, ps, other]

On the stability of periodic binary sequences with zone restriction

Authors: Ming Su, Qiang Wang

Abstract: Traditional global stability measure for sequences is hard to determine because of large search space. We propose the $k$-error linear complexity with a zone restriction for measuring the local stability of sequences. Accordingly, we can efficiently determine the global stability by studying a local stability for these sequences. For several classes of sequences, we demonstrate that the $k$-error… ▽ More Traditional global stability measure for sequences is hard to determine because of large search space. We propose the $k$-error linear complexity with a zone restriction for measuring the local stability of sequences. Accordingly, we can efficiently determine the global stability by studying a local stability for these sequences. For several classes of sequences, we demonstrate that the $k$-error linear complexity is identical to the $k$-error linear complexity within a zone, while the length of a zone is much smaller than the whole period when the $k$-error linear complexity is large. These sequences have periods $2^n$, or $2^v r$ ($r$ odd prime and $2$ is primitive modulo $r$), or $2^v p_1^{s_1} \cdots p_n^{s_n}$ ($p_i$ is an odd prime and $2$ is primitive modulo $p_i$ and $p_i^2$, where $1\leq i \leq n$) respectively. In particular, we completely determine the spectrum of $1$-error linear complexity with any zone length for an arbitrary $2^n$-periodic binary sequence. △ Less

Submitted 28 March, 2019; originally announced March 2019.

Comments: 17 pages

arXiv:1812.01713 [pdf, other]

FineFool: Fine Object Contour Attack via Attention

Authors: **yin Chen, Haibin Zheng, Hui Xiong, Mengmeng Su

Abstract: Machine learning models have been shown vulnerable to adversarial attacks launched by adversarial examples which are carefully crafted by attacker to defeat classifiers. Deep learning models cannot escape the attack either. Most of adversarial attack methods are focused on success rate or perturbations size, while we are more interested in the relationship between adversarial perturbation and the… ▽ More Machine learning models have been shown vulnerable to adversarial attacks launched by adversarial examples which are carefully crafted by attacker to defeat classifiers. Deep learning models cannot escape the attack either. Most of adversarial attack methods are focused on success rate or perturbations size, while we are more interested in the relationship between adversarial perturbation and the image itself. In this paper, we put forward a novel adversarial attack based on contour, named FineFool. Finefool not only has better attack performance compared with other state-of-art white-box attacks in aspect of higher attack success rate and smaller perturbation, but also capable of visualization the optimal adversarial perturbation via attention on object contour. To the best of our knowledge, Finefool is for the first time combines the critical feature of the original clean image with the optimal perturbations in a visible manner. Inspired by the correlations between adversarial perturbations and object contour, slighter perturbations is produced via focusing on object contour features, which is more imperceptible and difficult to be defended, especially network add-on defense methods with the trade-off between perturbations filtering and contour feature loss. Compared with existing state-of-art attacks, extensive experiments are conducted to show that Finefool is capable of efficient attack against defensive deep models. △ Less

Submitted 1 December, 2018; originally announced December 2018.

arXiv:1705.08378 [pdf, other]

doi 10.1109/TDSC.2018.2874243

Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise Reduction

Authors: Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong Li, Wenchang Shi, Xiaofeng Wang

Abstract: Recently, many studies have demonstrated deep neural network (DNN) classifiers can be fooled by the adversarial example, which is crafted via introducing some perturbations into an original sample. Accordingly, some powerful defense techniques were proposed. However, existing defense techniques often require modifying the target model or depend on the prior knowledge of attacks. In this paper, we… ▽ More Recently, many studies have demonstrated deep neural network (DNN) classifiers can be fooled by the adversarial example, which is crafted via introducing some perturbations into an original sample. Accordingly, some powerful defense techniques were proposed. However, existing defense techniques often require modifying the target model or depend on the prior knowledge of attacks. In this paper, we propose a straightforward method for detecting adversarial image examples, which can be directly deployed into unmodified off-the-shelf DNN models. We consider the perturbation to images as a kind of noise and introduce two classic image processing techniques, scalar quantization and smoothing spatial filter, to reduce its effect. The image entropy is employed as a metric to implement an adaptive noise reduction for different kinds of images. Consequently, the adversarial example can be effectively detected by comparing the classification results of a given sample and its denoised version, without referring to any prior knowledge of attacks. More than 20,000 adversarial examples against some state-of-the-art DNN models are used to evaluate the proposed method, which are crafted with different attack techniques. The experiments show that our detection method can achieve a high overall F1 score of 96.39% and certainly raises the bar for defense-aware attacks. △ Less

Submitted 8 January, 2019; v1 submitted 23 May, 2017; originally announced May 2017.

Comments: 14 pages, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8482346&isnumber=4358699

arXiv:1704.08006 [pdf]

doi 10.24963/ijcai.2018/585

Deep Text Classification Can be Fooled

Authors: Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi

Abstract: In this paper, we present an effective method to craft text adversarial samples, revealing one important yet underestimated fact that DNN-based text classifiers are also prone to adversarial sample attack. Specifically, confronted with different adversarial scenarios, the text items that are important for classification are identified by computing the cost gradients of the input (white-box attack)… ▽ More In this paper, we present an effective method to craft text adversarial samples, revealing one important yet underestimated fact that DNN-based text classifiers are also prone to adversarial sample attack. Specifically, confronted with different adversarial scenarios, the text items that are important for classification are identified by computing the cost gradients of the input (white-box attack) or generating a series of occluded test samples (black-box attack). Based on these items, we design three perturbation strategies, namely insertion, modification, and removal, to generate adversarial samples. The experiment results show that the adversarial samples generated by our method can successfully fool both state-of-the-art character-level and word-level DNN-based text classifiers. The adversarial samples can be perturbed to any desirable classes without compromising their utilities. At the same time, the introduced perturbation is difficult to be perceived. △ Less

Submitted 7 January, 2019; v1 submitted 26 April, 2017; originally announced April 2017.

Comments: 8 pages

Journal ref: https://www.ijcai.org/proceedings/2018/585

arXiv:1704.04429 [pdf, other]

3D seismic data denoising using two-dimensional sparse coding scheme

Authors: Ming-Jun Su, **gbo Chang, Feng Qian, Guangmin Hu, Xiao-Yang Liu

Abstract: Seismic data denoising is vital to geophysical applications and the transform-based function method is one of the most widely used techniques. However, it is challenging to design a suit- able sparse representation to express a transform-based func- tion group due to the complexity of seismic data. In this paper, we apply a seismic data denoising method based on learning- type overcomplete diction… ▽ More Seismic data denoising is vital to geophysical applications and the transform-based function method is one of the most widely used techniques. However, it is challenging to design a suit- able sparse representation to express a transform-based func- tion group due to the complexity of seismic data. In this paper, we apply a seismic data denoising method based on learning- type overcomplete dictionaries which uses two-dimensional sparse coding (2DSC). First, we model the input seismic data and dictionaries as third-order tensors and introduce tensor- linear combinations for data approximation. Second, we ap- ply learning-type overcomplete dictionary, i.e., optimal sparse data representation is achieved through learning and training. Third, we exploit the alternating minimization algorithm to solve the optimization problem of seismic denoising. Finally we evaluate its denoising performance on synthetic seismic data and land data survey. Experiment results show that the two-dimensional sparse coding scheme reduces computational costs and enhances the signal-to-noise ratio. △ Less

Submitted 8 April, 2017; originally announced April 2017.

arXiv:1704.02446 [pdf, other]

doi 10.1190/geo2017-0524.1

Seismic facies recognition based on prestack data using deep convolutional autoencoder

Authors: Feng Qian, Miao Yin, Ming-Jun Su, Yaojun Wang, Guangmin Hu

Abstract: Prestack seismic data carries much useful information that can help us find more complex atypical reservoirs. Therefore, we are increasingly inclined to use prestack seismic data for seis- mic facies recognition. However, due to the inclusion of ex- cessive redundancy, effective feature extraction from prestack seismic data becomes critical. In this paper, we consider seis- mic facies recognition… ▽ More Prestack seismic data carries much useful information that can help us find more complex atypical reservoirs. Therefore, we are increasingly inclined to use prestack seismic data for seis- mic facies recognition. However, due to the inclusion of ex- cessive redundancy, effective feature extraction from prestack seismic data becomes critical. In this paper, we consider seis- mic facies recognition based on prestack data as an image clus- tering problem in computer vision (CV) by thinking of each prestack seismic gather as a picture. We propose a convo- lutional autoencoder (CAE) network for deep feature learn- ing from prestack seismic data, which is more effective than principal component analysis (PCA) in redundancy removing and valid information extraction. Then, using conventional classification or clustering techniques (e.g. K-means or self- organizing maps) on the extracted features, we can achieve seismic facies recognition. We applied our method to the prestack data from physical model and LZB region. The re- sult shows that our approach is superior to the conventionals. △ Less

Submitted 8 April, 2017; originally announced April 2017.

Journal ref: GEOPHYSICS, 2018, 83(3): A39-A43

arXiv:1704.02445 [pdf, other]

Exact 3D seismic data reconstruction using Tubal-Alt-Min algorithm

Authors: Feng Qian, Quan Chen, Ming-Jun Su, Guang-Min Hu, Xiao-Yang Liu

Abstract: Data missing is an common issue in seismic data, and many methods have been proposed to solve it. In this paper, we present the low-tubal-rank tensor model and a novel tensor completion algorithm to recover 3D seismic data. This is a fast iterative algorithm, called Tubal-Alt-Min which completes our 3D seismic data by exploiting the low-tubal-rank property expressed as the product of two much smal… ▽ More Data missing is an common issue in seismic data, and many methods have been proposed to solve it. In this paper, we present the low-tubal-rank tensor model and a novel tensor completion algorithm to recover 3D seismic data. This is a fast iterative algorithm, called Tubal-Alt-Min which completes our 3D seismic data by exploiting the low-tubal-rank property expressed as the product of two much smaller tensors. TubalAlt-Min alternates between estimating those two tensor using least squares minimization. We evaluate its reconstruction performance both on synthetic seismic data and land data survey. The experimental results show that compared with the tensor nuclear norm minimization algorithm, Tubal-Alt-Min improves the reconstruction error by orders of magnitude. △ Less

Submitted 8 April, 2017; originally announced April 2017.

arXiv:1611.06459 [pdf, other]

Gendered Conversation in a Social Game-Streaming Platform

Authors: Supun Nakandala, Giovanni Luca Ciampaglia, Norman Makoto Su, Yong-Yeol Ahn

Abstract: Online social media and games are increasingly replacing offline social activities. Social media is now an indispensable mode of communication; online gaming is not only a genuine social activity but also a popular spectator sport. With support for anonymity and larger audiences, online interaction shrinks social and geographical barriers. Despite such benefits, social disparities such as gender i… ▽ More Online social media and games are increasingly replacing offline social activities. Social media is now an indispensable mode of communication; online gaming is not only a genuine social activity but also a popular spectator sport. With support for anonymity and larger audiences, online interaction shrinks social and geographical barriers. Despite such benefits, social disparities such as gender inequality persist in online social media. In particular, online gaming communities have been criticized for persistent gender disparities and objectification. As gaming evolves into a social platform, persistence of gender disparity is a pressing question. Yet, there are few large-scale, systematic studies of gender inequality and objectification in social gaming platforms. Here we analyze more than one billion chat messages from Twitch, a social game-streaming platform, to study how the gender of streamers is associated with the nature of conversation. Using a combination of computational text analysis methods, we show that gendered conversation and objectification is prevalent in chats. Female streamers receive significantly more objectifying comments while male streamers receive more game-related comments. This difference is more pronounced for popular streamers. There also exists a large number of users who post only on female or male streams. Employing a neural vector-space embedding (paragraph vector) method, we analyze gendered chat messages and create prediction models that (i) identify the gender of streamers based on messages posted in the channel and (ii) identify the gender a viewer prefers to watch based on their chat messages. Our findings suggest that disparities in social game-streaming platforms is a nuanced phenomenon that involves the gender of streamers as well as those who produce gendered and game-related conversation. △ Less

Submitted 22 November, 2016; v1 submitted 19 November, 2016; originally announced November 2016.

Comments: 10 pages, 7 figures, 5 tables

arXiv:1512.07805 [pdf, other]

RFP: A Remote Fetching Paradigm for RDMA-Accelerated Systems

Authors: Maomeng Su, Mingxing Zhang, Kang Chen, Yongwei Wu, Guoliang Li

Abstract: Remote Direct Memory Access (RDMA) is an efficient way to improve the performance of traditional client-server systems. Currently, there are two main design paradigms for RDMA-accelerated systems. The first allows the clients to directly operate the server's memory and totally bypasses the CPUs at server side. The second follows the traditional server-reply paradigm, which asks the server to write… ▽ More Remote Direct Memory Access (RDMA) is an efficient way to improve the performance of traditional client-server systems. Currently, there are two main design paradigms for RDMA-accelerated systems. The first allows the clients to directly operate the server's memory and totally bypasses the CPUs at server side. The second follows the traditional server-reply paradigm, which asks the server to write results back to the clients. However, the first method has to expose server's memory and needs tremendous re-design of upper-layer software, which is complex, unsafe, error-prone, and inefficient. The second cannot achieve high input/output operations per second (IOPS), because it employs out-bound RDMA-write at server side which is not efficient. We find that the performance of out-bound RDMA-write and in-bound RDMA-read is asymmetric and the latter is 5 times faster than the former. Based on this observation, we propose a novel design paradigm named Remote Fetching Paradigm (RFP). In RFP, the server is still responsible for processing requests from the clients. However, counter-intuitively, instead of sending results back to the clients through out-bound RDMA-write, the server only writes the results in local memory buffers, and the clients use in-bound RDMA-read to remotely fetch these results. Since in-bound RDMA-read achieves much higher IOPS than out-bound RDMA-write, our model is able to bring higher performance than the traditional models. In order to prove the effectiveness of RFP, we design and implement an RDMA-accelerated in-memory key-value store following the RFP model. To further improve the IOPS, we propose an optimization mechanism that combines status checking and result fetching. Experiment results show that RFP can improve the IOPS by 160%~310% against state-of-the-art models for in-memory key-value stores. △ Less

Submitted 24 December, 2015; originally announced December 2015.

Comments: 11 pages, 10 figures; Key Words: RDMA and InfiniBand, Remote Fetching Paradigm, IOPS, and Small Data

arXiv:1005.4454 [pdf, other]

Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

Authors: Joseph C. Jacob, Daniel S. Katz, G. Bruce Berriman, John Good, Anastasia C. Laity, Ewa Deelman, Carl Kesselman, Gurmeet Singh, Mei-Hui Su, Thomas A. Prince, Roy Williams

Abstract: Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, locati… ▽ More Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Montage can be run on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and usage of Montage as both a software toolkit and as a grid portal. Timing results are provided to show how Montage performance scales with number of processors on a cluster computer. In addition, we compare the performance of two methods of running Montage in parallel on a grid. △ Less

Submitted 24 May, 2010; originally announced May 2010.

Comments: 16 pages, 11 figures

Journal ref: Int. J. Computational Science and Engineering. 2009

Showing 1–35 of 35 results for author: Su, M