Skip to main content

Showing 1–8 of 8 results for author: Xiong, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.18127  [pdf, ps, other

    cs.LG math.ST stat.ML

    A Correction of Pseudo Log-Likelihood Method

    Authors: Shi Feng, Nuoya Xiong, Zhijie Zhang, Wei Chen

    Abstract: Pseudo log-likelihood is a type of maximum likelihood estimation (MLE) method used in various fields including contextual bandits, influence maximization of social networks, and causal bandits. However, in previous literature \citep{li2017provably, zhang2022online, xiong2022combinatorial, feng2023combinatorial1, feng2023combinatorial2}, the log-likelihood function may not be bounded, which may res… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 7 pages

  2. arXiv:2310.01769  [pdf, other

    cs.LG math.OC stat.ML

    How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization

    Authors: Nuoya Xiong, Lijun Ding, Simon S. Du

    Abstract: This paper rigorously shows how over-parameterization changes the convergence behaviors of gradient descent (GD) for the matrix sensing problem, where the goal is to recover an unknown low-rank ground-truth matrix from near-isotropic linear measurements. First, we consider the symmetric setting with the symmetric parameterization where $M^* \in \mathbb{R}^{n \times n}$ is a positive semi-definite… ▽ More

    Submitted 24 November, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  3. arXiv:2301.13392  [pdf, other

    cs.LG math.OC stat.ML

    Combinatorial Causal Bandits without Graph Skeleton

    Authors: Shi Feng, Nuoya Xiong, Wei Chen

    Abstract: In combinatorial causal bandits (CCB), the learning agent chooses a subset of variables in each round to intervene and collects feedback from the observed variables to minimize expected regret or sample complexity. Previous works study this problem in both general causal models and binary generalized linear models (BGLMs). However, all of them require prior knowledge of causal graph structure. Thi… ▽ More

    Submitted 16 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: 39 pages, 7 figures

  4. arXiv:2008.12522  [pdf, other

    cs.LG stat.ML

    An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data

    Authors: Genggeng Liu, Canyang Guo, Lin Xie, Wenxi Liu, Naixue Xiong, Guolong Chen

    Abstract: In the era of big data, a large number of text data generated by the Internet has given birth to a variety of text representation methods. In natural language processing (NLP), text representation transforms text into vectors that can be processed by computer without losing the original semantic information. However, these methods are difficult to effectively extract the semantic features among wo… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  5. arXiv:2008.09951  [pdf

    cs.LG stat.ML

    DSP: A Differential Spatial Prediction Scheme for Comprehensive real industrial datasets

    Authors: Junjie Zhang, Cong Zhang, Neal N. Xiong

    Abstract: Inverse Distance Weighted models (IDW) have been widely used for predicting and modeling multidimensional space in multimodal industrial processes. However, the more complex the structure of multidimensional space, the lower the performance of IDW models, and real industrial datasets tend to have more complex spatial structure. To solve this problem, a new framework for spatial prediction and mode… ▽ More

    Submitted 22 August, 2020; originally announced August 2020.

    ACM Class: K.3.2

  6. arXiv:2001.05759  [pdf, other

    cs.LG cs.DC stat.ML

    Smart Data driven Decision Trees Ensemble Methodology for Imbalanced Big Data

    Authors: Diego García-Gil, Salvador García, Ning Xiong, Francisco Herrera

    Abstract: Differences in data size per class, also known as imbalanced data distribution, have become a common problem affecting data quality. Big Data scenarios pose a new challenge to traditional imbalanced classification algorithms, since they are not prepared to work with such amount of data. Split data strategies and lack of data in the minority class due to the use of MapReduce paradigm have posed new… ▽ More

    Submitted 3 September, 2021; v1 submitted 16 January, 2020; originally announced January 2020.

  7. arXiv:1911.00262  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    Finding the most similar textual documents using Case-Based Reasoning

    Authors: Marko Mihajlovic, Ning Xiong

    Abstract: In recent years, huge amounts of unstructured textual data on the Internet are a big difficulty for AI algorithms to provide the best recommendations for users and their search queries. Since the Internet became widespread, a lot of research has been done in the field of Natural Language Processing (NLP) and machine learning. Almost every solution transforms documents into Vector Space Models (VSM… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

  8. arXiv:1804.05774  [pdf, ps, other

    cs.LG stat.ML

    BELIEF: A distance-based redundancy-proof feature selection method for Big Data

    Authors: Sergio Ramírez-Gallego, Salvador García, Ning Xiong, Francisco Herrera

    Abstract: With the advent of Big Data era, data reduction methods are highly demanded given its ability to simplify huge data, and ease complex learning processes. Concretely, algorithms that are able to filter relevant dimensions from a set of millions are of huge importance. Although effective, these techniques suffer from the "scalability" curse as well. In this work, we propose a distributed feature w… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: 30 pages, 6 figures