Skip to main content

Showing 1–32 of 32 results for author: Sun, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.14052  [pdf, other

    cs.CL stat.ME

    Differential contributions of machine learning and statistical analysis to language and cognitive sciences

    Authors: Kun Sun, Rong Wang

    Abstract: Data-driven approaches have revolutionized scientific research. Machine learning and statistical analysis are commonly utilized in this type of research. Despite their widespread use, these methodologies differ significantly in their techniques and objectives. Few studies have utilized a consistent dataset to demonstrate these differences within the social sciences, particularly in language and co… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  2. arXiv:2403.15822  [pdf, other

    cs.CL stat.ML

    Computational Sentence-level Metrics Predicting Human Sentence Comprehension

    Authors: Kun Sun, Rong Wang

    Abstract: The majority of research in computational psycholinguistics has concentrated on the processing of words. This study introduces innovative methods for computing sentence-level metrics using multilingual large language models. The metrics developed sentence surprisal and sentence relevance and then are tested and compared to validate whether they can predict how humans comprehend sentences as a whol… ▽ More

    Submitted 15 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

  3. arXiv:2402.05379  [pdf, other

    cs.LG stat.ML

    Tradeoffs of Diagonal Fisher Information Matrix Estimators

    Authors: Alexander Soen, Ke Sun

    Abstract: The Fisher information matrix characterizes the local geometry in the parameter space of neural networks. It elucidates insightful theories and useful tools to understand and optimize neural networks. Given its high computational cost, practitioners often use random estimators and evaluate only the diagonal entries. We examine two such estimators, whose accuracy and sample complexity depend on the… ▽ More

    Submitted 2 April, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  4. arXiv:2303.15464  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Mathematical Challenges in Deep Learning

    Authors: Vahid Partovi Nia, Guojun Zhang, Ivan Kobyzev, Michael R. Metel, Xinlin Li, Ke Sun, Sobhan Hemati, Masoud Asgharian, Linglong Kong, Wulong Liu, Boxing Chen

    Abstract: Deep models are dominating the artificial intelligence (AI) industry since the ImageNet challenge in 2012. The size of deep models is increasing ever since, which brings new challenges to this field with applications in cell phones, personal computers, autonomous cars, and wireless base stations. Here we list a set of problems, ranging from training, inference, generalization bound, and optimizati… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  5. arXiv:2211.01703  [pdf, other

    cs.GT cs.IT cs.LG math.ST stat.ML

    $2 \times 2$ Zero-Sum Games with Commitments and Noisy Observations

    Authors: Ke Sun, Samir M. Perlaza, Alain Jean-Marie

    Abstract: In this paper, $2\times2$ zero-sum games are studied under the following assumptions: $(1)$ One of the players (the leader) commits to choose its actions by sampling a given probability measure (strategy); $(2)$ The leader announces its action, which is observed by its opponent (the follower) through a binary channel; and $(3)$ the follower chooses its strategy based on the knowledge of the leader… ▽ More

    Submitted 11 May, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: Accepted by 2023 IEEE Int. Symp. on Information Theory (ISIT)

  6. arXiv:2204.01831  [pdf, other

    stat.ME

    An adaptive model checking test for functional linear model

    Authors: Enze Shi, Yi Liu, Ke Sun, Lingzhu Li, Linglong Kong

    Abstract: Numerous studies have been devoted to the estimation and inference problems for functional linear models (FLM). However, few works focus on model checking problem that ensures the reliability of results. Limited tests in this area do not have tractable null distributions or asymptotic analysis under alternatives. Also, the functional predictor is usually assumed to be fully observed, which is impr… ▽ More

    Submitted 5 June, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

  7. arXiv:2202.00769  [pdf, other

    cs.LG stat.ML

    Distributional Reinforcement Learning by Sinkhorn Divergence

    Authors: Ke Sun, Yingnan Zhao, Wulong Liu, Bei Jiang, Linglong Kong

    Abstract: The empirical success of distributional reinforcement learning~(RL) highly depends on the distribution representation and the choice of distribution divergence. In this paper, we propose \textit{Sinkhorn distributional RL~(SinkhornDRL)} that learns unrestricted statistics from return distributions and leverages Sinkhorn divergence to minimize the difference between current and target Bellman retur… ▽ More

    Submitted 2 February, 2024; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:2110.03155

  8. arXiv:2201.12947  [pdf, other

    stat.ML cs.LG

    Fair Wrap** for Black-box Predictions

    Authors: Alexander Soen, Ibrahim Alabdulmohsin, Sanmi Koyejo, Yishay Mansour, Nyalleng Moorosi, Richard Nock, Ke Sun, Lexing Xie

    Abstract: We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias. Our technique builds on the recent analysis of improper loss functions whose optimization can correct any twist in prediction, unfairness being treated as a twist. In the post-processing, we learn a wrapper function which we define as an $α$-tree, which modifies the prediction. We p… ▽ More

    Submitted 1 November, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

    Comments: Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  9. arXiv:2107.04205  [pdf, other

    cs.LG stat.ML

    On the Variance of the Fisher Information for Deep Learning

    Authors: Alexander Soen, Ke Sun

    Abstract: In the realm of deep learning, the Fisher information matrix (FIM) gives novel insights and useful tools to characterize the loss landscape, perform second-order optimization, and build geometric learning theories. The exact FIM is either unavailable in closed form or too expensive to compute. In practice, it is almost always estimated based on empirical samples. We investigate two such estimators… ▽ More

    Submitted 27 October, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  10. arXiv:2006.07841  [pdf, other

    cs.LG stat.ML

    Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

    Authors: Bing Yu, Ke Sun, He Wang, Zhouchen Lin, Zhanxing Zhu

    Abstract: The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems. While abundant unlabeled data typically exist and provide a potential solution, it is highly challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled~(PU) classification and the conditional generation with extra unlabeled data \emph{simultaneously}. In partic… ▽ More

    Submitted 8 February, 2024; v1 submitted 14 June, 2020; originally announced June 2020.

  11. arXiv:2001.00672  [pdf, other

    eess.SP eess.SY stat.ME

    A Two-Stage Batch Algorithm for Nonlinear Static Parameter Estimation

    Authors: Kerry Sun, Demoz Gebre-Egziabher

    Abstract: A two-stage batch estimation algorithm for solving a class of nonlinear, static parameter estimation problems that appear in aerospace engineering applications is proposed. It is shown how these problems can be recast into a form suitable for the proposed two-stage estimation process. In the first stage, linear least squares is used to obtain a subset of the unknown parameters (set 1), while a res… ▽ More

    Submitted 2 January, 2020; originally announced January 2020.

    Comments: Accepted by AIAA Journal of Guidance, Control and Dynamics, Dec 2019

  12. arXiv:1911.12463  [pdf, other

    cs.LG stat.ML

    Information-Geometric Set Embeddings (IGSE): From Sets to Probability Distributions

    Authors: Ke Sun, Frank Nielsen

    Abstract: This letter introduces an abstract learning problem called the "set embedding": The objective is to map sets into probability distributions so as to lose less information. We relate set union and intersection operations with corresponding interpolations of probability distributions. We also demonstrate a preliminary solution with experimental results on toy set embedding examples.

    Submitted 11 December, 2019; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: To be presented at Sets & Partitions (NeurIPS 2019 workshop)

  13. arXiv:1911.09307  [pdf, other

    cs.LG stat.ML

    Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy

    Authors: Ke Sun, Bing Yu, Zhouchen Lin, Zhanxing Zhu

    Abstract: Regularization plays a crucial role in machine learning models, especially for deep neural networks. The existing regularization techniques mainly rely on the i.i.d. assumption and only consider the knowledge from the current sample, without the leverage of the neighboring relationship between samples. In this work, we propose a general regularizer called \textbf{Patch-level Neighborhood Interpola… ▽ More

    Submitted 22 October, 2023; v1 submitted 21 November, 2019; originally announced November 2019.

    Comments: Accepted in ACML 2023 conference track

  14. arXiv:1908.06278  [pdf, other

    cs.LG q-bio.GN stat.ML

    Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification

    Authors: Xiaoyu Zhang, **gqing Zhang, Kai Sun, Xian Yang, Chengliang Dai, Yike Guo

    Abstract: Different aspects of a clinical sample can be revealed by multiple types of omics data. Integrated analysis of multi-omics data provides a comprehensive view of patients, which has the potential to facilitate more accurate clinical decision making. However, omics data are normally high dimensional with large number of molecular features and relatively small number of available samples with clinica… ▽ More

    Submitted 17 August, 2019; originally announced August 2019.

    Comments: 7 pages, 4 figures

    Journal ref: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

  15. arXiv:1908.05081  [pdf, other

    cs.LG stat.ML

    AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models

    Authors: Ke Sun, Zhanxing Zhu, Zhouchen Lin

    Abstract: The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(Adaboosting Gra… ▽ More

    Submitted 15 March, 2021; v1 submitted 14 August, 2019; originally announced August 2019.

    Comments: Published on International Conference on Learning Representations (ICLR) 2021

  16. arXiv:1907.06582  [pdf, other

    cs.LG stat.ML

    AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

    Authors: Zheng Gao, Lin Guo, Chi Ma, Xiao Ma, Kai Sun, Hang Xiang, Xiaoqiang Zhu, Hongsong Li, Xiaozhong Liu

    Abstract: Anomaly detection is facing with emerging challenges in many important industry domains, such as cyber security and online recommendation and advertising. The recent trend in these areas calls for anomaly detection on time-evolving data with high-dimensional categorical features without labeled samples. Also, there is an increasing demand for identifying and monitoring irregular patterns at multip… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: Accepted by 2019 KDD Workshop on Deep Learning Practice for High-Dimensional Sparse Data

  17. arXiv:1905.11027  [pdf, other

    cs.LG stat.ML

    A Geometric Modeling of Occam's Razor in Deep Learning

    Authors: Ke Sun, Frank Nielsen

    Abstract: Why do deep neural networks (DNNs) benefit from very high dimensional parameter spaces? Their huge parameter complexities vs. stunning performances in practice is all the more intriguing and not explainable using the standard theory of model selection for regular models. In this work, we propose a geometrically flavored information-theoretic approach to study this phenomenon. Namely, we introduce… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: This work first appeared under the former title "Lightlike Neuromanifolds, Occam's Razor and Deep Learning"

  18. arXiv:1903.04154  [pdf, other

    cs.LG stat.ML

    Fisher-Bures Adversary Graph Convolutional Networks

    Authors: Ke Sun, Piotr Koniusz, Zhen Wang

    Abstract: In a graph convolutional network, we assume that the graph $G$ is generated wrt some observation noise. During learning, we make small random perturbations $ΔG$ of the graph and try to improve generalization. Based on quantum information geometry, $ΔG$ can be characterized by the eigendecomposition of the graph Laplacian matrix. We try to minimize the loss wrt the perturbed $G+Δ{G}$ while making… ▽ More

    Submitted 30 June, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

    Comments: Published in UAI 2019

  19. arXiv:1902.11045  [pdf, other

    cs.LG stat.ML

    Virtual Adversarial Training on Graph Convolutional Networks in Node Classification

    Authors: Ke Sun, Zhouchen Lin, Hantao Guo, Zhanxing Zhu

    Abstract: The effectiveness of Graph Convolutional Networks (GCNs) has been demonstrated in a wide range of graph-based machine learning tasks. However, the update of parameters in GCNs is only from labeled nodes, lacking the utilization of unlabeled data. In this paper, we apply Virtual Adversarial Training (VAT), an adversarial regularization method based on both labeled and unlabeled data, on the supervi… ▽ More

    Submitted 20 February, 2020; v1 submitted 28 February, 2019; originally announced February 2019.

    Comments: Chinese Conference on Pattern Recognition and Computer Vision(PRCV) 2019 Oral paper

  20. arXiv:1902.11038  [pdf, other

    cs.LG stat.ML

    Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels

    Authors: Ke Sun, Zhouchen Lin, Zhanxing Zhu

    Abstract: Graph Convolutional Networks(GCNs) play a crucial role in graph learning tasks, however, learning graph embedding with few supervised signals is still a difficult problem. In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self-Supervised(M3S) Training Algorithm, combined with self-supervised learning approach, focusing on improving the general… ▽ More

    Submitted 20 February, 2020; v1 submitted 28 February, 2019; originally announced February 2019.

    Comments: AAAI Conference on Artificial Intelligence (AAAI 2020)

  21. arXiv:1902.11029  [pdf, other

    cs.LG stat.ML

    Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN

    Authors: Ke Sun, Zhanxing Zhu, Zhouchen Lin

    Abstract: Deep neural networks have been widely deployed in various machine learning tasks. However, recent works have demonstrated that they are vulnerable to adversarial examples: carefully crafted small perturbations to cause misclassification by the network. In this work, we propose a novel defense mechanism called Boundary Conditional GAN to enhance the robustness of deep neural networks against advers… ▽ More

    Submitted 28 February, 2019; originally announced February 2019.

  22. arXiv:1902.11019  [pdf, other

    cs.LG stat.ML

    Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors

    Authors: Ke Sun, Zhanxing Zhu, Zhouchen Lin

    Abstract: Most previous works usually explained adversarial examples from several specific perspectives, lacking relatively integral comprehension about this problem. In this paper, we present a systematic study on adversarial examples from three aspects: the amount of training data, task-dependent and model-specific factors. Particularly, we show that adversarial generalization (i.e. test accuracy on adver… ▽ More

    Submitted 28 February, 2019; originally announced February 2019.

  23. arXiv:1812.09427  [pdf

    cs.LG cs.CV stat.ML

    Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification

    Authors: Yongli Zhu, Chengxi Liu, Kai Sun

    Abstract: This paper presents a study on power grid disturbance classification by Deep Learning (DL). A real synchrophasor set composing of three different types of disturbance events from the Frequency Monitoring Network (FNET) is used. An image embedding technique called Gramian Angular Field is applied to transform each time series of event data to a two-dimensional image for learning. Two main DL algori… ▽ More

    Submitted 21 December, 2018; originally announced December 2018.

    Comments: An updated version of this manuscript has been accepted by the 2018 IEEE International Conference on Energy Internet (ICEI), Bei**g, China

  24. arXiv:1812.08113  [pdf, other

    cs.LG stat.ML

    On The Chain Rule Optimal Transport Distance

    Authors: Frank Nielsen, Ke Sun

    Abstract: We define a novel class of distances between statistical multivariate distributions by modeling an optimal transport problem on their marginals with respect to a ground distance defined on their conditionals. These new distances are metrics whenever the ground distance between the marginals is a metric, generalize both the Wasserstein distances between discrete measures and a recently introduced m… ▽ More

    Submitted 2 November, 2020; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: 23 pages, 6 figures

  25. arXiv:1811.02629  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

    Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More

    Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

  26. arXiv:1811.01464  [pdf, other

    cs.LG stat.ML

    Intrinsic Universal Measurements of Non-linear Embeddings

    Authors: Ke Sun

    Abstract: A basic problem in machine learning is to find a map** $f$ from a low dimensional latent space $\mathcal{Y}$ to a high dimensional observation space $\mathcal{X}$. Modern tools such as deep neural networks are capable to represent general non-linear map**s. A learner can easily find a map** which perfectly fits all the observations. However, such a map** is often not considered as good, be… ▽ More

    Submitted 1 August, 2022; v1 submitted 4 November, 2018; originally announced November 2018.

  27. arXiv:1809.02270  [pdf, other

    cs.LG cs.SE cs.SI stat.ML

    Learning Embeddings of Directed Networks with Text-Associated Nodes---with Applications in Software Package Dependency Networks

    Authors: Kexuan Sun, Shudan Zhong, Hong Xu

    Abstract: A network embedding consists of a vector representation for each node in the network. Its usefulness has been shown in many real-world application domains, such as social networks and web networks. Directed networks with text associated with each node, such as software package dependency networks, are commonplace. However, to the best of our knowledge, their embeddings have hitherto not been speci… ▽ More

    Submitted 26 November, 2020; v1 submitted 6 September, 2018; originally announced September 2018.

    Comments: 10 pages, 6 figures, 3 tables. 2020 BigGraphs Workshop at IEEE BigData 2020

  28. arXiv:1806.11311  [pdf, other

    cs.LG cs.CV stat.ML

    Guaranteed Deterministic Bounds on the Total Variation Distance between Univariate Mixtures

    Authors: Frank Nielsen, Ke Sun

    Abstract: The total variation distance is a core statistical distance between probability measures that satisfies the metric axioms, with value always falling in $[0,1]$. This distance plays a fundamental role in machine learning and signal processing: It is a member of the broader class of $f$-divergences, and it is related to the probability of error in Bayesian hypothesis testing. Since the total variati… ▽ More

    Submitted 29 June, 2018; originally announced June 2018.

    Comments: 11 pages, 2 figures

  29. arXiv:1806.08541  [pdf, other

    stat.ML cs.LG

    Visualizing and Understanding Deep Neural Networks in CTR Prediction

    Authors: Lin Guo, Hui Ye, Wenbo Su, Henhuan Liu, Kai Sun, Hang Xiang

    Abstract: Although deep learning techniques have been successfully applied to many tasks, interpreting deep neural network models is still a big challenge to us. Recently, many works have been done on visualizing and analyzing the mechanism of deep neural networks in the areas of image processing and natural language processing. In this paper, we present our approaches to visualize and understand deep neura… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.

    Comments: Accept by 2018 SIGIR Workshop on eCommerce

  30. arXiv:1806.00149  [pdf, other

    cs.NE cs.LG stat.ML

    q-Neurons: Neuron Activations based on Stochastic Jackson's Derivative Operators

    Authors: Frank Nielsen, Ke Sun

    Abstract: We propose a new generic type of stochastic neurons, called $q$-neurons, that considers activation functions based on Jackson's $q$-derivatives with stochastic parameters $q$. Our generalization of neural network architectures with $q$-neurons is shown to be both scalable and very easy to implement. We demonstrate experimentally consistently improved performances over state-of-the-art standard act… ▽ More

    Submitted 13 June, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

    Comments: 12 pages, 5 figures, 1 table

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems (2020)

  31. arXiv:1606.05850  [pdf, other

    cs.LG cs.IT stat.ML

    Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities

    Authors: Frank Nielsen, Ke Sun

    Abstract: Information-theoretic measures such as the entropy, cross-entropy and the Kullback-Leibler divergence between two mixture models is a core primitive in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte-Carlo stochastic integration, approximated, or bounded using variou… ▽ More

    Submitted 16 August, 2016; v1 submitted 19 June, 2016; originally announced June 2016.

    Comments: 20 pages, 3 figures

  32. arXiv:1405.2798  [pdf, other

    cs.LG cs.AI stat.ML

    Two-Stage Metric Learning

    Authors: Jun Wang, Ke Sun, Fei Sha, Stephane Marchand-Maillet, Alexandros Kalousis

    Abstract: In this paper, we present a novel two-stage metric learning algorithm. We first map each learning instance to a probability distribution by computing its similarities to a set of fixed anchor points. Then, we define the distance in the input data space as the Fisher information distance on the associated statistical manifold. This induces in the input data space a new family of distance metric wit… ▽ More

    Submitted 12 May, 2014; originally announced May 2014.

    Comments: Accepted for publication in ICML 2014