Skip to main content

Showing 1–5 of 5 results for author: Ji, P

Searching in archive stat. Search in all archives.
.
  1. Recent Advances in Text Analysis

    Authors: Zheng Tracy Ke, Pengsheng Ji, Jiashun **, Wanshan Li

    Abstract: Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze MADSta… ▽ More

    Submitted 7 February, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

    Journal ref: Annual Review of Statistics and Its Application 2024 11:1

  2. arXiv:2008.03820  [pdf, other

    stat.ML cs.LG cs.SI math.ST

    Spectral Algorithms for Community Detection in Directed Networks

    Authors: Zhe Wang, Yingbin Liang, Pengsheng Ji

    Abstract: Community detection in large social networks is affected by degree heterogeneity of nodes. The D-SCORE algorithm for directed networks was introduced to reduce this effect by taking the element-wise ratios of the singular vectors of the adjacency matrix before clustering. Meaningful results were obtained for the statistician citation network, but rigorous analysis on its performance was missing. F… ▽ More

    Submitted 9 August, 2020; originally announced August 2020.

    Comments: Journal of Machine Learning Research 2020, to appear

    Journal ref: Journal of Machine Learning Research 2020. (153):1-45,

  3. arXiv:1809.10804  [pdf, other

    cs.CL cs.LG stat.ML

    Patient Risk Assessment and Warning Symptom Detection Using Deep Attention-Based Neural Networks

    Authors: Ivan Girardi, Pengfei Ji, An-phi Nguyen, Nora Hollenstein, Adam Ivankay, Lorenz Kuhn, Chiara Marchiori, Ce Zhang

    Abstract: We present an operational component of a real-world patient triage system. Given a specific patient presentation, the system is able to assess the level of medical urgency and issue the most appropriate recommendation in terms of best point of care and time to treat. We use an attention-based convolutional neural network architecture trained on 600,000 doctor notes in German. We compare two approa… ▽ More

    Submitted 27 September, 2018; originally announced September 2018.

    Comments: 10 pages, 2 figures, EMNLP workshop LOUHI 2018

  4. arXiv:1410.2840  [pdf, other

    stat.AP cs.DL physics.soc-ph stat.ME

    Coauthorship and Citation Networks for Statisticians

    Authors: Pengsheng Ji, Jiashun **

    Abstract: We have collected and cleaned two network data sets: Coauthorship and Citation networks for statisticians. The data sets are based on all research papers published in four of the top journals in statistics from $2003$ to the first half of $2012$. We analyze the data sets from many different perspectives, focusing on (a) centrality, (b) community structures, and (c) productivity, patterns and trend… ▽ More

    Submitted 2 July, 2015; v1 submitted 10 October, 2014; originally announced October 2014.

    MSC Class: 91C20; 62H30; 62P25

    Journal ref: Annals of Applied Statistics 2016, 10(4): 1779-1812

  5. arXiv:1404.2961  [pdf, other

    stat.ME

    Rate optimal multiple testing procedure in high-dimensional regression

    Authors: Pengsheng Ji, Zhigen Zhao

    Abstract: In the high dimensional regression analysis when the number of predictors is much larger than the sample size, an important question is to select the important variable which are relevant to the response variable of interest. Variable selection and the multiple testing are both tools to address this issue. However, there is little discussion on the connection of these two areas. When the signal st… ▽ More

    Submitted 6 January, 2023; v1 submitted 10 April, 2014; originally announced April 2014.

    Comments: 26 pages