Skip to main content

Showing 1–7 of 7 results for author: Chou, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2211.10926  [pdf, other

    stat.AP physics.soc-ph q-bio.PE

    Unraveling implicit human behavioral effects on dynamic characteristics of Covid-19 daily infection rates in Taiwan

    Authors: Ting-Li Chen, Elizabeth P. Chou, Min-Yi Chen, Hsieh Fushing

    Abstract: We study Covid-19 spreading dynamics underlying 84 curves of daily Covid-19 infection rates pertaining to 84 districts belonging to the largest seven cities in Taiwan during her pristine surge period. Our computational developments begin with selecting and extracting 18 features from each smoothed district-specific curve. This step of computing effort allows unstructured data to be converted into… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  2. Learned practical guidelines for evaluating Conditional Entropy and Mutual Information in discovering major factors of response-vs-covariate dynamics

    Authors: Ting-Li Chen, Hsieh Fushing, Elizabeth P. Chou

    Abstract: We reformulate and reframe a series of increasingly complex parametric statistical topics into a framework of response-vs-covariate (Re-Co) dynamics that is described without any explicit functional structures. Then we resolve these topics' data analysis tasks by discovering major factors underlying such Re-Co dynamics by only making use of data's categorical nature. The major factor selection pro… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  3. arXiv:2209.02623  [pdf, other

    stat.ME stat.AP

    Multiscale major factor selections for complex system data with structural dependency and heterogeneity

    Authors: Hsieh Fushing, Elizabeth Chou, Ting-Li Chen

    Abstract: Based on structured data derived from large complex systems, we computationally further develop and refine a major factor selection protocol by accommodating structural dependency and heterogeneity among many features to unravel data's information content. Two operational concepts: ``de-associating'' and its counterpart ``shadowing'' that play key roles in our protocol, are reasoned, explained, an… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  4. arXiv:2007.15039  [pdf, other

    stat.AP stat.ME stat.ML

    Extreme-K categorical samples problem

    Authors: Elizabeth Chou, Catie McVey, Yin-Chen Hsieh, Sabrina Enriquez, Fushing Hsieh

    Abstract: With histograms as its foundation, we develop Categorical Exploratory Data Analysis (CEDA) under the extreme-$K$ sample problem, and illustrate its universal applicability through four 1D categorical datasets. Given a sizable $K$, CEDA's ultimate goal amounts to discover by data's information content via carrying out two data-driven computational tasks: 1) establish a tree geometry upon $K$ popula… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: 20 pages, 12 figures

  5. arXiv:2007.00077  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Similarity Search for Efficient Active Learning and Search of Rare Concepts

    Authors: Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz

    Abstract: Many active learning and search approaches are intractable for large-scale industrial settings with billions of unlabeled examples. Existing approaches search globally for the optimal examples to label, scaling linearly or even quadratically with the unlabeled data. In this paper, we improve the computational efficiency of active learning and search methods by restricting the candidate pool for la… ▽ More

    Submitted 22 July, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

  6. arXiv:2006.14411  [pdf, other

    stat.AP stat.ME

    Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics perspectives of baseball pitching dynamics

    Authors: Fushing Hsieh, Elizabeth P. Chou

    Abstract: From two coupled Multiclass Classification (MCC) and Response Manifold Analytics (RMA) perspectives, we develop Categorical Exploratory Data Analysis (CEDA) on PITCHf/x database for the information content of Major League Baseball's (MLB) pitching dynamics. MCC and RMA information contents are represented by one collection of multi-scales pattern categories from mixing geometries and one collectio… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  7. arXiv:1711.03346  [pdf, other

    stat.AP stat.ML

    Dimension Reduction of High-Dimensional Datasets Based on Stepwise SVM

    Authors: Elizabeth P. Chou, Tzu-Wei Ko

    Abstract: The current study proposes a dimension reduction method, stepwise support vector machine (SVM), to reduce the dimensions of large p small n datasets. The proposed method is compared with other dimension reduction methods, namely, the Pearson product difference correlation coefficient (PCCs), recursive feature elimination based on random forest (RF-RFE), and principal component analysis (PCA), by u… ▽ More

    Submitted 9 November, 2017; originally announced November 2017.