Search | arXiv e-print repository

Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications

Abstract: In recent times Large Language Models have exhibited tremendous capabilities, especially in the areas of mathematics, code generation and general-purpose reasoning. However for specialized domains especially in applications that require parsing and analyzing large chunks of numeric or tabular data even state-of-the-art (SOTA) models struggle. In this paper, we introduce a new approach to solving d… ▽ More In recent times Large Language Models have exhibited tremendous capabilities, especially in the areas of mathematics, code generation and general-purpose reasoning. However for specialized domains especially in applications that require parsing and analyzing large chunks of numeric or tabular data even state-of-the-art (SOTA) models struggle. In this paper, we introduce a new approach to solving domain-specific tabular data analysis tasks by presenting a unique RAG workflow that mitigates the scalability issues of existing tabular LLM solutions. Specifically, we present Tabular Embedding Model (TEM), a novel approach to fine-tune embedding models for tabular Retrieval-Augmentation Generation (RAG) applications. Embedding models form a crucial component in the RAG workflow and even current SOTA embedding models struggle as they are predominantly trained on textual datasets and thus underperform in scenarios involving complex tabular data. The evaluation results showcase that our approach not only outperforms current SOTA embedding models in this domain but also does so with a notably smaller and more efficient model structure. △ Less

Submitted 28 April, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2403.19477 [pdf, other]

Real-time Geoinformation Systems to Improve the Quality, Scalability, and Cost of Internet of Things for Agri-environment Research

Authors: Bryan C. Runck, Bobby Schulz, Jeff Bishop, Nathan Carlson, Bryan Chantigian, Gary Deters, Jesse Erdmann, Patrick M. Ewing, Michael Felzan, Xiao Fu, Jan Greyling, Christopher J. Hogan, Andrew Hollman, Ali Joglekar, Kris Junker, Michael Kantar, Lumbani Kaunda, Mohana Krishna, Benjamin Lynch, Peter Marchetto, Megan Marsolek, Troy McKay, Brad Morris, Ali Rashid Niaghi, Keerthi Pamulaparthy , et al. (19 additional authors not shown)

Abstract: With the increasing emphasis on machine learning and artificial intelligence to drive knowledge discovery in the agricultural sciences, spatial internet of things (IoT) technologies have become increasingly important for collecting real-time, high resolution data for these models. However, managing large fleets of devices while maintaining high data quality remains an ongoing challenge as scientis… ▽ More With the increasing emphasis on machine learning and artificial intelligence to drive knowledge discovery in the agricultural sciences, spatial internet of things (IoT) technologies have become increasingly important for collecting real-time, high resolution data for these models. However, managing large fleets of devices while maintaining high data quality remains an ongoing challenge as scientists iterate from prototype to mature end-to-end applications. Here, we provide a set of case studies using the framework of technology readiness levels for an open source spatial IoT system. The spatial IoT systems underwent 3 major and 14 minor system versions, had over 2,727 devices manufactured both in academic and commercial contexts, and are either in active or planned deployment across four continents. Our results show the evolution of a generalizable, open source spatial IoT system designed for agricultural scientists, and provide a model for academic researchers to overcome the challenges that exist in going from one-off prototypes to thousands of internet-connected devices. △ Less

Submitted 2 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures, 1 table

arXiv:2311.07762 [pdf, other]

Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for Clustering Count Data

Authors: Andrea Payne, Anjali Silva, Steven J. Rothstein, Paul D. McNicholas, Sanjeena Subedi

Abstract: A mixture of multivariate Poisson-log normal factor analyzers is introduced by imposing constraints on the covariance matrix, which resulted in flexible models for clustering purposes. In particular, a class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced. Variational Gaussian approximation is used for parameter estimation, and information criter… ▽ More A mixture of multivariate Poisson-log normal factor analyzers is introduced by imposing constraints on the covariance matrix, which resulted in flexible models for clustering purposes. In particular, a class of eight parsimonious mixture models based on the mixtures of factor analyzers model are introduced. Variational Gaussian approximation is used for parameter estimation, and information criteria are used for model selection. The proposed models are explored in the context of clustering discrete data arising from RNA sequencing studies. Using real and simulated data, the models are shown to give favourable clustering performance. The GitHub R package for this work is available at https://github.com/anjalisilva/mixMPLNFA and is released under the open-source MIT license. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 29 pages, 2 figures

MSC Class: 62H30

arXiv:2310.02056 [pdf, other]

Leveraging Data-Driven Models for Accurate Analysis of Grid-Tied Smart Inverters Dynamics

Authors: Sunil Subedi, Nischal Guruwacharya, Bidur Poudel, Jesus D. Vasquez-Plaza, Fabio Andrade, Robert Fourney, Hossein Moradi Rekabdarkolaee, Timothy M. Hansen, Reinaldo Tonkoski

Abstract: The integration of power electronic converters (PECs) and distributed energy resources (DERs) in modern power systems has introduced dynamism and complexity. Accurate simulation becomes essential to comprehend the influence of converter domination on the power grid. This study addresses the fast-switching and stochastic behaviors exhibited by inverter-based resources in converter-dominated power s… ▽ More The integration of power electronic converters (PECs) and distributed energy resources (DERs) in modern power systems has introduced dynamism and complexity. Accurate simulation becomes essential to comprehend the influence of converter domination on the power grid. This study addresses the fast-switching and stochastic behaviors exhibited by inverter-based resources in converter-dominated power systems, highlighting the necessity for precise analytical models. In the realm of modeling real-world systems, multiple methodologies exist. Notably, black-box and data-driven system identification techniques are employed to construct PEC models using experimental data, without relying on a priori knowledge of the internal system physics. This approach entails a systematic process of model class selection, parameter estimation, and model validation. While a range of linear and nonlinear model structures and estimation algorithms are at our disposal, it remains imperative to harness creativity and a profound understanding of the physical system to craft data-driven models that align seamlessly with their intended applications. These applications may encompass simulation, prediction, control, or fault detection. This report offers valuable insights into the collection of datasets from commercial off-the-shelf inverters, along with the presentation of intricate simulation models. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 9 pages, 7 figures

arXiv:2307.15766 [pdf, other]

Automated Data-Driven Model Extraction and Validation of Inverter Dynamics with Grid Support Function

Authors: Sunil Subedi, Bidur Poudel, Pooja Aslami, Robert Fourney, Hossein Moradi Rekabdarkolaee, Reinaldo Tonkoski, Timothy M. Hansen

Abstract: This research focuses on the evolving dynamics of the power grid, where traditional synchronous generators are being replaced by non-synchronous power electronic converter (PEC)-interfaced renewable energy sources. The non-linear dynamics must be accurately modeled to ensure the stability of future converter-dominated power systems (CDPS). However, obtaining comprehensive dynamic models becomes mo… ▽ More This research focuses on the evolving dynamics of the power grid, where traditional synchronous generators are being replaced by non-synchronous power electronic converter (PEC)-interfaced renewable energy sources. The non-linear dynamics must be accurately modeled to ensure the stability of future converter-dominated power systems (CDPS). However, obtaining comprehensive dynamic models becomes more complex and computationally intensive as the system grows. This study proposes a scalable and automated data-driven partitioned modeling framework for CDPS dynamics. The method constructs reduced-ordered dynamic linear transfer function models using input-output measurements from a PEC switching model. Validation experiments were conducted on single-house and multi-house scenarios, demonstrating high accuracy (over 97%) and significant computational speed improvements (6.5 times faster) compared to comprehensive models. This framework and modeling approach offers valuable insights for efficient analysis of power system dynamics, aiding in planning, operation, and dispatch. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2305.10254 [pdf, other]

SAM for Poultry Science

Authors: Xiao Yang, Haixing Dai, Zihao Wu, Ramesh Bist, Sachin Subedi, ** Sun, Guoyu Lu, Changying Li, Tianming Liu, Lilong Chai

Abstract: In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricult… ▽ More In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricultural applications, its potential in the poultry industry, specifically in the context of cage-free hens, remains relatively unexplored. This study aims to assess the zero-shot segmentation performance of SAM on representative chicken segmentation tasks, including part-based segmentation and the use of infrared thermal images, and to explore chicken-tracking tasks by using SAM as a segmentation tool. The results demonstrate SAM's superior performance compared to SegFormer and SETR in both whole and part-based chicken segmentation. SAM-based object tracking also provides valuable data on the behavior and movement patterns of broiler birds. The findings of this study contribute to a better understanding of SAM's potential in poultry science and lay the foundation for future advancements in chicken segmentation and tracking. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2303.14211 [pdf, other]

Tackling the infinite likelihood problem when fitting mixtures of shifted asymmetric Laplace distributions

Authors: Yuan Fang, Brian C. Franczak, Sanjeena Subedi

Abstract: Mixtures of shifted asymmetric Laplace distributions were introduced as a tool for model-based clustering that allowed for the direct parameterization of skewness in addition to location and scale. Following common practices, an expectation-maximization algorithm was developed to fit these mixtures. However, adaptations to account for the `infinite likelihood problem' led to fits that gave good cl… ▽ More Mixtures of shifted asymmetric Laplace distributions were introduced as a tool for model-based clustering that allowed for the direct parameterization of skewness in addition to location and scale. Following common practices, an expectation-maximization algorithm was developed to fit these mixtures. However, adaptations to account for the `infinite likelihood problem' led to fits that gave good classification performance at the expense of parameter recovery. In this paper, we propose a more valuable solution to this problem by develo** a novel Bayesian parameter estimation scheme for mixtures of shifted asymmetric Laplace distributions. Through simulation studies, we show that the proposed parameter estimation scheme gives better parameter estimates compared to the expectation-maximization based scheme. In addition, we also show that the classification performance is as good, and in some cases better, than the expectation-maximization based scheme. The performance of both schemes are also assessed using well-known real data sets. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2302.03849 [pdf, other]

Estimation of Gaussian Bi-Clusters with General Block-Diagonal Covariance Matrix and Applications

Authors: Anastasiia Livochka, Ryan Browne, Sanjeena Subedi

Abstract: Bi-clustering is a technique that allows for the simultaneous clustering of observations and features in a dataset. This technique is often used in bioinformatics, text mining, and time series analysis. An important advantage of biclustering algorithm is the ability to uncover multiple ``views'' (i.e., through rows and column grou**s) in the data. Several Gaussian mixture model based biclusterin… ▽ More Bi-clustering is a technique that allows for the simultaneous clustering of observations and features in a dataset. This technique is often used in bioinformatics, text mining, and time series analysis. An important advantage of biclustering algorithm is the ability to uncover multiple ``views'' (i.e., through rows and column grou**s) in the data. Several Gaussian mixture model based biclustering approach currently exist in the literature. However, they impose severe restrictions on the structure of the covariance matrix. Here, we propose a Gaussian mixture model-based bi-clustering approach that provides a more flexible block-diagonal covariance structure. We show that the clustering accuracy of the proposed model is comparable to other known techniques but our approach provides a more flexible covariance structure and has substantially lower computational time. We demonstrate the application of the proposed model in bioinformatics and topic modelling. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 48 pages, 13 figures

MSC Class: 62H30

arXiv:2208.12405 [pdf, other]

doi 10.1080/00295639.2022.2118483

Measurements of the $^{27}{\rm Al}(α,n)$ Thick Target Yield Near Threshold

Authors: K. Brandenburg, G. Hamad, Z. Meisel, C. R. Brune, D. E. Carter, J. Derkin, D. C. Ingram, Y. Jones-Alberty, B. Kenady, T. N. Massey, M. Saxena, D. Soltesz, S. K. Subedi, J. Warren

Abstract: We present results from direct measurements of the $^{27}{\rm Al}(α,n)$ thick target yield from laboratory incident energies $E_α\approx$ 3$-$5~MeV, performed with the $^{3}$HeBF$_{3}$ Giant Barrel (HeBGB) neutron detector at the Edwards Accelerator Laboratory. Our measurements have a small energy cadence in order to address discrepancies and sparseness of thick-target yield data sets existing for… ▽ More We present results from direct measurements of the $^{27}{\rm Al}(α,n)$ thick target yield from laboratory incident energies $E_α\approx$ 3$-$5~MeV, performed with the $^{3}$HeBF$_{3}$ Giant Barrel (HeBGB) neutron detector at the Edwards Accelerator Laboratory. Our measurements have a small energy cadence in order to address discrepancies and sparseness of thick-target yield data sets existing for this energy region. We find general agreement with existing data sets, including yields derived from cross section data, while resolving a discrepancy between existing thick-target yield data sets for $E_α\approx4-5$~MeV. However, for $E_α<3.5$~MeV, our results are substantially lower than previous thick-target yield data and somewhat larger than yields calculated from existing cross section data. Our data complete the energy-range needed for estimates of the $^{27}{\rm Al}(α,n)$ contribution to neutrino and dark matter detector backgrounds and result in increased viability of $^{27}{\rm Al}(α,n)$ as a plasma diagnostic tool at fusion facilities such as the National Ignition Facility. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: Accepted to Nuclear Science and Engineering

arXiv:2208.06239 [pdf, other]

doi 10.1103/PhysRevC.106.025804

Measurements of the $^{96}$Zr($α$,n)$^{99}$Mo cross section for astrophysics and applications

Authors: Gula Hamad, Kristyn Brandenburg, Zach Meisel, Carl R. Brune, Don E. Carter, David C. Ingram, Yenuel Jones-Alberty, Thomas N. Massey, Mansi Saxena, Doug Soltesz, Shiv K. Subedi, Alexander V. Voinov

Abstract: The reaction $^{96}$Zr($α$,n)$^{99}$Mo plays an important role in $ν$-driven wind nucleosynthesis in core-collapse supernovae and is a possible avenue for medical isotope production. Cross section measurements were performed using the activation technique at the Edwards Accelerator Laboratory. Results were analyzed along with world data on the $^{96}{\rm Zr}(α,n)$ cross section and… ▽ More The reaction $^{96}$Zr($α$,n)$^{99}$Mo plays an important role in $ν$-driven wind nucleosynthesis in core-collapse supernovae and is a possible avenue for medical isotope production. Cross section measurements were performed using the activation technique at the Edwards Accelerator Laboratory. Results were analyzed along with world data on the $^{96}{\rm Zr}(α,n)$ cross section and $^{96}{\rm Zr}(α,α)$ differential cross section using large-scale Hauser-Feshbach calculations. We compare our data, previous measurements, and a statistical description of the reaction. We find a larger cross section at low energies compared to prior experimental results, allowing for a larger astrophysical reaction rate. This may impact results of core-collapse supernova $ν$-driven wind nucleosynthesis calculations, but does not significantly alter prior conclusions about $^{99}{\rm Mo}$ production for medical physics applications. The results from our large-scale Hauser-Feshbach calculations demonstrate that phenomenological optical potentials may yet be adequate to describe $(α,n)$ reactions of interest for $ν$-driven wind nucleosynthesis, albeit with regionally-adjusted model parameters. △ Less

Submitted 12 August, 2022; originally announced August 2022.

Comments: accepted to PRC

arXiv:2206.05121 [pdf, other]

doi 10.1103/PhysRevX.13.011035

Topological spiral magnetism in the Weyl semimetal SmAlSi

Authors: Xiaohan Yao Jonathan Gaudet, Rahul Verma, David E. Graf, Hung-Yu Yang, Faranak Bahrami, Ruiqi Zhang, Adam A. Aczel, Sujan Subedi, Darius H. Torchinsky, Jianwei Sun, Arun Bansil, Shin-Ming Huang, Bahadur Singh, Predrag Nikolic, Peter Blaha, Fazel Tafti

Abstract: Weyl electrons are intensely studied due to novel charge transport phenomena such as chiral anomaly, Fermi arcs, and photogalvanic effect. Recent theoretical works suggest that Weyl electrons can also participate in magnetic interactions, and the Weyl-mediated indirect exchange coupling between local moments is proposed as a new mechanism of spiral magnetism that involves chiral electrons. Despite… ▽ More Weyl electrons are intensely studied due to novel charge transport phenomena such as chiral anomaly, Fermi arcs, and photogalvanic effect. Recent theoretical works suggest that Weyl electrons can also participate in magnetic interactions, and the Weyl-mediated indirect exchange coupling between local moments is proposed as a new mechanism of spiral magnetism that involves chiral electrons. Despite reports of incommensurate and non-collinear magnetic ordering in Weyl semimetals, an actual spiral order has remained hitherto undetected. Here, we present evidence of Weyl-mediated spiral magnetism in SmAlSi from neutron diffraction, transport, and thermodynamic data. We show that the spiral order in SmAlSi results from the nesting between topologically non-trivial Fermi pockets and weak magnetocrystalline anisotropy, unlike related materials (Ce,Pr,Nd)AlSi, where a strong anisotropy prevents the spins from freely rotating. We map the magnetic phase diagram of SmAlSi and reveal an A-phase where topological magnetic excitations may exist. This is corroborated by the observation of a topological Hall effect within the A-phase. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Comments: 7 pages, 4 figures

Journal ref: Phys. Rev. X 13, 011035 (2023)

arXiv:2206.02101

Selection of turbulence models via multiscaling analysis of an axisymmetric pipe flow and heat transfer

Authors: Indrajit Nandi, Saikat Saha, Sabir Subedi, Sumon Saha

Abstract: To fully evaluate a turbulent flow, Direct Numerical Simulation (DNS) is the most accurate method by far and requires considerable computational power and time; not optimum for industry standards. Develo** an alternative model, providing results with reasonable accuracy would resolve this issue. Reynolds Averaged Navier Stokes (RANS) modeling has proven its worth in addressing this phenomenon. I… ▽ More To fully evaluate a turbulent flow, Direct Numerical Simulation (DNS) is the most accurate method by far and requires considerable computational power and time; not optimum for industry standards. Develo** an alternative model, providing results with reasonable accuracy would resolve this issue. Reynolds Averaged Navier Stokes (RANS) modeling has proven its worth in addressing this phenomenon. In this study, we investigated the RANS turbulence models from COMSOL for fully developed single-phase flow in a two-dimensional axisymmetric pipe domain with constant heating at the wall and periodic boundary conditions at the inlet and outlet. Heat transfer in the fluid module has been added to address the heat transfer phenomenon. We evaluated the computed results with existing DNS data to match the accuracy of the RANS models. RANS simulations are conducted for friction Reynolds number, i.e., Re_τ= 180, 314, and 395 with varying Prandtl numbers, i.e., Pr = 0.71, 2, 5, and 7. Multiscaling analyses in the flow's inner, outer, and meso scaling regions are performed for fluid and heat transfer profiles, i.e., mean streamwise velocity, Reynolds shear stress, mean streamwise temperature, and turbulent heat flux, to compare with the DNS data. The investigation reports the scaling analysis's effectiveness and shows that RANS turbulence models can be used to describe such flow with reasonable accuracy. △ Less

Submitted 1 July, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

Comments: We have a major revision of this paper. I will share the updated one once I have it

arXiv:2204.01621 [pdf, other]

doi 10.1016/j.physletb.2022.137059

$^{57}$Zn $β$-delayed proton emission establishes the $^{56}$Ni $rp$-process waiting point bypass

Authors: M. Saxena, W. -J Ong, Z. Meisel, D. E. M. Hoff, N. Smirnova, P. C. Bender, S. P. Burcher, M. P. Carpenter, J. J. Carroll, A. Chester, C. J. Chiara, R. Conaway, P. A. Copp, B. P. Crider, J. Derkin, A. Estrade, G. Hamad, J. T. Harke, R. Jain, H. Jayatissa, S. N. Liddick, B. Longfellow, M. Mogannam, F. Montes, N. Nepal , et al. (10 additional authors not shown)

Abstract: We measured the $^{57}$Zn $β$-delayed proton ($β$p) and $γ$ emission at the National Superconducting Cyclotron Laboratory. We find a $^{57}$Zn half-life of 43.6 $\pm$ 0.2 ms, $β$p branching ratio of (84.7 $\pm$ 1.4)%, and identify four transitions corresponding to the exotic $β$-$γ$-$p$ decay mode, the second such identification in the $f p$-shell. The $p/γ$ ratio was used to correct for isospin m… ▽ More We measured the $^{57}$Zn $β$-delayed proton ($β$p) and $γ$ emission at the National Superconducting Cyclotron Laboratory. We find a $^{57}$Zn half-life of 43.6 $\pm$ 0.2 ms, $β$p branching ratio of (84.7 $\pm$ 1.4)%, and identify four transitions corresponding to the exotic $β$-$γ$-$p$ decay mode, the second such identification in the $f p$-shell. The $p/γ$ ratio was used to correct for isospin mixing while determining the $^{57}$Zn mass via the isobaric multiplet mass equation. Previously, it was uncertain as to whether the rp-process flow could bypass the textbook waiting point $^{56}$Ni for astrophysical conditions relevant to Type-I X-ray bursts. Our results definitively establish the existence of the $^{56}$Ni bypass, with 14-17% of the $rp$-process flow taking this route. △ Less

Submitted 4 April, 2022; originally announced April 2022.

arXiv:2111.15472 [pdf, other]

doi 10.1088/1748-0221/17/05/P05004

The $^{3}$He BF$_{3}$ Giant Barrel (HeBGB) Neutron Detector

Authors: K. Brandenburg, G. Hamad, Z. Meisel, C. R. Brune, D. E. Carter, T. Danley, J. Derkin, Y. Jones-Alberty, B. Kenady, T. N. Massey, S. Paneru, M. Saxena, D. Soltesz, S. K. Subedi, J. Warren

Abstract: $(α,n)$ reactions play an important role in nuclear astrophysics and applications and are an important background source in neutrino and dark matter detectors. Measurements of total $(α,n)… ▽ More $(α,n)$ reactions play an important role in nuclear astrophysics and applications and are an important background source in neutrino and dark matter detectors. Measurements of total $(α,n)$ cross sections employing direct neutron detection often have a considerable systematic uncertainty associated with the energy-dependent neutron detection efficiency and the unknown initial neutron energy distribution. The $^{3}{\rm He}\,{\rm BF}_{3}$ Giant Barrel (HeBGB) neutron detector was built at the Edwards Accelerator Laboratory at Ohio University to overcome this challenge. HeBGB offers a near-constant neutron detection efficiency of ($7.5\pm 1.2$) \% over the neutron energy range 0.01 MeV -- 9.00 MeV, removing a significant source of systematic uncertainty present in earlier $(α,n)$ cross section measurements. △ Less

Submitted 6 April, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

arXiv:2111.03363

Federated Learning Attacks Revisited: A Critical Discussion of Gaps, Assumptions, and Evaluation Setups

Authors: Aidmar Wainakh, Ephraim Zimmer, Sandeep Subedi, Jens Keim, Tim Grube, Shankar Karuppayah, Alejandro Sanchez Guinea, Max Mühlhäuser

Abstract: Federated learning (FL) enables a set of entities to collaboratively train a machine learning model without sharing their sensitive data, thus, mitigating some privacy concerns. However, an increasing number of works in the literature propose attacks that can manipulate the model and disclose information about the training data in FL. As a result, there has been a growing belief in the research co… ▽ More Federated learning (FL) enables a set of entities to collaboratively train a machine learning model without sharing their sensitive data, thus, mitigating some privacy concerns. However, an increasing number of works in the literature propose attacks that can manipulate the model and disclose information about the training data in FL. As a result, there has been a growing belief in the research community that FL is highly vulnerable to a variety of severe attacks. Although these attacks do indeed highlight security and privacy risks in FL, some of them may not be as effective in production deployment because they are feasible only under special -- sometimes impractical -- assumptions. Furthermore, some attacks are evaluated under limited setups that may not match real-world scenarios. In this paper, we investigate this issue by conducting a systematic map** study of attacks against FL, covering 48 relevant papers from 2016 to the third quarter of 2021. On the basis of this study, we provide a quantitative analysis of the proposed attacks and their evaluation settings. This analysis reveals several research gaps with regard to the type of target ML models and their architectures. Additionally, we highlight unrealistic assumptions in the problem settings of some attacks, related to the hyper-parameters of the ML model and data distribution among clients. Furthermore, we identify and discuss several fallacies in the evaluation of attacks, which open up questions on the generalizability of the conclusions. As a remedy, we propose a set of recommendations to avoid these fallacies and to promote adequate evaluations. △ Less

Submitted 3 January, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

Comments: In Section 5.2, incomplete information are mentioned on reference [9] ("How To Backdoor Federated Learning"). This part of text will be revised and enriched

arXiv:2103.09293 [pdf, other]

doi 10.1103/PhysRevLett.127.157405

Direct Measurement of Helicoid Surface States in RhSi using Nonlinear Optics

Authors: Dylan Rees, Baozhu Lu, Yue Sun, Kaustuv Manna, Rustem Ozgur, Sujan Subedi, Claudia Felser, Joseph Orenstein, Darius H. Torchinsky

Abstract: Despite the fundamental nature of the edge state in topological physics, direct measurement of electronic and optical properties of the Fermi arcs of topological semimetals has posed a significant experimental challenge, as their response is often overwhelmed by the metallic bulk. However, laser-driven currents carried by surface and bulk states can propagate in different directions in nonsymmorph… ▽ More Despite the fundamental nature of the edge state in topological physics, direct measurement of electronic and optical properties of the Fermi arcs of topological semimetals has posed a significant experimental challenge, as their response is often overwhelmed by the metallic bulk. However, laser-driven currents carried by surface and bulk states can propagate in different directions in nonsymmorphic crystals, allowing for the two components to be easily separated. Motivated by a recent theoretical prediction \cite{chang20}, we have measured the linear and circular photogalvanic effect currents deriving from the Fermi arcs of the nonsymmorphic, chiral Weyl semimetal RhSi over the $0.45 - 1.1$ eV incident photon energy range. Our data are in good agreement with the predicted magnitude of the circular photogalvanic effect as a function of photon energy, although the direction of the surface photocurrent departed from the theoretical expectation over the energy range studied. Surface currents arising from the linear photogalvanic effect were observed as well, with the unexpected result that only two of the six allowed tensor element were required to describe the measurements, suggesting an approximate emergent mirror symmetry inconsistent with the space group of the crystal. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: 6+5 pages, 5+3 figures

Journal ref: Phys. Rev. Lett. 127, 157405 (2021)

arXiv:2101.01871 [pdf, ps, other]

Logistic Normal Multinomial Factor Analyzers for Clustering Microbiome Data

Authors: Wangshu Tu, Sanjeena Subedi

Abstract: The human microbiome plays an important role in human health and disease status. Next generating sequencing technologies allow for quantifying the composition of the human microbiome. Clustering these microbiome data can provide valuable information by identifying underlying patterns across samples. Recently, Fang and Subedi (2020) proposed a logistic normal multinomial mixture model (LNM-MM) for… ▽ More The human microbiome plays an important role in human health and disease status. Next generating sequencing technologies allow for quantifying the composition of the human microbiome. Clustering these microbiome data can provide valuable information by identifying underlying patterns across samples. Recently, Fang and Subedi (2020) proposed a logistic normal multinomial mixture model (LNM-MM) for clustering microbiome data. As microbiome data tends to be high dimensional, here, we develop a family of logistic normal multinomial factor analyzers (LNM-FA) by incorporating a factor analyzer structure in the LNM-MM. This family of models is more suitable for high-dimensional data as the number of parameters in LNM-FA can be greatly reduced by assuming that the number of latent factors is small. Parameter estimation is done using a computationally efficient variant of the alternating expectation conditional maximization algorithm that utilizes variational Gaussian approximations. The proposed method is illustrated using simulated and real datasets. △ Less

Submitted 6 January, 2021; originally announced January 2021.

Comments: 50 pages, 5 figures

MSC Class: 62H30

arXiv:2101.01107 [pdf, ps, other]

Potentials versus Geometry

Authors: T. Curtright, S. Subedi

Abstract: We discuss some equivalence relations between the non-relativistic quantum mechanics for particles subjected to potentials and for particles moving freely on background geometries. In particular, we illustrate how selected geometries can be used to regularize singular potentials. We discuss some equivalence relations between the non-relativistic quantum mechanics for particles subjected to potentials and for particles moving freely on background geometries. In particular, we illustrate how selected geometries can be used to regularize singular potentials. △ Less

Submitted 4 January, 2021; originally announced January 2021.

arXiv:2011.06682 [pdf, other]

Clustering microbiome data using mixtures of logistic normal multinomial models

Authors: Yuan Fang, Sanjeena Subedi

Abstract: Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted on a simplex. In… ▽ More Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted on a simplex. In a logistic normal multinomial model, the relative abundance is mapped from a simplex to a latent variable that exists on the real Euclidean space using the additive log-ratio transformation. While a logistic normal multinomial approach brings in flexibility for modeling the data, it comes with a heavy computational cost as the parameter estimation typically relies on Bayesian techniques. In this paper, we develop a novel mixture of logistic normal multinomial models for clustering microbiome data. Additionally, we utilize an efficient framework for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. The proposed method is illustrated on simulated and real datasets. △ Less

Submitted 21 June, 2022; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: 53 pages, 6 Figures

MSC Class: 62H30

arXiv:2009.05098 [pdf, ps, other]

A Family of Mixture Models for Biclustering

Authors: Wangshu Tu, Sanjeena Subedi

Abstract: Biclustering is used for simultaneous clustering of the observations and variables when there is no group structure known \textit{a priori}. It is being increasingly used in bioinformatics, text analytics, etc. Previously, biclustering has been introduced in a model-based clustering framework by utilizing a structure similar to a mixture of factor analyzers. In such models, observed variables… ▽ More Biclustering is used for simultaneous clustering of the observations and variables when there is no group structure known \textit{a priori}. It is being increasingly used in bioinformatics, text analytics, etc. Previously, biclustering has been introduced in a model-based clustering framework by utilizing a structure similar to a mixture of factor analyzers. In such models, observed variables $\mathbf{X}$ are modelled using a latent variable $\mathbf{U}$ that is assumed to be from $N(\mathbf{0}, \mathbf{I})$. Clustering of variables is introduced by imposing constraints on the entries of the factor loading matrix to be 0 and 1 that results in a block diagonal covariance matrices. However, this approach is overly restrictive as off-diagonal elements in the blocks of the covariance matrices can only be 1 which can lead to unsatisfactory model fit on complex data. Here, the latent variable $\mathbf{U}$ is assumed to be from a $N(\mathbf{0}, \mathbf{T})$ where $\mathbf{T}$ is a diagonal matrix. This ensures that the off-diagonal terms in the block matrices within the covariance matrices are non-zero and not restricted to be 1. This leads to a superior model fit on complex data. A family of models are developed by imposing constraints on the components of the covariance matrix. For parameter estimation, an alternating expectation conditional maximization (AECM) algorithm is used. Finally, the proposed method is illustrated using simulated and real datasets. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: 45

MSC Class: 62H30

arXiv:2009.02621 [pdf, other]

Data-Driven Power Electronic Converter Modeling for Low Inertia Power System Dynamic Studies

Authors: Nischal Guruwacharya, Niranjan Bhujel, Ujjwol Tamrakar, Manisha Rauniyar, Sunil Subedi, Sterling E. Berg, Timothy M. Hansen, Reinaldo Tonkoski

Abstract: A significant amount of converter-based generation is being integrated into the bulk electric power grid to fulfill the future electric demand through renewable energy sources, such as wind and photovoltaic. The dynamics of converter systems in the overall stability of the power system can no longer be neglected as in the past. Numerous efforts have been made in the literature to derive detailed d… ▽ More A significant amount of converter-based generation is being integrated into the bulk electric power grid to fulfill the future electric demand through renewable energy sources, such as wind and photovoltaic. The dynamics of converter systems in the overall stability of the power system can no longer be neglected as in the past. Numerous efforts have been made in the literature to derive detailed dynamic models, but using detailed models becomes complicated and computationally prohibitive in large system level studies. In this paper, we use a data-driven, black-box approach to model the dynamics of a power electronic converter. System identification tools are used to identify the dynamic models, while a power amplifier controlled by a real-time digital simulator is used to perturb and control the converter. A set of linear dynamic models for the converter are derived, which can be employed for system level studies of converter-dominated electric grids. △ Less

Submitted 5 September, 2020; originally announced September 2020.

arXiv:2005.14702 [pdf, other]

doi 10.3847/1538-4357/ab9745

Sensitivity of ${^{44}}$Ti and ${^{56}}$Ni production in CCSN shock-driven nucleosynthesis to reaction rates

Authors: Shiv K. Subedi, Zach Meisel, Grant Merz

Abstract: Recent observational advances have enabled high resolution map** of ${^{44}}$Ti in core-collapse supernova (CCSN) remnants. Comparisons between observations and models provide stringent constraints on the CCSN mechanism. However, past work has identified several uncertain nuclear reaction rates that influence ${^{44}}$Ti and ${^{56}}$Ni production in post-processing model calculations. We evolve… ▽ More Recent observational advances have enabled high resolution map** of ${^{44}}$Ti in core-collapse supernova (CCSN) remnants. Comparisons between observations and models provide stringent constraints on the CCSN mechanism. However, past work has identified several uncertain nuclear reaction rates that influence ${^{44}}$Ti and ${^{56}}$Ni production in post-processing model calculations. We evolved one dimensional models of $15~M_{\odot}$, $18~M_{\odot}$, $22~M_{\odot}$ and $25~M_{\odot}$ stars from zero-age main sequence through CCSN using {\tt MESA} (Modules for Experiments in Stellar Astrophysics) and investigated the previously identified reaction rate sensitivities of ${^{44}}$Ti and ${^{56}}$Ni production. We tested the robustness of our results by making various assumptions about the CCSN explosion energy and mass-cut. We found a number of reactions that have a significant impact on the nucleosynthesis of ${^{44}}$Ti and ${^{56}}$Ni, particularly for lower progenitor masses. Notably, the reaction rates $^{13}{\rm N}(α,p)^{16}{\rm O}$, $^{17}{\rm F}(α,p)^{20}{\rm Ne}$, $^{52}{\rm Fe}(α,p)^{55}{\rm Co}$, $^{56}{\rm Ni}(α,p)^{59}{\rm Cu}$, $^{57}{\rm Ni}(n,p)^{57}{\rm Co}$, $^{56}{\rm Co}(p,n)^{56}{\rm Ni}$, $^{39}{\rm K}(p,γ)^{40}{\rm Ca}$, $^{47}{\rm V}(p,γ)^{48}{\rm Cr}$, $^{52}{\rm Mn}(p,γ)^{53}{\rm Fe}$, $^{57}{\rm Co}(p,γ)^{58}{\rm Ni}$, and $^{39}{\rm K}(p,α)^{36}{\rm Ar}$ are influential for a large number of model conditions. Furthermore, we found the list of influential reactions identified by previous post-processing studies of CCSN shock-driven nucleosynthesis is likely incomplete, motivating future larger-scale sensitivity studies. △ Less

Submitted 29 May, 2020; originally announced May 2020.

Comments: 15 pages, 7 figures, 3 tables; Accepted to the Astrophysical Journal

arXiv:2005.05324 [pdf, other]

Infinite mixtures of multivariate normal-inverse Gaussian distributions for clustering of skewed data

Authors: Yuan Fang, Dimitris Karlis, Sanjeena Subedi

Abstract: Mixtures of multivariate normal inverse Gaussian (MNIG) distributions can be used to cluster data that exhibit features such as skewness and heavy tails. However, for cluster analysis, using a traditional finite mixture model framework, either the number of components needs to be known $a$-$priori$ or needs to be estimated $a$-$posteriori$ using some model selection criterion after deriving result… ▽ More Mixtures of multivariate normal inverse Gaussian (MNIG) distributions can be used to cluster data that exhibit features such as skewness and heavy tails. However, for cluster analysis, using a traditional finite mixture model framework, either the number of components needs to be known $a$-$priori$ or needs to be estimated $a$-$posteriori$ using some model selection criterion after deriving results for a range of possible number of components. However, different model selection criteria can sometimes result in different number of components yielding uncertainty. Here, an infinite mixture model framework, also known as Dirichlet process mixture model, is proposed for the mixtures of MNIG distributions. This Dirichlet process mixture model approach allows the number of components to grow or decay freely from 1 to $\infty$ (in practice from 1 to $N$) and the number of components is inferred along with the parameter estimates in a Bayesian framework thus alleviating the need for model selection criteria. We provide real data applications with benchmark datasets as well as a small simulation experiment to compare with other existing models. The proposed method provides competitive clustering results to other clustering approaches for both simulation and real data and parameter recovery are illustrated using simulation studies. △ Less

Submitted 11 May, 2020; originally announced May 2020.

Comments: 61 pages. arXiv admin note: text overlap with arXiv:2005.02585

MSC Class: 62H30

arXiv:2005.02585 [pdf, other]

A Bayesian approach for clustering skewed data using mixtures of multivariate normal-inverse Gaussian distributions

Authors: Yuan Fang, Dimitris Karlis, Sanjeena Subedi

Abstract: Non-Gaussian mixture models are gaining increasing attention for mixture model-based clustering particularly when dealing with data that exhibit features such as skewness and heavy tails. Here, such a mixture distribution is presented, based on the multivariate normal inverse Gaussian (MNIG) distribution. For parameter estimation of the mixture, a Bayesian approach via Gibbs sampler is used; for t… ▽ More Non-Gaussian mixture models are gaining increasing attention for mixture model-based clustering particularly when dealing with data that exhibit features such as skewness and heavy tails. Here, such a mixture distribution is presented, based on the multivariate normal inverse Gaussian (MNIG) distribution. For parameter estimation of the mixture, a Bayesian approach via Gibbs sampler is used; for this, a novel approach to simulate univariate generalized inverse Gaussian random variables and matrix generalized inverse Gaussian random matrices is provided. The proposed algorithm will be applied to both simulated and real data. Through simulation studies and real data analysis, we show parameter recovery and that our approach provides competitive clustering results compared to other clustering approaches. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 40 pages, 7 figures

MSC Class: 62H30

arXiv:2004.06857 [pdf, other]

A parsimonious family of multivariate Poisson-lognormal distributions for clustering multivariate count data

Authors: Sanjeena Subedi, Ryan Browne

Abstract: Multivariate count data are commonly encountered through high-throughput sequencing technologies in bioinformatics, text mining, or in sports analytics. Although the Poisson distribution seems a natural fit to these count data, its multivariate extension is computationally expensive.In most cases mutual independence among the variables is assumed, however this fails to take into account the correl… ▽ More Multivariate count data are commonly encountered through high-throughput sequencing technologies in bioinformatics, text mining, or in sports analytics. Although the Poisson distribution seems a natural fit to these count data, its multivariate extension is computationally expensive.In most cases mutual independence among the variables is assumed, however this fails to take into account the correlation among the variables usually observed in the data. Recently, mixtures of multivariate Poisson-lognormal (MPLN) models have been used to analyze such multivariate count measurements with a dependence structure. In the MPLN model, each count is modeled using an independent Poisson distribution conditional on a latent multivariate Gaussian variable. Due to this hierarchical structure, the MPLN model can account for over-dispersion as opposed to the traditional Poisson distribution and allows for correlation between the variables. Rather than relying on a Monte Carlo-based estimation framework which is computationally inefficient, a fast variational-EM based framework is used here for parameter estimation. Further, a parsimonious family of mixtures of Poisson-lognormal distributions are proposed by decomposing the covariance matrix and imposing constraints on these decompositions. Utility of such models is shown using simulated and benchmark datasets. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: 31 Pages

MSC Class: 62H30

arXiv:2001.11600 [pdf, other]

doi 10.1103/PhysRevC.101.055805

Constraining the destruction rate of $^{40}$K in stellar nucleosynthesis through the study of the $^{40}$Ar(p,n)$^{40}$K reaction

Authors: P. Gastis, G. Perdikakis, J. Dissanayake, P. Tsintari, I. Sultana, C. R. Brune, T. N. Massey, Z. Meisel, A. V. Voinov, K. Brandenburg, T. Danley, R. Giri, Y. Jones-Alberty, S. Paneru, D. Soltesz, S. Subedi

Abstract: 40K plays a significant role in the radiogenic heating of earth-like exoplanets, which can affect the development of a habitable environment on their surfaces. The initial amount of 40K in the interior of these planets depends on the composition of the interstellar clouds from which they formed. Within this context, nuclear reactions that regulate the production of 40K during stellar evolution can… ▽ More 40K plays a significant role in the radiogenic heating of earth-like exoplanets, which can affect the development of a habitable environment on their surfaces. The initial amount of 40K in the interior of these planets depends on the composition of the interstellar clouds from which they formed. Within this context, nuclear reactions that regulate the production of 40K during stellar evolution can play a critical role. In this study, we constrain for the first time the astrophysical reaction rate of 40K(n,p)40Ar, which is responsible for the destruction of 40K during stellar nucleosynthesis. We performed differential cross-section measurements on the 40Ar(p,n)40K reaction, for six energies in the center-of-mass between 3.2 and 4.0 MeV and various angles between 0-deg and 135-deg. The experiment took place at the Edwards Accelerator Laboratory at Ohio University using the beam swinger target location and a standard neutron time-of-flight technique. The total and partial cross-sections varied with energy due to the contribution from isobaric analog states and Ericson type fluctuations. The energy-averaged neutron angular distributions were symmetrical relative to 90-deg and consistent with the theoretical predictions of the statistical model. Based on the experimental data, local transmission coefficients were extracted and were used to calculate the astrophysical reaction rates of 40Ar(p,n)40K and 40K(n,p)40Ar reactions. Our results support that the destruction rate of 40K in massive stars via the 40K(n,p)40Ar reaction is larger compared to previous estimates. This result directly affects the predicted stellar yields of 40K from nucleosynthesis, which is a critical input parameter for the galactic chemical evolution models that are currently employed for the study of significant properties of exoplanets. △ Less

Submitted 18 March, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

Comments: 20 pages, 16 figures, submitted to PRC (revised manuscript after referee's review)

Journal ref: Phys. Rev. C 101, 055805 (2020)

arXiv:1911.03602 [pdf, ps, other]

doi 10.1088/1361-6404/ab806a

Charge Densities for Conducting Ellipsoids

Authors: T L Curtright, Z Cao, S Huang, J S Sarmiento, S Subedi, D A Tarrence, T R Thapaliya

Abstract: The volume charge density for a conducting ellipsoid is expressed in simple geometrical terms, and then used to obtain the known surface charge density as well as the uniform charge per length along any principal axis. Corresponding results are presented for conducting hyperellipsoids in any number of spatial dimensions. The volume charge density for a conducting ellipsoid is expressed in simple geometrical terms, and then used to obtain the known surface charge density as well as the uniform charge per length along any principal axis. Corresponding results are presented for conducting hyperellipsoids in any number of spatial dimensions. △ Less

Submitted 25 November, 2019; v1 submitted 8 November, 2019; originally announced November 2019.

arXiv:1808.08300 [pdf, other]

Image Charges Re-Imagined

Authors: H Alshal, T Curtright, S Subedi

Abstract: We discuss the grounded, equipotential ellipse in two-dimensional electrostatics to illustrate different ways of extending the domain of the potential and placing image charges such that homogeneous boundary conditions are satisfied. In particular, we compare and contrast the Kelvin and Sommerfeld image methods. We discuss the grounded, equipotential ellipse in two-dimensional electrostatics to illustrate different ways of extending the domain of the potential and placing image charges such that homogeneous boundary conditions are satisfied. In particular, we compare and contrast the Kelvin and Sommerfeld image methods. △ Less

Submitted 4 December, 2018; v1 submitted 24 August, 2018; originally announced August 2018.

Comments: One example and two Appendices added

arXiv:1807.08380 [pdf, other]

Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

Authors: Anjali Silva, Steven J. Rothstein, Paul D. McNicholas, Xiaoke Qin, Sanjeena Subedi

Abstract: Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for $n$ genes across $p$ conditions at $r$ occasions. Matrix variate distributions offer a natural way to model three-way data and mixtur… ▽ More Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for $n$ genes across $p$ conditions at $r$ occasions. Matrix variate distributions offer a natural way to model three-way data and mixtures of matrix variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means of discovering gene co-expression networks. In this work, a mixture of matrix variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. We propose three different frameworks for parameter estimation: a Markov chain Monte Carlo based approach, a variational Gaussian approximation based approach, and a hybrid approach. Various information criteria are used for model selection. The models are applied to both real and simulated data, and we demonstrate that the proposed approaches can recover the underlying cluster structure in both cases. In simulation studies where the true model parameters are known, our proposed approach shows good parameter recovery. △ Less

Submitted 21 June, 2022; v1 submitted 22 July, 2018; originally announced July 2018.

arXiv:1711.11190 [pdf, ps, other]

A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data

Authors: Anjali Silva, Steven J. Rothstein, Paul D. McNicholas, Sanjeena Subedi

Abstract: High-dimensional data of discrete and skewed nature is commonly encountered in high-throughput sequencing studies. Analyzing the network itself or the interplay between genes in this type of data continues to present many challenges. As data visualization techniques become cumbersome for higher dimensions and unconvincing when there is no clear separation between homogeneous subgroups within the d… ▽ More High-dimensional data of discrete and skewed nature is commonly encountered in high-throughput sequencing studies. Analyzing the network itself or the interplay between genes in this type of data continues to present many challenges. As data visualization techniques become cumbersome for higher dimensions and unconvincing when there is no clear separation between homogeneous subgroups within the data, cluster analysis provides an intuitive alternative. The aim of applying mixture model-based clustering in this context is to discover groups of co-expressed genes, which can shed light on biological functions and pathways of gene products. A mixture of multivariate Poisson-Log Normal (MPLN) model is proposed for clustering of high-throughput transcriptome sequencing data. The MPLN model is able to fit a wide range of correlation and overdispersion situations, and is ideal for modeling multivariate count data from RNA sequencing studies. Parameter estimation is carried out via a Markov chain Monte Carlo expectation-maximization algorithm (MCMC-EM), and information criteria are used for model selection. △ Less

Submitted 29 November, 2017; originally announced November 2017.

arXiv:1309.1901 [pdf, other]

doi 10.1007/s11634-014-0165-7

Variational Bayes Approximations for Clustering via Mixtures of Normal Inverse Gaussian Distributions

Authors: Sanjeena Subedi, Paul D. McNicholas

Abstract: Parameter estimation for model-based clustering using a finite mixture of normal inverse Gaussian (NIG) distributions is achieved through variational Bayes approximations. Univariate NIG mixtures and multivariate NIG mixtures are considered. The use of variational Bayes approximations here is a substantial departure from the traditional EM approach and alleviates some of the associated computation… ▽ More Parameter estimation for model-based clustering using a finite mixture of normal inverse Gaussian (NIG) distributions is achieved through variational Bayes approximations. Univariate NIG mixtures and multivariate NIG mixtures are considered. The use of variational Bayes approximations here is a substantial departure from the traditional EM approach and alleviates some of the associated computational complexities and uncertainties. Our variational algorithm is applied to simulated and real data. The paper concludes with discussion and suggestions for future work. △ Less

Submitted 7 September, 2013; originally announced September 2013.

arXiv:1306.5824 [pdf, ps, other]

Constrained Optimization for a Subset of the Gaussian Parsimonious Clustering Models

Authors: Ryan P. Browne, Sanjeena Subedi, Paul McNicholas

Abstract: The expectation-maximization (EM) algorithm is an iterative method for finding maximum likelihood estimates when data are incomplete or are treated as being incomplete. The EM algorithm and its variants are commonly used for parameter estimation in applications of mixture models for clustering and classification. This despite the fact that even the Gaussian mixture model likelihood surface contain… ▽ More The expectation-maximization (EM) algorithm is an iterative method for finding maximum likelihood estimates when data are incomplete or are treated as being incomplete. The EM algorithm and its variants are commonly used for parameter estimation in applications of mixture models for clustering and classification. This despite the fact that even the Gaussian mixture model likelihood surface contains many local maxima and is singularity riddled. Previous work has focused on circumventing this problem by constraining the smallest eigenvalue of the component covariance matrices. In this paper, we consider constraining the smallest eigenvalue, the largest eigenvalue, and both the smallest and largest within the family setting. Specifically, a subset of the GPCM family is considered for model-based clustering, where we use a re-parameterized version of the famous eigenvalue decomposition of the component covariance matrices. Our approach is illustrated using various experiments with simulated and real data. △ Less

Submitted 24 June, 2013; originally announced June 2013.

arXiv:1306.5368 [pdf, other]

A Variational Approximations-DIC Rubric for Parameter Estimation and Mixture Model Selection Within a Family Setting

Authors: Sanjeena Subedi, Paul D. McNicholas

Abstract: Mixture model-based clustering has become an increasingly popular data analysis technique since its introduction over fifty years ago, and is now commonly utilized within a family setting. Families of mixture models arise when the component parameters, usually the component covariance (or scale) matrices, are decomposed and a number of constraints are imposed. Within the family setting, model sele… ▽ More Mixture model-based clustering has become an increasingly popular data analysis technique since its introduction over fifty years ago, and is now commonly utilized within a family setting. Families of mixture models arise when the component parameters, usually the component covariance (or scale) matrices, are decomposed and a number of constraints are imposed. Within the family setting, model selection involves choosing the member of the family, i.e., the appropriate covariance structure, in addition to the number of mixture components. To date, the Bayesian information criterion (BIC) has proved most effective for model selection, and the expectation-maximization (EM) algorithm is usually used for parameter estimation. In fact, this EM-BIC rubric has virtually monopolized the literature on families of mixture models. Deviating from this rubric, variational Bayes approximations are developed for parameter estimation and the deviance information criterion for model selection. The variational Bayes approach provides an alternate framework for parameter estimation by constructing a tight lower bound on the complex marginal likelihood and maximizing this lower bound by minimizing the associated Kullback-Leibler divergence. This approach is taken on the most commonly used family of Gaussian mixture models, and real and simulated data are used to compare the new approach to the EM-BIC rubric. △ Less

Submitted 6 November, 2019; v1 submitted 22 June, 2013; originally announced June 2013.

arXiv:1209.6463 [pdf, ps, other]

doi 10.1007/s11634-013-0124-8

Clustering and Classification via Cluster-Weighted Factor Analyzers

Authors: Sanjeena Subedi, Antonio Punzo, Salvatore Ingrassia, Paul D. McNicholas

Abstract: In model-based clustering and classification, the cluster-weighted model constitutes a convenient approach when the random vector of interest constitutes a response variable Y and a set p of explanatory variables X. However, its applicability may be limited when p is high. To overcome this problem, this paper assumes a latent factor structure for X in each mixture component. This leads to the clus… ▽ More In model-based clustering and classification, the cluster-weighted model constitutes a convenient approach when the random vector of interest constitutes a response variable Y and a set p of explanatory variables X. However, its applicability may be limited when p is high. To overcome this problem, this paper assumes a latent factor structure for X in each mixture component. This leads to the cluster-weighted factor analyzers (CWFA) model. By imposing constraints on the variance of Y and the covariance matrix of X, a novel family of sixteen CWFA models is introduced for model-based clustering and classification. The alternating expectation-conditional maximization algorithm, for maximum likelihood estimation of the parameters of all the models in the family, is described; to initialize the algorithm, a 5-step hierarchical procedure is proposed, which uses the nested structures of the models within the family and thus guarantees the natural ranking among the sixteen likelihoods. Artificial and real data show that these models have very good clustering and classification performance and that the algorithm is able to recover the parameters very well. △ Less

Submitted 28 September, 2012; originally announced September 2012.

Comments: 36 pages, 6 figures

Showing 1–34 of 34 results for author: Subedi, S