-
Deep Learning to Improve the Sensitivity of Di-Higgs Searches in the $4b$ Channel
Authors:
Cheng-Wei Chiang,
Feng-Yang Hsieh,
Shih-Chieh Hsu,
Ian Low
Abstract:
The study of di-Higgs events, both resonant and non-resonant, plays a crucial role in understanding the fundamental interactions of the Higgs boson. In this work we consider di-Higgs events decaying into four $b$-quarks and propose to improve the experimental sensitivity by utilizing a novel machine learning algorithm known as Symmetry Preserving Attention Network (\textsc{Spa-Net}) -- a neural ne…
▽ More
The study of di-Higgs events, both resonant and non-resonant, plays a crucial role in understanding the fundamental interactions of the Higgs boson. In this work we consider di-Higgs events decaying into four $b$-quarks and propose to improve the experimental sensitivity by utilizing a novel machine learning algorithm known as Symmetry Preserving Attention Network (\textsc{Spa-Net}) -- a neural network structure whose architecture is designed to incorporate the inherent symmetries in particle reconstruction tasks. We demonstrate that the \textsc{Spa-Net} can enhance the experimental reach over baseline methods such as the cut-based and the Deep Neural Networks (DNN)-based analyses. At the Large Hadron Collider, with a 14-TeV centre-of-mass energy and an integrated luminosity of 300 fb$^{-1}$, the \textsc{Spa-Net} allows us to establish 95\% C.L. upper limits in resonant production cross-sections that are 10\% to 45\% stronger than baseline methods. For non-resonant di-Higgs production, \textsc{Spa-Net} enables us to constrain the self-coupling that is 9\% more stringent than the baseline method.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Shallow learning enables real-time inference of molecular composition from spectroscopy of brain tissue
Authors:
Ivan Ezhov,
Kevin Scibilia,
Luca Giannoni,
Florian Kofler,
Ivan Iliash,
Felix Hsieh,
Suprosanna Shit,
Charly Caredda,
Fred Lange,
Ilias Tachtsidis,
Daniel Rueckert
Abstract:
Optical imaging modalities such as near-infrared spectroscopy (NIRS) and hyperspectral imaging (HSI) represent a promising alternative for low-cost, non-invasive, and fast monitoring of functional and structural properties of living tissue. Particularly, the possibility of extracting the molecular composition of the tissue from the optical spectra in real-time deems the spectroscopy techniques as…
▽ More
Optical imaging modalities such as near-infrared spectroscopy (NIRS) and hyperspectral imaging (HSI) represent a promising alternative for low-cost, non-invasive, and fast monitoring of functional and structural properties of living tissue. Particularly, the possibility of extracting the molecular composition of the tissue from the optical spectra in real-time deems the spectroscopy techniques as a unique diagnostic tool. However, due to a lack of paired optical and molecular profiling studies, building a map** between a spectral signature and a corresponding set of molecular concentrations is still an unsolved problem. Furthermore, no established methods exist to streamline the inference of the biochemical composition from the optical spectrum for real-time applications such as surgical monitoring. In this paper, we analyse a technique for fast and accurate inference of changes in the molecular composition of brain tissue. We base our method on the Beer-Lambert law to analytically connect the spectra with concentrations and use a deep-learning approach to significantly speed up the concentration inference compared to traditional optimization methods. We test our approach on real data obtained from the broadband NIRS- and HSI-based optical monitoring of brain tissue. The results demonstrate that the proposed method enables real-time molecular composition inference while maintaining the accuracy of traditional linear and non-linear optimization solvers.
△ Less
Submitted 26 March, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run
Authors:
C. Fletcher,
J. Wood,
R. Hamburg,
P. Veres,
C. M. Hui,
E. Bissaldi,
M. S. Briggs,
E. Burns,
W. H. Cleveland,
M. M. Giles,
A. Goldstein,
B. A. Hristov,
D. Kocevski,
S. Lesage,
B. Mailyan,
C. Malacaria,
S. Poolakkil,
A. von Kienlin,
C. A. Wilson-Hodge,
The Fermi Gamma-ray Burst Monitor Team,
M. Crnogorčević,
J. DeLaunay,
A. Tohuvavohu,
R. Caputo,
S. B. Cenko
, et al. (1674 additional authors not shown)
Abstract:
We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,…
▽ More
We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Unraveling heterogeneity of ADNI's time-to-event data using conditional entropy Part-I: Cross-sectional study
Authors:
Shuting Liao,
Fushing Hsieh
Abstract:
Through Alzheimer's Disease Neuroimaging Initiative (ADNI), time-to-event data: from the pre-dementia state of mild cognitive impairment (MCI) to the diagnosis of Alzheimer's disease (AD), is collected and analyzed by explicitly unraveling prognostic heterogeneity among 346 uncensored and 557 right censored subjects under structural dependency among covariate features. The non-informative censorin…
▽ More
Through Alzheimer's Disease Neuroimaging Initiative (ADNI), time-to-event data: from the pre-dementia state of mild cognitive impairment (MCI) to the diagnosis of Alzheimer's disease (AD), is collected and analyzed by explicitly unraveling prognostic heterogeneity among 346 uncensored and 557 right censored subjects under structural dependency among covariate features. The non-informative censoring mechanism is tested and confirmed based on conditional-vs-marginal entropies evaluated upon contingency tables built by the Redistribute-to-the-right algorithm. The Categorical Exploratory Data Analysis (CEDA) paradigm is applied to evaluate conditional entropy-based associative patterns between the categorized response variable against 16 categorized covariable variables all having 4 categories. Two order-1 global major factors: V9 (MEM-mean) and V8 (ADAS13.bl) are selected sharing the highest amounts of mutual information with the response variable. This heavily censored data set is analyzed by Cox's proportional hazard (PH) modeling. Comparisons of PH and CEDA results on a global scale are complicated under the structural dependency of covariate features. To alleviate such complications, V9 and V8 are taken as two potential perspectives of heterogeneity and the entire collections of subjects are divided into two sets of four sub-collections. CEDA major factor selection protocol is applied to all sub-collections to figure out which features provide extra information. Graphic displays are developed to explicitly unravel conditional entropy expansions upon perspectives of heterogeneity in ADNI data. On the local scale, PH analysis is carried out and results are compared with CEDA's. We conclude that, when facing structural dependency among covariates and heterogeneity in data, CEDA and its major factor selection provide significant merits for manifesting data's multiscale information content.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
An Encoding Approach for Stable Change Point Detection
Authors:
Xiaodong Wang,
Fushing Hsieh
Abstract:
Without imposing prior distributional knowledge underlying multivariate time series of interest, we propose a nonparametric change-point detection approach to estimate the number of change points and their locations along the temporal axis. We develop a structural subsampling procedure such that the observations are encoded into multiple sequences of Bernoulli variables. A maximum likelihood appro…
▽ More
Without imposing prior distributional knowledge underlying multivariate time series of interest, we propose a nonparametric change-point detection approach to estimate the number of change points and their locations along the temporal axis. We develop a structural subsampling procedure such that the observations are encoded into multiple sequences of Bernoulli variables. A maximum likelihood approach in conjunction with a newly developed searching algorithm is implemented to detect change points on each Bernoulli process separately. Then, aggregation statistics are proposed to collectively synthesize change-point results from all individual univariate time series into consistent and stable location estimations. We also study a weighting strategy to measure the degree of relevance for different subsampled groups. Simulation studies are conducted and shown that the proposed change-point methodology for multivariate time series has favorable performance comparing with currently popular nonparametric methods under various settings with different degrees of complexity. Real data analyses are finally performed on categorical, ordinal, and continuous time series taken from fields of genetics, climate, and finance.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Coarse- and fine-scale geometric information content of Multiclass Classification and implied Data-driven Intelligence
Authors:
Fushing Hsieh,
Xiaodong Wang
Abstract:
Under any Multiclass Classification (MCC) setting defined by a collection of labeled point-cloud specified by a feature-set, we extract only stochastic partial orderings from all possible triplets of point-cloud without explicitly measuring the three cloud-to-cloud distances. We demonstrate that such a collective of partial ordering can efficiently compute a label embedding tree geometry on the La…
▽ More
Under any Multiclass Classification (MCC) setting defined by a collection of labeled point-cloud specified by a feature-set, we extract only stochastic partial orderings from all possible triplets of point-cloud without explicitly measuring the three cloud-to-cloud distances. We demonstrate that such a collective of partial ordering can efficiently compute a label embedding tree geometry on the Label-space. This tree in turn gives rise to a predictive graph, or a network with precisely weighted linkages. Such two multiscale geometries are taken as the coarse scale information content of MCC. They indeed jointly shed lights on explainable knowledge on why and how labeling comes about and facilitates error-free prediction with potential multiple candidate labels supported by data. For revealing within-label heterogeneity, we further undergo labeling naturally found clusters within each point-cloud, and likewise derive multiscale geometry as its fine-scale information content contained in data. This fine-scale endeavor shows that our computational proposal is indeed scalable to a MCC setting having a large label-space. Overall the computed multiscale collective of data-driven patterns and knowledge will serve as a basis for constructing visible and explainable subject matter intelligence regarding the system of interest.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Discovering Multiple Phases of Dynamics by Dissecting Multivariate Time Series
Authors:
Xiaodong Wang,
Fushing Hsieh
Abstract:
We proposed a data-driven approach to dissect multivariate time series in order to discover multiple phases underlying dynamics of complex systems. This computing approach is developed as a multiple-dimension version of Hierarchical Factor Segmentation(HFS) technique. This expanded approach proposes a systematic protocol of choosing various extreme events in multi-dimensional space. Upon each chos…
▽ More
We proposed a data-driven approach to dissect multivariate time series in order to discover multiple phases underlying dynamics of complex systems. This computing approach is developed as a multiple-dimension version of Hierarchical Factor Segmentation(HFS) technique. This expanded approach proposes a systematic protocol of choosing various extreme events in multi-dimensional space. Upon each chosen event, an empirical distribution of event-recurrence, or waiting time between the excursions, is fitted by a geometric distribution with time-varying parameters. Iterative fittings are performed across all chosen events. We then collect and summarize the local recurrent patterns into a global dynamic mechanism. Clustering is applied for partitioning the whole time period into alternating segments, in which variables are identically distributed. Feature weighting techniques are also considered to compensate for some drawbacks of clustering. Our simulation results show that this expanded approach can even detect systematic differences when the joint distribution varies. In real data experiments, we analyze the relationship from returns, trading volume, and transaction number of a single, as well as of multiple stocks in S&P500. We can successfully not only map out volatile periods but also provide potential associative links between stocks.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
High Altitude Platform Stations (HAPS): Architecture and System Performance
Authors:
Yunchou Xing,
Frank Hsieh,
Amitava Ghosh,
Theodore S. Rappaport
Abstract:
High Altitude Platform Station (HAPS) has the potential to provide global wireless connectivity and data services such as high-speed wireless backhaul, industrial Internet of things (IoT), and public safety for large areas not served by terrestrial networks. A unified HAPS design is desired to support various use cases and a wide range of requirements. In this paper, we present two architecture de…
▽ More
High Altitude Platform Station (HAPS) has the potential to provide global wireless connectivity and data services such as high-speed wireless backhaul, industrial Internet of things (IoT), and public safety for large areas not served by terrestrial networks. A unified HAPS design is desired to support various use cases and a wide range of requirements. In this paper, we present two architecture designs of the HAPS system: i) repeater based HAPS, and ii) base station based HAPS, which are both viable technical solutions. The energy efficiency is analyzed and compared between the two architectures using consumption factor theory. The system performance of these two architectures is evaluated through Monte Carlo simulations and is characterized in metrics of spectral efficiency using LTE band 1 for both single-cell and multi-cell cases. Both designs can provide good downlink spectral efficiency and coverage, while the uplink coverage is significantly limited by UE transmit power and antenna gain. Using directional antennas at the UEs can improve the system performance for both downlink and uplink.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Unraveling S&P500 stock volatility and networks -- An encoding-and-decoding approach
Authors:
Xiaodong Wang,
Fushing Hsieh
Abstract:
Volatility of financial stock is referring to the degree of uncertainty or risk embedded within a stock's dynamics. Such risk has been received huge amounts of attention from diverse financial researchers. By following the concept of regime-switching model, we proposed a non-parametric approach, named encoding-and-decoding, to discover multiple volatility states embedded within a discrete time ser…
▽ More
Volatility of financial stock is referring to the degree of uncertainty or risk embedded within a stock's dynamics. Such risk has been received huge amounts of attention from diverse financial researchers. By following the concept of regime-switching model, we proposed a non-parametric approach, named encoding-and-decoding, to discover multiple volatility states embedded within a discrete time series of stock returns. The encoding is performed across the entire span of temporal time points for relatively extreme events with respect to a chosen quantile-based threshold. As such the return time series is transformed into Bernoulli-variable processes. In the decoding phase, we computationally seek for locations of change points via estimations based on a new searching algorithm in conjunction with the information criterion applied on the observed collection of recurrence times upon the binary process. Besides the independence required for building the Geometric distributional likelihood function, the proposed approach can functionally partition the entire return time series into a collection of homogeneous segments without any assumptions of dynamic structure and underlying distributions. In the numerical experiments, our approach is found favorably compared with parametric models like Hidden Markov Model. In the real data applications, we introduce the application of our approach in forecasting stock returns. Finally, volatility dynamic of every single stock of S&P500 is revealed, and a stock network is consequently established to represent dependency relations derived through concurrent volatility states among S&P500.
△ Less
Submitted 21 October, 2021; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Categorical exploratory data analysis on goodness-of-fit issues
Authors:
Sabrina Enriquez,
Fushing Hsieh
Abstract:
If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our r…
▽ More
If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our real data to potentially unattainable or even unrealistic theoretical structures, we propose to utilize the data analysis paradigm called Categorical Exploratory Data Analysis (CEDA). We illustrate the merits of this proposal with two real-world data sets from the perspective of goodness-of-fit. In both data sets, the Normal distribution's bell shape seemingly fits rather well by first glance. We apply CEDA to bring out where and how each data fits or deviates from the model shape via several important distributional aspects. We also demonstrate that CEDA affords a version of tree-based p-value, and compare it with p-values based on traditional statistical approaches. Along our data analysis, we invest computational efforts in making graphic display to illuminate the advantages of using CEDA as one primary way of data analysis in Data Science education.
△ Less
Submitted 3 December, 2020; v1 submitted 19 November, 2020;
originally announced November 2020.
-
Extreme-K categorical samples problem
Authors:
Elizabeth Chou,
Catie McVey,
Yin-Chen Hsieh,
Sabrina Enriquez,
Fushing Hsieh
Abstract:
With histograms as its foundation, we develop Categorical Exploratory Data Analysis (CEDA) under the extreme-$K$ sample problem, and illustrate its universal applicability through four 1D categorical datasets. Given a sizable $K$, CEDA's ultimate goal amounts to discover by data's information content via carrying out two data-driven computational tasks: 1) establish a tree geometry upon $K$ popula…
▽ More
With histograms as its foundation, we develop Categorical Exploratory Data Analysis (CEDA) under the extreme-$K$ sample problem, and illustrate its universal applicability through four 1D categorical datasets. Given a sizable $K$, CEDA's ultimate goal amounts to discover by data's information content via carrying out two data-driven computational tasks: 1) establish a tree geometry upon $K$ populations as a platform for discovering a wide spectrum of patterns among populations; 2) evaluate each geometric pattern's reliability. In CEDA developments, each population gives rise to a row vector of categories proportions. Upon the data matrix's row-axis, we discuss the pros and cons of Euclidean distance against its weighted version for building a binary clustering tree geometry. The criterion of choice rests on degrees of uniformness in column-blocks framed by this binary clustering tree. Each tree-leaf (population) is then encoded with a binary code sequence, so is tree-based pattern. For evaluating reliability, we adopt row-wise multinomial randomness to generate an ensemble of matrix mimicries, so an ensemble of mimicked binary trees. Reliability of any observed pattern is its recurrence rate within the tree ensemble. A high reliability value means a deterministic pattern. Our four applications of CEDA illuminate four significant aspects of extreme-$K$ sample problems.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Color-complexity enabled exhaustive color-dots identification and spatial patterns testing in images
Authors:
Shuting Liao,
Li-Yu Liu,
Ting-An Chen,
Kuang-Yu Chen,
Fushing Hsieh
Abstract:
Targeted color-dots with varying shapes and sizes in images are first exhaustively identified, and then their multiscale 2D geometric patterns are extracted for testing spatial uniformness in a progressive fashion. Based on color theory in physics, we develop a new color-identification algorithm relying on highly associative relations among the three color-coordinates: RGB or HSV. Such high associ…
▽ More
Targeted color-dots with varying shapes and sizes in images are first exhaustively identified, and then their multiscale 2D geometric patterns are extracted for testing spatial uniformness in a progressive fashion. Based on color theory in physics, we develop a new color-identification algorithm relying on highly associative relations among the three color-coordinates: RGB or HSV. Such high associations critically imply low color-complexity of a color image, and renders potentials of exhaustive identification of targeted color-dots of all shapes and sizes. Via heterogeneous shaded regions and lighting conditions, our algorithm is shown being robust, practical and efficient comparing with the popular Contour and OpenCV approaches. Upon all identified color-pixels, we form color-dots as individually connected networks with shapes and sizes. We construct minimum spanning trees (MST) as spatial geometries of dot-collectives of various size-scales. Given a size-scale, the distribution of distances between immediate neighbors in the observed MST is extracted, so do many simulated MSTs under the spatial uniformness assumption. We devise a new algorithm for testing 2D spatial uniformness based on a Hierarchical clustering tree upon all involving MSTs. Our developments are illustrated on images obtained by mimicking chemical spraying via drone in Precision Agriculture.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics perspectives of baseball pitching dynamics
Authors:
Fushing Hsieh,
Elizabeth P. Chou
Abstract:
From two coupled Multiclass Classification (MCC) and Response Manifold Analytics (RMA) perspectives, we develop Categorical Exploratory Data Analysis (CEDA) on PITCHf/x database for the information content of Major League Baseball's (MLB) pitching dynamics. MCC and RMA information contents are represented by one collection of multi-scales pattern categories from mixing geometries and one collectio…
▽ More
From two coupled Multiclass Classification (MCC) and Response Manifold Analytics (RMA) perspectives, we develop Categorical Exploratory Data Analysis (CEDA) on PITCHf/x database for the information content of Major League Baseball's (MLB) pitching dynamics. MCC and RMA information contents are represented by one collection of multi-scales pattern categories from mixing geometries and one collection of global-to-local geometric localities from response-covariate manifolds, respectively. These collectives shed light on the pitching dynamics and maps out uncertainty of popular machine learning approaches. On MCC setting, an indirect-distance-measure based label embedding tree leads to discover asymmetry of mixing geometries among labels' point-clouds. A selected chain of complementary covariate feature groups collectively brings out multi-order mixing geometric pattern categories. Such categories then reveal the true nature of MCC predictive inferences. On RMA setting, multiple response features couple with multiple major covariate features to demonstrate physical principles bearing manifolds with a lattice of natural localities. With minor features' heterogeneous effects being locally identified, such localities jointly weave their focal characteristics into system understanding and provide a platform for RMA predictive inferences. Our CEDA works for universal data types, adopts non-linear associations and facilitates efficient feature-selections and inferences.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
From learning gait signatures of many individuals to reconstructing gait dynamics of one single individual
Authors:
Fushing Hsieh,
Xiaodong Wang
Abstract:
Based on the same databases, we computationally address two seemingly highly related, in fact drastically distinct, questions via computational data-driven algorithms: 1) how to precisely achieve the big task of differentiating gait signatures of many individuals? 2) how to reconstruct an individual's complex gait dynamics in full? Our brains can "effortlessly" resolve the first question, but will…
▽ More
Based on the same databases, we computationally address two seemingly highly related, in fact drastically distinct, questions via computational data-driven algorithms: 1) how to precisely achieve the big task of differentiating gait signatures of many individuals? 2) how to reconstruct an individual's complex gait dynamics in full? Our brains can "effortlessly" resolve the first question, but will definitely fail in the second one. Since many fine temporal scale gait patterns surely escape our eyes. Based on accelerometers' 3D gait time series databases, we link the answers toward both questions via multiscale structural dependency within gait dynamics of our musculoskeletal system. Two types of dependency manifestations are explored. We first develop simple algorithmic computing called Principle System-State Analysis (PSSA) for the coarse dependency in implicit forms. PSSA is shown to be able to efficiently classifying among many subjects. We then develop a multiscale Local-1st-Global-2nd (L1G2) Coding Algorithm and a landmark computing algorithm. With both algorithms, we can precisely dissect rhythmic gait cycles, and then decompose each cycle into a series of cyclic gait phases. With proper color-coding and stacking, we reconstruct and represent an individual's gait dynamics via a 3D cylinder to collectively reveal universal deterministic and stochastic structural patterns on centisecond (10 milliseconds) scale across all rhythmic cycles. This 3D cylinder can serve as "passtensor" for authentication purposes related to clinical diagnoses and cybersecurity.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Information of Epileptic Mechanism and its Systemic Change-points in a Zebrafish's Brain-wide Calcium Imaging Video Data
Authors:
**gyi Zheng,
Fushing Hsieh
Abstract:
The epileptic mechanism is postulated as that an animal's neurons gradually diminish their inhibition function coupled with enhanced excitation when an epileptic event is approaching. Calcium imaging technique is designed to directly record brain-wide neurons activity in order to discover the underlying epileptic mechanism. In this paper, using one brain-wide calcium imaging video of Zebrafish, we…
▽ More
The epileptic mechanism is postulated as that an animal's neurons gradually diminish their inhibition function coupled with enhanced excitation when an epileptic event is approaching. Calcium imaging technique is designed to directly record brain-wide neurons activity in order to discover the underlying epileptic mechanism. In this paper, using one brain-wide calcium imaging video of Zebrafish, we compute dynamic pattern information of the epileptic mechanism, and devise three graphical displays to show the visible functional aspect of epileptic mechanism over five inter-ictal periods. The foundation of our data-driven computations for such dynamic patterns relies on one universal phenomenon discovered across 696 informative pixels. This universality is that each pixel's progressive 5-percentile process oscillates in an irregular fashion at first, but, after the middle point of inter-ictal period, the oscillation is replaced by a steady increasing trend. Such dynamic patterns are collectively transformed into a visible systemic change-point as an early warning signal (EWS) of an incoming epileptic event. We conclude through the graphic displays that pattern information extracted from the calcium imaging video realistically reveals the Zebrafish's authentic epileptic mechanism.
△ Less
Submitted 13 February, 2018;
originally announced March 2018.
-
Graphic displays of MLB pitching mechanics and its evolutions in PITCHf/x data
Authors:
Fushing Hsieh,
Kevin Fujii,
Tania Roy,
Cho-Jui Hsieh,
Brenda McCowan
Abstract:
Systemic and idiosyncratic patterns in pitching mechanics of 24 top starting pitchers in Major League Baseball (MLB) are extracted and discovered from PITCHf/x database. These evolving patterns across different pitchers or seasons are represented through three exclusively developed graphic displays. Understanding on such patterned evolutions will be beneficial for pitchers' wellbeing in signaling…
▽ More
Systemic and idiosyncratic patterns in pitching mechanics of 24 top starting pitchers in Major League Baseball (MLB) are extracted and discovered from PITCHf/x database. These evolving patterns across different pitchers or seasons are represented through three exclusively developed graphic displays. Understanding on such patterned evolutions will be beneficial for pitchers' wellbeing in signaling potential injury, and will be critical for expert knowledge in comparing pitchers. Based on data-driven computing, a universal composition of patterns is identified on all pitchers' mutual conditional entropy matrices. The first graphic display reveals that this universality accommodates physical laws as well as systemic characteristics of pitching mechanics. Such visible characters point to large scale factors for differentiating between distinct clusters of pitchers, and simultaneously lead to detailed factors for comparing individual pitchers. The second graphic display shows choices of features that are able to express a pitcher's season-by-season pitching contents via a series of 3(+2)D point-cloud geometries. The third graphic display exhibits exquisitely a pitcher's idiosyncratic pattern-information of pitching across seasons by demonstrating all his pitch-subtype evolutions. These heatmap-based graphic displays are platforms for visualizing and understanding pitching mechanics.
△ Less
Submitted 27 January, 2018;
originally announced January 2018.
-
Complexity of Possibly-gapped Histogram and Analysis of Histogram (ANOHT)
Authors:
Fushing Hsieh,
Tania Roy
Abstract:
Without unrealistic continuity and smoothness assumptions on a distributional density of one dimensional dataset, constructing an authentic possibly-gapped histogram becomes rather complex. The candidate ensemble is described via a two-layer Ising model, and its size is shown to grow exponentially. This exponential complexity makes any exhaustive search in-feasible and all boundary parameters loca…
▽ More
Without unrealistic continuity and smoothness assumptions on a distributional density of one dimensional dataset, constructing an authentic possibly-gapped histogram becomes rather complex. The candidate ensemble is described via a two-layer Ising model, and its size is shown to grow exponentially. This exponential complexity makes any exhaustive search in-feasible and all boundary parameters local. For data compression via Uniformity, the decoding error criterion is nearly independent of sample size. These characteristics nullify statistical model selection techniques, such as Minimum Description Length (MDL). Nonetheless practical and nearly optimal solutions are algorithmically computable. A data-driven algorithm is devised to construct such histograms along the branching hierarchy of a Hierarchical Clustering tree. Such resultant histograms naturally manifest data's physical information contents: deterministic structures of bin-boundaries coupled with stochastic structures of Uniformity within each bin. Without enforcing unrealistic Normality and constant variance assumptions, an application of possibly-gapped histogram is devised, called analysis of Histogram (ANOHT), to replace Analysis of Variance (ANOVA). Its potential applications are foreseen in digital re-normalization schemes and associative pattern extraction among features of heterogeneous data types. Thus constructing possibly-gapped histograms becomes a prerequisite for knowledge discovery, via exploratory data analysis and unsupervised Machine Learning.
△ Less
Submitted 12 November, 2017; v1 submitted 20 February, 2017;
originally announced February 2017.
-
The Greenland Telescope: Antenna Retrofit Status and Future Plans
Authors:
Philippe Raffin,
Paul T. P. Ho,
Keiichi Asada,
Raymond Blundell,
Geoffrey C. Bower,
Roberto Burgos,
Chih-Cheng Chang,
Ming-Tang Chen,
You-Hua Chu,
Paul K. Grimes,
C. C. Han,
Chih-Wei L. Huang,
Yau-De Huang,
Fang-Chia Hsieh,
Makoto Inoue,
Patrick M. Koch,
Derek Kubo,
Steve Leiker,
Lupin Lin,
Ching-Tang Liu,
Shih-Hsiang Lo,
Pierre Martin-Cocher,
Satoki Matsushita,
Masanori Nakamura,
Zheng Meyer-Zhao
, et al. (10 additional authors not shown)
Abstract:
Since the ALMA North America Prototype Antenna was awarded to the Smithsonian Astrophysical Observatory (SAO), SAO and the Academia Sinica Institute of Astronomy & Astrophysics (ASIAA) are working jointly to relocate the antenna to Greenland. This paper shows the status of the antenna retrofit and the work carried out after the recommissioning and subsequent disassembly of the antenna at the VLA h…
▽ More
Since the ALMA North America Prototype Antenna was awarded to the Smithsonian Astrophysical Observatory (SAO), SAO and the Academia Sinica Institute of Astronomy & Astrophysics (ASIAA) are working jointly to relocate the antenna to Greenland. This paper shows the status of the antenna retrofit and the work carried out after the recommissioning and subsequent disassembly of the antenna at the VLA has taken place. The next coming months will see the start of the antenna reassembly at Thule Air Base. These activities are expected to last until the fall of 2017 when commissioning should take place. In parallel, design, fabrication and testing of the last components are taking place in Taiwan.
△ Less
Submitted 9 December, 2016;
originally announced December 2016.
-
The Atacama Large Millimeter/submillimeter Array (ALMA) Band-1 Receiver
Authors:
Yau De Huang,
Oscar Morata,
Patrick Michel Koch,
Ciska Kemper,
Yuh-**g Hwang,
Chau-Ching Chiong,
Paul Ho,
You-Hua Chu,
Chi-Den Huang,
Ching-Tang Liu,
Fang-Chia Hsieh,
Yen-Hsiang Tseng,
Shou-Hsien Weng,
Chin-Ting Ho,
Po-Han Chiang,
Hsiao-Ling Wu,
Chih-Cheng Chang,
Shou-Ting Jian,
Chien-Feng Lee,
Yi-Wei Lee,
Satoru Iguchi,
Shin'ichiro Asayama,
Daisuke Iono,
Alvaro Gonzalez,
John Effland
, et al. (7 additional authors not shown)
Abstract:
The Atacama Large Millimeter/submillimeter Array(ALMA) Band 1 receiver covers the 35-50 GHz frequency band. Development of prototype receivers, including the key components and subsystems has been completed and two sets of prototype receivers were fully tested. We will provide an overview of the ALMA Band 1 science goals, and its requirements and design for use on the ALMA. The receiver developmen…
▽ More
The Atacama Large Millimeter/submillimeter Array(ALMA) Band 1 receiver covers the 35-50 GHz frequency band. Development of prototype receivers, including the key components and subsystems has been completed and two sets of prototype receivers were fully tested. We will provide an overview of the ALMA Band 1 science goals, and its requirements and design for use on the ALMA. The receiver development status will also be discussed and the infrastructure, integration, evaluation of fully-assembled band 1 receiver system will be covered. Finally, a discussion of the technical and management challenges encountered will be presented.
△ Less
Submitted 2 December, 2016;
originally announced December 2016.