-
Viscoelastic materials are most energy efficient when loaded and unloaded at equal rates
Authors:
Lucien Tsai,
Paco Navarro,
Siqi Wu,
Talyor Levinson,
Elizabeth Mendoza,
M. Janneke Schwaner,
Monica A. Daley,
Emanuel Azizi,
Mark Ilton
Abstract:
Biological springs can be used in nature for energy conservation and ultra-fast motion. The loading and unloading rates of elastic materials can play an important role in determining how the properties of these springs affect movements. We investigate the mechanical energy efficiency of biological springs (American bullfrog plantaris tendons and guinea fowl lateral gastrocnemius tendons) and synth…
▽ More
Biological springs can be used in nature for energy conservation and ultra-fast motion. The loading and unloading rates of elastic materials can play an important role in determining how the properties of these springs affect movements. We investigate the mechanical energy efficiency of biological springs (American bullfrog plantaris tendons and guinea fowl lateral gastrocnemius tendons) and synthetic elastomers. We measure these materials under symmetric rates (equal loading and unloading durations) and asymmetric rates (unequal loading and unloading durations) using novel dynamic mechanical analysis measurements. We find that mechanical efficiency is highest at symmetric rates and significantly decreases with a larger degree of asymmetry. A generalized 1D Maxwell model with no fitting parameters captures the experimental results based on the independently-characterized linear viscoelastic properties of the materials. The model further shows that a broader viscoelastic relaxation spectrum enhances the effect of rate-asymmetry on efficiency. Overall, our study provides valuable insights into the interplay between material properties and unloading dynamics in both biological and synthetic elastic systems.
△ Less
Submitted 21 November, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Multifunctionality in a Connectome-Based Reservoir Computer
Authors:
Jacob Morra,
Andrew Flynn,
Andreas Amann,
Mark Daley
Abstract:
Multifunctionality describes the capacity for a neural network to perform multiple mutually exclusive tasks without altering its network connections; and is an emerging area of interest in the reservoir computing machine learning paradigm. Multifunctionality has been observed in the brains of humans and other animals: particularly, in the lateral horn of the fruit fly. In this work, we transplant…
▽ More
Multifunctionality describes the capacity for a neural network to perform multiple mutually exclusive tasks without altering its network connections; and is an emerging area of interest in the reservoir computing machine learning paradigm. Multifunctionality has been observed in the brains of humans and other animals: particularly, in the lateral horn of the fruit fly. In this work, we transplant the connectome of the fruit fly lateral horn to a reservoir computer (RC), and investigate the extent to which this 'fruit fly RC' (FFRC) exhibits multifunctionality using the 'seeing double' problem as a benchmark test. We furthermore explore the dynamics of how this FFRC achieves multifunctionality while varying the network's spectral radius. Compared to the widely-used Erdös-Renyi Reservoir Computer (ERRC), we report that the FFRC exhibits a greater capacity for multifunctionality; is multifunctional across a broader hyperparameter range; and solves the seeing double problem far beyond the previously observed spectral radius limit, wherein the ERRC's dynamics become chaotic.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Biophysical Simulation Reveals the Mechanics of the Avian Lumbosacral Organ
Authors:
An Mo,
Viktoriia Kamska,
Fernanda Bribiesca-Contreras,
Janet Hauptmann,
Monica Daley,
Alexander Badri-Spröwitz
Abstract:
The lumbosacral organ (LSO) is a lumbosacral spinal canal morphology that is universally and uniquely found in birds. Recent studies suggested an intraspinal mechanosensor function that relies on the compliant motion of soft tissue in the spinal cord fluid. It has not yet been possible to observe LSO soft tissue motion in vivo due to limitations of imaging technologies. As an alternative approach,…
▽ More
The lumbosacral organ (LSO) is a lumbosacral spinal canal morphology that is universally and uniquely found in birds. Recent studies suggested an intraspinal mechanosensor function that relies on the compliant motion of soft tissue in the spinal cord fluid. It has not yet been possible to observe LSO soft tissue motion in vivo due to limitations of imaging technologies. As an alternative approach, we developed an artificial biophysical model of the LSO, and characterize the dynamic responses of this model when entrained by external motion. The parametric model incorporates morphological and material properties of the LSO. We varied the model's parameters to study the influence of individual features on the system response. We characterized the system in a locomotion simulator, producing vertical oscillations similar to the trunk motions. We show how morphological and material properties effectively shape the system's oscillation characteristics. We conclude that external oscillations could entrain the soft tissue of the intraspinal lumbosacral organ during locomotion, consistent with recently proposed sensing mechanisms.
△ Less
Submitted 17 May, 2023; v1 submitted 22 December, 2022;
originally announced December 2022.
-
Using Connectome Features to Constrain Echo State Networks
Authors:
Jacob Morra,
Mark Daley
Abstract:
We report an improvement to the conventional Echo State Network (ESN) across three benchmark chaotic time-series prediction tasks using fruit fly connectome data alone. We also investigate the impact of key connectome-derived structural features on prediction performance -- uniquely bridging neurobiological structure and machine learning function; and find that both increasing the global average c…
▽ More
We report an improvement to the conventional Echo State Network (ESN) across three benchmark chaotic time-series prediction tasks using fruit fly connectome data alone. We also investigate the impact of key connectome-derived structural features on prediction performance -- uniquely bridging neurobiological structure and machine learning function; and find that both increasing the global average clustering coefficient and modifying the position of weights -- by permuting their synapse-synapse partners -- can lead to increased model variance and (in some cases) degraded performance. In all we consider four topological point modifications to a connectome-derived ESN reservoir (null model): namely, we alter the network sparsity, re-draw nonzero weights from a uniform distribution, permute nonzero weight positions, and increase the network global average clustering coefficient. We compare the four resulting ESN model classes -- and the null model -- with a conventional ESN by conducting time-series prediction experiments on size-variants of the Mackey-Glass 17 (MG-17), Lorenz, and Rossler chaotic time series; denoting each model's performance and variance across train-validate trials.
△ Less
Submitted 9 February, 2023; v1 submitted 5 June, 2022;
originally announced June 2022.
-
Asteroid Measurements at Millimeter Wavelengths with the South Pole Telescope
Authors:
P. M. Chichura,
A. Foster,
C. Patel,
N. Ossa-Jaen,
P. A. R. Ade,
Z. Ahmed,
A. J. Anderson,
M. Archipley,
J. E. Austermann,
J. S. Avva,
L. Balkenhol,
P. S. Barry,
R. Basu Thakur,
J. A. Beall,
K. Benabed,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
F. R. Bouchet,
L. Bryant,
K. Byrum,
J. E. Carlstrom,
F. W. Carter,
T. W. Cecil
, et al. (119 additional authors not shown)
Abstract:
We present the first measurements of asteroids in millimeter wavelength (mm) data from the South Pole Telescope (SPT), which is used primarily to study the cosmic microwave background (CMB). We analyze maps of two $\sim270$ deg$^2$ sky regions near the ecliptic plane, each observed with the SPTpol camera $\sim100$ times over one month. We subtract the mean of all maps of a given field, removing st…
▽ More
We present the first measurements of asteroids in millimeter wavelength (mm) data from the South Pole Telescope (SPT), which is used primarily to study the cosmic microwave background (CMB). We analyze maps of two $\sim270$ deg$^2$ sky regions near the ecliptic plane, each observed with the SPTpol camera $\sim100$ times over one month. We subtract the mean of all maps of a given field, removing static sky signal, and then average the mean-subtracted maps at known asteroid locations. We detect three asteroids$\text{ -- }$(324) Bamberga, (13) Egeria, and (22) Kalliope$\text{ -- }$with signal-to-noise ratios (S/N) of 11.2, 10.4, and 6.1, respectively, at 2.0 mm (150 GHz); we also detect (324) Bamberga with S/N of 4.1 at 3.2 mm (95 GHz). We place constraints on these asteroids' effective emissivities, brightness temperatures, and light curve modulation amplitude. Our flux density measurements of (324) Bamberga and (13) Egeria roughly agree with predictions, while our measurements of (22) Kalliope suggest lower flux, corresponding to effective emissivities of $0.66 \pm 0.11$ at 2.0 mm and $<0.47$ at 3.2mm. We predict the asteroids detectable in other SPT datasets and find good agreement with detections of (772) Tanete and (1093) Freda in recent data from the SPT-3G camera, which has $\sim10 \times$ the map** speed of SPTpol. This work is the first focused analysis of asteroids in data from CMB surveys, and it demonstrates we can repurpose historic and future datasets for asteroid studies. Future SPT measurements can help constrain the distribution of surface properties over a larger asteroid population.
△ Less
Submitted 21 April, 2023; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Imposing Connectome-Derived Topology on an Echo State Network
Authors:
Jacob Morra,
Mark Daley
Abstract:
Can connectome-derived constraints inform computation? In this paper we investigate the contribution of a fruit fly connectome's topology on the performance of an Echo State Network (ESN) -- a subset of Reservoir Computing which is state of the art in chaotic time series prediction. Specifically, we replace the reservoir layer of a classical ESN -- normally a fixed, random graph represented as a 2…
▽ More
Can connectome-derived constraints inform computation? In this paper we investigate the contribution of a fruit fly connectome's topology on the performance of an Echo State Network (ESN) -- a subset of Reservoir Computing which is state of the art in chaotic time series prediction. Specifically, we replace the reservoir layer of a classical ESN -- normally a fixed, random graph represented as a 2-d matrix -- with a particular (female) fruit fly connectome-derived connectivity matrix. We refer to this experimental class of models (with connectome-derived reservoirs) as "Fruit Fly ESNs" (FFESNs). We train and validate the FFESN on a chaotic time series prediction task; here we consider four sets of trials with different training input sizes (small, large) and train-validate splits (two variants). We compare the validation performance (Mean-Squared Error) of all of the best FFESN models to a class of control model ESNs (simply referred to as "ESNs"). Overall, for all four sets of trials we find that the FFESN either significantly outperforms (and has lower variance than) the ESN; or simply has lower variance than the ESN.
△ Less
Submitted 23 January, 2022;
originally announced January 2022.
-
OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion
Authors:
Vittorio La Barbera,
Fabio Pardo,
Yuval Tassa,
Monica Daley,
Christopher Richards,
Petar Kormushev,
John Hutchinson
Abstract:
Muscle-actuated control is a research topic that spans multiple domains, including biomechanics, neuroscience, reinforcement learning, robotics, and graphics. This type of control is particularly challenging as bodies are often overactuated and dynamics are delayed and non-linear. It is however a very well tested and tuned actuation mechanism that has undergone millions of years of evolution with…
▽ More
Muscle-actuated control is a research topic that spans multiple domains, including biomechanics, neuroscience, reinforcement learning, robotics, and graphics. This type of control is particularly challenging as bodies are often overactuated and dynamics are delayed and non-linear. It is however a very well tested and tuned actuation mechanism that has undergone millions of years of evolution with interesting properties exploiting passive forces and efficient energy storage of muscle-tendon units. To facilitate research on muscle-actuated simulation, we release a 3D musculoskeletal simulation of an ostrich based on the MuJoCo physics engine. The ostrich is one of the fastest bipeds on earth and therefore makes an excellent model for studying muscle-actuated bipedal locomotion. The model is based on CT scans and dissections used to collect actual muscle data, such as insertion sites, lengths, and pennation angles. Along with this model, we also provide a set of reinforcement learning tasks, including reference motion tracking, running, and neck control, used to infer muscle actuation patterns. The reference motion data is based on motion capture clips of various behaviors that we preprocessed and adapted to our model. This paper describes how the model was built and iteratively improved using the tasks. We also evaluate the accuracy of the muscle actuation patterns by comparing them to experimentally collected electromyographic data from locomoting birds. The results demonstrate the need for rich reward signals or regularization techniques to constrain muscle excitations and produce realistic movements. Overall, we believe that this work can provide a useful bridge between fields of research interested in muscle actuation.
△ Less
Submitted 24 May, 2022; v1 submitted 11 December, 2021;
originally announced December 2021.
-
Optimal CMB Lensing Reconstruction and Parameter Estimation with SPTpol Data
Authors:
M. Millea,
C. M. Daley,
T-L. Chou,
E. Anderes,
P. A. R. Ade,
A. J. Anderson,
J. E. Austermann,
J. S. Avva,
J. A. Beall,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
J. E. Carlstrom,
C. L. Chang,
P. Chaubal,
H. C. Chiang,
R. Citron,
C. Corbett Moran,
T. M. Crawford,
A. T. Crites,
T. de Haan,
M. A. Dobbs,
W. Everett,
J. Gallicchio
, et al. (44 additional authors not shown)
Abstract:
We perform the first simultaneous Bayesian parameter inference and optimal reconstruction of the gravitational lensing of the cosmic microwave background (CMB), using 100 deg$^2$ of polarization observations from the SPTpol receiver on the South Pole Telescope. These data reach noise levels as low as 5.8 $μ$K-arcmin in polarization, which are low enough that the typically used quadratic estimator…
▽ More
We perform the first simultaneous Bayesian parameter inference and optimal reconstruction of the gravitational lensing of the cosmic microwave background (CMB), using 100 deg$^2$ of polarization observations from the SPTpol receiver on the South Pole Telescope. These data reach noise levels as low as 5.8 $μ$K-arcmin in polarization, which are low enough that the typically used quadratic estimator (QE) technique for analyzing CMB lensing is significantly sub-optimal. Conversely, the Bayesian procedure extracts all lensing information from the data and is optimal at any noise level. We infer the amplitude of the gravitational lensing potential to be $A_φ\,{=}\,0.949\,{\pm}\,0.122$ using the Bayesian pipeline, consistent with our QE pipeline result, but with 17\% smaller error bars. The Bayesian analysis also provides a simple way to account for systematic uncertainties, performing a similar job as frequentist "bias hardening," and reducing the systematic uncertainty on $A_φ$ due to polarization calibration from almost half of the statistical error to effectively zero. Finally, we jointly constrain $A_φ$ along with $A_{\rm L}$, the amplitude of lensing-like effects on the CMB power spectra, demonstrating that the Bayesian method can be used to easily infer parameters both from an optimal lensing reconstruction and from the delensed CMB, while exactly accounting for the correlation between the two. These results demonstrate the feasibility of the Bayesian approach on real data, and pave the way for future analysis of deep CMB polarization measurements with SPT-3G, Simons Observatory, and CMB-S4, where improvements relative to the QE can reach 1.5 times tighter constraints on $A_φ$ and 7 times lower effective lensing reconstruction noise.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Campus Wi-Fi Coverage Map** and Analysis
Authors:
Farhana Binte Kamrul Easha,
Robert Abbas,
Matthew Daley
Abstract:
Wireless Local Area Networks (WLANs), known as Wi-Fi, have become an essential service in university environments that helps staff, students and guests to access connectivity to the Internet from their mobile devices. Apart from the Internet being a learning resource, students also submit their assignments online using web portals. Most campuses will have poor coverage areas for mobile networks an…
▽ More
Wireless Local Area Networks (WLANs), known as Wi-Fi, have become an essential service in university environments that helps staff, students and guests to access connectivity to the Internet from their mobile devices. Apart from the Internet being a learning resource, students also submit their assignments online using web portals. Most campuses will have poor coverage areas for mobile networks and, as a result, the ability of the wireless network to supplement Internet access for mobile devices in these areas becomes more important. Acquiring clear understanding of WLAN traffic patterns, network handover between access points and inter-network handover between the Wi-Fi and mobile networks, the optimal placement of networking equipment will help deliver a better wireless service. This paper presents data analyses and Wi-Fi signal coverage maps obtained by performing wireless radio surveys, coverage predictions and statistical analysis of data from the existing access points to show the current Wi-Fi performance in several locations of a large university campus. It them makes recommendations that should improve performance. These recommendations are derived from AP performance testing and made in the context of cabling length limitations and physical and aesthetic placement restrictions that are present at each location.
△ Less
Submitted 3 April, 2020;
originally announced April 2020.
-
From Helmut Jürgensen's Former Students: The Game of Informatics Research
Authors:
Mark Daley,
Mark Eramian,
Christopher Power,
Ian McQuillan
Abstract:
Personal reflections are given on being students of Helmut Jürgensen. Then, we attempt to address his hypothesis that informatics follows trend-like behaviours through the use of a content analysis of university job advertisements, and then via simulation techniques from the area of quantitative economics.
Personal reflections are given on being students of Helmut Jürgensen. Then, we attempt to address his hypothesis that informatics follows trend-like behaviours through the use of a content analysis of university job advertisements, and then via simulation techniques from the area of quantitative economics.
△ Less
Submitted 7 March, 2019;
originally announced March 2019.
-
Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance
Authors:
Ethan C. Jackson,
Mark Daley
Abstract:
Reinforcement learning (RL) problems often feature deceptive local optima, and learning methods that optimize purely for reward signal often fail to learn strategies for overcoming them. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. In this paper, we introduce and evaluate the use of nove…
▽ More
Reinforcement learning (RL) problems often feature deceptive local optima, and learning methods that optimize purely for reward signal often fail to learn strategies for overcoming them. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. In this paper, we introduce and evaluate the use of novelty search over agent action sequences by string edit metric distance as a means for promoting innovation. We also introduce a method for stagnation detection and population resampling inspired by recent developments in the RL community that uses the same mechanisms as novelty search to promote and develop innovative policies. Our methods extend a state-of-the-art method for deep neuroevolution using a simple-yet-effective genetic algorithm (GA) designed to efficiently learn deep RL policy network weights. Experiments using four games from the Atari 2600 benchmark were conducted. Results provide further evidence that GAs are competitive with gradient-based algorithms for deep RL. Results also demonstrate that novelty search over action sequences is an effective source of selection pressure that can be integrated into existing evolutionary algorithms for deep RL.
△ Less
Submitted 8 February, 2019;
originally announced February 2019.
-
On the Generalizability of Linear and Non-Linear Region of Interest-Based Multivariate Regression Models for fMRI Data
Authors:
Ethan C. Jackson,
James Alexander Hughes,
Mark Daley
Abstract:
In contrast to conventional, univariate analysis, various types of multivariate analysis have been applied to functional magnetic resonance imaging (fMRI) data. In this paper, we compare two contemporary approaches for multivariate regression on task-based fMRI data: linear regression with ridge regularization and non-linear symbolic regression using genetic programming. The data for this project…
▽ More
In contrast to conventional, univariate analysis, various types of multivariate analysis have been applied to functional magnetic resonance imaging (fMRI) data. In this paper, we compare two contemporary approaches for multivariate regression on task-based fMRI data: linear regression with ridge regularization and non-linear symbolic regression using genetic programming. The data for this project is representative of a contemporary fMRI experimental design for visual stimuli. Linear and non-linear models were generated for 10 subjects, with another 4 withheld for validation. Model quality is evaluated by comparing $R$ scores (Pearson product-moment correlation) in various contexts, including single run self-fit, within-subject generalization, and between-subject generalization. Propensity for modelling strategies to overfit is estimated using a separate resting state scan. Results suggest that neither method is objectively or inherently better than the other.
△ Less
Submitted 3 February, 2018;
originally announced February 2018.
-
In-Datacenter Performance Analysis of a Tensor Processing Unit
Authors:
Norman P. Jouppi,
Cliff Young,
Nishant Patil,
David Patterson,
Gaurav Agrawal,
Raminder Bajwa,
Sarah Bates,
Suresh Bhatia,
Nan Boden,
Al Borchers,
Rick Boyle,
Pierre-luc Cantin,
Clifford Chao,
Chris Clark,
Jeremy Coriell,
Mike Daley,
Matt Dau,
Jeffrey Dean,
Ben Gelb,
Tara Vazir Ghaemmaghami,
Rajendra Gottipati,
William Gulland,
Robert Hagmann,
C. Richard Ho,
Doug Hogberg
, et al. (50 additional authors not shown)
Abstract:
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOp…
▽ More
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X - 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
△ Less
Submitted 16 April, 2017;
originally announced April 2017.
-
Radial Surface Density Profiles of Gas and Dust in the Debris Disk around 49 Ceti
Authors:
A. M. Hughes,
J. Lieman-Sifry,
K. M. Flaherty,
C. M. Daley,
A. Roberge,
A. Kospal,
Attila Moor,
Inga Kamp,
D. J. Wilner,
S. M. Andrews,
J. H. Kastner,
P. Abraham
Abstract:
We present ~0.4 resolution images of CO(3-2) and associated continuum emission from the gas-bearing debris disk around the nearby A star 49 Ceti, observed with the Atacama Large Millimeter/Submillimeter Array (ALMA). We analyze the ALMA visibilities in tandem with the broad-band spectral energy distribution to measure the radial surface density profiles of dust and gas emission from the system. Th…
▽ More
We present ~0.4 resolution images of CO(3-2) and associated continuum emission from the gas-bearing debris disk around the nearby A star 49 Ceti, observed with the Atacama Large Millimeter/Submillimeter Array (ALMA). We analyze the ALMA visibilities in tandem with the broad-band spectral energy distribution to measure the radial surface density profiles of dust and gas emission from the system. The dust surface density decreases with radius between ~100 and 310 au, with a marginally significant enhancement of surface density at a radius of ~110 au. The SED requires an inner disk of small grains in addition to the outer disk of larger grains resolved by ALMA. The gas disk exhibits a surface density profile that increases with radius, contrary to most previous spatially resolved observations of circumstellar gas disks. While ~80% of the CO flux is well described by an axisymmetric power-law disk in Keplerian rotation about the central star, residuals at ~20% of the peak flux exhibit a departure from axisymmetry suggestive of spiral arms or a warp in the gas disk. The radial extent of the gas disk (~220 au) is smaller than that of the dust disk (~300 au), consistent with recent observations of other gas-bearing debris disks. While there are so far only three broad debris disks with well characterized radial dust profiles at millimeter wavelengths, 49 Ceti's disk shows a markedly different structure from two radially resolved gas-poor debris disks, implying that the physical processes generating and sculpting the gas and dust are fundamentally different.
△ Less
Submitted 6 April, 2017;
originally announced April 2017.
-
Insights into Quasar UV Spectra Using Unsupervised Clustering Analysis
Authors:
Aycha Tammour,
Sarah C. Gallagher,
Mark Daley,
Gordon T. Richards
Abstract:
Machine learning can provide powerful tools to detect patterns in multi-dimensional parameter space. We use K-means -a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabeled data- to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey of Paris et al. (2014). Detecting patterns in large datasets h…
▽ More
Machine learning can provide powerful tools to detect patterns in multi-dimensional parameter space. We use K-means -a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabeled data- to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey of Paris et al. (2014). Detecting patterns in large datasets helps us gain insights into the physical conditions and processes giving rise to the observed properties of quasars. We use K-means to find clusters in the parameter space of the equivalent width (EW), the blue- and red-half-width at half-maximum (HWHM) of the Mg II 2800 A line, the C IV 1549 A line, and the C III] 1908 A blend in samples of Broad Absorption-Line (BAL) and non-BAL quasars at redshift 1.6-2.1. Using this method, we successfully recover correlations well-known in the UV regime such as the anti-correlation between the EW and blueshift of the C IV emission line and the shape of the ionizing Spectra Energy distribution (SED) probed by the strength of He II and the Si III]/C III] ratio. We find this to be particularly evident when the properties of C III] are used to find the clusters, while those of Mg II proved to be less strongly correlated with the properties of the other lines in the spectra such as the width of C IV or the Si III]/C III] ratio. We conclude that unsupervised clustering methods (such as K-means) are powerful methods for finding "natural" binning boundaries in multidimensional datasets and discuss caveats and future work.
△ Less
Submitted 10 March, 2016;
originally announced March 2016.
-
On the Shuffle Automaton Size for Words
Authors:
Franziska Biegler,
Mark Daley,
Ian McQuillan
Abstract:
We investigate the state size of DFAs accepting the shuffle of two words. We provide words u and v, such that the minimal DFA for u shuffled with v requires an exponential number of states. We also show some conditions for the words u and v which ensure a quadratic upper bound on the state size of u shuffled with v. Moreover, switching only two letters within one of u or v is enough to trigger t…
▽ More
We investigate the state size of DFAs accepting the shuffle of two words. We provide words u and v, such that the minimal DFA for u shuffled with v requires an exponential number of states. We also show some conditions for the words u and v which ensure a quadratic upper bound on the state size of u shuffled with v. Moreover, switching only two letters within one of u or v is enough to trigger the change from quadratic to exponential.
△ Less
Submitted 29 July, 2009;
originally announced July 2009.
-
State complexity of orthogonal catenation
Authors:
Mark Daley,
Michael Domaratzki,
Kai Salomaa
Abstract:
A language $L$ is the orthogonal catenation of languages $L_1$ and $L_2$ if every word of $L$ can be written in a unique way as a catenation of a word in $L_1$ and a word in $L_2$. We establish a tight bound for the state complexity of orthogonal catenation of regular languages. The bound is smaller than the bound for arbitrary catenation.
A language $L$ is the orthogonal catenation of languages $L_1$ and $L_2$ if every word of $L$ can be written in a unique way as a catenation of a word in $L_1$ and a word in $L_2$. We establish a tight bound for the state complexity of orthogonal catenation of regular languages. The bound is smaller than the bound for arbitrary catenation.
△ Less
Submitted 21 April, 2009;
originally announced April 2009.