Search | arXiv e-print repository

Impact of Network Topology on Byzantine Resilience in Decentralized Federated Learning

Authors: Siddhartha Bhattacharya, Daniel Helo, Joshua Siegel

Abstract: Federated learning (FL) enables a collaborative environment for training machine learning models without sharing training data between users. This is typically achieved by aggregating model gradients on a central server. Decentralized federated learning is a rising paradigm that enables users to collaboratively train machine learning models in a peer-to-peer manner, without the need for a central… ▽ More Federated learning (FL) enables a collaborative environment for training machine learning models without sharing training data between users. This is typically achieved by aggregating model gradients on a central server. Decentralized federated learning is a rising paradigm that enables users to collaboratively train machine learning models in a peer-to-peer manner, without the need for a central aggregation server. However, before applying decentralized FL in real-world use training environments, nodes that deviate from the FL process (Byzantine nodes) must be considered when selecting an aggregation function. Recent research has focused on Byzantine-robust aggregation for client-server or fully connected networks, but has not yet evaluated such aggregation schemes for complex topologies possible with decentralized FL. Thus, the need for empirical evidence of Byzantine robustness in differing network topologies is evident. This work investigates the effects of state-of-the-art Byzantine-robust aggregation methods in complex, large-scale network structures. We find that state-of-the-art Byzantine robust aggregation strategies are not resilient within large non-fully connected networks. As such, our findings point the field towards the development of topology-aware aggregation schemes, especially necessary within the context of large scale real-world deployment. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 8 pages, 6 figures

ACM Class: I.2.11; C.4; C.2.4

arXiv:2406.19099 [pdf]

Silver-enriched Microdomain Patterns as Advanced Bactericidal Coatings for Polymer-based Medical Devices

Authors: Jana Pryjmakova, Barbora Vokata, Miroslav Slouf, Tomas Hubacek, Patricia Martinez-Garcia, Esther Rebollar, Petr Slepicka, Jakub Siegel

Abstract: Today, it would be difficult for us to live a full life without polymers, especially in medicine, where its applicability is constantly expanding, giving satisfactory results without any harm effects on health. This study focused on the formation of hexagonal domains doped with AgNPs using a KrF excimer laser (λ=248 nm) on the polyetheretherketone (PEEK) surface that acts as an unfailing source of… ▽ More Today, it would be difficult for us to live a full life without polymers, especially in medicine, where its applicability is constantly expanding, giving satisfactory results without any harm effects on health. This study focused on the formation of hexagonal domains doped with AgNPs using a KrF excimer laser (λ=248 nm) on the polyetheretherketone (PEEK) surface that acts as an unfailing source of the antibacterial agent - silver. The hexagonal structure was formed with a grid placed in front of the incident laser beam. Surfaces with immobilized silver nanoparticles (AgNPs) were observed by AFM and SEM. Changes in surface chemistry were studied by XPS. To determine the concentration of released Ag+ ions, ICP-MS analysis was used. The antibacterial tests proved the antibacterial efficacy of Ag-doped PEEK composites against Escherichia coli and Staphylococcus aureus as the most common pathogens. Because AgNPs are also known for their strong toxicity, we also included cytotoxicity tests in this study. The findings presented here contribute to the advancement of materials design in the biomedical field, offering a novel starting point for combating bacterial infections through the innovative integration of AgNPs into inert synthetic polymers. △ Less

Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.13782 [pdf, other]

Clock-line-mediated Sisyphus Cooling

Authors: Chun-Chia Chen, Jacob L. Siegel, Benjamin D. Hunt, Tanner Grogan, Youssef S. Hassan, Kyle Beloy, Kurt Gibble, Roger C. Brown, Andrew D. Ludlow

Abstract: We demonstrate sub-recoil Sisyphus cooling using the long-lived $^{3}\mathrm{P}_{0}$ clock state in alkaline-earth-like ytterbium. A 1388 nm optical standing wave nearly resonant with the $^{3}\textrm{P}_{0}$$\,\rightarrow$$\,^{3}\textrm{D}_{1}$ transition creates a spatially periodic light shift of the $^{3}\textrm{P}_{0}$ clock state. Following excitation on the ultranarrow clock transition, we… ▽ More We demonstrate sub-recoil Sisyphus cooling using the long-lived $^{3}\mathrm{P}_{0}$ clock state in alkaline-earth-like ytterbium. A 1388 nm optical standing wave nearly resonant with the $^{3}\textrm{P}_{0}$$\,\rightarrow$$\,^{3}\textrm{D}_{1}$ transition creates a spatially periodic light shift of the $^{3}\textrm{P}_{0}$ clock state. Following excitation on the ultranarrow clock transition, we observe Sisyphus cooling in this potential, as the light shift is correlated with excitation to $^{3}\textrm{D}_{1}$ and subsequent spontaneous decay to the $^{1}\textrm{S}_{0}$ ground state. We observe that cooling enhances the loading efficiency of atoms into a 759 nm magic-wavelength one-dimensional (1D) optical lattice, as compared to standard Doppler cooling on the $^{1}\textrm{S}_{0}$$\,\rightarrow\,$$^{3}\textrm{P}_{1}$ transition. Sisyphus cooling yields temperatures below 200 nK in the weakly confined, transverse dimensions of the 1D optical lattice. These lower temperatures improve optical lattice clocks by facilitating the use of shallow lattices with reduced light shifts, while retaining large atom numbers to reduce the quantum projection noise. This Sisyphus cooling can be pulsed or continuous and is applicable to a range of quantum metrology applications. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures

arXiv:2406.09217 [pdf, ps, other]

Convergence and error control of consistent PINNs for elliptic PDEs

Authors: Andrea Bonito, Ronald DeVore, Guergana Petrova, Jonathan W. Siegel

Abstract: We provide an a priori analysis of a certain class of numerical methods, commonly referred to as collocation methods, for solving elliptic boundary value problems. They begin with information in the form of point values of the right side f of such equations and point values of the boundary function g and utilize only this information to numerically approximate the solution u of the Partial Differe… ▽ More We provide an a priori analysis of a certain class of numerical methods, commonly referred to as collocation methods, for solving elliptic boundary value problems. They begin with information in the form of point values of the right side f of such equations and point values of the boundary function g and utilize only this information to numerically approximate the solution u of the Partial Differential Equation (PDE). For such a method to provide an approximation to u with guaranteed error bounds, additional assumptions on f and g, called model class assumptions, are needed. We determine the best error (in the energy norm) of approximating u, in terms of the number of point samples m, under all Besov class model assumptions for the right hand side $f$ and boundary g. We then turn to the study of numerical procedures and asks whether a proposed numerical procedure (nearly) achieves the optimal recovery error. We analyze numerical methods which generate the numerical approximation to $u$ by minimizing a specified data driven loss function over a set $Σ$ which is either a finite dimensional linear space, or more generally, a finite dimensional manifold. We show that the success of such a procedure depends critically on choosing a correct data driven loss function that is consistent with the PDE and provides sharp error control. Based on this analysis a loss function $L^*$ is proposed. We also address the recent methods of Physics Informed Neural Networks (PINNs). Minimization of the new loss $L^*$ over neural network spaces $Σ$ is referred to as consistent PINNs (CPINNs). We prove that CPINNs provides an optimal recovery of the solution $u$, provided that the optimization problem can be numerically executed and $Σ$ has sufficient approximation capabilities. Finally, numerical examples illustrating the benefits of the CPINNs are given. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 48 pages

arXiv:2406.06899 [pdf, other]

Develo**, Analyzing, and Evaluating Vehicular Lane Kee** Algorithms Under Dynamic Lighting and Weather Conditions Using Electric Vehicles

Authors: Michael Khalfin, Jack Volgren, Matthew Jones, Luke LeGoullon, Joshua Siegel, Chan-** Chung

Abstract: Self-driving vehicles have the potential to reduce accidents and fatalities on the road. Many production vehicles already come equipped with basic self-driving capabilities, but have trouble following lanes in adverse lighting and weather conditions. Therefore, we develop, analyze, and evaluate two vehicular lane-kee** algorithms under dynamic weather conditions using a combined deep learning- a… ▽ More Self-driving vehicles have the potential to reduce accidents and fatalities on the road. Many production vehicles already come equipped with basic self-driving capabilities, but have trouble following lanes in adverse lighting and weather conditions. Therefore, we develop, analyze, and evaluate two vehicular lane-kee** algorithms under dynamic weather conditions using a combined deep learning- and hand-crafted approach and an end-to-end deep learning approach. We use image segmentation- and linear-regression based deep learning to drive the vehicle toward the center of the lane, measuring the amount of laps completed, average speed, and average steering error per lap. Our hybrid model completes more laps than our end-to-end deep learning model. In the future, we are interested in combining our algorithms to form one cohesive approach to lane-following. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Supported by the National Science Foundation under Grants No. 2150292 and 2150096

arXiv:2405.17754 [pdf, other]

Differential Voltage Analysis and Patterns in Parallel-Connected Pairs of Imbalanced Cells

Authors: Clement Wong, Andrew Weng, Sravan Pannala, Jeesoon Choi, Jason B. Siegel, Anna Stefanopoulou

Abstract: Diagnosing imbalances in capacity and resistance within parallel-connected cells in battery packs is critical for battery management and fault detection, but it is challenging given that individual currents flowing into each cell are often unmeasured. This work introduces a novel method useful for identifying imbalances in capacity and resistance within a pair of parallel-connected cells using onl… ▽ More Diagnosing imbalances in capacity and resistance within parallel-connected cells in battery packs is critical for battery management and fault detection, but it is challenging given that individual currents flowing into each cell are often unmeasured. This work introduces a novel method useful for identifying imbalances in capacity and resistance within a pair of parallel-connected cells using only voltage and current measurements from the pair. Our method utilizes differential voltage analysis (DVA) when the pair is under constant current discharge and demonstrates that features of the pair's differential voltage curve (dV/dQ), namely its mid-to-high SOC dV/dQ peak's height and skewness, are sensitive to imbalances in capacity and resistance. We analyze and explain how and why these dV/dQ peak shape features change in response to these imbalances, highlighting that the underlying current imbalance dynamics resulting from these imbalances contribute to these changes. Ultimately, we demonstrate that dV/dQ peak shape features can identify the product of capacity imbalance and resistance imbalance, but cannot uniquely identify the imbalances. This work lays the groundwork for identifying imbalances in capacity and resistance in parallel-connected cell groups in battery packs, where commonly only a single current sensor is placed for each parallel cell group. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted to American Control Conference (ACC), Toronto, Canada, July 2024

arXiv:2405.12028 [pdf, other]

The Case for DeepSOH: Addressing Path Dependency for Remaining Useful Life

Authors: Hamidreza Movahedi, Andrew Weng, Sravan Pannala, Jason B. Siegel, Anna G. Stefanopoulou

Abstract: The battery state of health (SOH) based on capacity fade and resistance increase is not sufficient for predicting Remaining Useful life (RUL). The electrochemical community blames the path-dependency of the battery degradation mechanisms for our inability to forecast the degradation. The control community knows that the path-dependency is addressed by full state estimation. We show that even the e… ▽ More The battery state of health (SOH) based on capacity fade and resistance increase is not sufficient for predicting Remaining Useful life (RUL). The electrochemical community blames the path-dependency of the battery degradation mechanisms for our inability to forecast the degradation. The control community knows that the path-dependency is addressed by full state estimation. We show that even the electrode-specific SOH (eSOH) estimation is not enough to fully define the degradation states by simulating infinite possible degradation trajectories and remaining useful lives (RUL) from a unique eSOH. We finally define the deepSOH states that capture the individual contributions of all the common degradation mechanisms, namely, SEI, plating, and mechanical fracture to the loss of lithium inventory. We show that the addition of cell expansion measurement may allow us to estimate the deepSOH and predict the remaining useful life. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 6 pages, 3 figures

arXiv:2404.17494 [pdf, other]

doi 10.3847/1538-3881/ad34d5

The death of Vulcan: NEID reveals the planet candidate orbiting HD 26965 is stellar activity

Authors: Abigail Burrows, Samuel Halverson, Jared C. Siegel, Christian Gilbertson, Jacob Luhn, Jennifer Burt, Chad F. Bender, Arpita Roy, Ryan C. Terrien, Selma Vangstein, Suvrath Mahadevan, Jason T. Wright, Paul Robertson, Eric B. Ford, Guðmundur Stefánsson, Joe P. Ninan, Cullen H. Blake, Michael W. McElwain, Christian Schwab, **glin Zhao

Abstract: We revisit the long-studied radial velocity (RV) target HD26965 using recent observations from the NASA-NSF 'NEID' precision Doppler facility. Leveraging a suite of classical activity indicators, combined with line-by-line RV analyses, we demonstrate that the claimed 45-day signal previously identified as a planet candidate is most likely an activity-induced signal. Correlating the bulk (spectrall… ▽ More We revisit the long-studied radial velocity (RV) target HD26965 using recent observations from the NASA-NSF 'NEID' precision Doppler facility. Leveraging a suite of classical activity indicators, combined with line-by-line RV analyses, we demonstrate that the claimed 45-day signal previously identified as a planet candidate is most likely an activity-induced signal. Correlating the bulk (spectrally-averaged) RV with canonical line activity indicators confirms a multi-day 'lag' between the observed activity indicator time series and the measured RV. When accounting for this lag, we show that much of the observed RV signal can be removed by a linear detrending of the data. Investigating activity at the line-by-line level, we find a depth-dependent correlation between individual line RVs and the bulk RVs, further indicative of periodic suppression of convective blueshift causing the observed RV variability, rather than an orbiting planet. We conclude that the combined evidence of the activity correlations and depth dependence is consistent with a radial velocity signature dominated by a rotationally-modulated activity signal at a period of $\sim$42 days. We hypothesize that this activity signature is due to a combination of spots and convective blueshift suppression. The tools applied in our analysis are broadly applicable to other stars, and could help paint a more comprehensive picture of the manifestations of stellar activity in future Doppler RV surveys. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 25 pages, 13 figures. Accepted in AJ

Journal ref: AJ 167 243 (2024)

arXiv:2404.02849 [pdf, other]

Efficient Structure-Informed Featurization and Property Prediction of Ordered, Dilute, and Random Atomic Structures

Authors: Adam M. Krajewski, Jonathan W. Siegel, Zi-Kui Liu

Abstract: Structure-informed materials informatics is a rapidly evolving discipline of materials science relying on the featurization of atomic structures or configurations to construct vector, voxel, graph, graphlet, and other representations useful for machine learning prediction of properties, fingerprinting, and generative design. This work discusses how current featurizers typically perform redundant c… ▽ More Structure-informed materials informatics is a rapidly evolving discipline of materials science relying on the featurization of atomic structures or configurations to construct vector, voxel, graph, graphlet, and other representations useful for machine learning prediction of properties, fingerprinting, and generative design. This work discusses how current featurizers typically perform redundant calculations and how their efficiency could be improved by considering (1) fundamentals of crystallographic (orbits) equivalency to optimize ordered cases and (2) representation-dependent equivalency to optimize cases of dilute, doped, and defect structures with broken symmetry. It also discusses and contrasts ways of (3) approximating random solid solutions occupying arbitrary lattices under such representations. Efficiency improvements discussed in this work were implemented within pySIPFENN or python toolset for Structure-Informed Property and Feature Engineering with Neural Networks developed by authors since 2019 and shown to increase performance from 2 to 10 times for typical inputs. Throughout this work, the authors explicitly discuss how these advances can be applied to different kinds of similar tools in the community. △ Less

Submitted 14 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 4 figures; GitHub repository at https://git.pysipfenn.org; Documentation at https://pysipfenn.org; V2 includes minor reference and wording updates

ACM Class: I.2; J.2

arXiv:2403.00136 [pdf, other]

Develo** a Taxonomy of Elements Adversarial to Autonomous Vehicles

Authors: Mohammadali Saffary, Nishan Inampudi, Joshua E. Siegel

Abstract: As highly automated vehicles reach higher deployment rates, they find themselves in increasingly dangerous situations. Knowing that the consequence of a crash is significant for the health of occupants, bystanders, and properties, as well as to the viability of autonomy and adjacent businesses, we must search for more efficacious ways to comprehensively and reliably train autonomous vehicles to be… ▽ More As highly automated vehicles reach higher deployment rates, they find themselves in increasingly dangerous situations. Knowing that the consequence of a crash is significant for the health of occupants, bystanders, and properties, as well as to the viability of autonomy and adjacent businesses, we must search for more efficacious ways to comprehensively and reliably train autonomous vehicles to better navigate the complex scenarios with which they struggle. We therefore introduce a taxonomy of potentially adversarial elements that may contribute to poor performance or system failures as a means of identifying and elucidating lesser-seen risks. This taxonomy may be used to characterize failures of automation, as well as to support simulation and real-world training efforts by providing a more comprehensive classification system for events resulting in disengagement, collision, or other negative consequences. This taxonomy is created from and tested against real collision events to ensure comprehensive coverage with minimal class overlap and few omissions. It is intended to be used both for the identification of harm-contributing adversarial events and in the generation thereof (to create extreme edge- and corner-case scenarios) in training procedures. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 18 pages total, 4 pages of references, initial page left blank for IEEE submission statement. Includes 4 figures and 2 tables. Written using IEEEtran document class

arXiv:2402.16077 [pdf, other]

Equivariant Frames and the Impossibility of Continuous Canonicalization

Authors: Nadav Dym, Hannah Lawrence, Jonathan W. Siegel

Abstract: Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures. Recent works have found an empirical benefit to using probabilistic frames instead, which learn weighted distributions over group elements. In this work, we provide stro… ▽ More Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures. Recent works have found an empirical benefit to using probabilistic frames instead, which learn weighted distributions over group elements. In this work, we provide strong theoretical justification for this phenomenon: for commonly-used groups, there is no efficiently computable choice of frame that preserves continuity of the function being averaged. In other words, unweighted frame-averaging can turn a smooth, non-symmetric function into a discontinuous, symmetric function. To address this fundamental robustness problem, we formally define and construct \emph{weighted} frames, which provably preserve continuity, and demonstrate their utility by constructing efficient and continuous weighted frames for the actions of $SO(2)$, $SO(3)$, and $S_n$ on point clouds. △ Less

Submitted 18 June, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.05664 [pdf, other]

UNCOVER NIRSpec/PRISM Spectroscopy Unveils Evidence of Early Core Formation in a Massive, Centrally Dusty Quiescent Galaxy at $z_{spec}=3.97$

Authors: David J. Setton, Gourav Khullar, Tim B. Miller, Rachel Bezanson, Jenny E. Greene, Katherine A. Suess, Katherine E. Whitaker, Jacqueline Antwi-Danso, Hakim Atek, Gabriel Brammer, Sam E. Cutler, Pratika Dayal, Robert Feldmann, Lukas J. Furtak, Seiji Fujimoto, Karl Glazebrook, Andy D. Goulding, Vasily Kokorev, Ivo Labbe, Joel Leja, Yilun Ma, Danilo Marchesini, Themiya Nanayakkara, Richard Pan, Sedona H. Price , et al. (6 additional authors not shown)

Abstract: We report the spectroscopic confirmation of a massive ($\log(M_\star/M_\odot)=10.34 \pm_{0.07}^{0.06}$), HST-dark ($m_\mathrm{F150W} - m_\mathrm{F444W} = 3.6$) quiescent galaxy at $z_{spec}=3.97$ in the UNCOVER survey. NIRSpec/PRISM spectroscopy and a non-detection in deep ALMA imaging surprisingly reveals that the galaxy is consistent with a low ($<$10 $M_\odot \ \mathrm{yr^{-1}}$) star formation… ▽ More We report the spectroscopic confirmation of a massive ($\log(M_\star/M_\odot)=10.34 \pm_{0.07}^{0.06}$), HST-dark ($m_\mathrm{F150W} - m_\mathrm{F444W} = 3.6$) quiescent galaxy at $z_{spec}=3.97$ in the UNCOVER survey. NIRSpec/PRISM spectroscopy and a non-detection in deep ALMA imaging surprisingly reveals that the galaxy is consistent with a low ($<$10 $M_\odot \ \mathrm{yr^{-1}}$) star formation rate despite evidence for moderate dust attenuation. The F444W image is well modeled with a two component \sersic fit that favors a compact, $r_e\sim200$ pc, $n\sim2.9$ component and a more extended, $r_e\sim1.6$ kpc, $n\sim1.7$ component. The galaxy exhibits strong color gradients: the inner regions are significantly redder than the outskirts. Spectral energy distribution models that reproduce both the red colors and low star formation rate in the center of UNCOVER 18407 require both significant ($A_v\sim1.4$ mag) dust attenuation and a stellar mass-weighted age of 900 Myr, implying 50\% of the stars in the core already formed by $z=7.5$. Using spatially resolved annular mass-to-light measurements enabled by the galaxy's moderate magnification ($μ=2.12\pm_{0.01}^{0.05}$) to reconstruct a radial mass profile from the best-fitting two-component \sersic model, we infer a total mass-weighted $r_\mathrm{eff} = 0.72 \pm_{0.11}^{0.15}$ kpc and log$(Σ_\mathrm{1 kpc} \ [\mathrm{M_\odot/kpc^2}]) = 9.61 \pm_{0.10}^{0.08}$. The early formation of a dense, low star formation rate, and dusty core embedded in a less attenuated stellar envelope suggests an evolutionary link between the earliest-forming massive galaxies and their elliptical descendants. Furthermore, the disparity between the global, integrated dust properties and the spatially resolved gradients highlights the importance of accounting for radially varying stellar populations when characterizing the early growth of galaxy structure. △ Less

Submitted 12 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: 17 pages, 9 figures, 2 tables. Resubmitted to ApJ after response to referee and update to include new medium band imaging from the JWST MEGASCIENCE program. Comments welcome!

arXiv:2402.04968 [pdf, other]

Excited-Band Coherent Delocalization for Improved Optical Lattice Clock Performance

Authors: Jacob Siegel, William McGrew, Youssef Hassan, Chun-Chia Chen, Kyle Beloy, Tanner Grogan, Xiaogang Zhang, Andrew Ludlow

Abstract: We implement coherent delocalization as a tool for improving the two primary metrics of atomic clock performance: systematic uncertainty and instability. By decreasing atomic density with coherent delocalization, we suppress cold-collision shifts and two-body losses. Atom loss attributed to Landau-Zener tunneling in the ground lattice band would compromise coherent delocalization at low trap depth… ▽ More We implement coherent delocalization as a tool for improving the two primary metrics of atomic clock performance: systematic uncertainty and instability. By decreasing atomic density with coherent delocalization, we suppress cold-collision shifts and two-body losses. Atom loss attributed to Landau-Zener tunneling in the ground lattice band would compromise coherent delocalization at low trap depths for our $^{171}$Yb atoms; hence, we implement for the first time delocalization in excited lattice bands. Doing so increases the spatial distribution of atoms trapped in the vertically-oriented optical lattice by $\sim7$ times. At the same time we observe a reduction of the cold-collision shift by 6.5(8) times, while also making inelastic two-body loss negligible. With these advantages, we measure the trap-light-induced quenching rate and natural lifetime of the ${}^3$P${}_0$ excited-state as $5.7(7)\times10^{-4}$ $E_r^{-1}s^{-1}$ and 19(2) s, respectively. △ Less

Submitted 12 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.04407 [pdf, ps, other]

Sharp Lower Bounds on the Manifold Widths of Sobolev and Besov Spaces

Authors: Jonathan W. Siegel

Abstract: We consider the problem of determining the manifold $n$-widths of Sobolev and Besov spaces with error measured in the $L_p$-norm. The manifold widths control how efficiently these spaces can be approximated by general non-linear parametric methods with the restriction that the parameter selection and parameterization maps must be continuous. Existing upper and lower bounds only match when the Sobo… ▽ More We consider the problem of determining the manifold $n$-widths of Sobolev and Besov spaces with error measured in the $L_p$-norm. The manifold widths control how efficiently these spaces can be approximated by general non-linear parametric methods with the restriction that the parameter selection and parameterization maps must be continuous. Existing upper and lower bounds only match when the Sobolev or Besov smoothness index $q$ satisfies $q\leq p$ or $1 \leq p \leq 2$. We close this gap and obtain sharp lower bounds for all $1 \leq p,q \leq \infty$ for which a compact embedding holds. A key part of our analysis is to determine the exact value of the manifold widths of finite dimensional $\ell^M_q$-balls in the $\ell_p$-norm when $p\leq q$. Although this result is not new, we provide a new proof and apply it to lower bounding the manifold widths of Sobolev and Besov spaces. Our results show that the Bernstein widths, which are typically used to lower bound the manifold widths, decay asymptotically faster than the manifold widths in many cases. △ Less

Submitted 4 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

MSC Class: 41A46; 41A25

arXiv:2312.02193 [pdf]

A new promising material for biological applications: multi-level physical modification of AgNPs-decorated PEEK

Authors: Jana Pryjmakova, Daniel Grossberger, Anna Kutova, Barbora Vokata, Miroslav Slouf, Petr Slepicka, Jakub Siegel

Abstract: In the case of polymer medical devices, the surface design plays a crucial role in contact with human tissue. The use of AgNPs as antibacterial agents is well known; however, their anchoring into the polymer surface can still be investigated. This work describes the change in surface morphology and behaviour in the biological environment of polyetheretherketone (PEEK) with immobilised AgNPs after… ▽ More In the case of polymer medical devices, the surface design plays a crucial role in contact with human tissue. The use of AgNPs as antibacterial agents is well known; however, their anchoring into the polymer surface can still be investigated. This work describes the change in surface morphology and behaviour in the biological environment of polyetheretherketone (PEEK) with immobilised AgNPs after different surface modifications. The initial composites were prepared by immobilisation of silver nanoparticles from a colloid solution into the upper surface layers of polyetheretherketone (PEEK). The prepared samples (Ag/PEEK) had a planar morphology and were further modified with a KrF laser, a GaN laser, and Ar plasma. The samples were studied using the AFM method to visualise changes in surface morphology and to obtain information on the height of the structures and other surface parameters. Comparative analysis of the nanoparticles and polymers was performed using FEG-SEM. The chemical composition of the surface of the samples and optical activity were studied by XPS and UV-Vis spectroscopy. Finally, drop plate antibacterial and cytotoxicity tests were performed to determine the role of Ag nanoparticles after modification and suitability of the surface, which are important for the use of the resulting composite in biomedical applications. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2310.17610 [pdf, other]

A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces

Authors: Jonathan W. Siegel, Stephan Wojtowytsch

Abstract: We consider gradient flow/gradient descent and heavy ball/accelerated gradient descent optimization for convex objective functions. In the gradient flow case, we prove the following: 1. If $f$ does not have a minimizer, the convergence $f(x_t)\to \inf f$ can be arbitrarily slow. 2. If $f$ does have a minimizer, the excess energy $f(x_t) - \inf f$ is integrable/summable in time. In particular,… ▽ More We consider gradient flow/gradient descent and heavy ball/accelerated gradient descent optimization for convex objective functions. In the gradient flow case, we prove the following: 1. If $f$ does not have a minimizer, the convergence $f(x_t)\to \inf f$ can be arbitrarily slow. 2. If $f$ does have a minimizer, the excess energy $f(x_t) - \inf f$ is integrable/summable in time. In particular, $f(x_t) - \inf f = o(1/t)$ as $t\to\infty$. 3. In Hilbert spaces, this is optimal: $f(x_t) - \inf f$ can decay to $0$ as slowly as any given function which is monotone decreasing and integrable at $\infty$, even for a fixed quadratic objective. 4. In finite dimension (or more generally, for all gradient flow curves of finite length), this is not optimal: We prove that there are convex monotone decreasing integrable functions $g(t)$ which decrease to zero slower than $f(x_t)-\inf f$ for the gradient flow of any convex function on $\mathbb R^d$. For instance, we show that any gradient flow $x_t$ of a convex function $f$ in finite dimension satisfies $\liminf_{t\to\infty} \big(t\cdot \log^2(t)\cdot \big\{f(x_t) -\inf f\big\}\big)=0$. This improves on the commonly reported $O(1/t)$ rate and provides a sharp characterization of the energy decay law. We also note that it is impossible to establish a rate $O(1/(tφ(t))$ for any function $φ$ which satisfies $\lim_{t\to\infty}φ(t) = \infty$, even asymptotically. Similar results are obtained in related settings for (1) discrete time gradient descent, (2) stochastic gradient descent with multiplicative noise and (3) the heavy ball ODE. In the case of stochastic gradient descent, the summability of $\mathbb E[f(x_n) - \inf f]$ is used to prove that $f(x_n)\to \inf f$ almost surely - an improvement on the convergence almost surely up to a subsequence which follows from the $O(1/n)$ decay estimate. △ Less

Submitted 26 October, 2023; originally announced October 2023.

MSC Class: 26A51; 34A34

arXiv:2310.10396 [pdf, other]

Current imbalance in dissimilar parallel-connected batteries and the fate of degradation convergence

Authors: Andrew Weng, Hamidreza Movahedi, Clement Wong, Jason B. Siegel, Anna Stefanopoulou

Abstract: This paper proposes an analytical framework describing how initial capacity and resistance variability in parallel-connected battery cells may inflict additional variability or reduce variability while the cells age. We derive closed-form equations for current and SOC imbalance dynamics within a charge or discharge cycle. These dynamics are represented by a first-order equivalent circuit model and… ▽ More This paper proposes an analytical framework describing how initial capacity and resistance variability in parallel-connected battery cells may inflict additional variability or reduce variability while the cells age. We derive closed-form equations for current and SOC imbalance dynamics within a charge or discharge cycle. These dynamics are represented by a first-order equivalent circuit model and validated against experimental data. To demonstrate how current and SOC imbalance leads to cell degradation, we developed a successive update scheme in which the inter-cycle imbalance dynamics update the intra-cycle degradation dynamics, and vice versa. Using this framework, we demonstrate that current imbalance can cause convergent degradation trajectories, consistent with previous reports. However, we also demonstrate that different degradation assumptions, such as those associated with SOC imbalance, may cause divergent degradation. We finally highlight the role of different cell chemistries, including different OCV function nonlinearities, on system behavior, and derive analytical bounds on the SOC imbalance using Lyapunov analysis. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 22 pages, 14 figures, submitted to the Journal of Dynamic Systems, Measurement and Control

arXiv:2308.07998 [pdf, other]

doi 10.1038/s41467-024-47229-0

Electrostatic Steering of Thermal Emission with Active Metasurface Control of Delocalized Modes

Authors: Joel Siegel, Shinho Kim, Margaret Fortman, Chenghao Wan, Mikhail A. Kats, Phillip W. C. Hon, Luke Sweatlock, Min Seok Jang, Victor Watson Brar

Abstract: We theoretically describe and experimentally demonstrate a graphene-integrated metasurface structure that enables electrically-tunable directional control of thermal emission. This device consists of a dielectric slab that acts as a Fabry-Perot (F-P) resonator supporting long-range delocalized modes bounded on one side by an electrostatically tunable metal-graphene metasurface. By varying the Ferm… ▽ More We theoretically describe and experimentally demonstrate a graphene-integrated metasurface structure that enables electrically-tunable directional control of thermal emission. This device consists of a dielectric slab that acts as a Fabry-Perot (F-P) resonator supporting long-range delocalized modes bounded on one side by an electrostatically tunable metal-graphene metasurface. By varying the Fermi level of the graphene, the accumulated phase of the F-P mode is shifted, which changes the direction of absorption and emission at a fixed frequency. We directly measure the frequency- and angle-dependent emissivity of the thermal emission from a fabricated device heated to 250$^{\circ}$. Our results show that electrostatic control allows the thermal emission at 6.61 $μ$m to be continuously steered over 16$^{\circ}$, with a peak emissivity maintained above 0.9. We analyze the dynamic behavior of the thermal emission steerer theoretically using a Fano interference model, and use the model to design optimized thermal steerer structures. △ Less

Submitted 22 April, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures

arXiv:2307.15772 [pdf, ps, other]

Weighted variation spaces and approximation by shallow ReLU networks

Authors: Ronald DeVore, Robert D. Nowak, Rahul Parhi, Jonathan W. Siegel

Abstract: We investigate the approximation of functions $f$ on a bounded domain $Ω\subset \mathbb{R}^d$ by the outputs of single-hidden-layer ReLU neural networks of width $n$. This form of nonlinear $n$-term dictionary approximation has been intensely studied since it is the simplest case of neural network approximation (NNA). There are several celebrated approximation results for this form of NNA that int… ▽ More We investigate the approximation of functions $f$ on a bounded domain $Ω\subset \mathbb{R}^d$ by the outputs of single-hidden-layer ReLU neural networks of width $n$. This form of nonlinear $n$-term dictionary approximation has been intensely studied since it is the simplest case of neural network approximation (NNA). There are several celebrated approximation results for this form of NNA that introduce novel model classes of functions on $Ω$ whose approximation rates avoid the curse of dimensionality. These novel classes include Barron classes, and classes based on sparsity or variation such as the Radon-domain BV classes. The present paper is concerned with the definition of these novel model classes on domains $Ω$. The current definition of these model classes does not depend on the domain $Ω$. A new and more proper definition of model classes on domains is given by introducing the concept of weighted variation spaces. These new model classes are intrinsic to the domain itself. The importance of these new model classes is that they are strictly larger than the classical (domain-independent) classes. Yet, it is shown that they maintain the same NNA rates. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.15285 [pdf, other]

Optimal Approximation of Zonoids and Uniform Approximation by Shallow Neural Networks

Authors: Jonathan W. Siegel

Abstract: We study the following two related problems. The first is to determine to what error an arbitrary zonoid in $\mathbb{R}^{d+1}$ can be approximated in the Hausdorff distance by a sum of $n$ line segments. The second is to determine optimal approximation rates in the uniform norm for shallow ReLU$^k$ neural networks on their variation spaces. The first of these problems has been solved for… ▽ More We study the following two related problems. The first is to determine to what error an arbitrary zonoid in $\mathbb{R}^{d+1}$ can be approximated in the Hausdorff distance by a sum of $n$ line segments. The second is to determine optimal approximation rates in the uniform norm for shallow ReLU$^k$ neural networks on their variation spaces. The first of these problems has been solved for $d\neq 2,3$, but when $d=2,3$ a logarithmic gap between the best upper and lower bounds remains. We close this gap, which completes the solution in all dimensions. For the second problem, our techniques significantly improve upon existing approximation rates when $k\geq 1$, and enable uniform approximation of both the target function and its derivatives. △ Less

Submitted 26 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

MSC Class: 41A25; 41A46; 52A21; 68T07

arXiv:2307.07679 [pdf, other]

Sharp Convergence Rates for Matching Pursuit

Authors: Jason M. Klusowski, Jonathan W. Siegel

Abstract: We study the fundamental limits of matching pursuit, or the pure greedy algorithm, for approximating a target function by a sparse linear combination of elements from a dictionary. When the target function is contained in the variation space corresponding to the dictionary, many impressive works over the past few decades have obtained upper and lower bounds on the error of matching pursuit, but th… ▽ More We study the fundamental limits of matching pursuit, or the pure greedy algorithm, for approximating a target function by a sparse linear combination of elements from a dictionary. When the target function is contained in the variation space corresponding to the dictionary, many impressive works over the past few decades have obtained upper and lower bounds on the error of matching pursuit, but they do not match. The main contribution of this paper is to close this gap and obtain a sharp characterization of the decay rate of matching pursuit. Specifically, we construct a worst case dictionary which shows that the existing best upper bound cannot be significantly improved. It turns out that, unlike other greedy algorithm variants, the converge rate is suboptimal and is determined by the solution to a certain non-linear equation. This enables us to conclude that any amount of shrinkage improves matching pursuit in the worst case. △ Less

Submitted 25 July, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

arXiv:2305.18722 [pdf, other]

Modeling battery formation: boosted SEI growth, multi-species reactions, and irreversible expansion

Authors: Andrew Weng, Everardo Olide, Iaroslav Kovalchuk, Jason B. Siegel, Anna Stefanopoulou

Abstract: This work proposes a semi-empirical model for the SEI growth process during the early stages of lithium-ion battery formation cycling and aging. By combining a full-cell model which tracks half-cell equilibrium potentials, a zero-dimensional model of SEI growth kinetics, and a semi-empirical description of cell thickness expansion, the resulting model replicated experimental trends measured on a 2… ▽ More This work proposes a semi-empirical model for the SEI growth process during the early stages of lithium-ion battery formation cycling and aging. By combining a full-cell model which tracks half-cell equilibrium potentials, a zero-dimensional model of SEI growth kinetics, and a semi-empirical description of cell thickness expansion, the resulting model replicated experimental trends measured on a 2.5 Ah pouch cell, including the calculated first-cycle efficiency, measured cell thickness changes, and electrolyte reduction peaks during the first charge dQ/dV signal. This work also introduces an SEI growth boosting formalism that enables a unified description of SEI growth during both cycling and aging. This feature can enable future applications for modeling path-dependent aging over a cell's life. The model further provides a homogenized representation of multiple SEI reactions enabling the study of both solvent and additive consumption during formation. This work bridges the gap between electrochemical descriptions of SEI growth and applications towards improving industrial battery manufacturing process control where battery formation is an essential but time-consuming final step. We envision that the formation model can be used to predict the impact of formation protocols and electrolyte systems on SEI passivation and resulting battery lifetime. △ Less

Submitted 21 July, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: Submitted to the Journal of the Electrochemical Society on May 24, 2023

arXiv:2305.13400 [pdf, other]

doi 10.3847/2041-8213/acd62f

Ponderings on the Possible Preponderance of Perpendicular Planets

Authors: Jared Siegel, Joshua Winn, Simon Albrecht

Abstract: Misalignments between planetary orbits and the equatorial planes of their host stars are clues about the formation and evolution of planetary systems. Earlier work found evidence for a peak near $90^\circ$ in the distribution of stellar obliquities, based on frequentist tests. We performed hierarchical Bayesian inference on a sample of 174 planets for which either the full three-dimensional stella… ▽ More Misalignments between planetary orbits and the equatorial planes of their host stars are clues about the formation and evolution of planetary systems. Earlier work found evidence for a peak near $90^\circ$ in the distribution of stellar obliquities, based on frequentist tests. We performed hierarchical Bayesian inference on a sample of 174 planets for which either the full three-dimensional stellar obliquity has been measured (72 planets) or for which only the sky-projected stellar obliquity has been measured (102 planets). We investigated whether the obliquities are best described by a Rayleigh distribution, or by a mixture of a Rayleigh distribution representing well-aligned systems and a different distribution representing misaligned systems. The mixture models are strongly favored over the single-component distribution. For the misaligned component, we tried an isotropic distribution and a distribution peaked at 90$^\circ$, and found the evidence to be essentially the same for both models. Thus, our Bayesian inference engine did not find strong evidence favoring a "perpendicular peak,'' unlike the frequentist tests. We also investigated selection biases that affect the inferred obliquity distribution, such as the bias of the gravity-darkening method against obliquities near $0^\circ$ or $180^\circ$. Further progress in characterizing the obliquity distribution will probably require the construction of a more homogeneous and complete sample of measurements. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 15 pages, accepted to ApJ Letters

arXiv:2304.13621 [pdf]

doi 10.3390/ma15217468

Influence of Heat Accumulation on Morphology Debris Deposition and Wetting of LIPSS on Steel upon High Repetition Rate Femtosecond Pulses Irradiation

Authors: Camilo Florian, Yasser Fuentes-Edfuf, Evangelos Skoulas, Emmanuel Stratakis, Santiago Sanchez-Cortes, Javier Solis, Jan Siegel

Abstract: The fabrication of laser-induced periodic surface structures (LIPSS) over extended areas at high processing speeds requires the use of high repetition rate femtosecond lasers. It is known that industrially relevant materials such as steel experience heat accumulation when irradiated at repetition rates above some hundreds of kHz, and significant debris redeposition can take place. However, there a… ▽ More The fabrication of laser-induced periodic surface structures (LIPSS) over extended areas at high processing speeds requires the use of high repetition rate femtosecond lasers. It is known that industrially relevant materials such as steel experience heat accumulation when irradiated at repetition rates above some hundreds of kHz, and significant debris redeposition can take place. However, there are few studies on how the laser repetition rate influences both the debris deposition and the final LIPSS morphology. In this work, we present a study of fs laser-induced fabrication of low spatial frequency LIPSS (LSFL), with pulse repetition rates ranging from 10 kHz to 2 MHz on commercially available steel. The morphology of the laser-structured areas as well as the redeposited debris was characterized by scanning electron microscopy (SEM) and μ-Raman spectroscopy. To identify repetition rate ranges where heat accumulation is present during the irradiations, we developed a simple heat accumulation model that solves the heat equation in 1 dimension implementing a Forward differencing in Time and Central differencing in Space (FTCS) scheme. Contact angle measurements with water demonstrated the influence of heat accumulation and debris on the functional wetting behavior. The findings are directly relevant for the processing of metals using high repetition rate femtosecond lasers, enabling the identification of optimum conditions in terms of desired morphology, functionality, and throughput. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Journal ref: https://www.mdpi.com/1996-1944/15/21/7468

arXiv:2304.13332 [pdf, ps, other]

doi 10.1142/S0218202524500143

Entropy-based convergence rates of greedy algorithms

Authors: Yuwen Li, Jonathan Siegel

Abstract: We present convergence estimates of two types of greedy algorithms in terms of the metric entropy of underlying compact sets. In the first part, we measure the error of a standard greedy reduced basis method for parametric PDEs by the metric entropy of the solution manifold in Banach spaces. This contrasts with the classical analysis based on the Kolmogorov n-widths and enables us to obtain direct… ▽ More We present convergence estimates of two types of greedy algorithms in terms of the metric entropy of underlying compact sets. In the first part, we measure the error of a standard greedy reduced basis method for parametric PDEs by the metric entropy of the solution manifold in Banach spaces. This contrasts with the classical analysis based on the Kolmogorov n-widths and enables us to obtain direct comparisons between the greedy algorithm error and the entropy numbers, where the multiplicative constants are explicit and simple. The entropy-based convergence estimate is sharp and improves upon the classical width-based analysis of reduced basis methods for elliptic model problems. In the second part, we derive a novel and simple convergence analysis of the classical orthogonal greedy algorithm for nonlinear dictionary approximation using the metric entropy of the symmetric convex hull of the dictionary. This also improves upon existing results by giving a direct comparison between the algorithm error and the metric entropy. △ Less

Submitted 22 February, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: 24 pages, no figures

MSC Class: 41A25; 41A46; 41A65; 65M12; 65N15

Journal ref: Mathematical Models and Methods in Applied Sciences (2024)

arXiv:2303.07088 [pdf, other]

Differential voltage analysis for battery manufacturing process control

Authors: Andrew Weng, Jason B. Siegel, Anna Stefanopoulou

Abstract: Voltage-based battery metrics are ubiquitous and essential in battery manufacturing diagnostics. They enable electrochemical "fingerprinting" of batteries at the end of the manufacturing line and are naturally scalable, since voltage data is already collected as part of the formation process which is the last step in battery manufacturing. Yet, despite their prevalence, interpretations of voltage-… ▽ More Voltage-based battery metrics are ubiquitous and essential in battery manufacturing diagnostics. They enable electrochemical "fingerprinting" of batteries at the end of the manufacturing line and are naturally scalable, since voltage data is already collected as part of the formation process which is the last step in battery manufacturing. Yet, despite their prevalence, interpretations of voltage-based metrics are often ambiguous and require expert judgment. In this work, we present a method for collecting and analyzing full cell near-equilibrium voltage curves for end-of-line manufacturing process control. The method builds on existing literature on differential voltage analysis (DVA or dV/dQ) by expanding the method formalism through the lens of reproducibility, interpretability, and automation. Our model revisions introduce several new derived metrics relevant to manufacturing process control, including lithium consumed during formation and the practical negative-to-positive ratio, which complement standard metrics such as positive and negative electrode capacities. To facilitate method reproducibility, we reformulate the model to account for the "inaccessible lithium problem" which quantifies the numerical differences between modeled versus true values for electrode capacities and stoichiometries. We finally outline key data collection considerations, including C-rate and charging direction for both full cell and half cell datasets, which may impact method reproducibility. This work highlights the opportunities for leveraging voltage-based electrochemical metrics for online battery manufacturing process control. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: 27 pages, 6 figures; pre-print for manuscript submitted to Frontiers in Energy Research

arXiv:2302.05515 [pdf, other]

Achieving acceleration despite very noisy gradients

Authors: Kanan Gupta, Jonathan Siegel, Stephan Wojtowytsch

Abstract: We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient. Nesterov's accelerated gradient descent does not converge under this noise model if the constant of proportionality exceeds o… ▽ More We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient. Nesterov's accelerated gradient descent does not converge under this noise model if the constant of proportionality exceeds one. AGNES fixes this deficiency and provably achieves an accelerated convergence rate no matter how small the signal to noise ratio in the gradient estimate. Empirically, we demonstrate that this is an appropriate model for mini-batch gradients in overparameterized deep learning. Finally, we show that AGNES outperforms stochastic gradient descent with momentum and Nesterov's method in the training of CNNs. △ Less

Submitted 25 May, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

MSC Class: 68T07

arXiv:2302.00834 [pdf, ps, other]

Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data

Authors: Jonathan W. Siegel

Abstract: We study the interpolation power of deep ReLU neural networks. Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $δ$. We show that $Ω(N)$ parameters are required in the regime where $δ$ is exponentially small in $N$, which gives the sharp res… ▽ More We study the interpolation power of deep ReLU neural networks. Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $δ$. We show that $Ω(N)$ parameters are required in the regime where $δ$ is exponentially small in $N$, which gives the sharp result in this regime since $O(N)$ parameters are always sufficient. This also shows that the bit-extraction technique used to prove lower bounds on the VC dimension cannot be applied to irregularly spaced datapoints. Finally, as an application we give a lower bound on the approximation rates that deep ReLU neural networks can achieve for Sobolev spaces at the embedding endpoint. △ Less

Submitted 23 February, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

MSC Class: 41A05; 65D05; 41A25

arXiv:2211.14400 [pdf, ps, other]

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

Authors: Jonathan W. Siegel

Abstract: Let $Ω= [0,1]^d$ be the unit cube in $\mathbb{R}^d$. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(Ω))$ and Besov spaces $B^s_r(L_q(Ω))$, with error measured in the $L_p(Ω)$ norm. This problem is important when studying the application of neural networks… ▽ More Let $Ω= [0,1]^d$ be the unit cube in $\mathbb{R}^d$. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(Ω))$ and Besov spaces $B^s_r(L_q(Ω))$, with error measured in the $L_p(Ω)$ norm. This problem is important when studying the application of neural networks in a variety of fields, including scientific computing and signal processing, and has previously been solved only when $p=q=\infty$. Our contribution is to provide a complete solution for all $1\leq p,q\leq \infty$ and $s > 0$ for which the corresponding Sobolev or Besov space compactly embeds into $L_p$. The key technical tool is a novel bit-extraction technique which gives an optimal encoding of sparse vectors. This enables us to obtain sharp upper bounds in the non-linear regime where $p > q$. We also provide a novel method for deriving $L_p$-approximation lower bounds based upon VC-dimension when $p < \infty$. Our results show that very deep ReLU networks significantly outperform classical methods of approximation in terms of the number of parameters, but that this comes at the cost of parameters which are not encodable. △ Less

Submitted 7 April, 2024; v1 submitted 25 November, 2022; originally announced November 2022.

MSC Class: 41A25; 41A46; 62M45

Journal ref: Journal of Machine Learning Research 24.357 (2023): 1-52

arXiv:2211.04961 [pdf, other]

Parallel-Connected Battery Current Imbalance Dynamics

Authors: Andrew Weng, Sravan Pannala, Jason B. Siegel, Anna G. Stefanopoulou

Abstract: In this work, we derive analytical expressions governing state-of-charge and current imbalance dynamics for two parallel-connected batteries. The model, based on equivalent circuits and an affine open circuit voltage relation, describes the evolution of state-of-charge and current imbalance over the course of a complete charge and discharge cycle. Using this framework, we identify the conditions u… ▽ More In this work, we derive analytical expressions governing state-of-charge and current imbalance dynamics for two parallel-connected batteries. The model, based on equivalent circuits and an affine open circuit voltage relation, describes the evolution of state-of-charge and current imbalance over the course of a complete charge and discharge cycle. Using this framework, we identify the conditions under which an aged battery will experience a higher current magnitude and state-of-charge deviation towards the end of a charge or discharge cycle. This work enables a quantitative understanding of how mismatches in battery capacities and resistances influence imbalance dynamics in parallel-connected battery systems, hel** to pave a path forward for battery degradation modeling in heterogeneous battery systems. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 7 pages, 4 figures, conference paper (MECC 2022)

arXiv:2209.06844 [pdf, other]

Investigating the Lower Mass Gap with Low Mass X-ray Binary Population Synthesis

Authors: Jared C. Siegel, Ilia Kiato, Vicky Kalogera, Christopher P. L. Berry, Thomas J. Maccarone, Katelyn Breivik, Jeff J. Andrews, Simone S. Bavera, Aaron Dotter, Tassos Fragos, Konstantinos Kovlakas, Devina Misra, Kyle A. Rocha, Philipp M. Srivastava, Meng Sun, Zepei Xing, Emmanouil Zapartas

Abstract: Mass measurements from low-mass black hole X-ray binaries (LMXBs) and radio pulsars have been used to identify a gap between the most massive neutron stars (NSs) and the least massive black holes (BHs). BH mass measurements in LMXBs are typically only possible for transient systems: outburst periods enable detection via all-sky X-ray monitors, while quiescent periods enable radial-velocity measure… ▽ More Mass measurements from low-mass black hole X-ray binaries (LMXBs) and radio pulsars have been used to identify a gap between the most massive neutron stars (NSs) and the least massive black holes (BHs). BH mass measurements in LMXBs are typically only possible for transient systems: outburst periods enable detection via all-sky X-ray monitors, while quiescent periods enable radial-velocity measurements of the low-mass donor. We quantitatively study selection biases due to the requirement of transient behavior for BH mass measurements. Using rapid population synthesis simulations (COSMIC), detailed binary stellar-evolution models (MESA), and the disk instability model of transient behavior, we demonstrate that transient-LMXB selection effects introduce observational biases, and can suppress mass-gap BHs in the observed sample. However, we find a population of transient LMXBs with mass-gap BHs form through accretion-induced collapse of a NS during the LMXB phase, which is inconsistent with observations. These results are robust against variations of binary evolution prescriptions. The significance of this accretion-induced collapse population depends upon the maximum NS birth mass $M_\mathrm{ NS, birth-max}$. To reflect the observed dearth of low-mass BHs, COSMIC and MESA models favor $M_\mathrm{ NS, birth-max} \lesssim2M_{\odot}$. In the absence of further observational biases against LMXBs with mass-gap BHs, our results indicate the need for additional physics connected to the modeling of LMXB formation and evolution. △ Less

Submitted 25 July, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: 21 pages, accepted to ApJ

arXiv:2208.14398 [pdf, other]

doi 10.3847/1538-3881/ac8985

Mass Upper Bounds for Over 50 Kepler Planets Using Low-S/N Transit Timing Variations

Authors: Jared C. Siegel, Leslie A. Rogers

Abstract: Prospects for expanding the available mass measurements of the Kepler sample are limited. Planet masses have typically been inferred via radial velocity (RV) measurements of the host star or time-series modeling of transit timing variations (TTVs) in multiplanet systems; however, the majority of Kepler hosts are too dim for RV follow-up, and only a select number of systems have strong enough TTVs… ▽ More Prospects for expanding the available mass measurements of the Kepler sample are limited. Planet masses have typically been inferred via radial velocity (RV) measurements of the host star or time-series modeling of transit timing variations (TTVs) in multiplanet systems; however, the majority of Kepler hosts are too dim for RV follow-up, and only a select number of systems have strong enough TTVs for time-series modeling. Here, we develop a method of constraining planet mass in multiplanet systems using low signal-to-noise ratio (S/N) TTVs. For a sample of 175 planets in 79 multiplanet systems from the California-Kepler Survey, we infer posteriors on planet mass using publicly available TTV time-series from Kepler. For 53 planets ($>30\%$ of our sample), low-S/N TTVs yield informative upper bounds on planet mass, i.e., the mass constraint strongly deviates from the prior on mass and yields a physically reasonable bulk composition. For 25 small planets, low-S/N TTVs favor volatile-rich compositions. Where available, low-S/N TTV-based mass constraints are consistent with RV-derived masses. TTV time-series are publicly available for each Kepler planet, and the compactness of Kepler systems makes TTV-based constraints informative for a substantial fraction of multiplanet systems. Leveraging low-S/N TTVs offers a valuable path toward increasing the available mass constraints of the Kepler sample. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: 18 pages, accepted to AJ

arXiv:2208.04924 [pdf, other]

On the Activation Function Dependence of the Spectral Bias of Neural Networks

Authors: Qingguo Hong, Jonathan W. Siegel, Qinyang Tan, **chao Xu

Abstract: Neural networks are universal function approximators which are known to generalize well despite being dramatically overparameterized. We study this phenomenon from the point of view of the spectral bias of neural networks. Our contributions are two-fold. First, we provide a theoretical explanation for the spectral bias of ReLU neural networks by leveraging connections with the theory of finite ele… ▽ More Neural networks are universal function approximators which are known to generalize well despite being dramatically overparameterized. We study this phenomenon from the point of view of the spectral bias of neural networks. Our contributions are two-fold. First, we provide a theoretical explanation for the spectral bias of ReLU neural networks by leveraging connections with the theory of finite element methods. Second, based upon this theory we predict that switching the activation function to a piecewise linear B-spline, namely the Hat function, will remove this spectral bias, which we verify empirically in a variety of settings. Our empirical studies also show that neural networks with the Hat activation function are trained significantly faster using stochastic gradient descent and ADAM. Combined with previous work showing that the Hat activation function also improves generalization accuracy on image classification tasks, this indicates that using the Hat activation provides significant advantages over the ReLU on certain problems. △ Less

Submitted 5 September, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

arXiv:2207.10819 [pdf, other]

A Non-intrusive Approach for Physics-constrained Learning with Application to Fuel Cell Modeling

Authors: Vishal Srivastava, Valentin Sulzer, Peyman Mohtat, Jason B. Siegel, Karthik Duraisamy

Abstract: A data-driven model augmentation framework, referred to as Weakly-coupled Integrated Inference and Machine Learning (IIML), is presented to improve the predictive accuracy of physical models. In contrast to parameter calibration, this work seeks corrections to the structure of the model by a) inferring augmentation fields that are consistent with the underlying model, and b) transforming these fie… ▽ More A data-driven model augmentation framework, referred to as Weakly-coupled Integrated Inference and Machine Learning (IIML), is presented to improve the predictive accuracy of physical models. In contrast to parameter calibration, this work seeks corrections to the structure of the model by a) inferring augmentation fields that are consistent with the underlying model, and b) transforming these fields into corrective model forms. The proposed approach couples the inference and learning steps in a weak sense via an alternating optimization approach. This coupling ensures that the augmentation fields remain learnable and maintain consistent functional relationships with local modeled quantities across the training dataset. An iterative solution procedure is presented in this paper, removing the need to embed the augmentation function during the inference process. This framework is used to infer an augmentation introduced within a Polymer electrolyte membrane fuel cell (PEMFC) model using a small amount of training data (from only 14 training cases.) These training cases belong to a dataset consisting of high-fidelity simulation data obtained from a high-fidelity model of a first generation Toyota Mirai. All cases in this dataset are characterized by different inflow and outflow conditions on the same geometry. When tested on 1224 different configurations, the inferred augmentation significantly improves the predictive accuracy for a wide range of physical conditions. Predictions and available data for the current density distribution are also compared to demonstrate the predictive capability of the model for quantities of interest which were not involved in the inference process. The results demonstrate that the weakly-coupled IIML framework offers sophisticated and robust model augmentation capabilities without requiring extensive changes to the numerical solver. △ Less

Submitted 30 June, 2022; originally announced July 2022.

arXiv:2206.09056 [pdf, ps, other]

doi 10.1103/PhysRevLett.129.113202

Sub-recoil clock-transition laser cooling enabling shallow optical lattice clocks

Authors: X. Zhang, K. Beloy, Y. S. Hassan, W. F. McGrew, C-C Chen, J. L. Siegel, T. Grogan, A. D. Ludlow

Abstract: Laser cooling is a key ingredient for quantum control of atomic systems in a variety of settings. In divalent atoms, two-stage Doppler cooling is typically used to bring atoms to the uK regime. Here, we implement a pulsed radial cooling scheme using the ultranarrow 1S0-3P0 clock transition in ytterbium to realize sub-recoil temperatures, down to tens of nK. Together with sideband cooling along the… ▽ More Laser cooling is a key ingredient for quantum control of atomic systems in a variety of settings. In divalent atoms, two-stage Doppler cooling is typically used to bring atoms to the uK regime. Here, we implement a pulsed radial cooling scheme using the ultranarrow 1S0-3P0 clock transition in ytterbium to realize sub-recoil temperatures, down to tens of nK. Together with sideband cooling along the one-dimensional lattice axis, we efficiently prepare atoms in shallow lattices at an energy of 6 lattice recoils. Under these conditions key limits on lattice clock accuracy and instability are reduced, opening the door to dramatic improvements. Furthermore, tunneling shifts in the shallow lattice do not compromise clock accuracy at the 10-19 level. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2205.09667 [pdf, other]

The AI Mechanic: Acoustic Vehicle Characterization Neural Networks

Authors: Adam M. Terwilliger, Joshua E. Siegel

Abstract: In a world increasingly dependent on road-based transportation, it is essential to understand vehicles. We introduce the AI mechanic, an acoustic vehicle characterization deep learning system, as an integrated approach using sound captured from mobile devices to enhance transparency and understanding of vehicles and their condition for non-expert users. We develop and implement novel cascading arc… ▽ More In a world increasingly dependent on road-based transportation, it is essential to understand vehicles. We introduce the AI mechanic, an acoustic vehicle characterization deep learning system, as an integrated approach using sound captured from mobile devices to enhance transparency and understanding of vehicles and their condition for non-expert users. We develop and implement novel cascading architectures for vehicle understanding, which we define as sequential, conditional, multi-level networks that process raw audio to extract highly-granular insights. To showcase the viability of cascading architectures, we build a multi-task convolutional neural network that predicts and cascades vehicle attributes to enhance fault detection. We train and test these models on a synthesized dataset reflecting more than 40 hours of augmented audio and achieve >92% validation set accuracy on attributes (fuel type, engine configuration, cylinder count and aspiration type). Our cascading architecture additionally achieved 93.6% validation and 86.8% test set accuracy on misfire fault prediction, demonstrating margins of 16.4% / 7.8% and 4.2% / 1.5% improvement over naïve and parallel baselines. We explore experimental studies focused on acoustic features, data augmentation, feature fusion, and data reliability. Finally, we conclude with a discussion of broader implications, future directions, and application areas for this work. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: 34 pages, 12 figures, 28 tables

arXiv:2204.05810 [pdf, other]

doi 10.3847/1538-3881/ac609a

Into the Depths: a new activity metric for high-precision radial velocity measurements based on line depth variations

Authors: Jared C. Siegel, Ryan A. Rubenzahl, Samuel Halverson, Andrew W. Howard

Abstract: The discovery and characterization of extrasolar planets using radial velocity (RV) measurements is limited by noise sources from the surfaces of host stars. Current techniques to suppress stellar magnetic activity rely on decorrelation using an activity indicator (e.g., strength of the Ca II lines, width of the cross-correlation function, broadband photometry) or measurement of the RVs using only… ▽ More The discovery and characterization of extrasolar planets using radial velocity (RV) measurements is limited by noise sources from the surfaces of host stars. Current techniques to suppress stellar magnetic activity rely on decorrelation using an activity indicator (e.g., strength of the Ca II lines, width of the cross-correlation function, broadband photometry) or measurement of the RVs using only a subset of spectral lines that have been shown to be insensitive to activity. Here, we combine the above techniques by constructing a high signal-to-noise activity indicator, the depth metric $\mathcal{D}(t)$, from the most activity-sensitive spectral lines using the "line-by-line" method of Dumusque (2018). Analogous to photometric decorrelation of RVs or Gaussian progress regression modeling of activity indices, time series modeling of $\mathcal{D}(t)$ reduces the amplitude of magnetic activity in RV measurements; in an $α$CenB RV time series from HARPS, the RV RMS was reduced from 2.67 to 1.02 m s$^{-1}$. $\mathcal{D}(t)$ modeling enabled us to characterize injected planetary signals as small as 1 m s$^{-1}$. In terms of noise reduction and injected signal recovery, $\mathcal{D}(t)$ modeling outperforms activity mitigation via the selection of activity-insensitive spectral lines. For Sun-like stars with activity signals on the m s$^{-1}$ level, the depth metric independently tracks rotationally modulated and multiyear stellar activity with a level of quality similar to that of the FWHM of the CCF and log$R^{\prime}_{HK}$. The depth metric and its elaborations will be a powerful tool in the mitigation of stellar magnetic activity, particularly as a means of connecting stellar activity to physical processes within host stars. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: 19 pages, accepted to AJ

arXiv:2203.12376 [pdf, other]

A Fast Diagnostic to Inform Screening of Discarded or Retired Batteries

Authors: Joseph A. Drallmeier, Clement Wong, Charles E. Solbrig, Jason B. Siegel, Anna G. Stefanopoulou

Abstract: With the increased pervasiveness of Lithium-ion batteries, there is growing concern for the amount of retired batteries that will be entering the waste stream. Although these batteries no longer meet the demands of their first application, many still have a significant portion of their initial capacity remaining for use in secondary applications. Yet, direct repurposing is generally not possible a… ▽ More With the increased pervasiveness of Lithium-ion batteries, there is growing concern for the amount of retired batteries that will be entering the waste stream. Although these batteries no longer meet the demands of their first application, many still have a significant portion of their initial capacity remaining for use in secondary applications. Yet, direct repurposing is generally not possible and each cell in a battery must be evaluated, increasing the cost of the repurposed packs due to the time intensive screening process. In this paper, a rapid assessment of the internal resistance of a cell is proposed. First, this method of measuring the resistance is completed on cells from twelve retired battery packs and one fresh pack using a hybrid pulse power characterization (HPPC) test as a benchmark for the analysis. Results from these tests show relatively constant resistance measurements across mid to high terminal voltages, allowing this metric to be independent of state of charge (SOC). Then, the relation between internal resistance and capacity across the various packs is discussed. Initial experimental results from this study show a correlation between internal resistance and capacity which can be approximated with a linear fit, suggesting internal resistance measurements taken above a threshold cell terminal voltage may be a suitable initial screening metric for the capacity of retired cells without knowledge of the SOC. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: 6 pages, 7 figures, Submitted to IFAC AAC 2022 Conference

arXiv:2203.02182 [pdf]

The Role of Occlusion: Potential Extension of the ICH E9 (R1) Addendum on Estimands and Sensitivity Analysis for Time-to-Event Oncology Studies

Authors: Jonathan M Siegel, Hans-Jochen Weber, Stefan Englert

Abstract: The ICH E9 (R1) Estimands Guidance1 terminology does not completely address the conceptual needs of time-to-event estimands in the complex oncology context. We previously described how censoring and censoring mechanisms for time-to-event endpoints can be embedded into the ICH E9 (R1) Estimands Guidance terminology. This second paper by the Pharmaceutical Industry Working Group on Estimands in Onco… ▽ More The ICH E9 (R1) Estimands Guidance1 terminology does not completely address the conceptual needs of time-to-event estimands in the complex oncology context. We previously described how censoring and censoring mechanisms for time-to-event endpoints can be embedded into the ICH E9 (R1) Estimands Guidance terminology. This second paper by the Pharmaceutical Industry Working Group on Estimands in Oncology Censoring Mechanisms Subteam discusses special issues in the oncology clinical context that may require different approaches than some other therapeutic areas as well as an extensions of the ICH E9 (R1) guidance. The concept of censoring is discussed in the broader context of occluding events, with occlusion representing any loss to further follow-up and/or removal of further collected data from analysis. Occlusion constitutes a broader concept than the estimand guidance's intercurrent event and terminal event terminology and is appropriate to describe and handle situations like withdrawal from assessments or situations where the requirements of different estimands conflict. We characterize, provide additional details, practical implications, and examples on the application of each estimands strategy for handling occluding events. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2203.01781 [pdf]

Time-to-event estimands and loss to follow-up in oncology in light of the estimands framework

Authors: Jonathan Siegel, Hans-Jochen Weber, Stefan Englert, Feng Liu

Abstract: Time-to-event estimands are central to many oncology clinical trials. The estimand framework (addendum to the ICH E9 guideline) calls for precisely defining the treatment effect of interest to align with the clinical question of interest and requires predefining the handling of intercurrent events that occur after treatment initiation and either preclude the observation of an event of interest or… ▽ More Time-to-event estimands are central to many oncology clinical trials. The estimand framework (addendum to the ICH E9 guideline) calls for precisely defining the treatment effect of interest to align with the clinical question of interest and requires predefining the handling of intercurrent events that occur after treatment initiation and either preclude the observation of an event of interest or impact the interpretation of the treatment effect. We discuss a practical problem in clinical trial design and execution, i.e. in some clinical contexts it is not feasible to systematically follow patients to an event of interest. Loss to follow-up in the presence of intercurrent events can affect the meaning and interpretation of the study results. We provide recommendations for trial design, stressing the need for close alignment of the clinical question of interest and study design, impact on data collection and other practical implications. When patients cannot be systematically followed, compromise may be necessary to select the best available estimand that can be feasibly estimated under the circumstances. We discuss the use of sensitivity and supplementary analyses to examine assumptions of interest. △ Less

Submitted 21 July, 2023; v1 submitted 3 March, 2022; originally announced March 2022.

arXiv:2112.02079 [pdf, other]

Cyberphysical Sequencing for Distributed Asset Management with Broad Traceability

Authors: Joshua Siegel, Gregory Falco

Abstract: Cyber-Physical systems (CPS) have complex lifecycles involving multiple stakeholders, and the transparency of both hardware and software components' supply chain is opaque at best. This raises concerns for stakeholders who may not trust that what they receive is what was requested. There is an opportunity to build a cyberphysical titling process offering universal traceability and the ability to d… ▽ More Cyber-Physical systems (CPS) have complex lifecycles involving multiple stakeholders, and the transparency of both hardware and software components' supply chain is opaque at best. This raises concerns for stakeholders who may not trust that what they receive is what was requested. There is an opportunity to build a cyberphysical titling process offering universal traceability and the ability to differentiate systems based on provenance. Today, RFID tags and barcodes address some of these needs, though they are easily manipulated due to non-linkage with an object or system's intrinsic characteristics. We propose cyberphysical sequencing as a low-cost, light-weight and pervasive means of adding track-and-trace capabilities to any asset that ties a system's physical identity to a unique and invariant digital identifier. CPS sequencing offers benefits similar Digital Twins' for identifying and managing the provenance and identity of an asset throughout its life with far fewer computational and other resources. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: 14 pages, 6 figures

arXiv:2109.15205 [pdf, other]

Game and Simulation Design for Studying Pedestrian-Automated Vehicle Interactions

Authors: Georgios Pappas, Joshua E. Siegel, Jacob Rutkowski, Andrea Schaaf

Abstract: The present cross-disciplinary research explores pedestrian-autonomous vehicle interactions in a safe, virtual environment. We first present contemporary tools in the field and then propose the design and development of a new application that facilitates pedestrian point of view research. We conduct a three-step user experience experiment where participants answer questions before and after using… ▽ More The present cross-disciplinary research explores pedestrian-autonomous vehicle interactions in a safe, virtual environment. We first present contemporary tools in the field and then propose the design and development of a new application that facilitates pedestrian point of view research. We conduct a three-step user experience experiment where participants answer questions before and after using the application in various scenarios. Behavioral results in virtuality, especially when there were consequences, tend to simulate real life sufficiently well to make design choices, and we received valuable insights into human/vehicle interaction. Our tool seemed to start raising participant awareness of autonomous vehicles and their capabilities and limitations, which is an important step in overcoming public distrust of AVs. Further, studying how users respect or take advantage of AVs may help inform future operating mode indicator design as well as algorithm biases that might support socially-optimal AV operation. △ Less

Submitted 30 September, 2021; originally announced September 2021.

Comments: 13 pages, 10 figures, 7 tables

arXiv:2109.14983 [pdf]

Single-step fabrication of high performance extraordinary transmission plasmonic metasurfaces employing ultrafast lasers

Authors: Carlota Ruiz de Galarreta, Noemi Casquero, Euan Humphreys, Jacopo Bertolotti, Javier Solis, C. David Wright, Jan Siegel

Abstract: Plasmonic metasurfaces based on the extraordinary optical transmission effect (EOT) can be designed to efficiently transmit specific spectral bands from the visible to the far-infrared regimes, offering numerous applications in im-portant technological fields such as compact multispectral imaging, biological and chemical sensing, or color displays. However, due to their subwavelength nature, EOT m… ▽ More Plasmonic metasurfaces based on the extraordinary optical transmission effect (EOT) can be designed to efficiently transmit specific spectral bands from the visible to the far-infrared regimes, offering numerous applications in im-portant technological fields such as compact multispectral imaging, biological and chemical sensing, or color displays. However, due to their subwavelength nature, EOT metasurfaces are nowadays fabricated with nano- and micro-lithographic techniques, requiring many processing steps and carried out in expensive cleanroom environments. In this work, we propose and experimentally demonstrate a novel, single-step process for the rapid fabrication of high performance mid- and long-wave infrared EOT metasurfaces employing ultrafast direct laser writing (DLW). Micro-hole arrays composing extraordinary transmission metasurfaces were fabricated over areas of 4 mm2 in timescales of units of minutes, employing single pulse ablation of 40 nm thick Au films on dielectric substrates mounted on a high-precision motorized stage. We show how by carefully characterizing the influence of only three key experimental pa-rameters on the processed micro-morphologies (namely laser pulse energy, scan velocity and beam sha** slit), we can have on-demand control of the optical characteristics of the extraordinary transmission effect in terms of transmission wavelength, quality factor and polarization sensitivity of the resonances. To illustrate this concept, a set of EOT metasurfaces having different performances and operating in different spectral regimes has been successfully designed, fabricated and tested. Comparison between transmittance measurements and numerical simulations have revealed that all the fabricated devices behave as expected, thus demonstrating the high performance, flexibility and reliability of the proposed fabrication method. △ Less

Submitted 25 January, 2022; v1 submitted 30 September, 2021; originally announced September 2021.

arXiv:2109.01157 [pdf, other]

doi 10.3847/1538-4357/ac2305

Can the Fe K-alpha Line Reliably Predict Supernova Remnant Progenitors?

Authors: Jared Siegel, Vikram V. Dwarkadas, Kari A. Frank, David N. Burrows

Abstract: The centroid energy of the Fe K$α$ line has been used to identify the progenitors of supernova remnants (SNRs). These investigations generally considered the energy of the centroid derived from the spectrum of the entire remnant. Here we use {\it XMM-Newton} data to investigate the Fe K$α$ centroid in 6 SNRs: 3C~397, N132D, W49B, DEM L71, 1E 0102.2-7219, and Kes 73. In Kes 73 and 1E 0102.2-7219, w… ▽ More The centroid energy of the Fe K$α$ line has been used to identify the progenitors of supernova remnants (SNRs). These investigations generally considered the energy of the centroid derived from the spectrum of the entire remnant. Here we use {\it XMM-Newton} data to investigate the Fe K$α$ centroid in 6 SNRs: 3C~397, N132D, W49B, DEM L71, 1E 0102.2-7219, and Kes 73. In Kes 73 and 1E 0102.2-7219, we fail to detect any Fe K$α$ emission. We report a tentative first detection of Fe K$α$ emission in SNR DEM L71, with a centroid energy consistent with its Type Ia designation. In the remaining remnants, the spatial and spectral sensitivity is sufficient to investigate spatial variations of the Fe K$α$ centroid. We find in N132D and W49B that the centroids in different regions are consistent with that derived from the overall spectrum, although not necessarily with the remnant type identified via other means. However, in SNR 3C~397, we find statistically significant variation in the centroid of up to 100 eV, aligning with the variation in the density structure around the remnant. These variations span the intermediate space between centroid energies signifying core-collapse and Type Ia remnants. Shifting the dividing line downwards by 50 eV can place all the centroids in the CC region, but contradicts the remnant type obtained via other means. Our results show that caution must be used when employing the Fe K$α$ centroid of the entire remnant as the sole diagnostic for ty** a remnant. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: 10 pages, 3 figures. Accepted to the Astrophysical Journal

arXiv:2108.07875 [pdf]

doi 10.1016/j.apsusc.2021.151850

Multiscale ultrafast laser texturing of marble for reduced surface wetting

Authors: Rocio Ariza, Miguel Alvarez-Alegria, Gloria Costas, Leo Tribaldo, Agustin R. Gonzalez Elipe, Jan Siegel, Javier Solis

Abstract: The modification of the wetting properties of marble surfaces upon multi-scale texturing induced by ultrafast laser processing (340 fs pulse duration, 1030 nm wavelength) has been investigated with the aim of evaluating its potential for surface protection. The contact angle (CA) of a water drop placed on the surface was used to assess the wettability of the processed areas. Although the surfaces… ▽ More The modification of the wetting properties of marble surfaces upon multi-scale texturing induced by ultrafast laser processing (340 fs pulse duration, 1030 nm wavelength) has been investigated with the aim of evaluating its potential for surface protection. The contact angle (CA) of a water drop placed on the surface was used to assess the wettability of the processed areas. Although the surfaces are initially hydrophilic upon laser treatment, after a few days they develop a strong hydrophobic behavior. Marble surfaces have been irradiated with different scan line separations to elucidate the relative roles of multi-scale roughness (nano- and micro-texture) and chemical changes at the surface. The time evolution of the contact angle has been then monitored up to 11 months after treatment. A short and a long-term evolution, associated to the combined effect of multi-scale roughness and the attachment of chemical species at the surface over the time, have been observed. XPS and ATR measurements are consistent with the progressive hydroxylation of the laser treated surfaces although the additional contribution of hydrocarbon adsorbates to the wettability evolution cannot be ruled-out. The robustness of the results has been tested by CA measurements after cleaning in different conditions with very positive results. △ Less

Submitted 28 September, 2021; v1 submitted 19 July, 2021; originally announced August 2021.

Comments: 22 pages

arXiv:2108.07833 [pdf, other]

An Algorithmic Safety VEST For Li-ion Batteries During Fast Charging

Authors: Peyman Mohtat, Sravan Pannala, Valentin Sulzer, Jason B. Siegel, Anna G. Stefanopoulou

Abstract: Fast charging of lithium-ion batteries is crucial to increase desirability for consumers and hence accelerate the adoption of electric vehicles. A major barrier to shorter charge times is the accelerated aging of the battery at higher charging rates, which can be driven by lithium plating, increased solid electrolyte interphase growth due to elevated temperatures, and particle cracking due to mech… ▽ More Fast charging of lithium-ion batteries is crucial to increase desirability for consumers and hence accelerate the adoption of electric vehicles. A major barrier to shorter charge times is the accelerated aging of the battery at higher charging rates, which can be driven by lithium plating, increased solid electrolyte interphase growth due to elevated temperatures, and particle cracking due to mechanical stress. Lithium plating depends on the overpotential of the negative electrode, and mechanical stress depends on the concentration gradient, both of which cannot be measured directly. Techniques based on physics-based models of the battery and optimal control algorithms have been developed to this end. While these methods show promise in reducing degradation, their optimization algorithms' complexity can limit their implementation. In this paper, we present a method based on the constant current constant voltage (CC-CV) charging scheme, called CC-CV$ησ$T (VEST). The new approach is simpler to implement and can be used with any model to impose varying levels of constraints on variables pertinent to degradation, such as plating potential and mechanical stress. We demonstrate the new CC-CV$ησ$T charging using an electrochemical model with mechanical and thermal effects included. Furthermore, we discuss how uncertainties can be accounted for by considering safety margins for the plating and stress constraints. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: In press; Modeling, Estimation and Control Conference 2021

arXiv:2107.04466 [pdf, other]

Greedy Training Algorithms for Neural Networks and Applications to PDEs

Authors: Jonathan W. Siegel, Qingguo Hong, Xianlin **, Wenrui Hao, **chao Xu

Abstract: Recently, neural networks have been widely applied for solving partial differential equations (PDEs). Although such methods have been proven remarkably successful on practical engineering problems, they have not been shown, theoretically or empirically, to converge to the underlying PDE solution with arbitrarily high accuracy. The primary difficulty lies in solving the highly non-convex optimizati… ▽ More Recently, neural networks have been widely applied for solving partial differential equations (PDEs). Although such methods have been proven remarkably successful on practical engineering problems, they have not been shown, theoretically or empirically, to converge to the underlying PDE solution with arbitrarily high accuracy. The primary difficulty lies in solving the highly non-convex optimization problems resulting from the neural network discretization, which are difficult to treat both theoretically and practically. It is our goal in this work to take a step toward remedying this. For this purpose, we develop a novel greedy training algorithm for shallow neural networks. Our method is applicable to both the variational formulation of the PDE and also to the residual minimization formulation pioneered by physics informed neural networks (PINNs). We analyze the method and obtain a priori error bounds when solving PDEs from the function class defined by shallow networks, which rigorously establishes the convergence of the method as the network size increases. Finally, we test the algorithm on several benchmark examples, including high dimensional PDEs, to confirm the theoretical convergence rate. Although the method is expensive relative to traditional approaches such as finite element methods, we view this work as a proof of concept for neural network-based methods, which shows that numerical methods based upon neural networks can be shown to rigorously converge. △ Less

Submitted 24 March, 2023; v1 submitted 9 July, 2021; originally announced July 2021.

Comments: has been merged with arXiv:2104.02903

MSC Class: 65N30; 65N22; 65H20

arXiv:2106.15002 [pdf, ps, other]

Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

Authors: Jonathan W. Siegel, **chao Xu

Abstract: We study the variation space corresponding to a dictionary of functions in $L^2(Ω)$ for a bounded domain $Ω\subset \mathbb{R}^d$. Specifically, we compare the variation space, which is defined in terms of a convex hull with related notions based on integral representations. This allows us to show that three important notions relating to the approximation theory of shallow neural networks, the Barr… ▽ More We study the variation space corresponding to a dictionary of functions in $L^2(Ω)$ for a bounded domain $Ω\subset \mathbb{R}^d$. Specifically, we compare the variation space, which is defined in terms of a convex hull with related notions based on integral representations. This allows us to show that three important notions relating to the approximation theory of shallow neural networks, the Barron space, the spectral Barron space, and the Radon BV space, are actually variation spaces with respect to certain natural dictionaries. △ Less

Submitted 9 April, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

arXiv:2106.15000 [pdf, other]

Optimal Convergence Rates for the Orthogonal Greedy Algorithm

Authors: Jonathan W. Siegel, **chao Xu

Abstract: We analyze the orthogonal greedy algorithm when applied to dictionaries $\mathbb{D}$ whose convex hull has small entropy. We show that if the metric entropy of the convex hull of $\mathbb{D}$ decays at a rate of $O(n^{-\frac{1}{2}-α})$ for $α> 0$, then the orthogonal greedy algorithm converges at the same rate on the variation space of $\mathbb{D}$. This improves upon the well-known… ▽ More We analyze the orthogonal greedy algorithm when applied to dictionaries $\mathbb{D}$ whose convex hull has small entropy. We show that if the metric entropy of the convex hull of $\mathbb{D}$ decays at a rate of $O(n^{-\frac{1}{2}-α})$ for $α> 0$, then the orthogonal greedy algorithm converges at the same rate on the variation space of $\mathbb{D}$. This improves upon the well-known $O(n^{-\frac{1}{2}})$ convergence rate of the orthogonal greedy algorithm in many cases, most notably for dictionaries corresponding to shallow neural networks. These results hold under no additional assumptions on the dictionary beyond the decay rate of the entropy of its convex hull. In addition, they are robust to noise in the target function and can be extended to convergence rates on the interpolation spaces of the variation norm. We show empirically that the predicted rates are obtained for the dictionary corresponding to shallow neural networks with Heaviside activation function in two dimensions. Finally, we show that these improved rates are sharp and prove a negative result showing that the iterates generated by the orthogonal greedy algorithm cannot in general be bounded in the variation norm of $\mathbb{D}$. △ Less

Submitted 21 January, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

MSC Class: 41A46; 41A25; 46N30

arXiv:2106.14997

Sharp Lower Bounds on the Approximation Rate of Shallow Neural Networks

Authors: Jonathan W. Siegel, **chao Xu

Abstract: We consider the approximation rates of shallow neural networks with respect to the variation norm. Upper bounds on these rates have been established for sigmoidal and ReLU activation functions, but it has remained an important open problem whether these rates are sharp. In this article, we provide a solution to this problem by proving sharp lower bounds on the approximation rates for shallow neura… ▽ More We consider the approximation rates of shallow neural networks with respect to the variation norm. Upper bounds on these rates have been established for sigmoidal and ReLU activation functions, but it has remained an important open problem whether these rates are sharp. In this article, we provide a solution to this problem by proving sharp lower bounds on the approximation rates for shallow neural networks, which are obtained by lower bounding the $L^2$-metric entropy of the convex hull of the neural network basis functions. In addition, our methods also give sharp lower bounds on the Kolmogorov $n$-widths of this convex hull, which show that the variation spaces corresponding to shallow neural networks cannot be efficiently approximated by linear methods. These lower bounds apply to both sigmoidal activation functions with bounded variation and to activation functions which are a power of the ReLU. Our results also quantify how much stronger the Barron spectral norm is than the variation norm and, combined with previous results, give the asymptotics of the $L^\infty$-metric entropy up to logarithmic factors in the case of the ReLU activation function. △ Less

Submitted 8 September, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

Comments: This paper has been merged with arXiv:2101.12365

MSC Class: 62M45; 41A46

Showing 1–50 of 92 results for author: Siegel, J