-
Challenge-Device-Synthesis: A multi-disciplinary approach for the development of social innovation competences for students of Artificial Intelligence
Authors:
Matías Bilkis,
Joan Moya Kohler,
Fernando Vilariño
Abstract:
The advent of Artificial Intelligence is expected to imply profound changes in the short-term. It is therefore imperative for Academia, and particularly for the Computer Science scope, to develop cross-disciplinary tools that bond AI developments to their social dimension. To this aim, we introduce the Challenge-Device-Synthesis methodology (CDS), in which a specific challenge is presented to the…
▽ More
The advent of Artificial Intelligence is expected to imply profound changes in the short-term. It is therefore imperative for Academia, and particularly for the Computer Science scope, to develop cross-disciplinary tools that bond AI developments to their social dimension. To this aim, we introduce the Challenge-Device-Synthesis methodology (CDS), in which a specific challenge is presented to the students of AI, who are required to develop a device as a solution for the challenge. The device becomes the object of study for the different dimensions of social transformation, and the conclusions addressed by the students during the discussion around the device are presented in a synthesis piece in the shape of a 10-page scientific paper. The latter is evaluated taking into account both the depth of analysis and the level to which it genuinely reflects the social transformations associated with the proposed AI-based device. We provide data obtained during the pilot for the implementation phase of CDS within the subject of Social Innovation, a 6-ECTS subject from the 6th semester of the Degree of Artificial Intelligence, UAB-Barcelona. We provide details on temporalisation, task distribution, methodological tools used and assessment delivery procedure, as well as qualitative analysis of the results obtained.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Hierarchical Rank-One Sequence Convexification for the Relaxation of Variational Problems with Microstructures
Authors:
Maximilian Köhler,
Timo Neumeier,
Malte. A. Peter,
Daniel Peterseim,
Daniel Balzani
Abstract:
This paper presents an efficient algorithm for the approximation of the rank-one convex hull in the context of nonlinear solid mechanics. It is based on hierarchical rank-one sequences and simultaneously provides first and second derivative information essential for the calculation of mechanical stresses and the computational minimization of discretized energies. For materials, whose microstructur…
▽ More
This paper presents an efficient algorithm for the approximation of the rank-one convex hull in the context of nonlinear solid mechanics. It is based on hierarchical rank-one sequences and simultaneously provides first and second derivative information essential for the calculation of mechanical stresses and the computational minimization of discretized energies. For materials, whose microstructure can be well approximated in terms of laminates and where each laminate stage achieves energetic optimality with respect to the current stage, the approximate envelope coincides with the rank-one convex envelope. Although the proposed method provides only an upper bound for the rank-one convex hull, a careful examination of the resulting constraints shows a decent applicability in mechanical problems. Various aspects of the algorithm are discussed, including the restoration of rotational invariance, microstructure reconstruction, comparisons with other semi-convexification algorithms, and mesh independency. Overall, this paper demonstrates the efficiency of the algorithm for both, well-established mathematical benchmark problems as well as nonconvex isotropic finite-strain continuum damage models in two and three dimensions. Thereby, for the first time, a feasible concurrent numerical relaxation is established for an incremental, dissipative large-strain model with relevant applications in engineering problems.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Authors:
Michael Kohler,
Adam Krzyzak,
Benjamin Walter
Abstract:
Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.
Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Mind the Gap Between Synthetic and Real: Utilizing Transfer Learning to Probe the Boundaries of Stable Diffusion Generated Data
Authors:
Leonhard Hennicke,
Christian Medeiros Adriano,
Holger Giese,
Jan Mathias Koehler,
Lukas Schott
Abstract:
Generative foundation models like Stable Diffusion comprise a diverse spectrum of knowledge in computer vision with the potential for transfer learning, e.g., via generating data to train student models for downstream tasks. This could circumvent the necessity of collecting labeled real-world data, thereby presenting a form of data-free knowledge distillation. However, the resultant student models…
▽ More
Generative foundation models like Stable Diffusion comprise a diverse spectrum of knowledge in computer vision with the potential for transfer learning, e.g., via generating data to train student models for downstream tasks. This could circumvent the necessity of collecting labeled real-world data, thereby presenting a form of data-free knowledge distillation. However, the resultant student models show a significant drop in accuracy compared to models trained on real data. We investigate possible causes for this drop and focus on the role of the different layers of the student model. By training these layers using either real or synthetic data, we reveal that the drop mainly stems from the model's final layers. Further, we briefly investigate other factors, such as differences in data-normalization between synthetic and real, the impact of data augmentations, texture vs.\ shape learning, and assuming oracle prompts. While we find that some of those factors can have an impact, they are not sufficient to close the gap towards real data. Building upon our insights that mainly later layers are responsible for the drop, we investigate the data-efficiency of fine-tuning a synthetically trained model with real data applied to only those last layers. Our results suggest an improved trade-off between the amount of real training data used and the model's accuracy. Our findings contribute to the understanding of the gap between synthetic and real data and indicate solutions to mitigate the scarcity of labeled real data.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
TMM$-$Sim: A Versatile Tool for Optical Simulation of Thin$-$Film Solar Cells
Authors:
Leandro Benatto,
Omar Mesquita,
Kaike R. M. Pachecoand Lucimara S. Roman,
Marlus Koehler,
Rodrigo B. Capaz,
Graziâni Candiotto
Abstract:
The Transfer Matrix Method (TMM) has become a prominent tool for the optical simulation of thin$-$film solar cells, particularly among researchers specializing in organic semiconductors and perovskite materials. As the commercial viability of these solar cells continues to advance, driven by rapid developments in materials and production processes, the importance of optical simulation has grown si…
▽ More
The Transfer Matrix Method (TMM) has become a prominent tool for the optical simulation of thin$-$film solar cells, particularly among researchers specializing in organic semiconductors and perovskite materials. As the commercial viability of these solar cells continues to advance, driven by rapid developments in materials and production processes, the importance of optical simulation has grown significantly. By leveraging optical simulation, researchers can gain profound insights into photovoltaic phenomena, empowering the implementation of device optimization strategies to achieve enhanced performance. However, existing TMM$-$based packages exhibit limitations, such as requiring programming expertise, licensing fees, or lack of support for bilayer device simulation. In response to these gaps and challenges, we present the TMM Simulator (TMM$-$Sim), an intuitive and user$-$friendly tool to calculate essential photovoltaic parameters, including the optical electric field profile, exciton generation profile, fraction of light absorbed per layer, photocurrent, external quantum efficiency, internal quantum efficiency, and parasitic losses. An additional advantage of TMM$-$Sim lies in its capacity to generate outcomes suitable as input parameters for electro$-$optical device simulations. In this work, we offer a comprehensive guide, outlining a step$-$by$-$step process to use TMM$-$Sim, and provide a thorough analysis of the results. TMM$-$Sim is freely available, accessible through our web server (nanocalc.org), or downloadable from the TMM$-$Sim repository (for \textit{Unix}, \textit{Windows}, and \textit{macOS}) on \textit{GitHub}. With its user$-$friendly interface and powerful capabilities, TMM$-$Sim aims to facilitate and accelerate research in thin$-$film solar cells, fostering advancements in renewable energy technologies.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization
Authors:
Michael Kohler,
Adam Krzyzak,
Alisha Sänger
Abstract:
Image classification from independent and identically distributed random variables is considered. Image classifiers are defined which are based on a linear combination of deep convolutional networks with max-pooling layer. Here all the weights are learned by stochastic gradient descent. A general result is presented which shows that the image classifiers are able to approximate the best possible d…
▽ More
Image classification from independent and identically distributed random variables is considered. Image classifiers are defined which are based on a linear combination of deep convolutional networks with max-pooling layer. Here all the weights are learned by stochastic gradient descent. A general result is presented which shows that the image classifiers are able to approximate the best possible deep convolutional network. In case that the a posteriori probability satisfies a suitable hierarchical composition model it is shown that the corresponding deep convolutional neural network image classifier achieves a rate of convergence which is independent of the dimension of the images.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
RI$-$Calc: A User Friendly Software and Web Server for Refractive Index Calculation
Authors:
Leandro Benatto,
Omar Mesquita,
Lucimara S. Roman,
Marlus Koehler,
Rodrigo B. Capaz,
Graziâni Candiotto
Abstract:
The refractive index of an optical medium is essential for studying a variety of physical phenomena. One useful method for determining the refractive index of scalar materials (i.e, materials which are characterized by a scalar dielectric function) is to employ the Kramers-Kronig (K-K) relations. The K-K method is particularly useful in cases where ellipsometric measurements are unavailable, a sit…
▽ More
The refractive index of an optical medium is essential for studying a variety of physical phenomena. One useful method for determining the refractive index of scalar materials (i.e, materials which are characterized by a scalar dielectric function) is to employ the Kramers-Kronig (K-K) relations. The K-K method is particularly useful in cases where ellipsometric measurements are unavailable, a situation that frequently occurs in many laboratories. Although some packages can perform this calculation, they usually lack a graphical interface and are complex to implement and use. Those deficiencies inhibits their utilization by a plethora of researchers unfamiliar with programming languages. To address the aforementioned gap, we have developed the Refractive Index Calculator (RI-Calc) program that provides an intuitive and user-friendly interface. The RI-Calc program allows users to input the absorption coefficient spectrum and then easily calculate the complex refractive index and the complex relative permittivity of a broad range of thin films, including of molecules, polymers, blends, and perovskites. The program has been thoroughly tested, taking into account the Lorentz oscillator model and experimental data from a materials' refractive index database, demonstrating consistent outcomes. It is compatible with Windows, Unix, and macOS operating systems. You can download the RI-Calc binaries from our GitHub repository or conveniently access the program through our dedicated web server at nanocalc.org.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Towards a Quality Indicator for Research Data publications and Research Software publications -- A vision from the Helmholtz Association
Authors:
Wolfgang zu Castell,
Doris Dransch,
Guido Juckeland,
Marcel Meistring,
Bernadette Fritzsch,
Ronny Gey,
Britta Höpfner,
Martin Köhler,
Christian Meeßen,
Hela Mehrtens,
Felix Mühlbauer,
Sirko Schindler,
Thomas Schnicke,
Roland Bertelmann
Abstract:
Research data and software are widely accepted as an outcome of scientific work. However, in comparison to text-based publications, there is not yet an established process to assess and evaluate quality of research data and research software publications. This paper presents an attempt to fill this gap. Initiated by the Working Group Open Science of the Helmholtz Association the Task Group Helmhol…
▽ More
Research data and software are widely accepted as an outcome of scientific work. However, in comparison to text-based publications, there is not yet an established process to assess and evaluate quality of research data and research software publications. This paper presents an attempt to fill this gap. Initiated by the Working Group Open Science of the Helmholtz Association the Task Group Helmholtz Quality Indicators for Data and Software Publications currently develops a quality indicator for research data and research software publications to be used within the Association. This report summarizes the vision of the group of what all contributes to such an indicator. The proposed approach relies on generic well-established concepts for quality criteria, such as the FAIR Principles and the COBIT Maturity Model. It does - on purpose - not limit itself to technical implementation possibilities to avoid using an existing metric for a new purpose. The intention of this paper is to share the current state for further discussion with all stakeholders, particularly with other groups also working on similar metrics but also with entities that use the metrics.
△ Less
Submitted 26 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
On the rate of convergence of an over-parametrized Transformer classifier learned by gradient descent
Authors:
Michael Kohler,
Adam Krzyzak
Abstract:
One of the most recent and fascinating breakthroughs in artificial intelligence is ChatGPT, a chatbot which can simulate human conversation. ChatGPT is an instance of GPT4, which is a language model based on generative gredictive gransformers. So if one wants to study from a theoretical point of view, how powerful such artificial intelligence can be, one approach is to consider transformer network…
▽ More
One of the most recent and fascinating breakthroughs in artificial intelligence is ChatGPT, a chatbot which can simulate human conversation. ChatGPT is an instance of GPT4, which is a language model based on generative gredictive gransformers. So if one wants to study from a theoretical point of view, how powerful such artificial intelligence can be, one approach is to consider transformer networks and to study which problems one can solve with these networks theoretically. Here it is not only important what kind of models these network can approximate, or how they can generalize their knowledge learned by choosing the best possible approximation to a concrete data set, but also how well optimization of such transformer network based on concrete data set works. In this article we consider all these three different aspects simultaneously and show a theoretical upper bound on the missclassification probability of a transformer network fitted to the observed data. For simplicity we focus in this context on transformer encoder networks which can be applied to define an estimate in the context of a classification problem involving natural language.
△ Less
Submitted 20 June, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Formalism for Anatomy-Independent Projection and Optimization of Transcranial Magnetic Stimulation Coils
Authors:
Max Koehler,
Stefan Goetz
Abstract:
Transcranial magnetic stimulation (TMS) is a popular method for the noninvasive stimulation of neurons in the brain. It has become a standard instrument in experimental brain research and is approved for a range of diagnostic and therapeutic applications. Various applications have been established or approved for specific coil designs with their corresponding spatial electric field distributions.…
▽ More
Transcranial magnetic stimulation (TMS) is a popular method for the noninvasive stimulation of neurons in the brain. It has become a standard instrument in experimental brain research and is approved for a range of diagnostic and therapeutic applications. Various applications have been established or approved for specific coil designs with their corresponding spatial electric field distributions. However, the specific coil implementation may no longer be appropriate from the perspective of material and manufacturing opportunities or considering the latest understanding of how to achieve induced electric fields in the head most efficiently. Furthermore, in some cases, field measurements of coils with unknown winding or a user-defined field are available and require an actual implementation. Similar applications exist for magnetic resonance imaging coils. This work aims at introducing a formalism that is completely free from heuristics, iterative optimization, and ad-hoc or manual steps to form practical stimulation coils with a winding consisting of individual turns to either equivalently match an existing coil or produce a given field. The target coil can reside on practically any sufficiently large or closed surface adjacent to or around the head. The method derives an equivalent field through vector projection. In contrast to other coil design or optimization approaches recently presented, the procedure is an explicit forward Hilbert-space vector projection or basis change. For demonstration, we map a commercial figure-of-eight coil as one of the most widely used devices and a more intricate coil recently approved clinically for addiction treatment (H4) onto a bent surface close to the head for highest efficiency and lowest field energy. The resulting projections are within 4% of the target field and reduce the necessary pulse energy by more than 40%.
△ Less
Submitted 23 December, 2023;
originally announced December 2023.
-
Ultra-broadband bright light emission from a one-dimensional inorganic van der Waals material
Authors:
Fateme Mahdikhany,
Sean Driskill,
Jeremy G. Philbrick,
Davoud Adinehloo,
Michael R. Koehler,
David G. Mandrus,
Takashi Taniguchi,
Kenji Watanabe,
Brian J. LeRoy,
Oliver L. A. Monti,
Vasili Perebeinos,
Tai Kong,
John R. Schaibley
Abstract:
One-dimensional (1D) van der Waals materials have emerged as an intriguing playground to explore novel electronic and optical effects. We report on inorganic one-dimensional SbPS4 nanotubes bundles obtained via mechanical exfoliation from bulk crystals. The ability to mechanically exfoliate SbPS4 nanobundles offers the possibility of applying modern 2D material fabrication techniques to create mix…
▽ More
One-dimensional (1D) van der Waals materials have emerged as an intriguing playground to explore novel electronic and optical effects. We report on inorganic one-dimensional SbPS4 nanotubes bundles obtained via mechanical exfoliation from bulk crystals. The ability to mechanically exfoliate SbPS4 nanobundles offers the possibility of applying modern 2D material fabrication techniques to create mixed-dimensional van der Waals heterostructures. We find that SbPS4 can readily be exfoliated to yield long (> 10 μm) nanobundles with thicknesses that range from of 1.3 - 200 nm. We investigated the optical response of semiconducting SbPS4 nanobundles and discovered that upon excitation with blue light, they emit bright and ultra-broadband red light with a quantum yield similar to that of hBN-encapsulated MoSe2. We discovered that the ultra-broadband red light emission is a result of a large ~1 eV exciton binding energy and a ~200 meV exciton self-trap** energy, unprecedented in previous material studies. Due to the bright and ultra-broadband light emission, we believe that this class of inorganic 1D van der Waals semiconductors has numerous potential applications including on-chip tunable nanolasers, and applications that require ultra-violet to visible light conversion such as lighting and sensing. Overall, our findings open avenues for harnessing the unique characteristics of these nanomaterials, advancing both fundamental research and practical optoelectronic applications.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
FRET$-$Calc: A Free Software and Web Server for Förster Resonance Energy Transfer Calculation
Authors:
Leandro Benatto,
Omar Mesquita,
João L. B. Rosa,
Lucimara S. Roman,
Marlus Koehler,
Rodrigo B. Capaz,
Graziâni Candiotto
Abstract:
Förster Resonance Energy Transfer Calculator (FRET$-$Calc) is a program and web server that analyzes molar extinction coefficient of the acceptor, emission spectrum of the donor, and the refractive index spectrum of the donor/acceptor blend. Its main function is to obtain important parameters of the FRET process from experimental data, such as: (i) effective refractive index, (ii) overlap integral…
▽ More
Förster Resonance Energy Transfer Calculator (FRET$-$Calc) is a program and web server that analyzes molar extinction coefficient of the acceptor, emission spectrum of the donor, and the refractive index spectrum of the donor/acceptor blend. Its main function is to obtain important parameters of the FRET process from experimental data, such as: (i) effective refractive index, (ii) overlap integral, (iii) Förster radius, (iii) FRET efficiency and (iv) FRET rate. FRET$-$Calc is license free software that can be run via dedicated web server (nanocalc.org) or downloading the program executables (for Unix, Windows, and macOS) from the FRET$-$Calc repository on GitHub. The program features a user$-$friendly interface, making it suitable for materials research and teaching purposes. In addition, the program is optimized to run on normal computers and is lightweight. An example will be given with the step by step of its use and results obtained.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
PLQ-Sim: A Computational Tool for Simulating Photoluminescence Quenching Dynamics in Organic Donor/Acceptor Blends
Authors:
Leandro Benatto,
Omar Mesquita,
Lucimara S. Roman,
Rodrigo B. Capaz,
Graziâni Candiotto,
Marlus Koehler
Abstract:
Photoluminescence Quenching Simulator (PLQ-Sim) is a user-friendly software to study the photoexcited state dynamics at the interface between two organic semiconductors forming a blend: an electron donor (D), and an electron acceptor (A). Its main function is to provide substantial information on the photophysical processes relevant to organic photovoltaic and photothermal devices, such as charge…
▽ More
Photoluminescence Quenching Simulator (PLQ-Sim) is a user-friendly software to study the photoexcited state dynamics at the interface between two organic semiconductors forming a blend: an electron donor (D), and an electron acceptor (A). Its main function is to provide substantial information on the photophysical processes relevant to organic photovoltaic and photothermal devices, such as charge transfer state formation and subsequent free charge generation or exciton recombination. From input parameters provided by the user, the program calculates the transfer rates of the D/A blend and employs a kinetic model that provides the photoluminescence quenching efficiency for initial excitation in the donor or acceptor. When calculating the rates, the user can choose to use disorder parameters to better describe the system. In addition, the program was developed to address energy transfer phenomena that are commonly present in organic blends. The time evolution of state populations is also calculated providing relevant information for the user. In this article, we present the theory behind the kinetic model, along with suggestions for methods to obtain the input parameters. A detailed demonstration of the program, its applicability, and an analysis of the outputs are also presented. PLQ-Sim is license free software that can be run via dedicated webserver nanocalc.org or downloading the program executables (for Unix, Windows, and macOS) from the PLQ-Sim repository on GitHub.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Analysis of the expected $L_2$ error of an over-parametrized deep neural network estimate learned by gradient descent without regularization
Authors:
Selina Drews,
Michael Kohler
Abstract:
Recent results show that estimates defined by over-parametrized deep neural networks learned by applying gradient descent to a regularized empirical $L_2$ risk are universally consistent and achieve good rates of convergence. In this paper, we show that the regularization term is not necessary to obtain similar results. In the case of a suitably chosen initialization of the network, a suitable num…
▽ More
Recent results show that estimates defined by over-parametrized deep neural networks learned by applying gradient descent to a regularized empirical $L_2$ risk are universally consistent and achieve good rates of convergence. In this paper, we show that the regularization term is not necessary to obtain similar results. In the case of a suitably chosen initialization of the network, a suitable number of gradient descent steps, and a suitable step size we show that an estimate without a regularization term is universally consistent for bounded predictor variables. Additionally, we show that if the regression function is Hölder smooth with Hölder exponent $1/2 \leq p \leq 1$, the $L_2$ error converges to zero with a convergence rate of approximately $n^{-1/(1+d)}$. Furthermore, in case of an interaction model, where the regression function consists of a sum of Hölder smooth functions with $d^*$ components, a rate of convergence is derived which does not depend on the input dimension $d$.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Revisiting Cont's Stylized Facts for Modern Stock Markets
Authors:
Ethan Ratliff-Crain,
Colin M. Van Oort,
James Bagrow,
Matthew T. K. Koehler,
Brian F. Tivnan
Abstract:
In 2001, Rama Cont introduced a now-widely used set of 'stylized facts' to synthesize empirical studies of financial price changes (returns), resulting in 11 statistical properties common to a large set of assets and markets. These properties are viewed as constraints a model should be able to reproduce in order to accurately represent returns in a market. It has not been established whether the c…
▽ More
In 2001, Rama Cont introduced a now-widely used set of 'stylized facts' to synthesize empirical studies of financial price changes (returns), resulting in 11 statistical properties common to a large set of assets and markets. These properties are viewed as constraints a model should be able to reproduce in order to accurately represent returns in a market. It has not been established whether the characteristics Cont noted in 2001 still hold for modern markets following significant regulatory shifts and technological advances. It is also not clear whether a given time series of financial returns for an asset will express all 11 stylized facts. We test both of these propositions by attempting to replicate each of Cont's 11 stylized facts for intraday returns of the individual stocks in the Dow 30, using the same authoritative data as that used by the U.S. regulator from October 2018 - March 2019. We find conclusive evidence for eight of Cont's original facts and no support for the remaining three. Our study represents the first test of Cont's 11 stylized facts against a consistent set of stocks, therefore providing insight into how these stylized facts should be viewed in the context of modern stock markets.
△ Less
Submitted 20 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Examining Common Paradigms in Multi-Task Learning
Authors:
Cathrin Elich,
Lukas Kirchdorfer,
Jan M. Köhler,
Lukas Schott
Abstract:
While multi-task learning (MTL) has gained significant attention in recent years, its underlying mechanisms remain poorly understood. Recent methods did not yield consistent performance improvements over single task learning (STL) baselines, underscoring the importance of gaining more profound insights about challenges specific to MTL. In our study, we investigate paradigms in MTL in the context o…
▽ More
While multi-task learning (MTL) has gained significant attention in recent years, its underlying mechanisms remain poorly understood. Recent methods did not yield consistent performance improvements over single task learning (STL) baselines, underscoring the importance of gaining more profound insights about challenges specific to MTL. In our study, we investigate paradigms in MTL in the context of STL: First, the impact of the choice of optimizer has only been mildly investigated in MTL. We show the pivotal role of common STL tools such as the Adam optimizer in MTL empirically in various experiments. To further investigate Adam's effectiveness, we theoretical derive a partial loss-scale invariance under mild assumptions. Second, the notion of gradient conflicts has often been phrased as a specific problem in MTL. We delve into the role of gradient conflicts in MTL and compare it to STL. For angular gradient alignment we find no evidence that this is a unique problem in MTL. We emphasize differences in gradient magnitude as the main distinguishing factor. Overall, we find surprising similarities between STL and MTL suggesting to consider methods from both fields in a broader context.
△ Less
Submitted 27 June, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Conditions for efficient charge generation preceded by energy transfer process in non-fullerene organic solar cells
Authors:
L. Benatto,
C. A. M. Moraes,
G. Candiotto,
K. R. A. Sousa,
J. P. A. Souza,
L. S. Roman,
M. Koehler
Abstract:
The minimum driving force strategy is applied to promote the exciton dissociation in organic solar cells (OSCs) without significant loss of open-circuit voltage. However, this strategy tends to promote Förster resonance energy transfer (FRET) from the donor to the acceptor (D-A), a consequence generally ignored until recently. In spite of the advances reported on this topic, the correlation betwee…
▽ More
The minimum driving force strategy is applied to promote the exciton dissociation in organic solar cells (OSCs) without significant loss of open-circuit voltage. However, this strategy tends to promote Förster resonance energy transfer (FRET) from the donor to the acceptor (D-A), a consequence generally ignored until recently. In spite of the advances reported on this topic, the correlation between charge-transfer (CT) state binding energy and driving force remains unclear, especially in the presence of D-A FRET. To address this question, we employ a kinetic approach to model the charge separation in ten different D/A blends using non-fullerene acceptors. The model considers the influence of FRET on photoluminescence (PL) quenching efficiency. It successfully predicts the measured PL quenching efficiency for D or A photoexcitation in those blends, including the ones for which the D-A FRET process is relevant. Furthermore, the application of the model allows to quantifying the fractions of quenching loss associated with charge transfer and energy transfer. Fundamental relationships that controls the exciton dissociation was derived evidencing the key roles played by the Marcus inverted regime, exciton lifetime and mainly by the correlation between the driving force and binding energy of CT state. Based on those findings, we propose some strategies to maximize the quenching efficiency and minimize energy loss of OSCs in the presence of D-A FRET.
△ Less
Submitted 29 November, 2023; v1 submitted 7 November, 2023;
originally announced November 2023.
-
The Binding Energy of Triplet Excitons in Non-Fullerene Acceptors: The Effects of Fluorination and Chlorination
Authors:
J. P. A. Souza,
L. Benatto,
G. Candiotto,
L. S. Roman,
M. Koehler
Abstract:
One strategy to improve the photovoltaic properties of non-fullerene acceptors (NFAs) is the rational fluorination or chlorination of those molecules. Although this modification improves important acceptor properties, little is known about the effects on the triplet states. Here, we combine the polarizable continuum model with optimally tuned range-separated hybrid functional to investigate this i…
▽ More
One strategy to improve the photovoltaic properties of non-fullerene acceptors (NFAs) is the rational fluorination or chlorination of those molecules. Although this modification improves important acceptor properties, little is known about the effects on the triplet states. Here, we combine the polarizable continuum model with optimally tuned range-separated hybrid functional to investigate this issue. We find that fluorination or chlorination of NFAs decreases the degree of HOMO-LUMO overlap along these molecules. Consequently, the energy gap between $T_{1}$ and $S_{1}$ states, $ΔE_{ST} = E_{S_{1}} - E_{T_{1}}$, also decreases. This effect simultaneously enhances the generation of triplet excitons and reduce the binding energy of the triplet excitons ($E_{b,T}$) which favor their dissociation into free charges. Interestingly, although Cl has a lower electronegativity than F, the chlorination is more effective to reduce $ΔE_{ST}$. Since chlorination of NFAs is easier than fluorination, Cl substitution can be a useful approach to enhance solar energy harvesting using triplet excitons.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Enhancing Chemical Stability and Photovoltaic Properties of Highly Efficient Nonfullerene Acceptors by Chalcogen Substitution: Insights from Quantum Chemical Calculations
Authors:
Leandro Benatto,
João Paulo A. Souza,
Matheus F. F. das Neves,
Lucimara S. Romana,
Rodrigo B. Capaz,
Graziâni Candiotto,
Marlus Koehler
Abstract:
The chemical stability of nonfullerene acceptor (NFA) is the Achilles' heel of the research on state-of-the-art organic solar cells (OSC). The fragility of the NFA is essentially due to the weak bond that links the central donor core of the molecules with their acceptor moieties at the edges. Here we proposed the replacement of thiophene at the outer-core position of traditional NFAs for telluroph…
▽ More
The chemical stability of nonfullerene acceptor (NFA) is the Achilles' heel of the research on state-of-the-art organic solar cells (OSC). The fragility of the NFA is essentially due to the weak bond that links the central donor core of the molecules with their acceptor moieties at the edges. Here we proposed the replacement of thiophene at the outer-core position of traditional NFAs for tellurophene, a hitherto unexplored modification. Since tellurium is a distinctive element among chalcogens, the basic features of Te compounds cannot be deduced straightforwardly from the properties of their lighter analogues, S and Se. The modeled Te-based NFAs presented interesting features like stronger intra- and intermolecular interactions induced by a distinctive secondary bond effect between the end acceptor moiety and the outer chalcogen atom. This design strategy resulted in stiffer molecules with red-shifted absorption spectra and less susceptible to degradation verified through stress tests and vibrational spectra analysis. Besides that, a weakened exciton binding energy has been found, opening the possibility of blends with a lower driving force. Our results shed light on several aspects of selenation and telluration of traditional NFAs, providing valuable insights into the possible consequences for OSCs applications.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Fusing Hand and Body Skeletons for Human Action Recognition in Assembly
Authors:
Dustin Aganian,
Mona Köhler,
Benedict Stephan,
Markus Eisenbach,
Horst-Michael Gross
Abstract:
As collaborative robots (cobots) continue to gain popularity in industrial manufacturing, effective human-robot collaboration becomes crucial. Cobots should be able to recognize human actions to assist with assembly tasks and act autonomously. To achieve this, skeleton-based approaches are often used due to their ability to generalize across various people and environments. Although body skeleton…
▽ More
As collaborative robots (cobots) continue to gain popularity in industrial manufacturing, effective human-robot collaboration becomes crucial. Cobots should be able to recognize human actions to assist with assembly tasks and act autonomously. To achieve this, skeleton-based approaches are often used due to their ability to generalize across various people and environments. Although body skeleton approaches are widely used for action recognition, they may not be accurate enough for assembly actions where the worker's fingers and hands play a significant role. To address this limitation, we propose a method in which less detailed body skeletons are combined with highly detailed hand skeletons. We investigate CNNs and transformers, the latter of which are particularly adept at extracting and combining important information from both skeleton types using attention. This paper demonstrates the effectiveness of our proposed approach in enhancing action recognition in assembly scenarios.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
How Object Information Improves Skeleton-based Human Action Recognition in Assembly Tasks
Authors:
Dustin Aganian,
Mona Köhler,
Sebastian Baake,
Markus Eisenbach,
Horst-Michael Gross
Abstract:
As the use of collaborative robots (cobots) in industrial manufacturing continues to grow, human action recognition for effective human-robot collaboration becomes increasingly important. This ability is crucial for cobots to act autonomously and assist in assembly tasks. Recently, skeleton-based approaches are often used as they tend to generalize better to different people and environments. Howe…
▽ More
As the use of collaborative robots (cobots) in industrial manufacturing continues to grow, human action recognition for effective human-robot collaboration becomes increasingly important. This ability is crucial for cobots to act autonomously and assist in assembly tasks. Recently, skeleton-based approaches are often used as they tend to generalize better to different people and environments. However, when processing skeletons alone, information about the objects a human interacts with is lost. Therefore, we present a novel approach of integrating object information into skeleton-based action recognition. We enhance two state-of-the-art methods by treating object centers as further skeleton joints. Our experiments on the assembly dataset IKEA ASM show that our approach improves the performance of these state-of-the-art methods to a large extent when combining skeleton joints with objects predicted by a state-of-the-art instance segmentation model. Our research sheds light on the benefits of combining skeleton joints with object information for human action recognition in assembly tasks. We analyze the effect of the object detector on the combination for action classification and discuss the important factors that must be taken into account.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Distributed Model Predictive Control for Periodic Cooperation of Multi-Agent Systems
Authors:
Matthias Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We consider multi-agent systems with heterogeneous, nonlinear agents subject to individual constraints that want to achieve a periodic, dynamic cooperative control goal which can be characterised by a set and a suitable cost. We propose a sequential distributed model predictive control (MPC) scheme in which agents sequentially solve an individual optimisation problem to track an artificial periodi…
▽ More
We consider multi-agent systems with heterogeneous, nonlinear agents subject to individual constraints that want to achieve a periodic, dynamic cooperative control goal which can be characterised by a set and a suitable cost. We propose a sequential distributed model predictive control (MPC) scheme in which agents sequentially solve an individual optimisation problem to track an artificial periodic output trajectory. The optimisation problems are coupled through these artificial periodic output trajectories, which are communicated and penalised using the cost that characterises the cooperative goal. The agents communicate only their artificial trajectories and only once per time step. We show that under suitable assumptions, the agents can incrementally move their artificial output trajectories towards the cooperative goal, and, hence, their closed-loop output trajectories asymptotically achieve it. We illustrate the scheme with a simulation example.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Transient Performance of MPC for Tracking
Authors:
Matthias Köhler,
Lisa Krügel,
Lars Grüne,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We analyse the closed-loop performance of a model predictive control (MPC) for tracking formulation with artificial references. It has been shown that such a scheme guarantees closed-loop stability and recursive feasibility for any externally supplied reference, even if it is unreachable or time-varying. The basic idea is to consider an artificial reference as an additional decision variable and t…
▽ More
We analyse the closed-loop performance of a model predictive control (MPC) for tracking formulation with artificial references. It has been shown that such a scheme guarantees closed-loop stability and recursive feasibility for any externally supplied reference, even if it is unreachable or time-varying. The basic idea is to consider an artificial reference as an additional decision variable and to formulate generalised terminal ingredients with respect to it. In addition, its offset is penalised in the MPC optimisation problem, leading to closed-loop convergence to the best reachable reference. In this paper, we provide a transient performance bound on the closed loop using MPC for tracking. We employ mild assumptions on the offset cost and scale it with the prediction horizon. In this case, an increasing horizon in MPC for tracking recovers the infinite horizon optimal solution.
△ Less
Submitted 24 January, 2024; v1 submitted 17 March, 2023;
originally announced March 2023.
-
Molecular Modeling of Aquaporins and Artificial Transmembrane Channels: a mini-review and perspective for plants
Authors:
José Rafael Bordin,
Alexandre Vargas Ilha,
Patrick Ruam Bredow Côrtes,
W. Silva-Oliveira,
Lucas Avila Pinheiro,
Elizane E. de Moraes,
Tulio G. Grison,
Mateus H. Köhler
Abstract:
Aquaporins (AQPs) are a family of transmembrane channels that are found from archaea, eubacteria, and fungi kingdoms to plants and animals. These proteins play a major role in water and small solutes transport across biological cell membranes and maintain the osmotic balance of living cells. In this sense, many works in recent years have been devoted to understanding their behavior, including in p…
▽ More
Aquaporins (AQPs) are a family of transmembrane channels that are found from archaea, eubacteria, and fungi kingdoms to plants and animals. These proteins play a major role in water and small solutes transport across biological cell membranes and maintain the osmotic balance of living cells. In this sense, many works in recent years have been devoted to understanding their behavior, including in plants, where 5 major groups of AQPs have been identified, whose physiological function details still have open questions waiting for an answer. In this direction, we observed in the literature very few Molecular Modeling studies focusing on plant AQPs. It creates a gap in the proper depiction of AQPs since Molecular Simulations allow us to get information that is usually inaccessible by experiments. Likewise, many efforts have been made to create artificial nanochannels with improved properties. It has the potential to help humanity (and plants) to face water stress -- a current problem that will be worsened by Climate Change. In this short review, we will revisit and discuss important computational studies about plant aquaporins and artificial transmembrane channels. With this, we aim to show how the Molecular Modeling community can (and should) help to understand plants' AQPs properties and function and how we can create new nanotechnology-based artificial channels.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Computer Science for Future -- Sustainability and Climate Protection in the Computer Science Courses of the HAW Hamburg
Authors:
Elina Eickstädt,
Martin Becke,
Martin Kohler,
Julia Padberg
Abstract:
Computer Science for Future (CS4F) is an initiative in the Department of Computer Science at HAW Hamburg. The aim of the initiative is a paradigm shift in the discipline of computer science, thus establishing sustainability goals as a primary leitmotif for teaching and research. The focus is on teaching since the most promising multipliers are the students of a university. The change in teaching i…
▽ More
Computer Science for Future (CS4F) is an initiative in the Department of Computer Science at HAW Hamburg. The aim of the initiative is a paradigm shift in the discipline of computer science, thus establishing sustainability goals as a primary leitmotif for teaching and research. The focus is on teaching since the most promising multipliers are the students of a university. The change in teaching influences our research, the transfer to business and civil society as well as the change in our own institution. In this article, we present the initiative CS4F and reflect primarily on the role of students as amplifiers in the transformation process of computer science.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
SAIF: Sparse Adversarial and Imperceptible Attack Framework
Authors:
Tooba Imtiaz,
Morgan Kohler,
Jared Miller,
Zifeng Wang,
Mario Sznaier,
Octavia Camps,
Jennifer Dy
Abstract:
Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks tha…
▽ More
Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
△ Less
Submitted 6 December, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
Multidimensional rank-one convexification of incremental damage models at finite strains
Authors:
Daniel Balzani,
Maximilian Köhler,
Timo Neumeier,
Malte A. Peter,
Daniel Peterseim
Abstract:
This paper presents computationally feasible rank-one relaxation algorithms for the efficient simulation of a time-incremental damage model with nonconvex incremental stress potentials in multiple spatial dimensions. While the standard model suffers from numerical issues due to the lack of convexity, the relaxation by rank-one convexification prevents non-existence of minimizers and mesh dependenc…
▽ More
This paper presents computationally feasible rank-one relaxation algorithms for the efficient simulation of a time-incremental damage model with nonconvex incremental stress potentials in multiple spatial dimensions. While the standard model suffers from numerical issues due to the lack of convexity, the relaxation by rank-one convexification prevents non-existence of minimizers and mesh dependence of the solutions of finite element discretizations. By the combination, modification and parallelization of the underlying convexification algorithms, the novel approach becomes computationally feasible. A descent method and a Newton scheme enhanced by step-size control prevent stability issues related to local minima in the energy landscape and the computation of derivatives. Numerical techniques for the construction of continuous derivatives of the approximated rank-one convex envelope are discussed. A series of numerical experiments demonstrates the ability of the computationally relaxed model to capture softening effects and the mesh independence of the computed approximations. An interpretation in terms of microstructural damage evolution is given, based on the rank-one lamination process.
△ Less
Submitted 9 February, 2023; v1 submitted 24 November, 2022;
originally announced November 2022.
-
General model for charge carriers transport in electrolyte-gated transistors
Authors:
Marcos Luginieski,
Marlus Koehler,
Jose P. M. Serbena,
Keli F. Seidel
Abstract:
Inspired by experimental observations related to electrolyte-gated transistors (EGTs) where non-ideals behaviors are shown and not described by just one theoretical model, we proposed a charge carriers transport model able to describe the typical modes of operation profiles as well as some non-ideals ones from electrolyte-gated field effect transistors (EGOFETs) and organic electrochemical transis…
▽ More
Inspired by experimental observations related to electrolyte-gated transistors (EGTs) where non-ideals behaviors are shown and not described by just one theoretical model, we proposed a charge carriers transport model able to describe the typical modes of operation profiles as well as some non-ideals ones from electrolyte-gated field effect transistors (EGOFETs) and organic electrochemical transistors (OECTs). Our analysis include the effect of 2D or 3D percolation transport (PT) and also the influence of a shallow exponential traps distribution on the transport. Under these considerations, a non-constant accumulation layer thickness along the channel can be formed. Such dependence was included into our model in the effective mobility parameter dependent on the accumulation thickness. The accumulation thickness can depict 2D or 3D PT or even a transition between them. This transition can produce a non-ideal profile between the linear and saturation regimes in the output curve, region in which a protuberance/lump appears. Other analyzed phenomenon was the non-linear behavior for low drain voltage range in the output curve, even when considering an ohmic contact. According to this proposed model, this curve behavior is attributed to the traps distribution profile into the semiconductor and the very thin accumulation layer thickness close to the injection contact. It was also possible to analyze the conditions when the linear field effect mobility is higher or lower than the saturation one. Finally, EGOFET and OECT experimental data were successfully fitted with this model showing its versatility.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Distributed MPC for Self-Organized Cooperation of Multiagent Systems -- Extended Version
Authors:
Matthias Köhler,
Matthias A. Müller,
Frank Allgöwer
Abstract:
We present a sequential distributed model predictive control (MPC) scheme for cooperative control of multi-agent systems with dynamically decoupled heterogeneous nonlinear agents subject to individual constraints. In the scheme, we explore the idea of using tracking MPC with artificial references to let agents coordinate their cooperation without external guidance. Each agent combines a tracking M…
▽ More
We present a sequential distributed model predictive control (MPC) scheme for cooperative control of multi-agent systems with dynamically decoupled heterogeneous nonlinear agents subject to individual constraints. In the scheme, we explore the idea of using tracking MPC with artificial references to let agents coordinate their cooperation without external guidance. Each agent combines a tracking MPC with artificial references, the latter penalized by a suitable coupling cost. They solve an individual optimization problem for this artificial reference and an input that tracks it, only communicating the former to its neighbors in a communication graph. This puts the cooperative problem on a different layer than the handling of the dynamics and constraints, loosening the connection between the two. We provide sufficient conditions on the formulation of the cooperative problem and the coupling cost for the closed-loop system to asymptotically achieve it. Since the dynamics and the cooperative problem are only loosely connected, classical results from distributed optimization can be used to this end. We illustrate the scheme's application to consensus and formation control.
△ Less
Submitted 12 June, 2024; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent
Authors:
Michael Kohler,
Adam Krzyzak
Abstract:
Estimation of a regression function from independent and identically distributed random variables is considered. The $L_2$ error with integration with respect to the design measure is used as an error criterion. Over-parametrized deep neural network estimates are defined where all the weights are learned by the gradient descent. It is shown that the expected $L_2$ error of these estimates converge…
▽ More
Estimation of a regression function from independent and identically distributed random variables is considered. The $L_2$ error with integration with respect to the design measure is used as an error criterion. Over-parametrized deep neural network estimates are defined where all the weights are learned by the gradient descent. It is shown that the expected $L_2$ error of these estimates converges to zero with the rate close to $n^{-1/(1+d)}$ in case that the regression function is Hölder smooth with Hölder exponent $p \in [1/2,1]$. In case of an interaction model where the regression function is assumed to be a sum of Hölder smooth functions where each of the functions depends only on $d^*$ many of $d$ components of the design variable, it is shown that these estimates achieve the corresponding $d^*$-dimensional rate of convergence.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Evolving Microstructures in Relaxed Continuum Damage Mechanics for Strain Softening
Authors:
Maximilian Köhler,
Daniel Balzani
Abstract:
A new relaxation approach is proposed which allows for the description of stress- and strain-softening at finite strains. The model is based on the construction of a convex hull replacing the originally non-convex incremental stress potential which in turn represents damage in terms of the classical $(1-D)$ approach. This convex hull is given as the linear convex combination of weakly and strongly…
▽ More
A new relaxation approach is proposed which allows for the description of stress- and strain-softening at finite strains. The model is based on the construction of a convex hull replacing the originally non-convex incremental stress potential which in turn represents damage in terms of the classical $(1-D)$ approach. This convex hull is given as the linear convex combination of weakly and strongly damaged phases and thus, it represents the homogenization of a microstructure bifurcated in the two phases. As a result thereof, damage evolves in the convexified regime mainly by an increasing volume fraction of the strongly damaged phase. In contrast to previous relaxed incremental formulations in Gürses and Miehe [16] and Balzani and Ortiz [2], where the convex hull has been kept fixated after construction, here, the strongly damaged phase is allowed to elastically unload upon further loading. At the same time, its volume fraction increases nonlinearly within the convexified regime. Thus, strain-softening in the sense of a decreasing stress with increasing strain can be modeled. The major advantage of the proposed approach is that it ensures mesh-independent structural simulations without the requirement of additional length-scale related parameters or nonlocal quantities, which simplifies an implementation using classical material subroutine interfaces. In this paper, focus is on the relaxation of one-dimensional models for fiber damage which are combined with a microsphere approach to allow for the description of three-dimensional fiber dispersions appearing in fibrous materials such as soft biological tissues. Several numerical examples are analyzed to show the overall response of the model and the mesh-independence of resulting structural calculations.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent
Authors:
Selina Drews,
Michael Kohler
Abstract:
Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which are computed in parallel, via gradient descent to the data. The estimate is over-parametrized in the sense that the number of its parameters is much larger tha…
▽ More
Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which are computed in parallel, via gradient descent to the data. The estimate is over-parametrized in the sense that the number of its parameters is much larger than the sample size. It is shown that in case of a suitable random initialization of the network, a suitable small stepsize of the gradient descent, and a number of gradient descent steps which is slightly larger than the reciprocal of the stepsize of the gradient descent, the estimate is universally consistent in the sense that its expected L2 error converges to zero for all distributions of the data where the response variable is square integrable.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Authors:
Daniel Seichter,
Söhnke Benedikt Fischedick,
Mona Köhler,
Horst-Michael Groß
Abstract:
Semantic scene understanding is essential for mobile agents acting in various environments. Although semantic segmentation already provides a lot of information, details about individual objects as well as the general scene are missing but required for many real-world applications. However, solving multiple tasks separately is expensive and cannot be accomplished in real time given limited computi…
▽ More
Semantic scene understanding is essential for mobile agents acting in various environments. Although semantic segmentation already provides a lot of information, details about individual objects as well as the general scene are missing but required for many real-world applications. However, solving multiple tasks separately is expensive and cannot be accomplished in real time given limited computing and battery capabilities on a mobile platform. In this paper, we propose an efficient multi-task approach for RGB-D scene analysis~(EMSANet) that simultaneously performs semantic and instance segmentation~(panoptic segmentation), instance orientation estimation, and scene classification. We show that all tasks can be accomplished using a single neural network in real time on a mobile platform without diminishing performance - by contrast, the individual tasks are able to benefit from each other. In order to evaluate our multi-task approach, we extend the annotations of the common RGB-D indoor datasets NYUv2 and SUNRGB-D for instance segmentation and orientation estimation. To the best of our knowledge, we are the first to provide results in such a comprehensive multi-task setting for indoor scene analysis on NYUv2 and SUNRGB-D.
△ Less
Submitted 10 July, 2022;
originally announced July 2022.
-
Single exciton trap** in an electrostatically defined 2D semiconductor quantum dot
Authors:
Daniel N. Shanks,
Fateme Mahdikhanysarvejahany,
Michael R. Koehler,
David G. Mandrus,
Takashi Taniguchi,
Kenji Watanabe,
Brian J. LeRoy,
John R. Schaibley
Abstract:
Interlayer excitons (IXs) in 2D semiconductors have long lifetimes and spin-valley coupled physics, with a long-standing goal of single exciton trap** for valleytronic applications. In this work, we use a nano-patterned graphene gate to create an electrostatic IX trap. We measure a unique power-dependent blue-shift of IX energy, where narrow linewidth emission exhibits discrete energy jumps. We…
▽ More
Interlayer excitons (IXs) in 2D semiconductors have long lifetimes and spin-valley coupled physics, with a long-standing goal of single exciton trap** for valleytronic applications. In this work, we use a nano-patterned graphene gate to create an electrostatic IX trap. We measure a unique power-dependent blue-shift of IX energy, where narrow linewidth emission exhibits discrete energy jumps. We attribute these jumps to quantized increases of the number occupancy of IXs within the trap and compare to a theoretical model to assign the lowest energy emission line to single IX recombination.
△ Less
Submitted 3 November, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Analysis of convolutional neural network image classifiers in a rotationally symmetric model
Authors:
Michael Kohler,
Benjamin Walter
Abstract:
Convolutional neural network image classifiers are defined and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Here we consider images as random variables with values in some functional space, where we only observe discrete samples as function values on some finite grid. Under suitable structural and smoothness assumpti…
▽ More
Convolutional neural network image classifiers are defined and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Here we consider images as random variables with values in some functional space, where we only observe discrete samples as function values on some finite grid. Under suitable structural and smoothness assumptions on the functional a posteriori probability, which includes some kind of symmetry against rotation of subparts of the input image, it is shown that least squares plug-in classifiers based on convolutional neural networks are able to circumvent the curse of dimensionality in binary image classification if we neglect a resolution-dependent error term. The finite sample size behavior of the classifier is analyzed by applying it to simulated and real data.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Interlayer Exciton Diode and Transistor
Authors:
Daniel N. Shanks,
Fateme Mahdikhanysarvejahany,
Trevor G. Stanfill,
Michael R. Koehler,
David G. Mandrus,
Takashi Taniguchi,
Kenji Watanabe,
Brian J. LeRoy,
John R. Schaibley
Abstract:
Controlling the flow of charge neutral interlayer exciton (IX) quasiparticles can potentially lead to low loss excitonic circuits. Here, we report unidirectional transport of IXs along nanoscale electrostatically defined channels in an MoSe$_2$-WSe$_2$ heterostructure. These results are enabled by a lithographically defined triangular etch in a graphene gate to create a potential energy ''slide''.…
▽ More
Controlling the flow of charge neutral interlayer exciton (IX) quasiparticles can potentially lead to low loss excitonic circuits. Here, we report unidirectional transport of IXs along nanoscale electrostatically defined channels in an MoSe$_2$-WSe$_2$ heterostructure. These results are enabled by a lithographically defined triangular etch in a graphene gate to create a potential energy ''slide''. By performing spatially and temporally resolved photoluminescence measurements, we measure smoothly varying IX energy along the structure and high-speed exciton flow with a drift velocity up to 2 * 10$^6$ cm/s, an order of magnitude larger than previous experiments. Furthermore, exciton flow can be controlled by saturating exciton population in the channel using a second laser pulse, demonstrating an optically gated excitonic transistor. Our work paves the way towards low loss excitonic circuits, the study of bosonic transport in one-dimensional channels, and custom potential energy landscapes for excitons in van der Waals heterostructures.
△ Less
Submitted 19 August, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Localized Interlayer Excitons in MoSe2-WSe2 Heterostructures without a Moiré Potential
Authors:
Fateme Mahdikhanysarvejahany,
Daniel N. Shanks,
Mathew Klein,
Qian Wang,
Michael R. Koehler,
David G. Mandrus,
Takashi Taniguchi,
Kenji Watanabe,
Oliver L. A. Monti,
Brian J. LeRoy,
John R. Schaibley
Abstract:
Trapped interlayer excitons (IXs) in MoSe2-WSe2 heterobilayers have generated interest for use as single quantum emitter arrays and as an opportunity to study moiré physics in transition metal dichalcogenide (TMD) heterostructures. IXs are spatially indirectly excitons comprised of an electron in the MoSe2 layer bound to a hole in the WSe2 layer. Previous reports of spectrally narrow (<1 meV) phot…
▽ More
Trapped interlayer excitons (IXs) in MoSe2-WSe2 heterobilayers have generated interest for use as single quantum emitter arrays and as an opportunity to study moiré physics in transition metal dichalcogenide (TMD) heterostructures. IXs are spatially indirectly excitons comprised of an electron in the MoSe2 layer bound to a hole in the WSe2 layer. Previous reports of spectrally narrow (<1 meV) photoluminescence (PL) emission lines at low temperature have been attributed to IXs localized by the moiré potential between the TMD layers. Here, we show that spectrally narrow IX PL lines are present even when the moiré potential is suppressed by inserting a bilayer hexagonal boron nitride (hBN) spacer between the TMD layers. We directly compare the do**, electric field, magnetic field, and temperature dependence of IXs in a directly contacted MoSe2-WSe2 region to those in a region separated by bilayer hBN. Our results show that the localization potential resulting in the narrow PL lines is independent of the moiré potential, and instead likely due to extrinsic effects such as nanobubbles or defects. We show that while the do**, electric field, and temperature dependence of the narrow IX lines is similar for both regions, their excitonic g-factors have opposite signs, indicating that the IXs in the directly contacted region are trapped by both moiré and extrinsic localization potentials.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Data-driven distributed MPC of dynamically coupled linear systems
Authors:
Matthias Köhler,
Julian Berberich,
Matthias A. Müller,
Frank Allgöwer
Abstract:
In this paper, we present a data-driven distributed model predictive control (MPC) scheme to stabilise the origin of dynamically coupled discrete-time linear systems subject to decoupled input constraints. The local optimisation problems solved by the subsystems rely on a distributed adaptation of the Fundamental Lemma by Willems et al., allowing to parametrise system trajectories using only measu…
▽ More
In this paper, we present a data-driven distributed model predictive control (MPC) scheme to stabilise the origin of dynamically coupled discrete-time linear systems subject to decoupled input constraints. The local optimisation problems solved by the subsystems rely on a distributed adaptation of the Fundamental Lemma by Willems et al., allowing to parametrise system trajectories using only measured input-output data without explicit model knowledge. For the local predictions, the subsystems rely on communicated assumed trajectories of neighbours. Each subsystem guarantees a small deviation from these trajectories via a consistency constraint. We provide a theoretical analysis of the resulting non-iterative distributed MPC scheme, including proofs of recursive feasibility and (practical) stability. Finally, the approach is successfully applied to a numerical example.
△ Less
Submitted 11 August, 2023; v1 submitted 25 February, 2022;
originally announced February 2022.
-
Direct STM Measurements of R- and H-type Twisted MoSe2/WSe2 Heterostructures
Authors:
Rachel Nieken,
Anna Roche,
Fateme Mahdikhanysarvejahany,
Takashi Taniguchi,
Kenji Watanabe,
Michael R. Koehler,
David G. Mandrus,
John Schaibley,
Brian J. LeRoy
Abstract:
When semiconducting transition metal dichalcogenides heterostructures are stacked the twist angle and lattice mismatch leads to a periodic moiré potential. As the angle between the layers changes, so do the electronic properties. As the angle approaches 0- or 60-degrees interesting characteristics and properties such as modulations in the band edges, flat bands, and confinement are predicted to oc…
▽ More
When semiconducting transition metal dichalcogenides heterostructures are stacked the twist angle and lattice mismatch leads to a periodic moiré potential. As the angle between the layers changes, so do the electronic properties. As the angle approaches 0- or 60-degrees interesting characteristics and properties such as modulations in the band edges, flat bands, and confinement are predicted to occur. Here we report scanning tunneling microscopy and spectroscopy measurements on the band gaps and band modulations in MoSe2/WSe2 heterostructures with near 0 degree rotation (R-type) and near 60 degree rotation (H-type). We find a modulation of the band gap for both stacking configurations with a larger modulation for R-type than for H-type as predicted by theory. Furthermore, local density of states images show that electrons are localized differently at the valence band and conduction band edges.
△ Less
Submitted 17 March, 2022; v1 submitted 6 January, 2022;
originally announced January 2022.
-
Few-Shot Object Detection: A Comprehensive Survey
Authors:
Mona Köhler,
Markus Eisenbach,
Horst-Michael Gross
Abstract:
Humans are able to learn to recognize new objects even from a few examples. In contrast, training deep-learning-based object detectors requires huge amounts of annotated data. To avoid the need to acquire and annotate these huge amounts of data, few-shot object detection aims to learn from few object instances of new categories in the target domain. In this survey, we provide an overview of the st…
▽ More
Humans are able to learn to recognize new objects even from a few examples. In contrast, training deep-learning-based object detectors requires huge amounts of annotated data. To avoid the need to acquire and annotate these huge amounts of data, few-shot object detection aims to learn from few object instances of new categories in the target domain. In this survey, we provide an overview of the state of the art in few-shot object detection. We categorize approaches according to their training scheme and architectural layout. For each type of approaches, we describe the general realization as well as concepts to improve the performance on novel categories. Whenever appropriate, we give short takeaways regarding these concepts in order to highlight the best ideas. Eventually, we introduce commonly used datasets and their evaluation protocols and analyze reported benchmark results. As a result, we emphasize common challenges in evaluation and identify the most promising current trends in this emerging field of few-shot object detection.
△ Less
Submitted 15 September, 2022; v1 submitted 22 December, 2021;
originally announced December 2021.
-
On the rate of convergence of a classifier based on a Transformer encoder
Authors:
Iryna Gurevych,
Michael Kohler,
Gözde Gül Sahin
Abstract:
Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability…
▽ More
Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between Transformer classifiers analyzed theoretically in this paper and Transformer classifiers used nowadays in practice are illustrated by considering classification problems in natural language processing.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Disentangling Electronic, Lattice and Spin Dynamics in the Chiral Helimagnet Cr1/3NbS2
Authors:
N. Sirica,
H. Hedayat,
D. Bugini,
M. R. Koehler,
L. Li,
D. S. Parker,
D. G. Mandrus,
C. Dallera,
E. Carpene,
N. Mannella
Abstract:
We investigate the static and ultrafast magneto-optical response of the hexagonal chiral helimagnet $Cr_{1/3}NbS_{2}$ above and below the helimagnetic ordering temperature. The presence of a magnetic easy plane contained within the crystallographic ab-plane is confirmed, while degenerate optical pump-probe experiments reveal significant differences in the dynamic between the parent, $NbS_{2}$, and…
▽ More
We investigate the static and ultrafast magneto-optical response of the hexagonal chiral helimagnet $Cr_{1/3}NbS_{2}$ above and below the helimagnetic ordering temperature. The presence of a magnetic easy plane contained within the crystallographic ab-plane is confirmed, while degenerate optical pump-probe experiments reveal significant differences in the dynamic between the parent, $NbS_{2}$, and Cr-intercalated compounds. Time resolved magneto-optical Kerr effect measurements show a two-step demagnetization process, where an initial, sub-ps relaxation and subsequent buildup ($τ> 50$ ps) in the demagnetization dynamic scale similarly with increasing pump fluence. Despite theoretical evidence for partial gap** of the minority spin channel, suggestive of possible half metallicity in $Cr_{1/3}NbS_{2}$, such a long demagnetization dynamic likely results from spin lattice-relaxation as opposed to minority state blocking. However, comparison of the two-step demagnetization process in $Cr_{1/3}NbS_{2}$ with other 3d intercalated transition metal dichalcogenides reveals a behavior that is unexpected from conventional spin-lattice relaxation, and may be attributed to the complicated interaction of local moments with itinerant electrons in this material system.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
Steady-state nonlinear optical response of excitons in monolayer MoSe$_2$
Authors:
Muhed S. Rana,
Joshua R. Hendrickson,
Christopher E. Stevens,
Michael R. Koehler,
David G. Mandrus,
Takashi Taniguchi,
Kenji Watanabe,
Nai H. Kwong,
Rolf Binder,
John R. Schaibley
Abstract:
Monolayer transition metal dichalcogenide (TMD) semiconductors such as MoSe$_2$ host strongly bound excitons which are known to exhibit a strong resonant third-order nonlinear response. Although there have been numerous studies of the ultrafast nonlinear response of monolayer TMDs, a study of the steady-state nonlinear response is lacking. We report a comprehensive study of the steady-state two-co…
▽ More
Monolayer transition metal dichalcogenide (TMD) semiconductors such as MoSe$_2$ host strongly bound excitons which are known to exhibit a strong resonant third-order nonlinear response. Although there have been numerous studies of the ultrafast nonlinear response of monolayer TMDs, a study of the steady-state nonlinear response is lacking. We report a comprehensive study of the steady-state two-color nonlinear response of excitons in hBN-encapsulated monolayer MoSe$_2$ at 7 K. We observe differential transmission (DT) signals associated with the neutral and charged exciton species, which is strongly dependent on the polarization of the pump and probe. Our results are compared to a theoretical model based on a T-matrix formulation for exciton-exciton, exciton-trion, and trion-trion correlations. The parameters are chosen such that the theory accurately reproduces the experimental DT spectrum, which is found to be dominated by two-exciton correlations without strong biexciton binding, exciton-trion attractive interactions, and strong spin mixing through incoherent relaxation.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
A data acquisition setup for data driven acoustic design
Authors:
Romana Rust,
Achilleas Xydis,
Kurt Heutschi,
Nathanaël Perraudin,
Gonzalo Casas,
Chaoyu Du,
Jürgen Strauss,
Kurt Eggenschwiler,
Fernando Perez-Cruz,
Fabio Gramazio,
Matthias Kohler
Abstract:
In this paper, we present a novel interdisciplinary approach to study the relationship between diffusive surface structures and their acoustic performance. Using computational design, surface structures are iteratively generated and 3D printed at 1:10 model scale. They originate from different fabrication typologies and are designed to have acoustic diffusion and absorption effects. An automated r…
▽ More
In this paper, we present a novel interdisciplinary approach to study the relationship between diffusive surface structures and their acoustic performance. Using computational design, surface structures are iteratively generated and 3D printed at 1:10 model scale. They originate from different fabrication typologies and are designed to have acoustic diffusion and absorption effects. An automated robotic process measures the impulse responses of these surfaces by positioning a microphone and a speaker at multiple locations. The collected data serves two purposes: first, as an exploratory catalogue of different spatio-temporal-acoustic scenarios and second, as data set for predicting the acoustic response of digitally designed surface geometries using machine learning. In this paper, we present the automated data acquisition setup, the data processing and the computational generation of diffusive surface structures. We describe first results of comparative studies of measured surface panels and conclude with steps of future research.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
Molecular Dynamics Simulations of Water Anchored in Multi-Layered Nanoporous MoS$_2$ Membranes: Implications for Desalination
Authors:
João P. K. Abal,
Rodrigo F. Dillenburg,
Mateus H. Köhler,
Marcia C. Barbosa
Abstract:
One of the most promising applications in nanoscience is the design of new materials to improve water permeability and selectivity of nanoporous membranes. Understanding the molecular architecture behind these fascinating structures and how it impacts the water flow is an intricate but necessary task. We studied here, the water flux through multi-layered nanoporous molybdenum disulfide (MLNMoS…
▽ More
One of the most promising applications in nanoscience is the design of new materials to improve water permeability and selectivity of nanoporous membranes. Understanding the molecular architecture behind these fascinating structures and how it impacts the water flow is an intricate but necessary task. We studied here, the water flux through multi-layered nanoporous molybdenum disulfide (MLNMoS$_2$) membranes with different nanopore sizes and length. Molecular dynamics simulations show that the permeability do not increase with the inverse of the membrane thickness, violating the classical hydrodynamic behavior. The data also reveals that the water dynamics is slower than that observed in frictionless carbon nanotubes and multi-layer graphene membranes, which we explain in terms of an anchor mechanism observed in between layers. We show that the membrane permeability is critically dependent on the nanopore architecture, bringing important insights into the manufacture of new desalination membranes.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Tracked 3D Ultrasound and Deep Neural Network-based Thyroid Segmentation reduce Interobserver Variability in Thyroid Volumetry
Authors:
Markus Krönke,
Christine Eilers,
Desislava Dimova,
Melanie Köhler,
Gabriel Buschner,
Lilit Mirzojan,
Lemonia Konstantinidou,
Marcus R. Makowski,
James Nagarajah,
Nassir Navab,
Wolfgang Weber,
Thomas Wendler
Abstract:
Background: Thyroid volumetry is crucial in diagnosis, treatment and monitoring of thyroid diseases. However, conventional thyroid volumetry with 2D ultrasound is highly operator-dependent. This study compares 2D ultrasound and tracked 3D ultrasound with an automatic thyroid segmentation based on a deep neural network regarding inter- and intraobserver variability, time and accuracy. Volume refere…
▽ More
Background: Thyroid volumetry is crucial in diagnosis, treatment and monitoring of thyroid diseases. However, conventional thyroid volumetry with 2D ultrasound is highly operator-dependent. This study compares 2D ultrasound and tracked 3D ultrasound with an automatic thyroid segmentation based on a deep neural network regarding inter- and intraobserver variability, time and accuracy. Volume reference was MRI. Methods: 28 healthy volunteers were scanned with 2D and 3D ultrasound as well as by MRI. Three physicians (MD 1, 2, 3) with different levels of experience (6, 4 and 1 a) performed three 2D ultrasound and three tracked 3D ultrasound scans on each volunteer. In the 2D scans the thyroid lobe volumes were calculated with the ellipsoid formula. A convolutional deep neural network (CNN) segmented the 3D thyroid lobes automatically. On MRI (T1 VIBE sequence) the thyroid was manually segmented by an experienced medical doctor. Results: The CNN was trained to obtain a dice score of 0.94. The interobserver variability comparing two MDs showed mean differences for 2D and 3D respectively of 0.58 ml to 0.52 ml (MD1 vs. 2), -1.33 ml to -0.17 ml (MD1 vs. 3) and -1.89 ml to -0.70 ml (MD2 vs. 3). Paired samples t-tests showed significant differences in two comparisons for 2D and none for 3D. Intraobsever variability was similar for 2D and 3D ultrasound. Comparison of ultrasound volumes and MRI volumes by paired samples t-tests showed a significant difference for the 2D volumetry of all MDs, and no significant difference for 3D ultrasound. Acquisition time was significantly shorter for 3D ultrasound. Conclusion: Tracked 3D ultrasound combined with a CNN segmentation significantly reduces interobserver variability in thyroid volumetry and increases the accuracy of the measurements with shorter acquisition times.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
FRET nanoscopy enables seamless imaging of molecular assemblies with sub-nanometer resolution
Authors:
Jan-Hendrik Budde,
Nicolaas van der Voort,
Suren Felekyan,
Julian Folz,
Ralf Kühnemuth,
Paul Lauterjung,
Markus Köhler,
Andreas Schönle,
Julian Sindram,
Marius Otten,
Matthias Karg,
Christian Herrmann,
Anders Barth,
Claus A. M. Seidel
Abstract:
By circumventing the optical diffraction limit, super-resolved fluorescence microscopies enable the study of larger cellular structures and molecular assemblies. However, fluorescence nanoscopy currently lacks the spatiotemporal resolution to resolve distances on the size of individual molecules and reveal the conformational fine structure and dynamics of molecular complexes. Here we establish FRE…
▽ More
By circumventing the optical diffraction limit, super-resolved fluorescence microscopies enable the study of larger cellular structures and molecular assemblies. However, fluorescence nanoscopy currently lacks the spatiotemporal resolution to resolve distances on the size of individual molecules and reveal the conformational fine structure and dynamics of molecular complexes. Here we establish FRET nanoscopy by combining colocalization STED microscopy with multiparameter FRET spectroscopy. We simultaneously localize donor and acceptor dyes of single FRET pairs with nanometer resolution and quantitatively measure intramolecular distances with sub-nanometer precision over a large dynamic range. While FRET provides isotropic 3D distance information, colocalization measures the projected distance onto the image plane. The combined information allows us to directly determine its 3D orientation using Pythagoras's theorem. Studying two DNA model systems and the human guanylate binding protein hGBP1, we demonstrate that FRET nanoscopy unravels the interplay between their spatial organization and local molecular conformation in a complex environment.
△ Less
Submitted 21 January, 2022; v1 submitted 30 July, 2021;
originally announced August 2021.
-
Convergence rates for shallow neural networks learned by gradient descent
Authors:
Alina Braun,
Michael Kohler,
Sophie Langer,
Harro Walk
Abstract:
In this paper we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of…
▽ More
In this paper we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of $1/\sqrt{n}$ (up to a logarithmic factor). Our statistical analysis implies that the key aspect behind this result is the proper choice of the initial inner weights and the adjustment of the outer weights via gradient descent. This indicates that we can also simply use linear least squares to choose the outer weights. We prove a corresponding theoretical result and compare our new linear least squares neural network estimate with standard neural network estimates via simulated data. Our simulations show that our theoretical considerations lead to an estimate with an improved performance in many cases.
△ Less
Submitted 18 August, 2023; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Estimation of a regression function on a manifold by fully connected deep neural networks
Authors:
Michael Kohler,
Sophie Langer,
Ulrich Reif
Abstract:
Estimation of a regression function from independent and identically distributed data is considered. The $L_2$ error with integration with respect to the distribution of the predictor variable is used as the error criterion. The rate of convergence of least squares estimates based on fully connected spaces of deep neural networks with ReLU activation function is analyzed for smooth regression func…
▽ More
Estimation of a regression function from independent and identically distributed data is considered. The $L_2$ error with integration with respect to the distribution of the predictor variable is used as the error criterion. The rate of convergence of least squares estimates based on fully connected spaces of deep neural networks with ReLU activation function is analyzed for smooth regression functions. It is shown that in case that the distribution of the predictor variable is concentrated on a manifold, these estimates achieve a rate of convergence which depends on the dimension of the manifold and not on the number of components of the predictor variable.
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
The multiple-charm hierarchy in the statistical hadronization model
Authors:
Anton Andronic,
Peter Braun-Munzinger,
Markus K. Köhler,
Aleksas Mazeliauskas,
Krzysztof Redlich,
Johanna Stachel,
Vytautas Vislavicius
Abstract:
In relativistic nuclear collisions the production of hadrons with light (u,d,s) quarks is quantitatively described in the framework of the Statistical Hadronization Model (SHM). Charm quarks are dominantly produced in initial hard collisions but interact strongly in the hot fireball and thermalize. Therefore charmed hadrons can be incorporated into the SHM by treating charm quarks as 'impurities'…
▽ More
In relativistic nuclear collisions the production of hadrons with light (u,d,s) quarks is quantitatively described in the framework of the Statistical Hadronization Model (SHM). Charm quarks are dominantly produced in initial hard collisions but interact strongly in the hot fireball and thermalize. Therefore charmed hadrons can be incorporated into the SHM by treating charm quarks as 'impurities' with thermal distributions, while the total charm content of the fireball is fixed by the measured open charm cross section. We call this model SHMc and demonstrate that with SHMc the measured multiplicities of single charm hadrons in lead-lead collisions at LHC energies can be well described with the same thermal parameters as for (u,d,s) hadrons. Furthermore, transverse momentum distributions are computed in a blast-wave model, which includes the resonance decay kinematics. SHMc is extended to lighter collision systems down to oxygen-oxygen and includes doubly- and triply-charmed hadrons. We show predictions for production probabilities of such states exhibiting a characteristic and quite spectacular enhancement hierarchy.
△ Less
Submitted 11 June, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.