-
Inestability presented in the estimating of the Nelson-Siegel-Svensson model
Authors:
Ainara Rodríguez-Sánchez
Abstract:
The literature shows the possible existence of a problem called collinearity in both Nelson-Siegel and Nelson-Siegel-Svensson models due to the relationship between the slope and curvature components. The presence of this problem and the estimation of both models by Ordinary Least Squares would lead to coefficients estimates that may be unstable among other consequences. However, these estimates a…
▽ More
The literature shows the possible existence of a problem called collinearity in both Nelson-Siegel and Nelson-Siegel-Svensson models due to the relationship between the slope and curvature components. The presence of this problem and the estimation of both models by Ordinary Least Squares would lead to coefficients estimates that may be unstable among other consequences. However, these estimates are used to make monetary policy decisions. For this reason, it is important to try mitigating this collinearity problem. Consequently, some authors propose traditional procedures for the treatment of collinearity such as: non-linear optimisation, to fix the shape parameter or ridge regression. Nevertheless, all these processes have their disadvantages. Alternatively, a new method with good properties called raise regression is proposed in this paper. Finally, the methodologies are illustrated with an empirical comparison on Euribor Overnight Index Swap and Euribor Interest Rates Swap data between 2011 and 2021.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model
Authors:
Sayantan Auddy,
Jakob Hollenstein,
Matteo Saveriano,
Antonio Rodríguez-Sánchez,
Justus Piater
Abstract:
Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generat…
▽ More
Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generated neural ordinary differential equation solvers (NODE) can learn multiple LfD tasks sequentially, this approach lacks stability guarantees. We propose an approach for stable continual-LfD in which a hypernetwork generates two networks: a trajectory learning dynamics model, and a trajectory stabilizing Lyapunov function. The introduction of stability generates convergent trajectories, but more importantly it also greatly improves continual learning performance, especially in the size-efficient chunked hypernetworks. With our approach, a single hypernetwork learns stable trajectories of the robot's end-effector position and orientation simultaneously, and does so continually for a sequence of real-world LfD tasks without retraining on past demonstrations. We also propose stochastic hypernetwork regularization with a single randomly sampled regularization term, which reduces the cumulative training time cost for N tasks from O$(N^2)$ to O$(N)$ without any loss in performance on real-world tasks. We empirically evaluate our approach on the popular LASA dataset, on high-dimensional extensions of LASA (including up to 32 dimensions) to assess scalability, and on a novel extended robotic task dataset (RoboTasks9) to assess real-world performance. In trajectory error metrics, stability metrics and continual learning metrics our approach performs favorably, compared to other baselines. Our open-source code and datasets are available at https://github.com/sayantanauddy/clfd-snode.
△ Less
Submitted 9 January, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
The Euclidean Adler Function and its Interplay with $Δα^{\mathrm{had}}_{\mathrm{QED}}$ and $α_s$
Authors:
M. Davier,
D. Díaz-Calderón,
B. Malaescu,
A. Pich,
A. Rodríguez-Sánchez,
Z. Zhang
Abstract:
Three different approaches to precisely describe the Adler function in the Euclidean regime at around $2\, \mathrm{GeVs}$ are available: dispersion relations based on the hadronic production data in $e^+e^-$ annihilation, lattice simulations and perturbative QCD (pQCD). We make a comprehensive study of the perturbative approach, supplemented with the leading power corrections in the operator produ…
▽ More
Three different approaches to precisely describe the Adler function in the Euclidean regime at around $2\, \mathrm{GeVs}$ are available: dispersion relations based on the hadronic production data in $e^+e^-$ annihilation, lattice simulations and perturbative QCD (pQCD). We make a comprehensive study of the perturbative approach, supplemented with the leading power corrections in the operator product expansion. All known contributions are included, with a careful assessment of uncertainties. The pQCD predictions are compared with the Adler functions extracted from $Δα^{\mathrm{had}}_{\mathrm{QED}}(Q^2)$, using both the DHMZ compilation of $e^+e^-$ data and published lattice results. Taking as input the FLAG value of $α_s$, the pQCD Adler function turns out to be in good agreement with the lattice data, while the dispersive results lie systematically below them. Finally, we explore the sensitivity to $α_s$ of the direct comparison between the data-driven, lattice and QCD Euclidean Adler functions. The precision with which the renormalisation group equation can be tested is also evaluated.
△ Less
Submitted 26 April, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Constraints on the hadronic light-by-light in the Melnikov-Vainshtein regime
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Antonio Rodríguez-Sánchez
Abstract:
The muon anomalous magnetic moment continues to attract attention due to the possible tension between the experimentally measured value and the theoretical Standard Model prediction. With the aim to reduce the uncertainty on the hadronic light-by-light contribution to the magnetic moment, we derive short-distance constraints in the Melnikov-Vainshtein regime which are useful for data-driven determ…
▽ More
The muon anomalous magnetic moment continues to attract attention due to the possible tension between the experimentally measured value and the theoretical Standard Model prediction. With the aim to reduce the uncertainty on the hadronic light-by-light contribution to the magnetic moment, we derive short-distance constraints in the Melnikov-Vainshtein regime which are useful for data-driven determinations. In this kinematical region, two of the four electromagnetic currents are close in the four-point function defining the hadronic light-by-light tensor. To obtain the constraints, we develop a systematic operator product expansion of the tensor in question to next-to-leading order in the expansion in operators. We evaluate the leading in $α_s$ contributions and derive constraints for the next-to-leading operators that are also valid nonperturbatively.
△ Less
Submitted 17 February, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
Affordance detection with Dynamic-Tree Capsule Networks
Authors:
Antonio Rodríguez-Sánchez,
Simon Haller-Seeber,
David Peer,
Chris Engelhardt,
Jakob Mittelberger,
Matteo Saveriano
Abstract:
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object insta…
▽ More
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object instances or new viewpoints. One solution to overcome such limitations can be to resort to capsule networks. In this paper, we introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds. We show that our capsule-based network outperforms current state-of-the-art models on viewpoint invariance and parts-segmentation of new object instances through a novel dataset we only used for evaluation and it is publicly available from github.com/gipfelen/DTCG-Net. In the experimental evaluation we will show that our algorithm is superior to current affordance detection methods when faced with gras** previously unseen objects thanks to our Capsule Network enforcing a parts-to-whole representation.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Short-distance constraints on the hadronic light-by-light
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Antonio Rodríguez-Sánchez
Abstract:
The muon anomalous magnetic moment continues to attract interest due to the potential tension between experimental measurement [1,2] and the Standard Model prediction [3]. The hadronic light-by-light contribution to the magnetic moment is one of the two diagrammatic topologies currently saturating the theoretical uncertainty. With the aim of improving precision on the hadronic light-by-light in a…
▽ More
The muon anomalous magnetic moment continues to attract interest due to the potential tension between experimental measurement [1,2] and the Standard Model prediction [3]. The hadronic light-by-light contribution to the magnetic moment is one of the two diagrammatic topologies currently saturating the theoretical uncertainty. With the aim of improving precision on the hadronic light-by-light in a data-driven approach founded on dispersion theory [4,5], we derive various short-distance constraints of the underlying correlation function of four electromagnetic currents. Here, we present our previous progress in the purely short-distance regime and current efforts in the so-called Melnikov-Vainshtein limit.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Improving the Trainability of Deep Neural Networks through Layerwise Batch-Entropy Regularization
Authors:
David Peer,
Bart Keulen,
Sebastian Stabinger,
Justus Piater,
Antonio Rodríguez-Sánchez
Abstract:
Training deep neural networks is a very demanding task, especially challenging is how to adapt architectures to improve the performance of trained models. We can find that sometimes, shallow networks generalize better than deep networks, and the addition of more layers results in higher training and test errors. The deep residual learning framework addresses this degradation problem by adding skip…
▽ More
Training deep neural networks is a very demanding task, especially challenging is how to adapt architectures to improve the performance of trained models. We can find that sometimes, shallow networks generalize better than deep networks, and the addition of more layers results in higher training and test errors. The deep residual learning framework addresses this degradation problem by adding skip connections to several neural network layers. It would at first seem counter-intuitive that such skip connections are needed to train deep networks successfully as the expressivity of a network would grow exponentially with depth. In this paper, we first analyze the flow of information through neural networks. We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network. We prove empirically and theoretically that a positive batch-entropy is required for gradient descent-based training approaches to optimize a given loss function successfully. Based on those insights, we introduce batch-entropy regularization to enable gradient descent-based training algorithms to optimize the flow of information through each hidden layer individually. With batch-entropy regularization, gradient descent optimizers can transform untrainable networks into trainable networks. We show empirically that we can therefore train a "vanilla" fully connected network and convolutional neural network -- no skip connections, batch normalization, dropout, or any other architectural tweak -- with 500 layers by simply adding the batch-entropy regularization term to the loss function. The effect of batch-entropy regularization is not only evaluated on vanilla neural networks, but also on residual networks, autoencoders, and also transformer models over a wide range of computer vision as well as natural language processing tasks.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
On the sensitivity of the D parameter to new physics
Authors:
Adam Falkowski,
Antonio Rodríguez-Sánchez
Abstract:
Measurements of angular correlations in nuclear beta decay are important tests of the Standard Model (SM). Among those, the so-called D correlation parameter occupies a particular place because it is odd under time reversal, and because the experimental sensitivity is at the $10^{-4}$ level, with plans of further improvement in the near future. Using effective field theory~(EFT) techniques, we rea…
▽ More
Measurements of angular correlations in nuclear beta decay are important tests of the Standard Model (SM). Among those, the so-called D correlation parameter occupies a particular place because it is odd under time reversal, and because the experimental sensitivity is at the $10^{-4}$ level, with plans of further improvement in the near future. Using effective field theory~(EFT) techniques, we reassess its potential to discover or constrain new physics beyond the SM. We provide a comprehensive classification of CP-violating EFT scenarios which generate a shift of the D parameter away from the SM prediction. We show that, in each scenario, a shift larger than $10^{-5}$ is in serious tension with the existing experimental data, where bounds coming from electric dipole moments and LHC observables play a decisive role. The tension can only be avoided by fine tuning of the parameters in the UV completion of the EFT. We illustrate this using examples of leptoquark UV completions. Finally, we comment on the possibility to probe CP-conserving new physics via the D parameter.
△ Less
Submitted 3 August, 2022; v1 submitted 5 July, 2022;
originally announced July 2022.
-
Violations of Quark-Hadron Duality in Low-Energy Determinations of $α_s$
Authors:
Antonio Pich,
Antonio Rodríguez-Sánchez
Abstract:
Using the spectral functions measured in $τ$ decays, we investigate the actual numerical impact of duality violations on the extraction of the strong coupling. These effects are tiny in the standard $α_s(m_τ^2)$ determinations from integrated distributions of the hadronic spectrum with pinched weights, or from the total $τ$ hadronic width. The pinched-weight factors suppress very efficiently the v…
▽ More
Using the spectral functions measured in $τ$ decays, we investigate the actual numerical impact of duality violations on the extraction of the strong coupling. These effects are tiny in the standard $α_s(m_τ^2)$ determinations from integrated distributions of the hadronic spectrum with pinched weights, or from the total $τ$ hadronic width. The pinched-weight factors suppress very efficiently the violations of duality, making their numerical effects negligible in comparison with the larger perturbative uncertainties. However, combined fits of $α_s$ and duality-violation parameters, performed with non-protected weights, are subject to large systematic errors associated with the assumed modelling of duality-violation effects. These uncertainties have not been taken into account in the published analyses, based on specific models of quark-hadron duality.
△ Less
Submitted 11 July, 2022; v1 submitted 16 May, 2022;
originally announced May 2022.
-
Prospects for precise predictions of $a_μ$ in the Standard Model
Authors:
G. Colangelo,
M. Davier,
A. X. El-Khadra,
M. Hoferichter,
C. Lehner,
L. Lellouch,
T. Mibe,
B. L. Roberts,
T. Teubner,
H. Wittig,
B. Ananthanarayan,
A. Bashir,
J. Bijnens,
T. Blum,
P. Boyle,
N. Bray-Ali,
I. Caprini,
C. M. Carloni Calame,
O. Catà,
M. Cè,
J. Charles,
N. H. Christ,
F. Curciarello,
I. Danilkin,
D. Das
, et al. (57 additional authors not shown)
Abstract:
We discuss the prospects for improving the precision on the hadronic corrections to the anomalous magnetic moment of the muon, and the plans of the Muon $g-2$ Theory Initiative to update the Standard Model prediction.
We discuss the prospects for improving the precision on the hadronic corrections to the anomalous magnetic moment of the muon, and the plans of the Muon $g-2$ Theory Initiative to update the Standard Model prediction.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
The strong coupling constant: State of the art and the decade ahead
Authors:
D. d'Enterria,
S. Kluth,
G. Zanderighi,
C. Ayala,
M. A. Benitez-Rathgeb,
J. Bluemlein,
D. Boito,
N. Brambilla,
D. Britzger,
S. Camarda,
A. M. Cooper-Sarkar,
T. Cridge,
G. Cvetic,
M. Dalla Brida,
A. Deur,
F. Giuli,
M. Golterman,
A. H. Hoang,
J. Huston,
M. Jamin,
A. V. Kotikov,
V. G. Krivokhizhin,
A. S. Kronfeld,
V. Leino,
K. Lipka
, et al. (33 additional authors not shown)
Abstract:
This document provides a comprehensive summary of the state-of-the-art, challenges, and prospects in the experimental and theoretical study of the strong coupling $α_s$. The current status of the seven methods presently used to determine $α_s$ based on: (i) lattice QCD, (ii) hadronic $τ$ decays, (iii) deep-inelastic scattering and parton distribution functions fits, (iv) electroweak boson decays,…
▽ More
This document provides a comprehensive summary of the state-of-the-art, challenges, and prospects in the experimental and theoretical study of the strong coupling $α_s$. The current status of the seven methods presently used to determine $α_s$ based on: (i) lattice QCD, (ii) hadronic $τ$ decays, (iii) deep-inelastic scattering and parton distribution functions fits, (iv) electroweak boson decays, hadronic final-states in (v) e+e-, (vi) e-p, and (vii) p-p collisions, and (viii) quarkonia decays and masses, are reviewed. Novel $α_s$ determinations are discussed, as well as the averaging method used to obtain the PDG world-average value at the reference Z boson mass scale, $α_s(m^2_Z)$. Each of the extraction methods proposed provides a "wish list" of experimental and theoretical developments required in order to achieve an ideal permille precision on $α_s(m^2_Z)$ within the next 10 years.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Continual Learning from Demonstration of Robotics Skills
Authors:
Sayantan Auddy,
Jakob Hollenstein,
Matteo Saveriano,
Antonio Rodríguez-Sánchez,
Justus Piater
Abstract:
Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equa…
▽ More
Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers. We empirically demonstrate the effectiveness of this approach in remembering long sequences of trajectory learning tasks without the need to store any data from past demonstrations. Our results show that hypernetworks outperform other state-of-the-art continual learning approaches for learning from demonstration. In our experiments, we use the popular LASA benchmark, and two new datasets of kinesthetic demonstrations collected with a real robot that we introduce in this paper called the HelloWorld and RoboTasks datasets. We evaluate our approach on a physical robot and demonstrate its effectiveness in learning real-world robotic tasks involving changing positions as well as orientations. We report both trajectory error metrics and continual learning metrics, and we propose two new continual learning metrics. Our code, along with the newly collected datasets, is available at https://github.com/sayantanauddy/clfd.
△ Less
Submitted 12 April, 2023; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Momentum Capsule Networks
Authors:
Josef Gugglberger,
David Peer,
Antonio Rodríguez-Sánchez
Abstract:
Capsule networks are a class of neural networks that achieved promising results on many computer vision tasks. However, baseline capsule networks have failed to reach state-of-the-art results on more complex datasets due to the high computation and memory requirements. We tackle this problem by proposing a new network architecture, called Momentum Capsule Network (MoCapsNet). MoCapsNets are inspir…
▽ More
Capsule networks are a class of neural networks that achieved promising results on many computer vision tasks. However, baseline capsule networks have failed to reach state-of-the-art results on more complex datasets due to the high computation and memory requirements. We tackle this problem by proposing a new network architecture, called Momentum Capsule Network (MoCapsNet). MoCapsNets are inspired by Momentum ResNets, a type of network that applies reversible residual building blocks. Reversible networks allow for recalculating activations of the forward pass in the backpropagation algorithm, so those memory requirements can be drastically reduced. In this paper, we provide a framework on how invertible residual building blocks can be applied to capsule networks. We will show that MoCapsNet beats the accuracy of baseline capsule networks on MNIST, SVHN, CIFAR-10 and CIFAR-100 while using considerably less memory. The source code is available on https://github.com/moejoe95/MoCapsNet.
△ Less
Submitted 25 August, 2022; v1 submitted 26 January, 2022;
originally announced January 2022.
-
Constraints on subleading interactions in beta decay Lagrangian
Authors:
Adam Falkowski,
Martín González-Alonso,
Ajdin Palavrić,
Antonio Rodríguez-Sánchez
Abstract:
We discuss the effective field theory (EFT) for nuclear beta decay. The general quark-level EFT describing charged-current interactions between quarks and leptons is matched to the nucleon-level non-relativistic EFT at the O(MeV) momentum scale characteristic for beta transitions. The matching takes into account, for the first time, the effect of all possible beyond-the-Standard-Model interactions…
▽ More
We discuss the effective field theory (EFT) for nuclear beta decay. The general quark-level EFT describing charged-current interactions between quarks and leptons is matched to the nucleon-level non-relativistic EFT at the O(MeV) momentum scale characteristic for beta transitions. The matching takes into account, for the first time, the effect of all possible beyond-the-Standard-Model interactions at the subleading order in the recoil momentum. We calculate the impact of all the Wilson coefficients of the leading and subleading EFT Lagrangian on the differential decay width in allowed beta transitions. As an example application, we show how the existing experimental data constrain the subleading Wilson coefficients corresponding to pseudoscalar, weak magnetism, and induced tensor interactions. The data display a 3.5 sigma evidence for nucleon weak magnetism, in agreement with the theory prediction based on isospin symmetry.
△ Less
Submitted 26 April, 2024; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Semileptonic tau decays beyond the Standard Model
Authors:
Vincenzo Cirigliano,
David Díaz-Calderón,
Adam Falkowski,
Martín González-Alonso,
Antonio Rodríguez-Sánchez
Abstract:
Hadronic $τ$ decays are studied as probe of new physics. We determine the dependence of several inclusive and exclusive $τ$ observables on the Wilson coefficients of the low-energy effective theory describing charged-current interactions between light quarks and leptons. The analysis includes both strange and non-strange decay channels. The main result is the likelihood function for the Wilson coe…
▽ More
Hadronic $τ$ decays are studied as probe of new physics. We determine the dependence of several inclusive and exclusive $τ$ observables on the Wilson coefficients of the low-energy effective theory describing charged-current interactions between light quarks and leptons. The analysis includes both strange and non-strange decay channels. The main result is the likelihood function for the Wilson coefficients in the tau sector, based on the up-to-date experimental measurements and state-of-the-art theoretical techniques. The likelihood can be readily combined with inputs from other low-energy precision observables. We discuss a combination with nuclear beta, baryon, pion, and kaon decay data. In particular, we provide a comprehensive and model-independent description of the new physics hints in the combined dataset, which are known under the name of the Cabibbo anomaly.
△ Less
Submitted 5 July, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
2-loop short-distance constraints for the $g-2$ HLbL
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Laetitia Laub,
Antonio Rodríguez-Sánchez
Abstract:
The recent experimental measurement of the muon $g-2$ at Fermilab National Laboratory, at a $4.2σ$ tension with the Standard Model prediction, highlights the need for further improvements on the theoretical uncertainties associated to the hadronic sector. In the framework of the operator product expansion in the presence of a background field, the short-distance behaviour of the hadronic light-by-…
▽ More
The recent experimental measurement of the muon $g-2$ at Fermilab National Laboratory, at a $4.2σ$ tension with the Standard Model prediction, highlights the need for further improvements on the theoretical uncertainties associated to the hadronic sector. In the framework of the operator product expansion in the presence of a background field, the short-distance behaviour of the hadronic light-by-light contribution was recently studied. The leading term in this expansion is given by the massless quark-loop, which is numerically dominant compared to non-perturbative corrections. Here, we present the perturbative QCD correction to the massless quark-loop and estimate its size numerically. In particular, we find that for scales above 1 GeV it is relatively small, in general roughly $-10\%$ the size of the massless quark-loop. The knowledge of these short-distance constraints will in the future allow to reduce the systematic uncertainties in the Standard Model prediction of the hadronic light-by-light contribution to the $g-2$.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing
Authors:
David Peer,
Sebastian Stabinger,
Stefan Engl,
Antonio Rodriguez-Sanchez
Abstract:
Fine-tuning transformer models after unsupervised pre-training reaches a very high performance on many different natural language processing tasks. Unfortunately, transformers suffer from long inference times which greatly increases costs in production. One possible solution is to use knowledge distillation, which solves this problem by transferring information from large teacher models to smaller…
▽ More
Fine-tuning transformer models after unsupervised pre-training reaches a very high performance on many different natural language processing tasks. Unfortunately, transformers suffer from long inference times which greatly increases costs in production. One possible solution is to use knowledge distillation, which solves this problem by transferring information from large teacher models to smaller student models. Knowledge distillation maintains high performance and reaches high compression rates, nevertheless, the size of the student model is fixed after pre-training and can not be changed individually for a given downstream task and use-case to reach a desired performance/speedup ratio. Another solution to reduce the size of models in a much more fine-grained and computationally cheaper fashion is to prune layers after the pre-training. The price to pay is that the performance of layer-wise pruning algorithms is not on par with state-of-the-art knowledge distillation methods. In this paper, Greedy-layer pruning is introduced to (1) outperform current state-of-the-art for layer-wise pruning, (2) close the performance gap when compared to knowledge distillation, while (3) providing a method to adapt the model size dynamically to reach a desired performance/speedup tradeoff without the need of additional pre-training phases. Our source code is available on https://github.com/deepopinion/greedy-layer-pruning.
△ Less
Submitted 29 March, 2022; v1 submitted 31 May, 2021;
originally announced May 2021.
-
Training Deep Capsule Networks with Residual Connections
Authors:
Josef Gugglberger,
David Peer,
Antonio Rodriguez-Sanchez
Abstract:
Capsule networks are a type of neural network that have recently gained increased popularity. They consist of groups of neurons, called capsules, which encode properties of objects or object parts. The connections between capsules encrypt part-whole relationships between objects through routing algorithms which route the output of capsules from lower level layers to upper level layers. Capsule net…
▽ More
Capsule networks are a type of neural network that have recently gained increased popularity. They consist of groups of neurons, called capsules, which encode properties of objects or object parts. The connections between capsules encrypt part-whole relationships between objects through routing algorithms which route the output of capsules from lower level layers to upper level layers. Capsule networks can reach state-of-the-art results on many challenging computer vision tasks, such as MNIST, Fashion-MNIST, and Small-NORB. However, most capsule network implementations use two to three capsule layers, which limits their applicability as expressivity grows exponentially with depth. One approach to overcome such limitations would be to train deeper network architectures, as it has been done for convolutional neural networks with much increased success. In this paper, we propose a methodology to train deeper capsule networks using residual connections, which is evaluated on four datasets and three different routing algorithms. Our experimental results show that in fact, performance increases when training deeper capsule networks. The source code is available on https://github.com/moejoe95/res-capsnet.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Auto-tuning of Deep Neural Networks by Conflicting Layer Removal
Authors:
David Peer,
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that…
▽ More
Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that such a layer could lead to a network that cannot be trained at all. A theoretical analysis is provided on what is the origin of those layers that result in a lower overall network performance, which is complemented by our extensive empirical evaluation. More precisely, we identified those layers that worsen the performance because they would produce what we name conflicting training bundles. We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture with no significant increase in the test-error. We will further present a novel neural-architecture-search (NAS) algorithm that identifies conflicting layers at the beginning of the training. Architectures found by our auto-tuning algorithm achieve competitive accuracy values when compared against more complex state-of-the-art architectures, while drastically reducing memory consumption and inference time for different computer vision tasks. The source code is available on https://github.com/peerdavid/conflicting-bundles
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Arguments for the Unsuitability of Convolutional Neural Networks for Non--Local Tasks
Authors:
Sebastian Stabinger,
David Peer,
Antonio Rodríguez-Sánchez
Abstract:
Convolutional neural networks have established themselves over the past years as the state of the art method for image classification, and for many datasets, they even surpass humans in categorizing images. Unfortunately, the same architectures perform much worse when they have to compare parts of an image to each other to correctly classify this image.
Until now, no well-formed theoretical argu…
▽ More
Convolutional neural networks have established themselves over the past years as the state of the art method for image classification, and for many datasets, they even surpass humans in categorizing images. Unfortunately, the same architectures perform much worse when they have to compare parts of an image to each other to correctly classify this image.
Until now, no well-formed theoretical argument has been presented to explain this deficiency. In this paper, we will argue that convolutional layers are of little use for such problems, since comparison tasks are global by nature, but convolutional layers are local by design. We will use this insight to reformulate a comparison task into a sorting task and use findings on sorting networks to propose a lower bound for the number of parameters a neural network needs to solve comparison tasks in a generalizable way. We will use this lower bound to argue that attention, as well as iterative/recurrent processing, is needed to prevent a combinatorial explosion.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
SU(3) analysis of four-quark operators: $K\toππ$ and vacuum matrix elements
Authors:
Antonio Pich,
Antonio Rodríguez-Sánchez
Abstract:
Hadronic matrix elements of local four-quark operators play a central role in non-leptonic kaon decays, while vacuum matrix elements involving the same kind of operators appear in inclusive dispersion relations, such as those relevant in $τ$-decay analyses. Using an $SU(3)_L\otimes SU(3)_R$ decomposition of the operators, we derive generic relations between these matrix elements, extending well-kn…
▽ More
Hadronic matrix elements of local four-quark operators play a central role in non-leptonic kaon decays, while vacuum matrix elements involving the same kind of operators appear in inclusive dispersion relations, such as those relevant in $τ$-decay analyses. Using an $SU(3)_L\otimes SU(3)_R$ decomposition of the operators, we derive generic relations between these matrix elements, extending well-known results that link observables in the two different sectors. Two relevant phenomenological applications are presented. First, we determine the electroweak-penguin contribution to the kaon CP-violating ratio $\varepsilon'/\varepsilon$, using the measured hadronic spectral functions in $τ$ decay. Second, we fit our $SU(3)$ dynamical parameters to the most recent lattice data on $K\toππ$ matrix elements. The comparison of this numerical fit with results from previous analytical approaches provides an interesting anatomy of the $ΔI = \frac{1}{2}$ enhancement, confirming old suggestions about its underlying dynamical origin.
△ Less
Submitted 16 June, 2021; v1 submitted 18 February, 2021;
originally announced February 2021.
-
The two-loop perturbative correction to the (g-2)$_μ$ HLbL at short distances
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Laetitia Laub,
Antonio Rodríguez-Sánchez
Abstract:
The short-distance behaviour of the hadronic light-by-light (HLbL) contribution to $(g-2)_μ$ has recently been studied by means of an operator product expansion in a background electromagnetic field. The leading term in this expansion has been shown to be given by the massless quark loop, and the non-perturbative corrections are numerically very suppressed. Here, we calculate the perturbative QCD…
▽ More
The short-distance behaviour of the hadronic light-by-light (HLbL) contribution to $(g-2)_μ$ has recently been studied by means of an operator product expansion in a background electromagnetic field. The leading term in this expansion has been shown to be given by the massless quark loop, and the non-perturbative corrections are numerically very suppressed. Here, we calculate the perturbative QCD correction to the massless quark loop. The correction is found to be fairly small compared to the quark loop as far as we study energy scales where the perturbative running for the QCD coupling is well-defined, i.e.~for scales $μ\gtrsim 1\, \mathrm{GeV}$. This should allow to reduce the large systematic uncertainty associated to high-multiplicity hadronic states.
△ Less
Submitted 11 May, 2021; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Short-distance HLbL contributions to the muon g-2
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Laetitia Laub,
Antonio Rodriguez-Sanchez
Abstract:
The current $3.7σ$ discrepancy between the Standard Model prediction and the experimental value of the muon anomalous magnetic moment could be a hint for the existence of new physics. The hadronic light-by-light contribution is one of the pieces requiring improved precision on the theory side, and an important step is to derive short-distance constraints for this quantity containing four electroma…
▽ More
The current $3.7σ$ discrepancy between the Standard Model prediction and the experimental value of the muon anomalous magnetic moment could be a hint for the existence of new physics. The hadronic light-by-light contribution is one of the pieces requiring improved precision on the theory side, and an important step is to derive short-distance constraints for this quantity containing four electromagnetic currents. Here, we derive such short-distance constraints for three large photon loop virtualities and the external fourth photon in the static limit. The static photon is considered as a background field and we construct a systematic operator product expansion in the presence of this field. We show that the massless quark loop, i.e. the leading term, is numerically dominant over non-perturbative contributions up to next-to-next-to leading order, both those suppressed by quark masses and those that are not.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks
Authors:
David Peer,
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer…
▽ More
Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer could lead to a network that can not be trained at all. More precisely, we identified those layers that worsen the performance because they produce conflicting training bundles as we show in our novel theoretical analysis, complemented by our extensive empirical studies. Based on these findings, a novel algorithm is introduced to remove performance decreasing layers automatically. Architectures found by this algorithm achieve a competitive accuracy when compared against the state-of-the-art architectures. While kee** such high accuracy, our approach drastically reduces memory consumption and inference time for different computer vision tasks.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Short-distance HLbL contributions to the muon anomalous magnetic moment beyond perturbation theory
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Laetitia Laub,
Antonio Rodríguez-Sánchez
Abstract:
The hadronic light-by-light contribution to the muon anomalous magnetic moment depends on an integration over three off-shell momenta squared ($Q_i^2$) of the correlator of four electromagnetic currents and the fourth leg at zero momentum. We derive the short-distance expansion of this correlator in the limit where all three $Q_i^2$ are large and in the Euclidean domain in QCD. This is done via a…
▽ More
The hadronic light-by-light contribution to the muon anomalous magnetic moment depends on an integration over three off-shell momenta squared ($Q_i^2$) of the correlator of four electromagnetic currents and the fourth leg at zero momentum. We derive the short-distance expansion of this correlator in the limit where all three $Q_i^2$ are large and in the Euclidean domain in QCD. This is done via a systematic operator product expansion (OPE) in a background field which we construct. The leading order term in the expansion is the massless quark loop. We also compute the non-perturbative part of the next-to-leading contribution, which is suppressed by quark masses, and the chiral limit part of the next-to-next-to leading contributions to the OPE. We build a renormalisation program for the OPE. The numerical role of the higher-order contributions is estimated and found to be small.
△ Less
Submitted 27 October, 2020; v1 submitted 31 August, 2020;
originally announced August 2020.
-
The anomalous magnetic moment of the muon in the Standard Model
Authors:
T. Aoyama,
N. Asmussen,
M. Benayoun,
J. Bijnens,
T. Blum,
M. Bruno,
I. Caprini,
C. M. Carloni Calame,
M. Cè,
G. Colangelo,
F. Curciarello,
H. Czyż,
I. Danilkin,
M. Davier,
C. T. H. Davies,
M. Della Morte,
S. I. Eidelman,
A. X. El-Khadra,
A. Gérardin,
D. Giusti,
M. Golterman,
Steven Gottlieb,
V. Gülpers,
F. Hagelstein,
M. Hayakawa
, et al. (107 additional authors not shown)
Abstract:
We review the present status of the Standard Model calculation of the anomalous magnetic moment of the muon. This is performed in a perturbative expansion in the fine-structure constant $α$ and is broken down into pure QED, electroweak, and hadronic contributions. The pure QED contribution is by far the largest and has been evaluated up to and including $\mathcal{O}(α^5)$ with negligible numerical…
▽ More
We review the present status of the Standard Model calculation of the anomalous magnetic moment of the muon. This is performed in a perturbative expansion in the fine-structure constant $α$ and is broken down into pure QED, electroweak, and hadronic contributions. The pure QED contribution is by far the largest and has been evaluated up to and including $\mathcal{O}(α^5)$ with negligible numerical uncertainty. The electroweak contribution is suppressed by $(m_μ/M_W)^2$ and only shows up at the level of the seventh significant digit. It has been evaluated up to two loops and is known to better than one percent. Hadronic contributions are the most difficult to calculate and are responsible for almost all of the theoretical uncertainty. The leading hadronic contribution appears at $\mathcal{O}(α^2)$ and is due to hadronic vacuum polarization, whereas at $\mathcal{O}(α^3)$ the hadronic light-by-light scattering contribution appears. Given the low characteristic scale of this observable, these contributions have to be calculated with nonperturbative methods, in particular, dispersion relations and the lattice approach to QCD. The largest part of this review is dedicated to a detailed account of recent efforts to improve the calculation of these two contributions with either a data-driven, dispersive approach, or a first-principle, lattice-QCD approach. The final result reads $a_μ^\text{SM}=116\,591\,810(43)\times 10^{-11}$ and is smaller than the Brookhaven measurement by 3.7$σ$. The experimental uncertainty will soon be reduced by up to a factor four by the new experiment currently running at Fermilab, and also by the future J-PARC experiment. This and the prospects to further reduce the theoretical uncertainty in the near future-which are also discussed here-make this quantity one of the most promising places to look for evidence of new physics.
△ Less
Submitted 13 November, 2020; v1 submitted 8 June, 2020;
originally announced June 2020.
-
Evaluating the Progress of Deep Learning for Visual Relational Concepts
Authors:
Sebastian Stabinger,
Peer David,
Justus Piater,
Antonio Rodríguez-Sánchez
Abstract:
Convolutional Neural Networks (CNNs) have become the state of the art method for image classification in the last ten years. Despite the fact that they achieve superhuman classification accuracy on many popular datasets, they often perform much worse on more abstract image classification tasks. We will show that these difficult tasks are linked to relational concepts from cognitive psychology and…
▽ More
Convolutional Neural Networks (CNNs) have become the state of the art method for image classification in the last ten years. Despite the fact that they achieve superhuman classification accuracy on many popular datasets, they often perform much worse on more abstract image classification tasks. We will show that these difficult tasks are linked to relational concepts from cognitive psychology and that despite progress over the last few years, such relational reasoning tasks still remain difficult for current neural network architectures.
We will review deep learning research that is linked to relational concept learning, even if it was not originally presented from this angle. Reviewing the current literature, we will argue that some form of attention will be an important component of future systems to solve relational tasks.
In addition, we will point out the shortcomings of currently used datasets, and we will recommend steps to make future datasets more relevant for testing systems on relational reasoning.
△ Less
Submitted 13 September, 2021; v1 submitted 29 January, 2020;
originally announced January 2020.
-
Isospin-breaking contributions to $\varepsilon'/\varepsilon$
Authors:
V. Cirigliano,
H. Gisbert,
A. Pich,
A. Rodríguez-Sánchez
Abstract:
We present an updated analysis of isospin-violating corrections to $\varepsilon'/\varepsilon$ in the framework of chiral perturbation theory, taking advantage of the currently improved knowledge on quark masses and nonperturbative parameters. The role of the different ingredients entering into the analysis is carefully assessed. Our final result is $Ω_{\mathrm{eff}}=0.110\,{}^{+0.090}_{-0.088}$.
We present an updated analysis of isospin-violating corrections to $\varepsilon'/\varepsilon$ in the framework of chiral perturbation theory, taking advantage of the currently improved knowledge on quark masses and nonperturbative parameters. The role of the different ingredients entering into the analysis is carefully assessed. Our final result is $Ω_{\mathrm{eff}}=0.110\,{}^{+0.090}_{-0.088}$.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Theoretical status of $\varepsilon'/\varepsilon$
Authors:
V. Cirigliano,
H. Gisbert,
A. Pich,
A. Rodríguez-Sánchez
Abstract:
We briefly overview the historical controversy around Standard Model predictions of $\varepsilon'/\varepsilon$ and clarify the underlying physics. A full update of this important observable is presented, with all known short- and long-distance contributions, including isospin-breaking corrections. The current Standard Model prediction,…
▽ More
We briefly overview the historical controversy around Standard Model predictions of $\varepsilon'/\varepsilon$ and clarify the underlying physics. A full update of this important observable is presented, with all known short- and long-distance contributions, including isospin-breaking corrections. The current Standard Model prediction, $\mathrm{Re}(\varepsilon'/\varepsilon) = (14\pm 5)\cdot 10^{-4}$, is in excellent agreement with the experimentally measured value.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
A complete update of $\varepsilon'/\varepsilon$ in the Standard Model
Authors:
V. Cirigliano,
H. Gisbert,
A. Pich,
A. Rodríguez-Sánchez
Abstract:
The recent release of improved lattice data has revived again the interest on precise theoretical calculations of the direct CP-violation ratio $\varepsilon'/\varepsilon$. We present a complete update of the Standard Model prediction [1,2], including a new re-analysis of isospin-breaking corrections which are of vital importance in the theoretical determination of this observable. The Standard Mod…
▽ More
The recent release of improved lattice data has revived again the interest on precise theoretical calculations of the direct CP-violation ratio $\varepsilon'/\varepsilon$. We present a complete update of the Standard Model prediction [1,2], including a new re-analysis of isospin-breaking corrections which are of vital importance in the theoretical determination of this observable. The Standard Model prediction, $\mathrm{Re} (ε'/ε) = (14\pm 5)\cdot 10^{-4}$, turns out to be in good agreement with the experimental measurement.
△ Less
Submitted 30 November, 2019; v1 submitted 15 November, 2019;
originally announced November 2019.
-
Isospin-Violating Contributions to $ε'/ε$
Authors:
V. Cirigliano,
H. Gisbert,
A. Pich,
A. Rodríguez-Sánchez
Abstract:
The known isospin-breaking contributions to the $K\rightarrow ππ$ amplitudes are reanalyzed, taking into account our current understanding of the quark masses and the relevant non-perturbative inputs. We present a complete numerical reappraisal of the direct CP-violating ratio $ε'/ε$, where these corrections play a quite significant role. We obtain the Standard Model prediction…
▽ More
The known isospin-breaking contributions to the $K\rightarrow ππ$ amplitudes are reanalyzed, taking into account our current understanding of the quark masses and the relevant non-perturbative inputs. We present a complete numerical reappraisal of the direct CP-violating ratio $ε'/ε$, where these corrections play a quite significant role. We obtain the Standard Model prediction $\text{Re}\left(ε'/ε\right)\, =\,\left(14\,\pm\,5\right)\cdot 10^{-4}$, which is in very good agreement with the measured ratio. The uncertainty, which has been estimated conservatively, is dominated by our current ignorance about $1/N_C$-suppressed contributions to some relevant chiral-perturbation-theory low-energy constants.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Analytical results for hadronic contributions to the muon $g-2$
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Antonio Rodriguez-Sánchez
Abstract:
This talk discusses two analytical calculations relevant for the Standard Model calculation of the muon $g-2$. The first part is the recent derivation of the quark-loop as the first term in a well-defined operator-product expansion for the short-distance part of the hadronic light-by-light contribution, as well as the calculation of the next term. The second part is the calculation of finite volum…
▽ More
This talk discusses two analytical calculations relevant for the Standard Model calculation of the muon $g-2$. The first part is the recent derivation of the quark-loop as the first term in a well-defined operator-product expansion for the short-distance part of the hadronic light-by-light contribution, as well as the calculation of the next term. The second part is the calculation of finite volume effects relevant for lattice QCD calculations of the electromagnetic contribution to the lowest-order hadronic vacuum-polarization contribution and the proof they only start at $1/L^3$.
△ Less
Submitted 10 October, 2019;
originally announced October 2019.
-
Short-distance constraints for the HLbL contribution to the muon anomalous magnetic moment
Authors:
Johan Bijnens,
Nils Hermansson-Truedsson,
Antonio Rodríguez-Sánchez
Abstract:
We derive short-distance constraints for the hadronic light-by-light contribution (HLbL) to the anomalous magnetic moment of the muon in the kinematic region where the three virtual momenta are all large. We include the external soft photon via an external field leading to a well-defined Operator Product Expansion. We establish that the perturbative quark loop gives the leading contribution in a w…
▽ More
We derive short-distance constraints for the hadronic light-by-light contribution (HLbL) to the anomalous magnetic moment of the muon in the kinematic region where the three virtual momenta are all large. We include the external soft photon via an external field leading to a well-defined Operator Product Expansion. We establish that the perturbative quark loop gives the leading contribution in a well defined expansion. We compute the first nonzero power correction. It is related to to the magnetic susceptibility of the QCD vacuum. The results can be used as model-independent short-distance constraints for the very many different approaches to the HLbL contribution. Numerically the power correction is found to be small.
△ Less
Submitted 9 August, 2019;
originally announced August 2019.
-
Limitation of capsule networks
Authors:
David Peer,
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
A recently proposed method in deep learning groups multiple neurons to capsules such that each capsule represents an object or part of an object. Routing algorithms route the output of capsules from lower-level layers to upper-level layers. In this paper, we prove that state-of-the-art routing procedures decrease the expressivity of capsule networks. More precisely, it is shown that EM-routing and…
▽ More
A recently proposed method in deep learning groups multiple neurons to capsules such that each capsule represents an object or part of an object. Routing algorithms route the output of capsules from lower-level layers to upper-level layers. In this paper, we prove that state-of-the-art routing procedures decrease the expressivity of capsule networks. More precisely, it is shown that EM-routing and routing-by-agreement prevent capsule networks from distinguishing inputs and their negative counterpart. Therefore, only symmetric functions can be expressed by capsule networks, and it can be concluded that they are not universal approximators. We also theoretically motivate and empirically show that this limitation affects the training of deep capsule networks negatively. Therefore, we present an incremental improvement for state-of-the-art routing algorithms that solves the aforementioned limitation and stabilizes the training of capsule networks.
△ Less
Submitted 19 January, 2021; v1 submitted 21 May, 2019;
originally announced May 2019.
-
Increasing the adversarial robustness and explainability of capsule networks with $γ$-capsules
Authors:
David Peer,
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
In this paper we introduce a new inductive bias for capsule networks and call networks that use this prior $γ$-capsule networks. Our inductive bias that is inspired by TE neurons of the inferior temporal cortex increases the adversarial robustness and the explainability of capsule networks. A theoretical framework with formal definitions of $γ$-capsule networks and metrics for evaluation are also…
▽ More
In this paper we introduce a new inductive bias for capsule networks and call networks that use this prior $γ$-capsule networks. Our inductive bias that is inspired by TE neurons of the inferior temporal cortex increases the adversarial robustness and the explainability of capsule networks. A theoretical framework with formal definitions of $γ$-capsule networks and metrics for evaluation are also provided. Under our framework we show that common capsule networks do not necessarily make use of this inductive bias. For this reason we introduce a novel routing algorithm and use a different training algorithm to be able to implement $γ$-capsule networks. We then show experimentally that $γ$-capsule networks are indeed more transparent and more robust against adversarial attacks than regular capsule networks.
△ Less
Submitted 5 December, 2019; v1 submitted 23 December, 2018;
originally announced December 2018.
-
Hadronic tau decays as New Physics probes in the LHC era
Authors:
Vincenzo Cirigliano,
Adam Falkowski,
Martín González-Alonso,
Antonio Rodríguez-Sánchez
Abstract:
We analyze the sensitivity of hadronic tau decays to non-standard interactions within the model-independent framework of the Standard Model Effective Field Theory (SMEFT). Both exclusive and inclusive decays are studied, using the latest lattice data and QCD dispersion relations. We show that there are enough theoretically clean channels to disentangle all the effective couplings contributing to t…
▽ More
We analyze the sensitivity of hadronic tau decays to non-standard interactions within the model-independent framework of the Standard Model Effective Field Theory (SMEFT). Both exclusive and inclusive decays are studied, using the latest lattice data and QCD dispersion relations. We show that there are enough theoretically clean channels to disentangle all the effective couplings contributing to these decays, with the $τ\to ππν_τ$ channel representing an unexpected powerful New Physics probe. We find that the ratios of non-standard couplings to the Fermi constant are bound at the sub-percent level. These bounds are complementary to the ones from electroweak precision observables and $p p \to τν_τ$ measurements at the LHC. The combination of tau decay and LHC data puts tighter constraints on lepton universality violation in the gauge boson-lepton vertex corrections.
△ Less
Submitted 12 June, 2019; v1 submitted 4 September, 2018;
originally announced September 2018.
-
Guided Labeling using Convolutional Neural Networks
Authors:
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
Over the last couple of years, deep learning and especially convolutional neural networks have become one of the work horses of computer vision. One limiting factor for the applicability of supervised deep learning to more areas is the need for large, manually labeled datasets. In this paper we propose an easy to implement method we call guided labeling, which automatically determines which sample…
▽ More
Over the last couple of years, deep learning and especially convolutional neural networks have become one of the work horses of computer vision. One limiting factor for the applicability of supervised deep learning to more areas is the need for large, manually labeled datasets. In this paper we propose an easy to implement method we call guided labeling, which automatically determines which samples from an unlabeled dataset should be labeled. We show that using this procedure, the amount of samples that need to be labeled is reduced considerably in comparison to labeling images arbitrarily.
△ Less
Submitted 6 December, 2017;
originally announced December 2017.
-
Evaluation of Deep Learning on an Abstract Image Classification Dataset
Authors:
Sebastian Stabinger,
Antonio Rodriguez-Sanchez
Abstract:
Convolutional Neural Networks have become state of the art methods for image classification over the last couple of years. By now they perform better than human subjects on many of the image classification datasets. Most of these datasets are based on the notion of concrete classes (i.e. images are classified by the type of object in the image). In this paper we present a novel image classificatio…
▽ More
Convolutional Neural Networks have become state of the art methods for image classification over the last couple of years. By now they perform better than human subjects on many of the image classification datasets. Most of these datasets are based on the notion of concrete classes (i.e. images are classified by the type of object in the image). In this paper we present a novel image classification dataset, using abstract classes, which should be easy to solve for humans, but variations of it are challenging for CNNs. The classification performance of popular CNN architectures is evaluated on this dataset and variations of the dataset that might be interesting for further research are identified.
△ Less
Submitted 25 August, 2017;
originally announced August 2017.
-
25 years of CNNs: Can we compare to human abstraction capabilities?
Authors:
Sebastian Stabinger,
Antonio Rodríguez-Sánchez,
Justus Piater
Abstract:
We try to determine the progress made by convolutional neural networks over the past 25 years in classifying images into abstractc lasses. For this purpose we compare the performance of LeNet to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property (e.g., one class contains two objects of the same size, the other class two objects of different…
▽ More
We try to determine the progress made by convolutional neural networks over the past 25 years in classifying images into abstractc lasses. For this purpose we compare the performance of LeNet to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property (e.g., one class contains two objects of the same size, the other class two objects of different sizes). Our results show that there is still work to do in order to solve vision problems humans are able to solve without much difficulty.
△ Less
Submitted 28 July, 2016;
originally announced July 2016.
-
Learning Abstract Classes using Deep Learning
Authors:
Sebastian Stabinger,
Antonio Rodriguez-Sanchez,
Justus Piater
Abstract:
Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract c…
▽ More
Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract classes which are trivially differentiable for humans. We trained and tested the CNN on the two abstract classes of horizontal and vertical orientation and determined how well the network is able to transfer the learned classes to other, previously unseen objects.
△ Less
Submitted 17 June, 2016;
originally announced June 2016.
-
Determination of the QCD Coupling from ALEPH $τ$ Decay Data
Authors:
Antonio Pich,
Antonio Rodríguez-Sánchez
Abstract:
We present a comprehensive study of the determination of the strong coupling from $τ$ decay, using the most recent release of the experimental ALEPH data. We critically review all theoretical strategies used in previous works and put forward various novel approaches which allow to study complementary aspects of the problem. We investigate the advantages and disadvantages of the different methods,…
▽ More
We present a comprehensive study of the determination of the strong coupling from $τ$ decay, using the most recent release of the experimental ALEPH data. We critically review all theoretical strategies used in previous works and put forward various novel approaches which allow to study complementary aspects of the problem. We investigate the advantages and disadvantages of the different methods, trying to uncover their potential hidden weaknesses and test the stability of the obtained results under slight variations of the assumed inputs. We perform several determinations, using different methodologies, and find a very consistent set of results. All determinations are in excellent agreement, and allow us to extract a very reliable value for $α_s(m_τ^2)$. The main uncertainty originates in the pure perturbative error from unknown higher orders. Taking into account the systematic differences between the results obtained with the CIPT and FOPT prescriptions, we find $α_{s}^{(n_f=3)}(m_τ^2) = 0.328 \pm 0.013$ which implies $α_{s}^{(n_f=5)}(M_Z^{2}) = 0.1197\pm 0.0015$.
△ Less
Submitted 7 September, 2016; v1 submitted 22 May, 2016;
originally announced May 2016.
-
Updated determination of chiral couplings and vacuum condensates from hadronic tau decay data
Authors:
A. Rodríguez-Sánchez,
M. González-Alonso,
A. Pich
Abstract:
We analyze the lowest spectral moments of the left-right two-point correlation function, using all known short-distance constraints and the recently updated ALEPH V-A spectral function from tau decays. This information is used to determine the low-energy couplings L10 and C87 of chiral perturbation theory and the lowest-dimensional contributions to the Operator Product Expansion of the left-right…
▽ More
We analyze the lowest spectral moments of the left-right two-point correlation function, using all known short-distance constraints and the recently updated ALEPH V-A spectral function from tau decays. This information is used to determine the low-energy couplings L10 and C87 of chiral perturbation theory and the lowest-dimensional contributions to the Operator Product Expansion of the left-right correlator. A detailed statistical analysis is implemented to assess the theoretical uncertainties, including violations of quark-hadron duality.
△ Less
Submitted 19 February, 2016;
originally announced February 2016.
-
ChPT parameters from tau-decay data
Authors:
A. Rodríguez-Sánchez,
M. González-Alonso,
A. Pich
Abstract:
Using the updated ALEPH V-A spectral function from tau decays, we determine the lowest spectral moments of the left-right correlator and extract dynamical information on order parameters of the QCD chiral symmetry breaking. Uncertainties associated with violations of quark-hadron duality are estimated from the data, imposing all known short-distance constraints on a resonance-based parametrization…
▽ More
Using the updated ALEPH V-A spectral function from tau decays, we determine the lowest spectral moments of the left-right correlator and extract dynamical information on order parameters of the QCD chiral symmetry breaking. Uncertainties associated with violations of quark-hadron duality are estimated from the data, imposing all known short-distance constraints on a resonance-based parametrization. Employing proper pinched weight functions, we obtain an accurate determination of the effective chiral couplings L10 and C87 and the dimension-six and -eight contributions in the Operator Product Expansion.
△ Less
Submitted 28 September, 2015;
originally announced September 2015.
-
Radiative corrections to $M_h$ from three generations of Majorana neutrinos and sneutrinos
Authors:
S. Heinemeyer,
J. Hernandez-Garcia,
M. J. Herrero,
X. Marcano,
A. M. Rodriguez-Sanchez
Abstract:
In this work we study the radiative corrections to the mass of the lightest Higgs boson of the MSSM from three generations of Majorana neutrinos and sneutrinos. The spectrum of the MSSM is augmented by three right handed neutrinos and their supersymmetric partners. A seesaw mechanism of type I is used to generate the physical neutrino masses and oscillations that we require to be in agreement with…
▽ More
In this work we study the radiative corrections to the mass of the lightest Higgs boson of the MSSM from three generations of Majorana neutrinos and sneutrinos. The spectrum of the MSSM is augmented by three right handed neutrinos and their supersymmetric partners. A seesaw mechanism of type I is used to generate the physical neutrino masses and oscillations that we require to be in agreement with present neutrino data. We present a full one-loop computation of these Higgs mass corrections, and analyze in full detail their numerical size in terms of both the MSSM and the new (s)neutrino parameters. A critical discussion on the different possible renormalization schemes and their implications is included.
△ Less
Submitted 31 July, 2015; v1 submitted 3 July, 2014;
originally announced July 2014.
-
Proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), 2013
Authors:
Justus Piater,
Antonio Rodríguez-Sánchez
Abstract:
This volume represents the proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), held May 23-24, 2013, in Innsbruck, Austria.
This volume represents the proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), held May 23-24, 2013, in Innsbruck, Austria.
△ Less
Submitted 28 May, 2013; v1 submitted 6 April, 2013;
originally announced April 2013.
-
Mh in the MSSM-seesaw scenario with ILC precision
Authors:
S. Heinemeyer,
M. J. Herrero,
S. Penaranda,
A. M. Rodriguez-Sanchez
Abstract:
We review the computation of the one-loop radiative corrections from the neutrino/ sneutrino sector to the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This model introduces right handed neutrinos and their supersymmetric partners, the sneutrinos, including Majorana mass terms. We find negative and sizeable corrections to Mh, up to -5 GeV for a large Maj…
▽ More
We review the computation of the one-loop radiative corrections from the neutrino/ sneutrino sector to the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This model introduces right handed neutrinos and their supersymmetric partners, the sneutrinos, including Majorana mass terms. We find negative and sizeable corrections to Mh, up to -5 GeV for a large Majorana scale, 10^{13}-10^{15} GeV, and for the lightest neutrino mass in a range 0.1-1 eV. The corrections to Mh are substantially larger than the anticipated ILC precision for large regions of the MSSM-seesaw parameter space.
△ Less
Submitted 30 January, 2012;
originally announced January 2012.
-
Heavy Majorana neutrino effects on MSSM-Mh
Authors:
M. J. Herrero,
S. Heinemeyer,
S. Penaranda,
A. M. Rodriguez-Sanchez
Abstract:
We study the effects of heavy Majorana neutrinos on the Higgs sector of the MSSM via radiative corrections. We work within the SUSY context where the MSSM particle content is enlarged with right handed neutrinos and their corresponding SUSY partners, the sneutrinos, and where compatibility with neutrino data is required. We compute the one-loop corrections to the mass of the lightest MSSM CP-even…
▽ More
We study the effects of heavy Majorana neutrinos on the Higgs sector of the MSSM via radiative corrections. We work within the SUSY context where the MSSM particle content is enlarged with right handed neutrinos and their corresponding SUSY partners, the sneutrinos, and where compatibility with neutrino data is required. We compute the one-loop corrections to the mass of the lightest MSSM CP-even neutral Higgs boson from Majorana neutrinos and their SUSY partners and assume a seesaw mechanism of type I for neutrino mass generation. A negative and sizeable Higgs mass correction of up to -5 GeV is found for a heavy Majorana mass of up to 10^{15} GeV. This negative correction can grow up to several tens of GeV if the soft SUSY breaking mass associated to their sneutrino partners is simmilarly heavy as the Majorana mass.
△ Less
Submitted 23 January, 2012;
originally announced January 2012.
-
M_h in MSSM with Heavy Majorana Neutrinos
Authors:
S. Heinemeyer,
M. J. Herrero,
S. Penaranda,
A. M. Rodriguez-Sanchez
Abstract:
We review the main results of the one-loop radiative corrections from the neutrino/sneutrino sector to the lightest Higgs boson mass, M_h, within the context of the so-called MSSM-seesaw scenario where right handed neutrinos and their supersymmetric partners are included in order to explain neutrino masses. For simplicity, we have restricted ourselves to the one generation case. We find sizable co…
▽ More
We review the main results of the one-loop radiative corrections from the neutrino/sneutrino sector to the lightest Higgs boson mass, M_h, within the context of the so-called MSSM-seesaw scenario where right handed neutrinos and their supersymmetric partners are included in order to explain neutrino masses. For simplicity, we have restricted ourselves to the one generation case. We find sizable corrections to M_h, which are negative in the region where the Majorana scale is large (10^{13} - 10^{15} GeV) and the lightest neutrino mass is within a range inspired by data (0.1 - 1 eV). For some regions of the MSSM-seesaw parameter space, the corrections to M_h are substantially larger than the anticipated LHC precision.
△ Less
Submitted 1 July, 2011;
originally announced July 2011.
-
Higgs Boson Masses in the MSSM with Heavy Majorana Neutrinos
Authors:
S. Heinemeyer,
M. J. Herrero,
S. Penaranda,
A. M. Rodriguez-Sanchez
Abstract:
We present a full diagrammatic computation of the one-loop corrections from the neutrino/sneutrino sector to the renormalized neutral CP-even Higgs boson self-energies and the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This consists of the Minimal Supersymmetric Standard Model with the addition of massive right handed Majorana neutrinos and their super…
▽ More
We present a full diagrammatic computation of the one-loop corrections from the neutrino/sneutrino sector to the renormalized neutral CP-even Higgs boson self-energies and the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This consists of the Minimal Supersymmetric Standard Model with the addition of massive right handed Majorana neutrinos and their supersymmetric partners, and where the seesaw mechanism is used for the lightest neutrino mass generation. We explore the dependence on all the parameters involved, with particular emphasis in the role played by the heavy Majorana scale. We restrict ourselves to the case of one generation of neutrinos/sneutrinos. For the numerical part of the study, we consider a very wide range of values for all the parameters involved. We find sizeable corrections to Mh, which are negative in the region where the Majorana scale is large (10^{13}-10^{15} GeV) and the lightest neutrino mass is within a range inspired by data (0.1-1 eV). For some regions of the MSSM-seesaw parameter space, the corrections to Mh are substantially larger than the anticipated Large Hadron Collider precision.
△ Less
Submitted 26 May, 2011; v1 submitted 30 July, 2010;
originally announced July 2010.
-
Sensitivity to the Higgs sector of SUSY-seesaw models via LFV tau decays
Authors:
M. Herrero,
J. Portoles,
A. Rodriguez-Sanchez
Abstract:
Here we study and compare the sensitivity to the Higgs sector of the SUSY-seesaw models via the LFV tau decays: tau-> 3 mu, tau->K^{+}K^{-}, tau->mu eta and tau-> mu f_{0}. We emphasize that, at present, the two later channels are the most efficient ones to test indirectly the Higgs particles.
Here we study and compare the sensitivity to the Higgs sector of the SUSY-seesaw models via the LFV tau decays: tau-> 3 mu, tau->K^{+}K^{-}, tau->mu eta and tau-> mu f_{0}. We emphasize that, at present, the two later channels are the most efficient ones to test indirectly the Higgs particles.
△ Less
Submitted 3 September, 2009;
originally announced September 2009.