Search | arXiv e-print repository

Inestability presented in the estimating of the Nelson-Siegel-Svensson model

Abstract: The literature shows the possible existence of a problem called collinearity in both Nelson-Siegel and Nelson-Siegel-Svensson models due to the relationship between the slope and curvature components. The presence of this problem and the estimation of both models by Ordinary Least Squares would lead to coefficients estimates that may be unstable among other consequences. However, these estimates a… ▽ More The literature shows the possible existence of a problem called collinearity in both Nelson-Siegel and Nelson-Siegel-Svensson models due to the relationship between the slope and curvature components. The presence of this problem and the estimation of both models by Ordinary Least Squares would lead to coefficients estimates that may be unstable among other consequences. However, these estimates are used to make monetary policy decisions. For this reason, it is important to try mitigating this collinearity problem. Consequently, some authors propose traditional procedures for the treatment of collinearity such as: non-linear optimisation, to fix the shape parameter or ridge regression. Nevertheless, all these processes have their disadvantages. Alternatively, a new method with good properties called raise regression is proposed in this paper. Finally, the methodologies are illustrated with an empirical comparison on Euribor Overnight Index Swap and Euribor Interest Rates Swap data between 2011 and 2021. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Working paper with 11 pages, 6 tables, 4 figures

arXiv:2311.03600 [pdf, other]

Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model

Authors: Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

Abstract: Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generat… ▽ More Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generated neural ordinary differential equation solvers (NODE) can learn multiple LfD tasks sequentially, this approach lacks stability guarantees. We propose an approach for stable continual-LfD in which a hypernetwork generates two networks: a trajectory learning dynamics model, and a trajectory stabilizing Lyapunov function. The introduction of stability generates convergent trajectories, but more importantly it also greatly improves continual learning performance, especially in the size-efficient chunked hypernetworks. With our approach, a single hypernetwork learns stable trajectories of the robot's end-effector position and orientation simultaneously, and does so continually for a sequence of real-world LfD tasks without retraining on past demonstrations. We also propose stochastic hypernetwork regularization with a single randomly sampled regularization term, which reduces the cumulative training time cost for N tasks from O$(N^2)$ to O$(N)$ without any loss in performance on real-world tasks. We empirically evaluate our approach on the popular LASA dataset, on high-dimensional extensions of LASA (including up to 32 dimensions) to assess scalability, and on a novel extended robotic task dataset (RoboTasks9) to assess real-world performance. In trajectory error metrics, stability metrics and continual learning metrics our approach performs favorably, compared to other baselines. Our open-source code and datasets are available at https://github.com/sayantanauddy/clfd-snode. △ Less

Submitted 9 January, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: This paper is currently under peer review

arXiv:2302.01359 [pdf, other]

doi 10.1007/JHEP04(2023)067

The Euclidean Adler Function and its Interplay with $Δα^{\mathrm{had}}_{\mathrm{QED}}$ and $α_s$

Authors: M. Davier, D. Díaz-Calderón, B. Malaescu, A. Pich, A. Rodríguez-Sánchez, Z. Zhang

Abstract: Three different approaches to precisely describe the Adler function in the Euclidean regime at around $2\, \mathrm{GeVs}$ are available: dispersion relations based on the hadronic production data in $e^+e^-$ annihilation, lattice simulations and perturbative QCD (pQCD). We make a comprehensive study of the perturbative approach, supplemented with the leading power corrections in the operator produ… ▽ More Three different approaches to precisely describe the Adler function in the Euclidean regime at around $2\, \mathrm{GeVs}$ are available: dispersion relations based on the hadronic production data in $e^+e^-$ annihilation, lattice simulations and perturbative QCD (pQCD). We make a comprehensive study of the perturbative approach, supplemented with the leading power corrections in the operator product expansion. All known contributions are included, with a careful assessment of uncertainties. The pQCD predictions are compared with the Adler functions extracted from $Δα^{\mathrm{had}}_{\mathrm{QED}}(Q^2)$, using both the DHMZ compilation of $e^+e^-$ data and published lattice results. Taking as input the FLAG value of $α_s$, the pQCD Adler function turns out to be in good agreement with the lattice data, while the dispersive results lie systematically below them. Finally, we explore the sensitivity to $α_s$ of the direct comparison between the data-driven, lattice and QCD Euclidean Adler functions. The precision with which the renormalisation group equation can be tested is also evaluated. △ Less

Submitted 26 April, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

Comments: 56 pages, 22 figures, 14 tables. Published version

Journal ref: JHEP 04 (2023) 067

arXiv:2211.17183 [pdf, other]

doi 10.1007/JHEP02(2023)167

Constraints on the hadronic light-by-light in the Melnikov-Vainshtein regime

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Antonio Rodríguez-Sánchez

Abstract: The muon anomalous magnetic moment continues to attract attention due to the possible tension between the experimentally measured value and the theoretical Standard Model prediction. With the aim to reduce the uncertainty on the hadronic light-by-light contribution to the magnetic moment, we derive short-distance constraints in the Melnikov-Vainshtein regime which are useful for data-driven determ… ▽ More The muon anomalous magnetic moment continues to attract attention due to the possible tension between the experimentally measured value and the theoretical Standard Model prediction. With the aim to reduce the uncertainty on the hadronic light-by-light contribution to the magnetic moment, we derive short-distance constraints in the Melnikov-Vainshtein regime which are useful for data-driven determinations. In this kinematical region, two of the four electromagnetic currents are close in the four-point function defining the hadronic light-by-light tensor. To obtain the constraints, we develop a systematic operator product expansion of the tensor in question to next-to-leading order in the expansion in operators. We evaluate the leading in $α_s$ contributions and derive constraints for the next-to-leading operators that are also valid nonperturbatively. △ Less

Submitted 17 February, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: 35 pages, 4 figures. Published version

Report number: LU-TP 22-63

arXiv:2211.05200 [pdf, other]

Affordance detection with Dynamic-Tree Capsule Networks

Authors: Antonio Rodríguez-Sánchez, Simon Haller-Seeber, David Peer, Chris Engelhardt, Jakob Mittelberger, Matteo Saveriano

Abstract: Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object insta… ▽ More Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object instances or new viewpoints. One solution to overcome such limitations can be to resort to capsule networks. In this paper, we introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds. We show that our capsule-based network outperforms current state-of-the-art models on viewpoint invariance and parts-segmentation of new object instances through a novel dataset we only used for evaluation and it is publicly available from github.com/gipfelen/DTCG-Net. In the experimental evaluation we will show that our algorithm is superior to current affordance detection methods when faced with gras** previously unseen objects thanks to our Capsule Network enforcing a parts-to-whole representation. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: IEEE-RAS International Conference on Humanoid Robots (Humanoids 2022)

arXiv:2211.04068 [pdf, other]

doi 10.1051/epjconf/202227406010

Short-distance constraints on the hadronic light-by-light

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Antonio Rodríguez-Sánchez

Abstract: The muon anomalous magnetic moment continues to attract interest due to the potential tension between experimental measurement [1,2] and the Standard Model prediction [3]. The hadronic light-by-light contribution to the magnetic moment is one of the two diagrammatic topologies currently saturating the theoretical uncertainty. With the aim of improving precision on the hadronic light-by-light in a… ▽ More The muon anomalous magnetic moment continues to attract interest due to the potential tension between experimental measurement [1,2] and the Standard Model prediction [3]. The hadronic light-by-light contribution to the magnetic moment is one of the two diagrammatic topologies currently saturating the theoretical uncertainty. With the aim of improving precision on the hadronic light-by-light in a data-driven approach founded on dispersion theory [4,5], we derive various short-distance constraints of the underlying correlation function of four electromagnetic currents. Here, we present our previous progress in the purely short-distance regime and current efforts in the so-called Melnikov-Vainshtein limit. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: Proceedings for "The XVth Quark Confinement and the Hadron Spectrum conference", Stavanger, Norway, August 2022

arXiv:2208.01134 [pdf, other]

Improving the Trainability of Deep Neural Networks through Layerwise Batch-Entropy Regularization

Authors: David Peer, Bart Keulen, Sebastian Stabinger, Justus Piater, Antonio Rodríguez-Sánchez

Abstract: Training deep neural networks is a very demanding task, especially challenging is how to adapt architectures to improve the performance of trained models. We can find that sometimes, shallow networks generalize better than deep networks, and the addition of more layers results in higher training and test errors. The deep residual learning framework addresses this degradation problem by adding skip… ▽ More Training deep neural networks is a very demanding task, especially challenging is how to adapt architectures to improve the performance of trained models. We can find that sometimes, shallow networks generalize better than deep networks, and the addition of more layers results in higher training and test errors. The deep residual learning framework addresses this degradation problem by adding skip connections to several neural network layers. It would at first seem counter-intuitive that such skip connections are needed to train deep networks successfully as the expressivity of a network would grow exponentially with depth. In this paper, we first analyze the flow of information through neural networks. We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network. We prove empirically and theoretically that a positive batch-entropy is required for gradient descent-based training approaches to optimize a given loss function successfully. Based on those insights, we introduce batch-entropy regularization to enable gradient descent-based training algorithms to optimize the flow of information through each hidden layer individually. With batch-entropy regularization, gradient descent optimizers can transform untrainable networks into trainable networks. We show empirically that we can therefore train a "vanilla" fully connected network and convolutional neural network -- no skip connections, batch normalization, dropout, or any other architectural tweak -- with 500 layers by simply adding the batch-entropy regularization term to the loss function. The effect of batch-entropy regularization is not only evaluated on vanilla neural networks, but also on residual networks, autoencoders, and also transformer models over a wide range of computer vision as well as natural language processing tasks. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted at TMLR (07/2022): https://openreview.net/forum?id=LJohl5DnZf

arXiv:2207.02161 [pdf, other]

doi 10.1140/epjc/s10052-022-11085-3

On the sensitivity of the D parameter to new physics

Authors: Adam Falkowski, Antonio Rodríguez-Sánchez

Abstract: Measurements of angular correlations in nuclear beta decay are important tests of the Standard Model (SM). Among those, the so-called D correlation parameter occupies a particular place because it is odd under time reversal, and because the experimental sensitivity is at the $10^{-4}$ level, with plans of further improvement in the near future. Using effective field theory~(EFT) techniques, we rea… ▽ More Measurements of angular correlations in nuclear beta decay are important tests of the Standard Model (SM). Among those, the so-called D correlation parameter occupies a particular place because it is odd under time reversal, and because the experimental sensitivity is at the $10^{-4}$ level, with plans of further improvement in the near future. Using effective field theory~(EFT) techniques, we reassess its potential to discover or constrain new physics beyond the SM. We provide a comprehensive classification of CP-violating EFT scenarios which generate a shift of the D parameter away from the SM prediction. We show that, in each scenario, a shift larger than $10^{-5}$ is in serious tension with the existing experimental data, where bounds coming from electric dipole moments and LHC observables play a decisive role. The tension can only be avoided by fine tuning of the parameters in the UV completion of the EFT. We illustrate this using examples of leptoquark UV completions. Finally, we comment on the possibility to probe CP-conserving new physics via the D parameter. △ Less

Submitted 3 August, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: 33 pages; V2: updated reference to HighPT

arXiv:2205.07587 [pdf, other]

doi 10.1007/JHEP07(2022)145

Violations of Quark-Hadron Duality in Low-Energy Determinations of $α_s$

Authors: Antonio Pich, Antonio Rodríguez-Sánchez

Abstract: Using the spectral functions measured in $τ$ decays, we investigate the actual numerical impact of duality violations on the extraction of the strong coupling. These effects are tiny in the standard $α_s(m_τ^2)$ determinations from integrated distributions of the hadronic spectrum with pinched weights, or from the total $τ$ hadronic width. The pinched-weight factors suppress very efficiently the v… ▽ More Using the spectral functions measured in $τ$ decays, we investigate the actual numerical impact of duality violations on the extraction of the strong coupling. These effects are tiny in the standard $α_s(m_τ^2)$ determinations from integrated distributions of the hadronic spectrum with pinched weights, or from the total $τ$ hadronic width. The pinched-weight factors suppress very efficiently the violations of duality, making their numerical effects negligible in comparison with the larger perturbative uncertainties. However, combined fits of $α_s$ and duality-violation parameters, performed with non-protected weights, are subject to large systematic errors associated with the assumed modelling of duality-violation effects. These uncertainties have not been taken into account in the published analyses, based on specific models of quark-hadron duality. △ Less

Submitted 11 July, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

Comments: 39 pages, 8 figures. References added. Published version

arXiv:2203.15810 [pdf, other]

Prospects for precise predictions of $a_μ$ in the Standard Model

Authors: G. Colangelo, M. Davier, A. X. El-Khadra, M. Hoferichter, C. Lehner, L. Lellouch, T. Mibe, B. L. Roberts, T. Teubner, H. Wittig, B. Ananthanarayan, A. Bashir, J. Bijnens, T. Blum, P. Boyle, N. Bray-Ali, I. Caprini, C. M. Carloni Calame, O. Catà, M. Cè, J. Charles, N. H. Christ, F. Curciarello, I. Danilkin, D. Das , et al. (57 additional authors not shown)

Abstract: We discuss the prospects for improving the precision on the hadronic corrections to the anomalous magnetic moment of the muon, and the plans of the Muon $g-2$ Theory Initiative to update the Standard Model prediction. We discuss the prospects for improving the precision on the hadronic corrections to the anomalous magnetic moment of the muon, and the plans of the Muon $g-2$ Theory Initiative to update the Standard Model prediction. △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: Contribution to the US Community Study on the Future of Particle Physics (Snowmass 2021)

Report number: FERMILAB-CONF-22-236-T, LTH 1303, MITP-22-030

arXiv:2203.08271 [pdf, other]

The strong coupling constant: State of the art and the decade ahead

Authors: D. d'Enterria, S. Kluth, G. Zanderighi, C. Ayala, M. A. Benitez-Rathgeb, J. Bluemlein, D. Boito, N. Brambilla, D. Britzger, S. Camarda, A. M. Cooper-Sarkar, T. Cridge, G. Cvetic, M. Dalla Brida, A. Deur, F. Giuli, M. Golterman, A. H. Hoang, J. Huston, M. Jamin, A. V. Kotikov, V. G. Krivokhizhin, A. S. Kronfeld, V. Leino, K. Lipka , et al. (33 additional authors not shown)

Abstract: This document provides a comprehensive summary of the state-of-the-art, challenges, and prospects in the experimental and theoretical study of the strong coupling $α_s$. The current status of the seven methods presently used to determine $α_s$ based on: (i) lattice QCD, (ii) hadronic $τ$ decays, (iii) deep-inelastic scattering and parton distribution functions fits, (iv) electroweak boson decays,… ▽ More This document provides a comprehensive summary of the state-of-the-art, challenges, and prospects in the experimental and theoretical study of the strong coupling $α_s$. The current status of the seven methods presently used to determine $α_s$ based on: (i) lattice QCD, (ii) hadronic $τ$ decays, (iii) deep-inelastic scattering and parton distribution functions fits, (iv) electroweak boson decays, hadronic final-states in (v) e+e-, (vi) e-p, and (vii) p-p collisions, and (viii) quarkonia decays and masses, are reviewed. Novel $α_s$ determinations are discussed, as well as the averaging method used to obtain the PDG world-average value at the reference Z boson mass scale, $α_s(m^2_Z)$. Each of the extraction methods proposed provides a "wish list" of experimental and theoretical developments required in order to achieve an ideal permille precision on $α_s(m^2_Z)$ within the next 10 years. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: 130 pages, 82 figures. White paper submitted to the Energy Frontier "Proceedings of the US Community Study on the Future of Particle Physics" (Snowmass 2021)

arXiv:2202.06843 [pdf, other]

Continual Learning from Demonstration of Robotics Skills

Authors: Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

Abstract: Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equa… ▽ More Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers. We empirically demonstrate the effectiveness of this approach in remembering long sequences of trajectory learning tasks without the need to store any data from past demonstrations. Our results show that hypernetworks outperform other state-of-the-art continual learning approaches for learning from demonstration. In our experiments, we use the popular LASA benchmark, and two new datasets of kinesthetic demonstrations collected with a real robot that we introduce in this paper called the HelloWorld and RoboTasks datasets. We evaluate our approach on a physical robot and demonstrate its effectiveness in learning real-world robotic tasks involving changing positions as well as orientations. We report both trajectory error metrics and continual learning metrics, and we propose two new continual learning metrics. Our code, along with the newly collected datasets, is available at https://github.com/sayantanauddy/clfd. △ Less

Submitted 12 April, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

Comments: To appear in Robotics and Autonomous Systems

arXiv:2201.11091 [pdf, ps, other]

Momentum Capsule Networks

Authors: Josef Gugglberger, David Peer, Antonio Rodríguez-Sánchez

Abstract: Capsule networks are a class of neural networks that achieved promising results on many computer vision tasks. However, baseline capsule networks have failed to reach state-of-the-art results on more complex datasets due to the high computation and memory requirements. We tackle this problem by proposing a new network architecture, called Momentum Capsule Network (MoCapsNet). MoCapsNets are inspir… ▽ More Capsule networks are a class of neural networks that achieved promising results on many computer vision tasks. However, baseline capsule networks have failed to reach state-of-the-art results on more complex datasets due to the high computation and memory requirements. We tackle this problem by proposing a new network architecture, called Momentum Capsule Network (MoCapsNet). MoCapsNets are inspired by Momentum ResNets, a type of network that applies reversible residual building blocks. Reversible networks allow for recalculating activations of the forward pass in the backpropagation algorithm, so those memory requirements can be drastically reduced. In this paper, we provide a framework on how invertible residual building blocks can be applied to capsule networks. We will show that MoCapsNet beats the accuracy of baseline capsule networks on MNIST, SVHN, CIFAR-10 and CIFAR-100 while using considerably less memory. The source code is available on https://github.com/moejoe95/MoCapsNet. △ Less

Submitted 25 August, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

arXiv:2112.07688 [pdf, other]

doi 10.1007/JHEP02(2024)091

Constraints on subleading interactions in beta decay Lagrangian

Authors: Adam Falkowski, Martín González-Alonso, Ajdin Palavrić, Antonio Rodríguez-Sánchez

Abstract: We discuss the effective field theory (EFT) for nuclear beta decay. The general quark-level EFT describing charged-current interactions between quarks and leptons is matched to the nucleon-level non-relativistic EFT at the O(MeV) momentum scale characteristic for beta transitions. The matching takes into account, for the first time, the effect of all possible beyond-the-Standard-Model interactions… ▽ More We discuss the effective field theory (EFT) for nuclear beta decay. The general quark-level EFT describing charged-current interactions between quarks and leptons is matched to the nucleon-level non-relativistic EFT at the O(MeV) momentum scale characteristic for beta transitions. The matching takes into account, for the first time, the effect of all possible beyond-the-Standard-Model interactions at the subleading order in the recoil momentum. We calculate the impact of all the Wilson coefficients of the leading and subleading EFT Lagrangian on the differential decay width in allowed beta transitions. As an example application, we show how the existing experimental data constrain the subleading Wilson coefficients corresponding to pseudoscalar, weak magnetism, and induced tensor interactions. The data display a 3.5 sigma evidence for nucleon weak magnetism, in agreement with the theory prediction based on isospin symmetry. △ Less

Submitted 26 April, 2024; v1 submitted 14 December, 2021; originally announced December 2021.

Comments: 55 pages; V2: corrected tensor matrix element, discussion of matrix elements expanded in Appendix A; V3: Discussion of many-body currents added in Appendix D. Published version

Journal ref: JHEP 02 (2024) 091

arXiv:2112.02087 [pdf, other]

doi 10.1007/JHEP04(2022)152

Semileptonic tau decays beyond the Standard Model

Authors: Vincenzo Cirigliano, David Díaz-Calderón, Adam Falkowski, Martín González-Alonso, Antonio Rodríguez-Sánchez

Abstract: Hadronic $τ$ decays are studied as probe of new physics. We determine the dependence of several inclusive and exclusive $τ$ observables on the Wilson coefficients of the low-energy effective theory describing charged-current interactions between light quarks and leptons. The analysis includes both strange and non-strange decay channels. The main result is the likelihood function for the Wilson coe… ▽ More Hadronic $τ$ decays are studied as probe of new physics. We determine the dependence of several inclusive and exclusive $τ$ observables on the Wilson coefficients of the low-energy effective theory describing charged-current interactions between light quarks and leptons. The analysis includes both strange and non-strange decay channels. The main result is the likelihood function for the Wilson coefficients in the tau sector, based on the up-to-date experimental measurements and state-of-the-art theoretical techniques. The likelihood can be readily combined with inputs from other low-energy precision observables. We discuss a combination with nuclear beta, baryon, pion, and kaon decay data. In particular, we provide a comprehensive and model-independent description of the new physics hints in the combined dataset, which are known under the name of the Cabibbo anomaly. △ Less

Submitted 5 July, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: 58 pages; V2: Table 1 added, final published version

arXiv:2107.13886 [pdf, other]

2-loop short-distance constraints for the $g-2$ HLbL

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Laetitia Laub, Antonio Rodríguez-Sánchez

Abstract: The recent experimental measurement of the muon $g-2$ at Fermilab National Laboratory, at a $4.2σ$ tension with the Standard Model prediction, highlights the need for further improvements on the theoretical uncertainties associated to the hadronic sector. In the framework of the operator product expansion in the presence of a background field, the short-distance behaviour of the hadronic light-by-… ▽ More The recent experimental measurement of the muon $g-2$ at Fermilab National Laboratory, at a $4.2σ$ tension with the Standard Model prediction, highlights the need for further improvements on the theoretical uncertainties associated to the hadronic sector. In the framework of the operator product expansion in the presence of a background field, the short-distance behaviour of the hadronic light-by-light contribution was recently studied. The leading term in this expansion is given by the massless quark-loop, which is numerically dominant compared to non-perturbative corrections. Here, we present the perturbative QCD correction to the massless quark-loop and estimate its size numerically. In particular, we find that for scales above 1 GeV it is relatively small, in general roughly $-10\%$ the size of the massless quark-loop. The knowledge of these short-distance constraints will in the future allow to reduce the systematic uncertainties in the Standard Model prediction of the hadronic light-by-light contribution to the $g-2$. △ Less

Submitted 29 July, 2021; originally announced July 2021.

Comments: Proceedings for QCD 21 - 24th High-Energy Physics International Conference in Quantum Chromodynamics

arXiv:2105.14839 [pdf, other]

doi 10.1016/j.patrec.2022.03.023

Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing

Authors: David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-Sanchez

Abstract: Fine-tuning transformer models after unsupervised pre-training reaches a very high performance on many different natural language processing tasks. Unfortunately, transformers suffer from long inference times which greatly increases costs in production. One possible solution is to use knowledge distillation, which solves this problem by transferring information from large teacher models to smaller… ▽ More Fine-tuning transformer models after unsupervised pre-training reaches a very high performance on many different natural language processing tasks. Unfortunately, transformers suffer from long inference times which greatly increases costs in production. One possible solution is to use knowledge distillation, which solves this problem by transferring information from large teacher models to smaller student models. Knowledge distillation maintains high performance and reaches high compression rates, nevertheless, the size of the student model is fixed after pre-training and can not be changed individually for a given downstream task and use-case to reach a desired performance/speedup ratio. Another solution to reduce the size of models in a much more fine-grained and computationally cheaper fashion is to prune layers after the pre-training. The price to pay is that the performance of layer-wise pruning algorithms is not on par with state-of-the-art knowledge distillation methods. In this paper, Greedy-layer pruning is introduced to (1) outperform current state-of-the-art for layer-wise pruning, (2) close the performance gap when compared to knowledge distillation, while (3) providing a method to adapt the model size dynamically to reach a desired performance/speedup tradeoff without the need of additional pre-training phases. Our source code is available on https://github.com/deepopinion/greedy-layer-pruning. △ Less

Submitted 29 March, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

Comments: Accepted at Pattern Recognition Letters

arXiv:2104.07393 [pdf, ps, other]

Training Deep Capsule Networks with Residual Connections

Authors: Josef Gugglberger, David Peer, Antonio Rodriguez-Sanchez

Abstract: Capsule networks are a type of neural network that have recently gained increased popularity. They consist of groups of neurons, called capsules, which encode properties of objects or object parts. The connections between capsules encrypt part-whole relationships between objects through routing algorithms which route the output of capsules from lower level layers to upper level layers. Capsule net… ▽ More Capsule networks are a type of neural network that have recently gained increased popularity. They consist of groups of neurons, called capsules, which encode properties of objects or object parts. The connections between capsules encrypt part-whole relationships between objects through routing algorithms which route the output of capsules from lower level layers to upper level layers. Capsule networks can reach state-of-the-art results on many challenging computer vision tasks, such as MNIST, Fashion-MNIST, and Small-NORB. However, most capsule network implementations use two to three capsule layers, which limits their applicability as expressivity grows exponentially with depth. One approach to overcome such limitations would be to train deeper network architectures, as it has been done for convolutional neural networks with much increased success. In this paper, we propose a methodology to train deeper capsule networks using residual connections, which is evaluated on four datasets and three different routing algorithms. Our experimental results show that in fact, performance increases when training deeper capsule networks. The source code is available on https://github.com/moejoe95/res-capsnet. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: 12 pages

arXiv:2103.04331 [pdf, other]

Auto-tuning of Deep Neural Networks by Conflicting Layer Removal

Authors: David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that… ▽ More Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that such a layer could lead to a network that cannot be trained at all. A theoretical analysis is provided on what is the origin of those layers that result in a lower overall network performance, which is complemented by our extensive empirical evaluation. More precisely, we identified those layers that worsen the performance because they would produce what we name conflicting training bundles. We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture with no significant increase in the test-error. We will further present a novel neural-architecture-search (NAS) algorithm that identifies conflicting layers at the beginning of the training. Architectures found by our auto-tuning algorithm achieve competitive accuracy values when compared against more complex state-of-the-art architectures, while drastically reducing memory consumption and inference time for different computer vision tasks. The source code is available on https://github.com/peerdavid/conflicting-bundles △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2011.02956

arXiv:2102.11944 [pdf, other]

Arguments for the Unsuitability of Convolutional Neural Networks for Non--Local Tasks

Authors: Sebastian Stabinger, David Peer, Antonio Rodríguez-Sánchez

Abstract: Convolutional neural networks have established themselves over the past years as the state of the art method for image classification, and for many datasets, they even surpass humans in categorizing images. Unfortunately, the same architectures perform much worse when they have to compare parts of an image to each other to correctly classify this image. Until now, no well-formed theoretical argu… ▽ More Convolutional neural networks have established themselves over the past years as the state of the art method for image classification, and for many datasets, they even surpass humans in categorizing images. Unfortunately, the same architectures perform much worse when they have to compare parts of an image to each other to correctly classify this image. Until now, no well-formed theoretical argument has been presented to explain this deficiency. In this paper, we will argue that convolutional layers are of little use for such problems, since comparison tasks are global by nature, but convolutional layers are local by design. We will use this insight to reformulate a comparison task into a sorting task and use findings on sorting networks to propose a lower bound for the number of parameters a neural network needs to solve comparison tasks in a generalizable way. We will use this lower bound to argue that attention, as well as iterative/recurrent processing, is needed to prevent a combinatorial explosion. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Under review at Neural Networks Journal

arXiv:2102.09308 [pdf, other]

doi 10.1007/JHEP06(2021)005

SU(3) analysis of four-quark operators: $K\toππ$ and vacuum matrix elements

Authors: Antonio Pich, Antonio Rodríguez-Sánchez

Abstract: Hadronic matrix elements of local four-quark operators play a central role in non-leptonic kaon decays, while vacuum matrix elements involving the same kind of operators appear in inclusive dispersion relations, such as those relevant in $τ$-decay analyses. Using an $SU(3)_L\otimes SU(3)_R$ decomposition of the operators, we derive generic relations between these matrix elements, extending well-kn… ▽ More Hadronic matrix elements of local four-quark operators play a central role in non-leptonic kaon decays, while vacuum matrix elements involving the same kind of operators appear in inclusive dispersion relations, such as those relevant in $τ$-decay analyses. Using an $SU(3)_L\otimes SU(3)_R$ decomposition of the operators, we derive generic relations between these matrix elements, extending well-known results that link observables in the two different sectors. Two relevant phenomenological applications are presented. First, we determine the electroweak-penguin contribution to the kaon CP-violating ratio $\varepsilon'/\varepsilon$, using the measured hadronic spectral functions in $τ$ decay. Second, we fit our $SU(3)$ dynamical parameters to the most recent lattice data on $K\toππ$ matrix elements. The comparison of this numerical fit with results from previous analytical approaches provides an interesting anatomy of the $ΔI = \frac{1}{2}$ enhancement, confirming old suggestions about its underlying dynamical origin. △ Less

Submitted 16 June, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 46 pages, 7 figures. Published version

arXiv:2101.09169 [pdf, other]

doi 10.1007/JHEP04(2021)240

The two-loop perturbative correction to the (g-2)$_μ$ HLbL at short distances

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Laetitia Laub, Antonio Rodríguez-Sánchez

Abstract: The short-distance behaviour of the hadronic light-by-light (HLbL) contribution to $(g-2)_μ$ has recently been studied by means of an operator product expansion in a background electromagnetic field. The leading term in this expansion has been shown to be given by the massless quark loop, and the non-perturbative corrections are numerically very suppressed. Here, we calculate the perturbative QCD… ▽ More The short-distance behaviour of the hadronic light-by-light (HLbL) contribution to $(g-2)_μ$ has recently been studied by means of an operator product expansion in a background electromagnetic field. The leading term in this expansion has been shown to be given by the massless quark loop, and the non-perturbative corrections are numerically very suppressed. Here, we calculate the perturbative QCD correction to the massless quark loop. The correction is found to be fairly small compared to the quark loop as far as we study energy scales where the perturbative running for the QCD coupling is well-defined, i.e.~for scales $μ\gtrsim 1\, \mathrm{GeV}$. This should allow to reduce the large systematic uncertainty associated to high-multiplicity hadronic states. △ Less

Submitted 11 May, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

Comments: 28 pages, the expressions and the many expansions that are too long are included in the supplementary files. Minor misprints corrected, short discussion about the MV limit and explicit expressions at the symmetric point added

Report number: LU TP 21-03

Journal ref: JHEP 04 (2021) 240

arXiv:2011.12123 [pdf, other]

Short-distance HLbL contributions to the muon g-2

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Laetitia Laub, Antonio Rodriguez-Sanchez

Abstract: The current $3.7σ$ discrepancy between the Standard Model prediction and the experimental value of the muon anomalous magnetic moment could be a hint for the existence of new physics. The hadronic light-by-light contribution is one of the pieces requiring improved precision on the theory side, and an important step is to derive short-distance constraints for this quantity containing four electroma… ▽ More The current $3.7σ$ discrepancy between the Standard Model prediction and the experimental value of the muon anomalous magnetic moment could be a hint for the existence of new physics. The hadronic light-by-light contribution is one of the pieces requiring improved precision on the theory side, and an important step is to derive short-distance constraints for this quantity containing four electromagnetic currents. Here, we derive such short-distance constraints for three large photon loop virtualities and the external fourth photon in the static limit. The static photon is considered as a background field and we construct a systematic operator product expansion in the presence of this field. We show that the massless quark loop, i.e. the leading term, is numerically dominant over non-perturbative contributions up to next-to-next-to leading order, both those suppressed by quark masses and those that are not. △ Less

Submitted 24 November, 2020; originally announced November 2020.

Comments: Proceedings for talk given at 23rd International Conference in Quantum Chromodynamics (QCD 20), 27 October - 30 October 2020, Montpellier - FR

arXiv:2011.02956 [pdf, other]

Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks

Authors: David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer… ▽ More Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer could lead to a network that can not be trained at all. More precisely, we identified those layers that worsen the performance because they produce conflicting training bundles as we show in our novel theoretical analysis, complemented by our extensive empirical studies. Based on these findings, a novel algorithm is introduced to remove performance decreasing layers automatically. Architectures found by this algorithm achieve a competitive accuracy when compared against the state-of-the-art architectures. While kee** such high accuracy, our approach drastically reduces memory consumption and inference time for different computer vision tasks. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: Accepted at WACV2021

arXiv:2008.13487 [pdf, other]

doi 10.1007/JHEP10(2020)203

Short-distance HLbL contributions to the muon anomalous magnetic moment beyond perturbation theory

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Laetitia Laub, Antonio Rodríguez-Sánchez

Abstract: The hadronic light-by-light contribution to the muon anomalous magnetic moment depends on an integration over three off-shell momenta squared ($Q_i^2$) of the correlator of four electromagnetic currents and the fourth leg at zero momentum. We derive the short-distance expansion of this correlator in the limit where all three $Q_i^2$ are large and in the Euclidean domain in QCD. This is done via a… ▽ More The hadronic light-by-light contribution to the muon anomalous magnetic moment depends on an integration over three off-shell momenta squared ($Q_i^2$) of the correlator of four electromagnetic currents and the fourth leg at zero momentum. We derive the short-distance expansion of this correlator in the limit where all three $Q_i^2$ are large and in the Euclidean domain in QCD. This is done via a systematic operator product expansion (OPE) in a background field which we construct. The leading order term in the expansion is the massless quark loop. We also compute the non-perturbative part of the next-to-leading contribution, which is suppressed by quark masses, and the chiral limit part of the next-to-next-to leading contributions to the OPE. We build a renormalisation program for the OPE. The numerical role of the higher-order contributions is estimated and found to be small. △ Less

Submitted 27 October, 2020; v1 submitted 31 August, 2020; originally announced August 2020.

Comments: 53 pages, analytical results as FORM output included as results.txt, some references added, misprints corrected, one figure changed

Report number: LU TP 20-47

arXiv:2006.04822 [pdf, other]

doi 10.1016/j.physrep.2020.07.006

The anomalous magnetic moment of the muon in the Standard Model

Authors: T. Aoyama, N. Asmussen, M. Benayoun, J. Bijnens, T. Blum, M. Bruno, I. Caprini, C. M. Carloni Calame, M. Cè, G. Colangelo, F. Curciarello, H. Czyż, I. Danilkin, M. Davier, C. T. H. Davies, M. Della Morte, S. I. Eidelman, A. X. El-Khadra, A. Gérardin, D. Giusti, M. Golterman, Steven Gottlieb, V. Gülpers, F. Hagelstein, M. Hayakawa , et al. (107 additional authors not shown)

Abstract: We review the present status of the Standard Model calculation of the anomalous magnetic moment of the muon. This is performed in a perturbative expansion in the fine-structure constant $α$ and is broken down into pure QED, electroweak, and hadronic contributions. The pure QED contribution is by far the largest and has been evaluated up to and including $\mathcal{O}(α^5)$ with negligible numerical… ▽ More We review the present status of the Standard Model calculation of the anomalous magnetic moment of the muon. This is performed in a perturbative expansion in the fine-structure constant $α$ and is broken down into pure QED, electroweak, and hadronic contributions. The pure QED contribution is by far the largest and has been evaluated up to and including $\mathcal{O}(α^5)$ with negligible numerical uncertainty. The electroweak contribution is suppressed by $(m_μ/M_W)^2$ and only shows up at the level of the seventh significant digit. It has been evaluated up to two loops and is known to better than one percent. Hadronic contributions are the most difficult to calculate and are responsible for almost all of the theoretical uncertainty. The leading hadronic contribution appears at $\mathcal{O}(α^2)$ and is due to hadronic vacuum polarization, whereas at $\mathcal{O}(α^3)$ the hadronic light-by-light scattering contribution appears. Given the low characteristic scale of this observable, these contributions have to be calculated with nonperturbative methods, in particular, dispersion relations and the lattice approach to QCD. The largest part of this review is dedicated to a detailed account of recent efforts to improve the calculation of these two contributions with either a data-driven, dispersive approach, or a first-principle, lattice-QCD approach. The final result reads $a_μ^\text{SM}=116\,591\,810(43)\times 10^{-11}$ and is smaller than the Brookhaven measurement by 3.7$σ$. The experimental uncertainty will soon be reduced by up to a factor four by the new experiment currently running at Fermilab, and also by the future J-PARC experiment. This and the prospects to further reduce the theoretical uncertainty in the near future-which are also discussed here-make this quantity one of the most promising places to look for evidence of new physics. △ Less

Submitted 13 November, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: 196 pages, 103 figures, version published in Phys. Rept., bib files for the citation references are available from: https://muon-gm2-theory.illinois.edu

Report number: FERMILAB-PUB-20-207-T, INT-PUB-20-021, KEK Preprint 2020-5, MITP/20-028, CERN-TH-2020-075, IFT-UAM/CSIC-20-74, LMU-ASC 18/20, LTH 1234, LU TP 20-20, MAN/HEP/2020/003, PSI-PR-20-06, UWThPh 2020-14, ZU-TH 18/20

Journal ref: Phys. Rept. 887 (2020) 1-166

arXiv:2001.10857 [pdf, other]

Evaluating the Progress of Deep Learning for Visual Relational Concepts

Authors: Sebastian Stabinger, Peer David, Justus Piater, Antonio Rodríguez-Sánchez

Abstract: Convolutional Neural Networks (CNNs) have become the state of the art method for image classification in the last ten years. Despite the fact that they achieve superhuman classification accuracy on many popular datasets, they often perform much worse on more abstract image classification tasks. We will show that these difficult tasks are linked to relational concepts from cognitive psychology and… ▽ More Convolutional Neural Networks (CNNs) have become the state of the art method for image classification in the last ten years. Despite the fact that they achieve superhuman classification accuracy on many popular datasets, they often perform much worse on more abstract image classification tasks. We will show that these difficult tasks are linked to relational concepts from cognitive psychology and that despite progress over the last few years, such relational reasoning tasks still remain difficult for current neural network architectures. We will review deep learning research that is linked to relational concept learning, even if it was not originally presented from this angle. Reviewing the current literature, we will argue that some form of attention will be an important component of future systems to solve relational tasks. In addition, we will point out the shortcomings of currently used datasets, and we will recommend steps to make future datasets more relevant for testing systems on relational reasoning. △ Less

Submitted 13 September, 2021; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: Accepted for publication at Journal of Vision

arXiv:1912.04811 [pdf, ps, other]

doi 10.1088/1742-6596/1526/1/012010

Isospin-breaking contributions to $\varepsilon'/\varepsilon$

Authors: V. Cirigliano, H. Gisbert, A. Pich, A. Rodríguez-Sánchez

Abstract: We present an updated analysis of isospin-violating corrections to $\varepsilon'/\varepsilon$ in the framework of chiral perturbation theory, taking advantage of the currently improved knowledge on quark masses and nonperturbative parameters. The role of the different ingredients entering into the analysis is carefully assessed. Our final result is $Ω_{\mathrm{eff}}=0.110\,{}^{+0.090}_{-0.088}$. We present an updated analysis of isospin-violating corrections to $\varepsilon'/\varepsilon$ in the framework of chiral perturbation theory, taking advantage of the currently improved knowledge on quark masses and nonperturbative parameters. The role of the different ingredients entering into the analysis is carefully assessed. Our final result is $Ω_{\mathrm{eff}}=0.110\,{}^{+0.090}_{-0.088}$. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: 6 pages. Contribution to the Proceedings of the International Conference on Kaon Physics 2019

Report number: IFIC/19-55, DO-TH 19/30, LU TP/19-56

arXiv:1912.04736 [pdf, other]

doi 10.1088/1742-6596/1526/1/012011

Theoretical status of $\varepsilon'/\varepsilon$

Authors: V. Cirigliano, H. Gisbert, A. Pich, A. Rodríguez-Sánchez

Abstract: We briefly overview the historical controversy around Standard Model predictions of $\varepsilon'/\varepsilon$ and clarify the underlying physics. A full update of this important observable is presented, with all known short- and long-distance contributions, including isospin-breaking corrections. The current Standard Model prediction,… ▽ More We briefly overview the historical controversy around Standard Model predictions of $\varepsilon'/\varepsilon$ and clarify the underlying physics. A full update of this important observable is presented, with all known short- and long-distance contributions, including isospin-breaking corrections. The current Standard Model prediction, $\mathrm{Re}(\varepsilon'/\varepsilon) = (14\pm 5)\cdot 10^{-4}$, is in excellent agreement with the experimentally measured value. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: Invited talk at Kaon 2019 (Perugia, 10-13 September 2019). 6 pages, 3 figures

Report number: DO-TH 19/29, IFIC/19-54, LU TP/19-55

arXiv:1911.06554 [pdf, other]

A complete update of $\varepsilon'/\varepsilon$ in the Standard Model

Authors: V. Cirigliano, H. Gisbert, A. Pich, A. Rodríguez-Sánchez

Abstract: The recent release of improved lattice data has revived again the interest on precise theoretical calculations of the direct CP-violation ratio $\varepsilon'/\varepsilon$. We present a complete update of the Standard Model prediction [1,2], including a new re-analysis of isospin-breaking corrections which are of vital importance in the theoretical determination of this observable. The Standard Mod… ▽ More The recent release of improved lattice data has revived again the interest on precise theoretical calculations of the direct CP-violation ratio $\varepsilon'/\varepsilon$. We present a complete update of the Standard Model prediction [1,2], including a new re-analysis of isospin-breaking corrections which are of vital importance in the theoretical determination of this observable. The Standard Model prediction, $\mathrm{Re} (ε'/ε) = (14\pm 5)\cdot 10^{-4}$, turns out to be in good agreement with the experimental measurement. △ Less

Submitted 30 November, 2019; v1 submitted 15 November, 2019; originally announced November 2019.

Comments: 6 pages, 1 figure, Contribution to the Proceedings of the EPS-HEP 2019 Conference

Report number: LA-UR-19-31481, IFIC/19-47, DO-TH/19-24, LU TP/19-52

arXiv:1911.01359 [pdf, other]

doi 10.1007/JHEP02(2020)032

Isospin-Violating Contributions to $ε'/ε$

Authors: V. Cirigliano, H. Gisbert, A. Pich, A. Rodríguez-Sánchez

Abstract: The known isospin-breaking contributions to the $K\rightarrow ππ$ amplitudes are reanalyzed, taking into account our current understanding of the quark masses and the relevant non-perturbative inputs. We present a complete numerical reappraisal of the direct CP-violating ratio $ε'/ε$, where these corrections play a quite significant role. We obtain the Standard Model prediction… ▽ More The known isospin-breaking contributions to the $K\rightarrow ππ$ amplitudes are reanalyzed, taking into account our current understanding of the quark masses and the relevant non-perturbative inputs. We present a complete numerical reappraisal of the direct CP-violating ratio $ε'/ε$, where these corrections play a quite significant role. We obtain the Standard Model prediction $\text{Re}\left(ε'/ε\right)\, =\,\left(14\,\pm\,5\right)\cdot 10^{-4}$, which is in very good agreement with the measured ratio. The uncertainty, which has been estimated conservatively, is dominated by our current ignorance about $1/N_C$-suppressed contributions to some relevant chiral-perturbation-theory low-energy constants. △ Less

Submitted 4 November, 2019; originally announced November 2019.

Comments: 49 pages, 4 figures

Report number: LU TP/19-51, IFIC/19-46, DO-TH/19-23

arXiv:1910.04655 [pdf, ps, other]

Analytical results for hadronic contributions to the muon $g-2$

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Antonio Rodriguez-Sánchez

Abstract: This talk discusses two analytical calculations relevant for the Standard Model calculation of the muon $g-2$. The first part is the recent derivation of the quark-loop as the first term in a well-defined operator-product expansion for the short-distance part of the hadronic light-by-light contribution, as well as the calculation of the next term. The second part is the calculation of finite volum… ▽ More This talk discusses two analytical calculations relevant for the Standard Model calculation of the muon $g-2$. The first part is the recent derivation of the quark-loop as the first term in a well-defined operator-product expansion for the short-distance part of the hadronic light-by-light contribution, as well as the calculation of the next term. The second part is the calculation of finite volume effects relevant for lattice QCD calculations of the electromagnetic contribution to the lowest-order hadronic vacuum-polarization contribution and the proof they only start at $1/L^3$. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: 6 pages, presented by JB at European Physical Society Conference on High Energy Physics - EPS-HEP2019 - 10-17 July, 2019, Ghent, Belgium

Report number: LU TP 19-49

arXiv:1908.03331 [pdf, other]

doi 10.1016/j.physletb.2019.134994

Short-distance constraints for the HLbL contribution to the muon anomalous magnetic moment

Authors: Johan Bijnens, Nils Hermansson-Truedsson, Antonio Rodríguez-Sánchez

Abstract: We derive short-distance constraints for the hadronic light-by-light contribution (HLbL) to the anomalous magnetic moment of the muon in the kinematic region where the three virtual momenta are all large. We include the external soft photon via an external field leading to a well-defined Operator Product Expansion. We establish that the perturbative quark loop gives the leading contribution in a w… ▽ More We derive short-distance constraints for the hadronic light-by-light contribution (HLbL) to the anomalous magnetic moment of the muon in the kinematic region where the three virtual momenta are all large. We include the external soft photon via an external field leading to a well-defined Operator Product Expansion. We establish that the perturbative quark loop gives the leading contribution in a well defined expansion. We compute the first nonzero power correction. It is related to to the magnetic susceptibility of the QCD vacuum. The results can be used as model-independent short-distance constraints for the very many different approaches to the HLbL contribution. Numerically the power correction is found to be small. △ Less

Submitted 9 August, 2019; originally announced August 2019.

Comments: 6 pages

Report number: LU TP 19-38

arXiv:1905.08744 [pdf, other]

Limitation of capsule networks

Authors: David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: A recently proposed method in deep learning groups multiple neurons to capsules such that each capsule represents an object or part of an object. Routing algorithms route the output of capsules from lower-level layers to upper-level layers. In this paper, we prove that state-of-the-art routing procedures decrease the expressivity of capsule networks. More precisely, it is shown that EM-routing and… ▽ More A recently proposed method in deep learning groups multiple neurons to capsules such that each capsule represents an object or part of an object. Routing algorithms route the output of capsules from lower-level layers to upper-level layers. In this paper, we prove that state-of-the-art routing procedures decrease the expressivity of capsule networks. More precisely, it is shown that EM-routing and routing-by-agreement prevent capsule networks from distinguishing inputs and their negative counterpart. Therefore, only symmetric functions can be expressed by capsule networks, and it can be concluded that they are not universal approximators. We also theoretically motivate and empirically show that this limitation affects the training of deep capsule networks negatively. Therefore, we present an incremental improvement for state-of-the-art routing algorithms that solves the aforementioned limitation and stabilizes the training of capsule networks. △ Less

Submitted 19 January, 2021; v1 submitted 21 May, 2019; originally announced May 2019.

arXiv:1812.09707 [pdf, other]

Increasing the adversarial robustness and explainability of capsule networks with $γ$-capsules

Authors: David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: In this paper we introduce a new inductive bias for capsule networks and call networks that use this prior $γ$-capsule networks. Our inductive bias that is inspired by TE neurons of the inferior temporal cortex increases the adversarial robustness and the explainability of capsule networks. A theoretical framework with formal definitions of $γ$-capsule networks and metrics for evaluation are also… ▽ More In this paper we introduce a new inductive bias for capsule networks and call networks that use this prior $γ$-capsule networks. Our inductive bias that is inspired by TE neurons of the inferior temporal cortex increases the adversarial robustness and the explainability of capsule networks. A theoretical framework with formal definitions of $γ$-capsule networks and metrics for evaluation are also provided. Under our framework we show that common capsule networks do not necessarily make use of this inductive bias. For this reason we introduce a novel routing algorithm and use a different training algorithm to be able to implement $γ$-capsule networks. We then show experimentally that $γ$-capsule networks are indeed more transparent and more robust against adversarial attacks than regular capsule networks. △ Less

Submitted 5 December, 2019; v1 submitted 23 December, 2018; originally announced December 2018.

arXiv:1809.01161 [pdf, other]

doi 10.1103/PhysRevLett.122.221801

Hadronic tau decays as New Physics probes in the LHC era

Authors: Vincenzo Cirigliano, Adam Falkowski, Martín González-Alonso, Antonio Rodríguez-Sánchez

Abstract: We analyze the sensitivity of hadronic tau decays to non-standard interactions within the model-independent framework of the Standard Model Effective Field Theory (SMEFT). Both exclusive and inclusive decays are studied, using the latest lattice data and QCD dispersion relations. We show that there are enough theoretically clean channels to disentangle all the effective couplings contributing to t… ▽ More We analyze the sensitivity of hadronic tau decays to non-standard interactions within the model-independent framework of the Standard Model Effective Field Theory (SMEFT). Both exclusive and inclusive decays are studied, using the latest lattice data and QCD dispersion relations. We show that there are enough theoretically clean channels to disentangle all the effective couplings contributing to these decays, with the $τ\to ππν_τ$ channel representing an unexpected powerful New Physics probe. We find that the ratios of non-standard couplings to the Fermi constant are bound at the sub-percent level. These bounds are complementary to the ones from electroweak precision observables and $p p \to τν_τ$ measurements at the LHC. The combination of tau decay and LHC data puts tighter constraints on lepton universality violation in the gauge boson-lepton vertex corrections. △ Less

Submitted 12 June, 2019; v1 submitted 4 September, 2018; originally announced September 2018.

Comments: 8 pages; v2: comments and references added, PRL version

Report number: CERN-TH-2008-171, LA-UR-18-27408, LPT-Orsay-18-82

Journal ref: Phys. Rev. Lett. 122, 221801 (2019)

arXiv:1712.02154 [pdf, other]

Guided Labeling using Convolutional Neural Networks

Authors: Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: Over the last couple of years, deep learning and especially convolutional neural networks have become one of the work horses of computer vision. One limiting factor for the applicability of supervised deep learning to more areas is the need for large, manually labeled datasets. In this paper we propose an easy to implement method we call guided labeling, which automatically determines which sample… ▽ More Over the last couple of years, deep learning and especially convolutional neural networks have become one of the work horses of computer vision. One limiting factor for the applicability of supervised deep learning to more areas is the need for large, manually labeled datasets. In this paper we propose an easy to implement method we call guided labeling, which automatically determines which samples from an unlabeled dataset should be labeled. We show that using this procedure, the amount of samples that need to be labeled is reduced considerably in comparison to labeling images arbitrarily. △ Less

Submitted 6 December, 2017; originally announced December 2017.

Comments: Under review for CVPR2018

arXiv:1708.07770 [pdf, other]

Evaluation of Deep Learning on an Abstract Image Classification Dataset

Authors: Sebastian Stabinger, Antonio Rodriguez-Sanchez

Abstract: Convolutional Neural Networks have become state of the art methods for image classification over the last couple of years. By now they perform better than human subjects on many of the image classification datasets. Most of these datasets are based on the notion of concrete classes (i.e. images are classified by the type of object in the image). In this paper we present a novel image classificatio… ▽ More Convolutional Neural Networks have become state of the art methods for image classification over the last couple of years. By now they perform better than human subjects on many of the image classification datasets. Most of these datasets are based on the notion of concrete classes (i.e. images are classified by the type of object in the image). In this paper we present a novel image classification dataset, using abstract classes, which should be easy to solve for humans, but variations of it are challenging for CNNs. The classification performance of popular CNN architectures is evaluated on this dataset and variations of the dataset that might be interesting for further research are identified. △ Less

Submitted 25 August, 2017; originally announced August 2017.

Comments: Copyright IEEE. To be published in the proceedings of MBCC at ICCV2017

arXiv:1607.08366 [pdf, other]

25 years of CNNs: Can we compare to human abstraction capabilities?

Authors: Sebastian Stabinger, Antonio Rodríguez-Sánchez, Justus Piater

Abstract: We try to determine the progress made by convolutional neural networks over the past 25 years in classifying images into abstractc lasses. For this purpose we compare the performance of LeNet to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property (e.g., one class contains two objects of the same size, the other class two objects of different… ▽ More We try to determine the progress made by convolutional neural networks over the past 25 years in classifying images into abstractc lasses. For this purpose we compare the performance of LeNet to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property (e.g., one class contains two objects of the same size, the other class two objects of different sizes). Our results show that there is still work to do in order to solve vision problems humans are able to solve without much difficulty. △ Less

Submitted 28 July, 2016; originally announced July 2016.

Comments: To appear in the proceedings of ICANN 2016, Springer

arXiv:1606.05506 [pdf, other]

doi 10.4108/eai.3-12-2015.2262468

Learning Abstract Classes using Deep Learning

Authors: Sebastian Stabinger, Antonio Rodriguez-Sanchez, Justus Piater

Abstract: Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract c… ▽ More Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract classes which are trivially differentiable for humans. We trained and tested the CNN on the two abstract classes of horizontal and vertical orientation and determined how well the network is able to transfer the learned classes to other, previously unseen objects. △ Less

Submitted 17 June, 2016; originally announced June 2016.

Comments: To be published in the proceedings of the International Conference on Bio-inspired Information and Communications Technologies 2015

arXiv:1605.06830 [pdf, other]

doi 10.1103/PhysRevD.94.034027

Determination of the QCD Coupling from ALEPH $τ$ Decay Data

Authors: Antonio Pich, Antonio Rodríguez-Sánchez

Abstract: We present a comprehensive study of the determination of the strong coupling from $τ$ decay, using the most recent release of the experimental ALEPH data. We critically review all theoretical strategies used in previous works and put forward various novel approaches which allow to study complementary aspects of the problem. We investigate the advantages and disadvantages of the different methods,… ▽ More We present a comprehensive study of the determination of the strong coupling from $τ$ decay, using the most recent release of the experimental ALEPH data. We critically review all theoretical strategies used in previous works and put forward various novel approaches which allow to study complementary aspects of the problem. We investigate the advantages and disadvantages of the different methods, trying to uncover their potential hidden weaknesses and test the stability of the obtained results under slight variations of the assumed inputs. We perform several determinations, using different methodologies, and find a very consistent set of results. All determinations are in excellent agreement, and allow us to extract a very reliable value for $α_s(m_τ^2)$. The main uncertainty originates in the pure perturbative error from unknown higher orders. Taking into account the systematic differences between the results obtained with the CIPT and FOPT prescriptions, we find $α_{s}^{(n_f=3)}(m_τ^2) = 0.328 \pm 0.013$ which implies $α_{s}^{(n_f=5)}(M_Z^{2}) = 0.1197\pm 0.0015$. △ Less

Submitted 7 September, 2016; v1 submitted 22 May, 2016; originally announced May 2016.

Comments: 41 pages, 12 figures, Preprint numbers: IFIC/16-13 FTUV/16-0522; v.2. One reference added and some typos fixed v.3. Published version. One references added

Journal ref: Phys. Rev. D 94, 034027 (2016)

arXiv:1602.06112 [pdf, other]

doi 10.1103/PhysRevD.94.014017

Updated determination of chiral couplings and vacuum condensates from hadronic tau decay data

Authors: A. Rodríguez-Sánchez, M. González-Alonso, A. Pich

Abstract: We analyze the lowest spectral moments of the left-right two-point correlation function, using all known short-distance constraints and the recently updated ALEPH V-A spectral function from tau decays. This information is used to determine the low-energy couplings L10 and C87 of chiral perturbation theory and the lowest-dimensional contributions to the Operator Product Expansion of the left-right… ▽ More We analyze the lowest spectral moments of the left-right two-point correlation function, using all known short-distance constraints and the recently updated ALEPH V-A spectral function from tau decays. This information is used to determine the low-energy couplings L10 and C87 of chiral perturbation theory and the lowest-dimensional contributions to the Operator Product Expansion of the left-right correlator. A detailed statistical analysis is implemented to assess the theoretical uncertainties, including violations of quark-hadron duality. △ Less

Submitted 19 February, 2016; originally announced February 2016.

Comments: 25 pages, 8 figures, Preprint numbers: IFIC/16-08 FTUV/16-0219

Journal ref: Phys. Rev. D 94, 014017 (2016)

arXiv:1509.08494 [pdf, other]

ChPT parameters from tau-decay data

Authors: A. Rodríguez-Sánchez, M. González-Alonso, A. Pich

Abstract: Using the updated ALEPH V-A spectral function from tau decays, we determine the lowest spectral moments of the left-right correlator and extract dynamical information on order parameters of the QCD chiral symmetry breaking. Uncertainties associated with violations of quark-hadron duality are estimated from the data, imposing all known short-distance constraints on a resonance-based parametrization… ▽ More Using the updated ALEPH V-A spectral function from tau decays, we determine the lowest spectral moments of the left-right correlator and extract dynamical information on order parameters of the QCD chiral symmetry breaking. Uncertainties associated with violations of quark-hadron duality are estimated from the data, imposing all known short-distance constraints on a resonance-based parametrization. Employing proper pinched weight functions, we obtain an accurate determination of the effective chiral couplings L10 and C87 and the dimension-six and -eight contributions in the Operator Product Expansion. △ Less

Submitted 28 September, 2015; originally announced September 2015.

Comments: 5 pages, 3 figures, QCD2015 Montpellier

Report number: preprint numbers: IFIC/15-67 FTUV/15-0928

arXiv:1407.1083 [pdf, other]

doi 10.1155/2015/152394

Radiative corrections to $M_h$ from three generations of Majorana neutrinos and sneutrinos

Authors: S. Heinemeyer, J. Hernandez-Garcia, M. J. Herrero, X. Marcano, A. M. Rodriguez-Sanchez

Abstract: In this work we study the radiative corrections to the mass of the lightest Higgs boson of the MSSM from three generations of Majorana neutrinos and sneutrinos. The spectrum of the MSSM is augmented by three right handed neutrinos and their supersymmetric partners. A seesaw mechanism of type I is used to generate the physical neutrino masses and oscillations that we require to be in agreement with… ▽ More In this work we study the radiative corrections to the mass of the lightest Higgs boson of the MSSM from three generations of Majorana neutrinos and sneutrinos. The spectrum of the MSSM is augmented by three right handed neutrinos and their supersymmetric partners. A seesaw mechanism of type I is used to generate the physical neutrino masses and oscillations that we require to be in agreement with present neutrino data. We present a full one-loop computation of these Higgs mass corrections, and analyze in full detail their numerical size in terms of both the MSSM and the new (s)neutrino parameters. A critical discussion on the different possible renormalization schemes and their implications is included. △ Less

Submitted 31 July, 2015; v1 submitted 3 July, 2014; originally announced July 2014.

Comments: 42 pages, 39 figures, 1 appendix, version published in AHEP

Report number: IFT-UAM/CSIC-14-054, FTUAM-14-21

Journal ref: Advances in High Energy Physics, vol. 2015, Article ID 152394

arXiv:1304.1876

Proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), 2013

Authors: Justus Piater, Antonio Rodríguez-Sánchez

Abstract: This volume represents the proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), held May 23-24, 2013, in Innsbruck, Austria. This volume represents the proceedings of the 37th Annual Workshop of the Austrian Association for Pattern Recognition (ÖAGM/AAPR), held May 23-24, 2013, in Innsbruck, Austria. △ Less

Submitted 28 May, 2013; v1 submitted 6 April, 2013; originally announced April 2013.

Comments: Contributed papers presented at ÖAGM/AAPR 2013

ACM Class: I.4; I.5; I.2.10

arXiv:1201.6157 [pdf, ps, other]

Mh in the MSSM-seesaw scenario with ILC precision

Authors: S. Heinemeyer, M. J. Herrero, S. Penaranda, A. M. Rodriguez-Sanchez

Abstract: We review the computation of the one-loop radiative corrections from the neutrino/ sneutrino sector to the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This model introduces right handed neutrinos and their supersymmetric partners, the sneutrinos, including Majorana mass terms. We find negative and sizeable corrections to Mh, up to -5 GeV for a large Maj… ▽ More We review the computation of the one-loop radiative corrections from the neutrino/ sneutrino sector to the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This model introduces right handed neutrinos and their supersymmetric partners, the sneutrinos, including Majorana mass terms. We find negative and sizeable corrections to Mh, up to -5 GeV for a large Majorana scale, 10^{13}-10^{15} GeV, and for the lightest neutrino mass in a range 0.1-1 eV. The corrections to Mh are substantially larger than the anticipated ILC precision for large regions of the MSSM-seesaw parameter space. △ Less

Submitted 30 January, 2012; originally announced January 2012.

Comments: LaTeX, 6 pages, 2 figures. Proceedings of the 2011 International Workshop on Future Linear Colliders (LCWS11), Granada, Spain, 26-30 September 2011

Report number: IFT-UAM/CSIC-12-09

arXiv:1201.4770 [pdf, ps, other]

Heavy Majorana neutrino effects on MSSM-Mh

Authors: M. J. Herrero, S. Heinemeyer, S. Penaranda, A. M. Rodriguez-Sanchez

Abstract: We study the effects of heavy Majorana neutrinos on the Higgs sector of the MSSM via radiative corrections. We work within the SUSY context where the MSSM particle content is enlarged with right handed neutrinos and their corresponding SUSY partners, the sneutrinos, and where compatibility with neutrino data is required. We compute the one-loop corrections to the mass of the lightest MSSM CP-even… ▽ More We study the effects of heavy Majorana neutrinos on the Higgs sector of the MSSM via radiative corrections. We work within the SUSY context where the MSSM particle content is enlarged with right handed neutrinos and their corresponding SUSY partners, the sneutrinos, and where compatibility with neutrino data is required. We compute the one-loop corrections to the mass of the lightest MSSM CP-even neutral Higgs boson from Majorana neutrinos and their SUSY partners and assume a seesaw mechanism of type I for neutrino mass generation. A negative and sizeable Higgs mass correction of up to -5 GeV is found for a heavy Majorana mass of up to 10^{15} GeV. This negative correction can grow up to several tens of GeV if the soft SUSY breaking mass associated to their sneutrino partners is simmilarly heavy as the Majorana mass. △ Less

Submitted 23 January, 2012; originally announced January 2012.

Comments: LaTeX, 10 pages, 3 figures. Contribution to the proceedings of the 10th International Symposium on Radiative Corrections (Applications of Quantum Field Theory to Phenomenology) - RADCOR2011, September 26-30, 2011, Mamallapuram, India

Report number: IFT-UAM/CSIC-12-06

arXiv:1107.0241 [pdf, ps, other]

M_h in MSSM with Heavy Majorana Neutrinos

Authors: S. Heinemeyer, M. J. Herrero, S. Penaranda, A. M. Rodriguez-Sanchez

Abstract: We review the main results of the one-loop radiative corrections from the neutrino/sneutrino sector to the lightest Higgs boson mass, M_h, within the context of the so-called MSSM-seesaw scenario where right handed neutrinos and their supersymmetric partners are included in order to explain neutrino masses. For simplicity, we have restricted ourselves to the one generation case. We find sizable co… ▽ More We review the main results of the one-loop radiative corrections from the neutrino/sneutrino sector to the lightest Higgs boson mass, M_h, within the context of the so-called MSSM-seesaw scenario where right handed neutrinos and their supersymmetric partners are included in order to explain neutrino masses. For simplicity, we have restricted ourselves to the one generation case. We find sizable corrections to M_h, which are negative in the region where the Majorana scale is large (10^{13} - 10^{15} GeV) and the lightest neutrino mass is within a range inspired by data (0.1 - 1 eV). For some regions of the MSSM-seesaw parameter space, the corrections to M_h are substantially larger than the anticipated LHC precision. △ Less

Submitted 1 July, 2011; originally announced July 2011.

Comments: 4 pages, 1 figure, talk given by A.M.R.-S. at Moriond EW 2011

Report number: IFT-UAM/CSIC-11-48

arXiv:1007.5512 [pdf, ps, other]

doi 10.1007/JHEP05(2011)063

Higgs Boson Masses in the MSSM with Heavy Majorana Neutrinos

Authors: S. Heinemeyer, M. J. Herrero, S. Penaranda, A. M. Rodriguez-Sanchez

Abstract: We present a full diagrammatic computation of the one-loop corrections from the neutrino/sneutrino sector to the renormalized neutral CP-even Higgs boson self-energies and the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This consists of the Minimal Supersymmetric Standard Model with the addition of massive right handed Majorana neutrinos and their super… ▽ More We present a full diagrammatic computation of the one-loop corrections from the neutrino/sneutrino sector to the renormalized neutral CP-even Higgs boson self-energies and the lightest Higgs boson mass, Mh, within the context of the so-called MSSM-seesaw scenario. This consists of the Minimal Supersymmetric Standard Model with the addition of massive right handed Majorana neutrinos and their supersymmetric partners, and where the seesaw mechanism is used for the lightest neutrino mass generation. We explore the dependence on all the parameters involved, with particular emphasis in the role played by the heavy Majorana scale. We restrict ourselves to the case of one generation of neutrinos/sneutrinos. For the numerical part of the study, we consider a very wide range of values for all the parameters involved. We find sizeable corrections to Mh, which are negative in the region where the Majorana scale is large (10^{13}-10^{15} GeV) and the lightest neutrino mass is within a range inspired by data (0.1-1 eV). For some regions of the MSSM-seesaw parameter space, the corrections to Mh are substantially larger than the anticipated Large Hadron Collider precision. △ Less

Submitted 26 May, 2011; v1 submitted 30 July, 2010; originally announced July 2010.

Comments: Latex, 50 pages, 15 figures, 6 tables. Discussion improved. Comments and some new approximate formulae have been added. Published version on JHEP

Report number: IFT-UAM/CSIC-10-41, FTUAM-10-10

Journal ref: JHEP 1105:063,2011

arXiv:0909.0724 [pdf, ps, other]

doi 10.1063/1.3327759

Sensitivity to the Higgs sector of SUSY-seesaw models via LFV tau decays

Authors: M. Herrero, J. Portoles, A. Rodriguez-Sanchez

Abstract: Here we study and compare the sensitivity to the Higgs sector of the SUSY-seesaw models via the LFV tau decays: tau-> 3 mu, tau->K^{+}K^{-}, tau->mu eta and tau-> mu f_{0}. We emphasize that, at present, the two later channels are the most efficient ones to test indirectly the Higgs particles. Here we study and compare the sensitivity to the Higgs sector of the SUSY-seesaw models via the LFV tau decays: tau-> 3 mu, tau->K^{+}K^{-}, tau->mu eta and tau-> mu f_{0}. We emphasize that, at present, the two later channels are the most efficient ones to test indirectly the Higgs particles. △ Less

Submitted 3 September, 2009; originally announced September 2009.

Comments: 4 pages, 3 figures, conference SUSY09 Boston (M.Herrero)

Journal ref: AIP Conf.Proc.1200:908-911,2010

Showing 1–50 of 53 results for author: Rodríguez-Sánchez, A