Search | arXiv e-print repository

Task-specific experimental design for treatment effect estimation

Authors: Bethany Connolly, Kim Moore, Tobias Schwedes, Alexander Adam, Gary Willis, Ilya Feige, Christopher Frye

Abstract: Understanding causality should be a core requirement of any attempt to build real impact through AI. Due to the inherent unobservability of counterfactuals, large randomised trials (RCTs) are the standard for causal inference. But large experiments are generically expensive, and randomisation carries its own costs, e.g. when suboptimal decisions are trialed. Recent work has proposed more sample-ef… ▽ More Understanding causality should be a core requirement of any attempt to build real impact through AI. Due to the inherent unobservability of counterfactuals, large randomised trials (RCTs) are the standard for causal inference. But large experiments are generically expensive, and randomisation carries its own costs, e.g. when suboptimal decisions are trialed. Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought. In this work, we develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications. Across a range of important tasks, real-world datasets, and sample sizes, our method outperforms other benchmarks, e.g. requiring an order-of-magnitude less data to match RCT performance on targeted marketing tasks. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: To appear in ICML 2023; 8 pages, 7 figures, 4 appendices

arXiv:2010.12464 [pdf, other]

Representation Learning for High-Dimensional Data Collection under Local Differential Privacy

Authors: Alex Mansbridge, Gregory Barbour, Davide Piras, Michael Murray, Christopher Frye, Ilya Feige, David Barber

Abstract: The collection of individuals' data has become commonplace in many industries. Local differential privacy (LDP) offers a rigorous approach to preserving privacy whereby the individual privatises their data locally, allowing only their perturbed datum to leave their possession. LDP thus provides a provable privacy guarantee to the individual against both adversaries and database administrators. Exi… ▽ More The collection of individuals' data has become commonplace in many industries. Local differential privacy (LDP) offers a rigorous approach to preserving privacy whereby the individual privatises their data locally, allowing only their perturbed datum to leave their possession. LDP thus provides a provable privacy guarantee to the individual against both adversaries and database administrators. Existing LDP mechanisms have successfully been applied to low-dimensional data, but in high dimensions the privacy-inducing noise largely destroys the utility of the data. In this work, our contributions are two-fold: first, by adapting state-of-the-art techniques from representation learning, we introduce a novel approach to learning LDP mechanisms. These mechanisms add noise to powerful representations on the low-dimensional manifold underlying the data, thereby overcoming the prohibitive noise requirements of LDP in high dimensions. Second, we introduce a novel denoising approach for downstream model learning. The training of performant machine learning models using collected LDP data is a common goal for data collectors, and downstream model performance forms a proxy for the LDP data utility. Our approach significantly outperforms current state-of-the-art LDP mechanisms. △ Less

Submitted 14 May, 2022; v1 submitted 23 October, 2020; originally announced October 2020.

arXiv:2010.07389 [pdf, other]

Explainability for fair machine learning

Authors: Tom Begley, Tobias Schwedes, Christopher Frye, Ilya Feige

Abstract: As the decisions made or influenced by machine learning models increasingly impact our lives, it is crucial to detect, understand, and mitigate unfairness. But even simply determining what "unfairness" should mean in a given context is non-trivial: there are many competing definitions, and choosing between them often requires a deep understanding of the underlying task. It is thus tempting to use… ▽ More As the decisions made or influenced by machine learning models increasingly impact our lives, it is crucial to detect, understand, and mitigate unfairness. But even simply determining what "unfairness" should mean in a given context is non-trivial: there are many competing definitions, and choosing between them often requires a deep understanding of the underlying task. It is thus tempting to use model explainability to gain insights into model fairness, however existing explainability tools do not reliably indicate whether a model is indeed fair. In this work we present a new approach to explaining fairness in machine learning, based on the Shapley value paradigm. Our fairness explanations attribute a model's overall unfairness to individual input features, even in cases where the model does not operate on sensitive attributes directly. Moreover, motivated by the linearity of Shapley explainability, we propose a meta algorithm for applying existing training-time fairness interventions, wherein one trains a perturbation to the original model, rather than a new model entirely. By explaining the original model, the perturbation, and the fair-corrected model, we gain insight into the accuracy-fairness trade-off that is being made by the intervention. We further show that this meta algorithm enjoys both flexibility and stability benefits with no loss in performance. △ Less

Submitted 14 October, 2020; originally announced October 2020.

Comments: 8 pages, 3 figures, 2 tables, 1 appendix

arXiv:2010.07384 [pdf, other]

Human-interpretable model explainability on high-dimensional data

Authors: Damien de Mijolla, Christopher Frye, Markus Kunesch, John Mansir, Ilya Feige

Abstract: The importance of explainability in machine learning continues to grow, as both neural-network architectures and the data they model become increasingly complex. Unique challenges arise when a model's input features become high dimensional: on one hand, principled model-agnostic approaches to explainability become too computationally expensive; on the other, more efficient explainability algorithm… ▽ More The importance of explainability in machine learning continues to grow, as both neural-network architectures and the data they model become increasingly complex. Unique challenges arise when a model's input features become high dimensional: on one hand, principled model-agnostic approaches to explainability become too computationally expensive; on the other, more efficient explainability algorithms lack natural interpretations for general users. In this work, we introduce a framework for human-interpretable explainability on high-dimensional data, consisting of two modules. First, we apply a semantically meaningful latent representation, both to reduce the raw dimensionality of the data, and to ensure its human interpretability. These latent features can be learnt, e.g. explicitly as disentangled representations or implicitly through image-to-image translation, or they can be based on any computable quantities the user chooses. Second, we adapt the Shapley paradigm for model-agnostic explainability to operate on these latent features. This leads to interpretable model explanations that are both theoretically controlled and computationally tractable. We benchmark our approach on synthetic data and demonstrate its effectiveness on several image-classification tasks. △ Less

Submitted 20 December, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 8 pages, 6 figures, 1 appendix

arXiv:2010.03467 [pdf, other]

Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders

Authors: Benoit Gaujac, Ilya Feige, David Barber

Abstract: Probabilistic models with hierarchical-latent-variable structures provide state-of-the-art results amongst non-autoregressive, unsupervised density-based models. However, the most common approach to training such models based on Variational Autoencoders (VAEs) often fails to leverage deep-latent hierarchies; successful approaches require complex inference and optimisation schemes. Optimal Transpor… ▽ More Probabilistic models with hierarchical-latent-variable structures provide state-of-the-art results amongst non-autoregressive, unsupervised density-based models. However, the most common approach to training such models based on Variational Autoencoders (VAEs) often fails to leverage deep-latent hierarchies; successful approaches require complex inference and optimisation schemes. Optimal Transport is an alternative, non-likelihood-based framework for training generative models with appealing theoretical properties, in principle allowing easier training convergence between distributions. In this work we propose a novel approach to training models with deep-latent hierarchies based on Optimal Transport, without the need for highly bespoke models and inference networks. We show that our method enables the generative model to fully leverage its deep-latent hierarchy, avoiding the well known "latent variable collapse" issue of VAEs; therefore, providing qualitatively better sample generations as well as more interpretable latent representation than the original Wasserstein Autoencoder with Maximum Mean Discrepancy divergence. △ Less

Submitted 7 October, 2020; originally announced October 2020.

arXiv:2010.03459 [pdf, other]

Learning disentangled representations with the Wasserstein Autoencoder

Authors: Benoit Gaujac, Ilya Feige, David Barber

Abstract: Disentangled representation learning has undoubtedly benefited from objective function surgery. However, a delicate balancing act of tuning is still required in order to trade off reconstruction fidelity versus disentanglement. Building on previous successes of penalizing the total correlation in the latent variables, we propose TCWAE (Total Correlation Wasserstein Autoencoder). Working in the WAE… ▽ More Disentangled representation learning has undoubtedly benefited from objective function surgery. However, a delicate balancing act of tuning is still required in order to trade off reconstruction fidelity versus disentanglement. Building on previous successes of penalizing the total correlation in the latent variables, we propose TCWAE (Total Correlation Wasserstein Autoencoder). Working in the WAE paradigm naturally enables the separation of the total-correlation term, thus providing disentanglement control over the learned representation, while offering more flexibility in the choice of reconstruction cost. We propose two variants using different KL estimators and perform extensive quantitative comparisons on data sets with known generative factors, showing competitive results relative to state-of-the-art techniques. We further study the trade off between disentanglement and reconstruction on more-difficult data sets with unknown generative factors, where the flexibility of the WAE paradigm in the reconstruction term improves reconstructions. △ Less

Submitted 7 October, 2020; originally announced October 2020.

arXiv:2006.01272 [pdf, other]

Shapley explainability on the data manifold

Authors: Christopher Frye, Damien de Mijolla, Tom Begley, Laurence Cowton, Megan Stanley, Ilya Feige

Abstract: Explainability in AI is crucial for model development, compliance with regulation, and providing operational nuance to predictions. The Shapley framework for explainability attributes a model's predictions to its input features in a mathematically principled and model-agnostic way. However, general implementations of Shapley explainability make an untenable assumption: that the model's features ar… ▽ More Explainability in AI is crucial for model development, compliance with regulation, and providing operational nuance to predictions. The Shapley framework for explainability attributes a model's predictions to its input features in a mathematically principled and model-agnostic way. However, general implementations of Shapley explainability make an untenable assumption: that the model's features are uncorrelated. In this work, we demonstrate unambiguous drawbacks of this assumption and develop two solutions to Shapley explainability that respect the data manifold. One solution, based on generative modelling, provides flexible access to data imputations; the other directly learns the Shapley value-function, providing performance and stability at the cost of flexibility. While "off-manifold" Shapley values can (i) give rise to incorrect explanations, (ii) hide implicit model dependence on sensitive attributes, and (iii) lead to unintelligible explanations in higher-dimensional data, on-manifold explainability overcomes these problems. △ Less

Submitted 20 December, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: To appear in ICLR 2021; 9 pages, 6 figures, 2 appendices

arXiv:1910.06358 [pdf, other]

Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability

Authors: Christopher Frye, Colin Rowat, Ilya Feige

Abstract: Explaining AI systems is fundamental both to the development of high performing models and to the trust placed in them by their users. The Shapley framework for explainability has strength in its general applicability combined with its precise, rigorous foundation: it provides a common, model-agnostic language for AI explainability and uniquely satisfies a set of intuitive mathematical axioms. How… ▽ More Explaining AI systems is fundamental both to the development of high performing models and to the trust placed in them by their users. The Shapley framework for explainability has strength in its general applicability combined with its precise, rigorous foundation: it provides a common, model-agnostic language for AI explainability and uniquely satisfies a set of intuitive mathematical axioms. However, Shapley values are too restrictive in one significant regard: they ignore all causal structure in the data. We introduce a less restrictive framework, Asymmetric Shapley values (ASVs), which are rigorously founded on a set of axioms, applicable to any AI system, and flexible enough to incorporate any causal structure known to be respected by the data. We demonstrate that ASVs can (i) improve model explanations by incorporating causal information, (ii) provide an unambiguous test for unfair discrimination in model predictions, (iii) enable sequentially incremental explanations in time-series models, and (iv) support feature-selection studies without the need for model retraining. △ Less

Submitted 20 December, 2021; v1 submitted 14 October, 2019; originally announced October 2019.

Comments: To appear in NeurIPS 2020; 9 pages, 2 figures, 2 appendices

arXiv:1906.10137 [pdf, other]

doi 10.1103/PhysRevLett.123.182001

Binary JUNIPR: an interpretable probabilistic model for discrimination

Authors: Anders Andreassen, Ilya Feige, Christopher Frye, Matthew D. Schwartz

Abstract: JUNIPR is an approach to unsupervised learning in particle physics that scaffolds a probabilistic model for jets around their representation as binary trees. Separate JUNIPR models can be learned for different event or jet types, then compared and explored for physical insight. The relative probabilities can also be used for discrimination. In this paper, we show how the training of the separate m… ▽ More JUNIPR is an approach to unsupervised learning in particle physics that scaffolds a probabilistic model for jets around their representation as binary trees. Separate JUNIPR models can be learned for different event or jet types, then compared and explored for physical insight. The relative probabilities can also be used for discrimination. In this paper, we show how the training of the separate models can be refined in the context of classification to optimize discrimination power. We refer to this refined approach as Binary JUNIPR. Binary JUNIPR achieves state-of-the-art performance for quark/gluon discrimination and top-tagging. The trained models can then be analyzed to provide physical insight into how the classification is achieved. As examples, we explore differences between quark and gluon jets and between gluon jets generated with two different simulations. △ Less

Submitted 24 June, 2019; originally announced June 2019.

Comments: 6 pages, 3 figures

Journal ref: Phys. Rev. Lett. 123, 182001 (2019)

arXiv:1902.06766 [pdf, other]

Parenting: Safe Reinforcement Learning from Human Input

Authors: Christopher Frye, Ilya Feige

Abstract: Autonomous agents trained via reinforcement learning present numerous safety concerns: reward hacking, negative side effects, and unsafe exploration, among others. In the context of near-future autonomous agents, operating in environments where humans understand the existing dangers, human involvement in the learning process has proved a promising approach to AI Safety. Here we demonstrate that a… ▽ More Autonomous agents trained via reinforcement learning present numerous safety concerns: reward hacking, negative side effects, and unsafe exploration, among others. In the context of near-future autonomous agents, operating in environments where humans understand the existing dangers, human involvement in the learning process has proved a promising approach to AI Safety. Here we demonstrate that a precise framework for learning from human input, loosely inspired by the way humans parent children, solves a broad class of safety problems in this context. We show that our Parenting algorithm solves these problems in the relevant AI Safety gridworlds of Leike et al. (2017), that an agent can learn to outperform its parent as it "matures", and that policies learnt through Parenting are generalisable to new environments. △ Less

Submitted 18 February, 2019; originally announced February 2019.

Comments: 9 pages, 4 figures, 1 table

arXiv:1902.03251 [pdf, other]

Invariant-equivariant representation learning for multi-class data

Authors: Ilya Feige

Abstract: Representations learnt through deep neural networks tend to be highly informative, but opaque in terms of what information they learn to encode. We introduce an approach to probabilistic modelling that learns to represent data with two separate deep representations: an invariant representation that encodes the information of the class from which the data belongs, and an equivariant representation… ▽ More Representations learnt through deep neural networks tend to be highly informative, but opaque in terms of what information they learn to encode. We introduce an approach to probabilistic modelling that learns to represent data with two separate deep representations: an invariant representation that encodes the information of the class from which the data belongs, and an equivariant representation that encodes the symmetry transformation defining the particular data point within the class manifold (equivariant in the sense that the representation varies naturally with symmetry transformations). This approach is based primarily on the strategic routing of data through the two latent variables, and thus is conceptually transparent, easy to implement, and in-principle generally applicable to any data comprised of discrete classes of continuous distributions (e.g. objects in images, topics in language, individuals in behavioural data). We demonstrate qualitatively compelling representation learning and competitive quantitative performance, in both supervised and semi-supervised settings, versus comparable modelling approaches in the literature with little fine tuning. △ Less

Submitted 19 May, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

Comments: 8 pages, 5 figures, 2 tables, 2 appendices

Journal ref: ICML 2019

arXiv:1806.04480 [pdf, other]

Improving latent variable descriptiveness with AutoGen

Authors: Alex Mansbridge, Roberto Fierimonte, Ilya Feige, David Barber

Abstract: Powerful generative models, particularly in Natural Language Modelling, are commonly trained by maximizing a variational lower bound on the data log likelihood. These models often suffer from poor use of their latent variable, with ad-hoc annealing factors used to encourage retention of information in the latent variable. We discuss an alternative and general approach to latent variable modelling,… ▽ More Powerful generative models, particularly in Natural Language Modelling, are commonly trained by maximizing a variational lower bound on the data log likelihood. These models often suffer from poor use of their latent variable, with ad-hoc annealing factors used to encourage retention of information in the latent variable. We discuss an alternative and general approach to latent variable modelling, based on an objective that combines the data log likelihood as well as the likelihood of a perfect reconstruction through an autoencoder. Tying these together ensures by design that the latent variable captures information about the observations, whilst retaining the ability to generate well. Interestingly, though this approach is a priori unrelated to VAEs, the lower bound attained is identical to the standard VAE bound but with the addition of a simple pre-factor; thus, providing a formal interpretation of the commonly used, ad-hoc pre-factors in training VAEs. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: 8 pages, 2 figures, 5 tables

arXiv:1806.04465 [pdf, other]

Gaussian mixture models with Wasserstein distance

Authors: Benoit Gaujac, Ilya Feige, David Barber

Abstract: Generative models with both discrete and continuous latent variables are highly motivated by the structure of many real-world data sets. They present, however, subtleties in training often manifesting in the discrete latent being under leveraged. In this paper, we show that such models are more amenable to training when using the Optimal Transport framework of Wasserstein Autoencoders. We find our… ▽ More Generative models with both discrete and continuous latent variables are highly motivated by the structure of many real-world data sets. They present, however, subtleties in training often manifesting in the discrete latent being under leveraged. In this paper, we show that such models are more amenable to training when using the Optimal Transport framework of Wasserstein Autoencoders. We find our discrete latent variable to be fully leveraged by the model when trained, without any modifications to the objective function or significant fine tuning. Our model generates comparable samples to other approaches while using relatively simple neural networks, since the discrete latent variable carries much of the descriptive burden. Furthermore, the discrete latent provides significant control over generation. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: 8 pages, 5 figures

arXiv:1804.09720 [pdf, other]

doi 10.1140/epjc/s10052-019-6607-9

JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics

Authors: Anders Andreassen, Ilya Feige, Christopher Frye, Matthew D. Schwartz

Abstract: In applications of machine learning to particle physics, a persistent challenge is how to go beyond discrimination to learn about the underlying physics. To this end, a powerful tool would be a framework for unsupervised learning, where the machine learns the intricate high-dimensional contours of the data upon which it is trained, without reference to pre-established labels. In order to approach… ▽ More In applications of machine learning to particle physics, a persistent challenge is how to go beyond discrimination to learn about the underlying physics. To this end, a powerful tool would be a framework for unsupervised learning, where the machine learns the intricate high-dimensional contours of the data upon which it is trained, without reference to pre-established labels. In order to approach such a complex task, an unsupervised network must be structured intelligently, based on a qualitative understanding of the data. In this paper, we scaffold the neural network's architecture around a leading-order model of the physics underlying the data. In addition to making unsupervised learning tractable, this design actually alleviates existing tensions between performance and interpretability. We call the framework JUNIPR: "Jets from UNsupervised Interpretable PRobabilistic models". In this approach, the set of particle momenta composing a jet are clustered into a binary tree that the neural network examines sequentially. Training is unsupervised and unrestricted: the network could decide that the data bears little correspondence to the chosen tree structure. However, when there is a correspondence, the network's output along the tree has a direct physical interpretation. JUNIPR models can perform discrimination tasks, through the statistically optimal likelihood-ratio test, and they permit visualizations of discrimination power at each branching in a jet's tree. Additionally, JUNIPR models provide a probability distribution from which events can be drawn, providing a data-driven Monte Carlo generator. As a third application, JUNIPR models can reweight events from one (e.g. simulated) data set to agree with distributions from another (e.g. experimental) data set. △ Less

Submitted 25 April, 2018; originally announced April 2018.

Comments: 37 pages, 24 figures

arXiv:1703.03411 [pdf, other]

doi 10.1007/JHEP11(2017)142

A Complete Basis of Helicity Operators for Subleading Factorization

Authors: Ilya Feige, Daniel W. Kolodrubetz, Ian Moult, Iain W. Stewart

Abstract: Factorization theorems underly our ability to make predictions for many processes involving the strong interaction. Although typically formulated at leading power, the study of factorization at subleading power is of interest both for improving the precision of calculations, as well as for understanding the all orders structure of QCD. We use the SCET helicity operator formalism to construct a com… ▽ More Factorization theorems underly our ability to make predictions for many processes involving the strong interaction. Although typically formulated at leading power, the study of factorization at subleading power is of interest both for improving the precision of calculations, as well as for understanding the all orders structure of QCD. We use the SCET helicity operator formalism to construct a complete power suppressed basis of hard scattering operators for $e^+e^-\to$ dijets, $e^- p\to e^-$ jet, and constrained Drell-Yan, including the first two subleading orders in the amplitude level power expansion. We analyze the form of the hard, jet, and soft function contributions to the power suppressed cross section for $e^+e^-\to$ dijet event shapes, and give results for the lowest order matching to the contributing operators. These results will be useful for studies of power corrections both in fixed order and resummed perturbation theory. △ Less

Submitted 9 March, 2017; originally announced March 2017.

Comments: 110 pages, many figures

Report number: MIT-CTP 4597

arXiv:1507.06315 [pdf, other]

doi 10.1007/JHEP08(2016)112

Streamlining resummed QCD calculations using Monte Carlo integration

Authors: David Farhi, Ilya Feige, Marat Freytsis, Matthew D. Schwartz

Abstract: Some of the most arduous and error-prone aspects of precision resummed calculations are related to the partonic hard process, having nothing to do with the resummation. In particular, interfacing to parton-distribution functions, combining various channels, and performing the phase space integration can be limiting factors in completing calculations. Conveniently, however, most of these tasks are… ▽ More Some of the most arduous and error-prone aspects of precision resummed calculations are related to the partonic hard process, having nothing to do with the resummation. In particular, interfacing to parton-distribution functions, combining various channels, and performing the phase space integration can be limiting factors in completing calculations. Conveniently, however, most of these tasks are already automated in many Monte Carlo programs, such as MadGraph, Alpgen or Sherpa. In this paper, we show how such programs can be used to produce distributions of partonic kinematics with associated color structures representing the hard factor in a resummed distribution. These distributions can then be used to weight convolutions of jet, soft and beam functions producing a complete resummed calculation. In fact, only around 1000 unweighted events are necessary to produce precise distributions. A number of examples and checks are provided, including $e^+e^-$ two- and four-jet event shapes, $n$-jettiness and jet-mass related observables at hadron colliders. Attached code can be used to modify MadGraph to export the relevant leading-order hard functions and color structures for arbitrary processes. △ Less

Submitted 22 July, 2015; originally announced July 2015.

Comments: 30 pages, 10 figures, code included with submission

arXiv:1502.05411 [pdf, other]

doi 10.1103/PhysRevD.91.094027

Removing phase-space restrictions in factorized cross sections

Authors: Ilya Feige, Matthew D. Schwartz, Kai Yan

Abstract: Factorization in gauge theories holds at the amplitude or amplitude-squared level for states of given soft or collinear momenta. When performing phase-space integrals over such states, one would generally like to avoid putting in explicit cuts to separate soft from collinear momenta. Removing these cuts induces an overcounting of the soft-collinear region and adds new infrared-ultraviolet divergen… ▽ More Factorization in gauge theories holds at the amplitude or amplitude-squared level for states of given soft or collinear momenta. When performing phase-space integrals over such states, one would generally like to avoid putting in explicit cuts to separate soft from collinear momenta. Removing these cuts induces an overcounting of the soft-collinear region and adds new infrared-ultraviolet divergences in the collinear region. In this paper, we first present a regulator-independent subtraction algorithm for removing soft-collinear overlap at the amplitude level which may be useful in pertubative QCD. We then discuss how both the soft-collinear and infrared-ultraviolet overlap can be undone for certain observables in a way which respects factorization. Our discussion clarifies some of the subtleties in phase-space subtractions and includes a proof of the infrared finiteness of a suitably subtracted jet function. These results complete the connection between factorized QCD and Soft-Collinear Effective Theory. △ Less

Submitted 18 February, 2015; originally announced February 2015.

Comments: 32 pages, 1 figure

Journal ref: Phys. Rev. D 91, 094027 (2015)

arXiv:1403.6472 [pdf, other]

doi 10.1103/PhysRevD.90.105020

Hard-Soft-Collinear Factorization to All Orders

Authors: Ilya Feige, Matthew D. Schwartz

Abstract: We provide a precise statement of hard-soft-collinear factorization of scattering amplitudes and prove it to all orders in perturbation theory. Factorization is formulated as the equality at leading power of scattering amplitudes in QCD with other amplitudes in QCD computed from a product of operator matrix elements. The equivalence is regulator independent and gauge independent. As the formulatio… ▽ More We provide a precise statement of hard-soft-collinear factorization of scattering amplitudes and prove it to all orders in perturbation theory. Factorization is formulated as the equality at leading power of scattering amplitudes in QCD with other amplitudes in QCD computed from a product of operator matrix elements. The equivalence is regulator independent and gauge independent. As the formulation relates amplitudes to the same amplitudes with additional soft or collinear particles, it includes as special cases the factorization of soft currents and collinear splitting functions from generic matrix elements, both of which are shown to be process independent to all orders. We show that the overlap** soft-collinear region is naturally accounted for by vacuum matrix elements of kinked Wilson lines. Although the proof is self-contained, it combines techniques developed for the study of pinch surfaces, scattering amplitudes, and effective field theory. △ Less

Submitted 6 March, 2015; v1 submitted 25 March, 2014; originally announced March 2014.

Comments: 88 pages. Version 3 is updated to match the PRD article

Journal ref: Phys. Rev. D 90, 105020 (2014)

arXiv:1306.6341 [pdf, other]

doi 10.1103/PhysRevD.88.065021

An on-shell approach to factorization

Authors: Ilya Feige, Matthew D. Schwartz

Abstract: Factorization is possible due to the universal behavior of Yang-Mills theories in soft and collinear limits. Here, we take a small step towards a more transparent understanding of these limits by proving a form of perturbative factorization at tree- level using on-shell spinor helicity methods. We present a concrete and self-contained expression of factorization in which matrix elements in QCD are… ▽ More Factorization is possible due to the universal behavior of Yang-Mills theories in soft and collinear limits. Here, we take a small step towards a more transparent understanding of these limits by proving a form of perturbative factorization at tree- level using on-shell spinor helicity methods. We present a concrete and self-contained expression of factorization in which matrix elements in QCD are related to products of other matrix elements in QCD up to leading order in a power-counting parameter determined by the momenta of certain physical on-shell states. Our approach uses only the scaling of momenta in soft and collinear limits, avoiding any assignment of scaling behavior to unphysical (and gauge-dependent) fields. The proof of factorization exploits many advantages of helicity spinors, such as the freedom to choose different reference vectors for polarizations in different collinear sectors. An advantage of this approach is that once factorization is shown to hold in QCD, the transition to Soft-Collinear Effective Theory is effortless. △ Less

Submitted 11 July, 2013; v1 submitted 26 June, 2013; originally announced June 2013.

Comments: 48 pages

arXiv:1204.3898 [pdf, ps, other]

doi 10.1103/PhysRevLett.109.092001

Precision Jet Substructure from Boosted Event Shapes

Authors: Ilya Feige, Matthew D. Schwartz, Iain W. Stewart, Jesse Thaler

Abstract: Jet substructure has emerged as a critical tool for LHC searches, but studies so far have relied heavily on shower Monte Carlo simulations, which formally approximate QCD at leading-log level. We demonstrate that systematic higher-order QCD computations of jet substructure can be carried out by boosting global event shapes by a large momentum Q, and accounting for effects due to finite jet size, i… ▽ More Jet substructure has emerged as a critical tool for LHC searches, but studies so far have relied heavily on shower Monte Carlo simulations, which formally approximate QCD at leading-log level. We demonstrate that systematic higher-order QCD computations of jet substructure can be carried out by boosting global event shapes by a large momentum Q, and accounting for effects due to finite jet size, initial-state radiation (ISR), and the underlying event (UE) as 1/Q corrections. In particular, we compute the 2-subjettiness substructure distribution for boosted Z -> q qbar events at the LHC at next-to-next-to-next-to-leading-log order. The calculation is greatly simplified by recycling the known results for the thrust distribution in e+ e- collisions. The 2-subjettiness distribution quickly saturates, becoming Q independent for Q > 400 GeV. Crucially, the effects of jet contamination from ISR/UE can be subtracted out analytically at large Q, without knowing their detailed form. Amusingly, the Q=infinity and Q=0 distributions are related by a scaling by e, up to next-to-leading-log order. △ Less

Submitted 17 September, 2012; v1 submitted 17 April, 2012; originally announced April 2012.

Comments: 5 pages, 7 figures

Journal ref: Phys. Rev. Lett. 109, 092001 (2012)

Showing 1–20 of 20 results for author: Feige, I