-
Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter
Authors:
M. Aamir,
B. Acar,
G. Adamov,
T. Adams,
C. Adloff,
S. Afanasiev,
C. Agrawal,
C. Agrawal,
A. Ahmad,
H. A. Ahmed,
S. Akbar,
N. Akchurin,
B. Akgul,
B. Akgun,
R. O. Akpinar,
E. Aktas,
A. AlKadhim,
V. Alexakhin,
J. Alimena,
J. Alison,
A. Alpana,
W. Alshehri,
P. Alvarez Dominguez,
M. Alyari,
C. Amendola
, et al. (550 additional authors not shown)
Abstract:
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr…
▽ More
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Effects of oxygen on the optical properties of phenyl-based scintillators during irradiation and recovery
Authors:
C. Papageorgakis,
M. Y. Aamir,
A. Belloni,
T. K. Edberg,
S. C. Eno,
B. Kronheim,
C. Palmer
Abstract:
Plastic scintillators are a versatile and inexpensive option for particle detection, which is why the largest particle physics experiments, CMS and ATLAS, use them extensively in their calorimeters. One of their challenging aspects, however, is their relatively low radiation hardness, which might be inadequate for very high luminosity future projects like the FCC-hh. In this study, results on the…
▽ More
Plastic scintillators are a versatile and inexpensive option for particle detection, which is why the largest particle physics experiments, CMS and ATLAS, use them extensively in their calorimeters. One of their challenging aspects, however, is their relatively low radiation hardness, which might be inadequate for very high luminosity future projects like the FCC-hh. In this study, results on the effects of ionizing radiation on the optical properties of plastic scintillator samples are presented. The samples are made from two different matrix materials, polystyrene and polyvinyltoluene, and have been irradiated at dose rates ranging from $2.2\,$Gy/h up to $3.4\,$kGy/h at room temperature. An internal boundary that separates two regions of different indices of refraction is visible in the samples depending on the dose rate, and it is compatible with the expected oxygen penetration depth during irradiation. The dose rate dependence of the oxygen penetration depth for the two matrix materials suggests that the oxygen penetration coefficient differs for PS and PVT. The values of the refractive index for the internal regions are elevated compared to those of the outer regions, which are compatible with the indices of unirradiated samples.
△ Less
Submitted 8 December, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Reduction of light output of plastic scintillator tiles during irradiation at cold temperatures and in low-oxygen environments
Authors:
B. Kronheim,
A. Belloni,
T. K. Edberg,
S. C. Eno,
C. Howe,
C. Palmer,
C. Papageorgakis,
M. Paranjpe,
S. Sriram
Abstract:
The advent of the silicon photomultiplier has allowed the development of highly segmented calorimeters using plastic scintillator as the active media, with photodetectors embedded in the calorimeter, in dimples in the plastic. To reduce the photodetector's dark current and radiation damage, the high granularity calorimeter designed for the high luminosity upgrade of the CMS detector at CERN's Larg…
▽ More
The advent of the silicon photomultiplier has allowed the development of highly segmented calorimeters using plastic scintillator as the active media, with photodetectors embedded in the calorimeter, in dimples in the plastic. To reduce the photodetector's dark current and radiation damage, the high granularity calorimeter designed for the high luminosity upgrade of the CMS detector at CERN's Large Hadron Collider will be operated at a temperature of about -30$^\circ$C. Due to flammability considerations, a low oxygen environment is being considered. However, the radiation damage to the plastic scintillator during irradiation in this operating environment needs to be considered. In this paper, we present measurements of the relative decrease of light output during irradiation of small plastic scintillator tiles read out by silicon photomultipliers. The irradiations were performed using a $^{60}\mathrm{Co}$ source both to produce the tiles' light and as a source of ionizing irradiation at dose rates of 0.3, 1.3, and $1.6\,$Gy/hr, temperatures of -30, -15, -5, and 0$^\circ$C, and with several different oxygen concentrations in the surrounding atmosphere. The effect of the material used to wrap the tile was also studied. Substantial temporary damage, which annealed when the sample was warmed, was seen during the low-temperature irradiations, regardless of the oxygen concentration and wrap** material. The relative light loss was largest with 3M$^{\tiny \textrm{TM}}$ Enhanced Specular Reflector Film wrap** and smallest with no wrap**, although due to the substantially higher light yield with wrap**, the final light output is largest with wrap**. The light loss was less at warmer temperatures. Damage with $3\%$ oxygen was similar to that in standard atmosphere. Evidence of a plateau in the radical density was seen for the 0$^\circ$C data.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Neighborhood Adaptive Estimators for Causal Inference under Network Interference
Authors:
Alexandre Belloni,
Fei Fang,
Alexander Volfovsky
Abstract:
Estimating causal effects has become an integral part of most applied fields. Solving these modern causal questions requires tackling violations of many classical causal assumptions. In this work we consider the violation of the classical no-interference assumption, meaning that the treatment of one individuals might affect the outcomes of another. To make interference tractable, we consider a kno…
▽ More
Estimating causal effects has become an integral part of most applied fields. Solving these modern causal questions requires tackling violations of many classical causal assumptions. In this work we consider the violation of the classical no-interference assumption, meaning that the treatment of one individuals might affect the outcomes of another. To make interference tractable, we consider a known network that describes how interference may travel. However, unlike previous work in this area, the radius (and intensity) of the interference experienced by a unit is unknown and can depend on different sub-networks of those treated and untreated that are connected to this unit.
We study estimators for the average direct treatment effect on the treated in such a setting. The proposed estimator builds upon a Lepski-like procedure that searches over the possible relevant radii and treatment assignment patterns. In contrast to previous work, the proposed procedure aims to approximate the relevant network interference patterns. We establish oracle inequalities and corresponding adaptive rates for the estimation of the interference function. We leverage such estimates to propose and analyze two estimators for the average direct treatment effect on the treated. We address several challenges steaming from the data-driven creation of the patterns (i.e. feature engineering) and the network dependence. In addition to rates of convergence, under mild regularity conditions, we show that one of the proposed estimators is asymptotically normal and unbiased.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
The Future of US Particle Physics -- The Snowmass 2021 Energy Frontier Report
Authors:
Meenakshi Narain,
Laura Reina,
Alessandro Tricoli,
Michael Begel,
Alberto Belloni,
Tulika Bose,
Antonio Boveia,
Sally Dawson,
Caterina Doglioni,
Ayres Freitas,
James Hirschauer,
Stefan Hoeche,
Yen-Jie Lee,
Huey-Wen Lin,
Elliot Lipeles,
Zhen Liu,
Patrick Meade,
Swagato Mukherjee,
Pavel Nadolsky,
Isobel Ojalvo,
Simone Pagan Griso,
Christophe Royon,
Michael Schmitt,
Reinhard Schwienhorst,
Nausheen Shah
, et al. (10 additional authors not shown)
Abstract:
This report, as part of the 2021 Snowmass Process, summarizes the current status of collider physics at the Energy Frontier, the broad and exciting future prospects identified for the Energy Frontier, the challenges and needs of future experiments, and indicates high priority research areas.
This report, as part of the 2021 Snowmass Process, summarizes the current status of collider physics at the Energy Frontier, the broad and exciting future prospects identified for the Energy Frontier, the challenges and needs of future experiments, and indicates high priority research areas.
△ Less
Submitted 3 January, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Performance of the CMS High Granularity Calorimeter prototype to charged pion beams of 20$-$300 GeV/c
Authors:
B. Acar,
G. Adamov,
C. Adloff,
S. Afanasiev,
N. Akchurin,
B. Akgün,
M. Alhusseini,
J. Alison,
J. P. Figueiredo de sa Sousa de Almeida,
P. G. Dias de Almeida,
A. Alpana,
M. Alyari,
I. Andreev,
U. Aras,
P. Aspell,
I. O. Atakisi,
O. Bach,
A. Baden,
G. Bakas,
A. Bakshi,
S. Banerjee,
P. DeBarbaro,
P. Bargassa,
D. Barney,
F. Beaudette
, et al. (435 additional authors not shown)
Abstract:
The upgrade of the CMS experiment for the high luminosity operation of the LHC comprises the replacement of the current endcap calorimeter by a high granularity sampling calorimeter (HGCAL). The electromagnetic section of the HGCAL is based on silicon sensors interspersed between lead and copper (or copper tungsten) absorbers. The hadronic section uses layers of stainless steel as an absorbing med…
▽ More
The upgrade of the CMS experiment for the high luminosity operation of the LHC comprises the replacement of the current endcap calorimeter by a high granularity sampling calorimeter (HGCAL). The electromagnetic section of the HGCAL is based on silicon sensors interspersed between lead and copper (or copper tungsten) absorbers. The hadronic section uses layers of stainless steel as an absorbing medium and silicon sensors as an active medium in the regions of high radiation exposure, and scintillator tiles directly readout by silicon photomultipliers in the remaining regions. As part of the development of the detector and its readout electronic components, a section of a silicon-based HGCAL prototype detector along with a section of the CALICE AHCAL prototype was exposed to muons, electrons and charged pions in beam test experiments at the H2 beamline at the CERN SPS in October 2018. The AHCAL uses the same technology as foreseen for the HGCAL but with much finer longitudinal segmentation. The performance of the calorimeters in terms of energy response and resolution, longitudinal and transverse shower profiles is studied using negatively charged pions, and is compared to GEANT4 predictions. This is the first report summarizing results of hadronic showers measured by the HGCAL prototype using beam test data.
△ Less
Submitted 27 May, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Report of the Topical Group on Electroweak Precision Physics and Constraining New Physics for Snowmass 2021
Authors:
Alberto Belloni,
Ayres Freitas,
Jun** Tian,
Juan Alcaraz Maestre Aram Apyan,
Bianca Azartash-Namin,
Paolo Azzurri,
Swagato Banerjee,
Jakob Beyer,
Saptaparna Bhattacharya,
Jorge de Blas,
Alain Blondel,
Daniel Britzger,
Mogens Dam,
Yong Du,
David d'Enterria,
Keisuke Fujii,
Christophe Grojean,
Jiayin Gu,
Tao Han,
Michael Hildreth,
Adrián Irles,
Patrick Janot,
Daniel Jeans,
Mayuri Kawale,
Elham E Khoda
, et al. (43 additional authors not shown)
Abstract:
The precise measurement of physics observables and the test of their consistency within the standard model (SM) are an invaluable approach, complemented by direct searches for new particles, to determine the existence of physics beyond the standard model (BSM). Studies of massive electroweak gauge bosons (W and Z bosons) are a promising target for indirect BSM searches, since the interactions of p…
▽ More
The precise measurement of physics observables and the test of their consistency within the standard model (SM) are an invaluable approach, complemented by direct searches for new particles, to determine the existence of physics beyond the standard model (BSM). Studies of massive electroweak gauge bosons (W and Z bosons) are a promising target for indirect BSM searches, since the interactions of photons and gluons are strongly constrained by the unbroken gauge symmetries. They can be divided into two categories: (a) Fermion scattering processes mediated by s- or t-channel W/Z bosons, also known as electroweak precision measurements; and (b) multi-boson processes, which include production of two or more vector bosons in fermion-antifermion annihilation, as well as vector boson scattering (VBS) processes. The latter categories can test modifications of gauge-boson self-interactions, and the sensitivity is typically improved with increased collision energy.
This report evaluates the achievable precision of a range of future experiments, which depend on the statistics of the collected data sample, the experimental and theoretical systematic uncertainties, and their correlations. In addition it presents a combined interpretation of these results, together with similar studies in the Higgs and top sector, in the Standard Model effective field theory (SMEFT) framework. This framework provides a model-independent prescription to put generic constraints on new physics and to study and combine large sets of experimental observables, assuming that the new physics scales are significantly higher than the EW scale.
△ Less
Submitted 28 November, 2022; v1 submitted 16 September, 2022;
originally announced September 2022.
-
Dose rate effects in radiation-induced changes to phenyl-based polymeric scintillators
Authors:
Christos Papageorgakis,
Mohamad Al-Sheikhly,
Alberto Belloni,
Timothy K. Edberg,
Sarah C. Eno,
Yongbin Feng,
Geng-Yuan Jeng,
Abraham Kahn,
Yihui Lai,
Tyler McDonnell,
Christopher Palmer,
Ruhi Perez-Gokhale,
Francesca Ricci-Tam,
Yao Yao,
Zishuo Yang
Abstract:
Results on the effects of ionizing radiation on the signal produced by plastic scintillating rods manufactured by Eljen Technology company are presented for various matrix materials, dopant concentrations, fluors (EJ-200 and EJ-260), anti-oxidant concentrations, scintillator thickness, doses, and dose rates. The light output before and after irradiation is measured using an alpha source and a phot…
▽ More
Results on the effects of ionizing radiation on the signal produced by plastic scintillating rods manufactured by Eljen Technology company are presented for various matrix materials, dopant concentrations, fluors (EJ-200 and EJ-260), anti-oxidant concentrations, scintillator thickness, doses, and dose rates. The light output before and after irradiation is measured using an alpha source and a photomultiplier tube, and the light transmission by a spectrophotometer. Assuming an exponential decrease in the light output with dose, the change in light output is quantified using the exponential dose constant $D$. The $D$ values are similar for primary and secondary do** concentrations of 1 and 2 times, and for antioxidant concentrations of 0, 1, and 2 times, the default manufacturer's concentration. The $D$ value depends approximately linearly on the logarithm of the dose rate for dose rates between 2.2 Gy/hr and 70 Gy/hr for all materials. For EJ-200 polyvinyltoluene-based (PVT) scintillator, the dose constant is approximately linear in the logarithm of the dose rate up to 3400 Gy/hr, while for polystyrene-based (PS) scintillator or for both materials with EJ-260 fluors, it remains constant or decreases (depending on do** concentration) above about 100 Gy/hr. The results from rods of varying thickness and from the different fluors suggest damage to the initial light output is a larger effect than color center formation for scintillator thickness $\leq1$ cm. For the blue scintillator (EJ-200), the transmission measurements indicate damage to the fluors. We also find that while PVT is more resistant to radiation damage than PS at dose rates higher than about 100 Gy/hr for EJ-200 fluors, they show similar damage at lower dose rates and for EJ-260 fluors.
△ Less
Submitted 8 August, 2023; v1 submitted 29 March, 2022;
originally announced March 2022.
-
Dual-Readout Calorimetry for Future Experiments Probing Fundamental Physics
Authors:
I. Pezzotti,
Harvey Newman,
J. Freeman,
J. Hirschauer,
R. Ferrari,
G. Gaudio,
G. Polesello,
R. Santoro,
M. Lucchini,
S. Giagu,
F. Bedeschi,
Sehwook Lee,
P. Harris,
C. Tully,
A. Jung,
Nural Akchurin,
A. Belloni,
S. Eno,
J. Qian,
B. Zhou,
J. Zhu,
Jason Sang Hun Lee,
I. Vivarelli,
R. Hirosky,
Hwidong Yoo
Abstract:
In this White Paper for the 2021 Snowmass process, we detail the status and prospects for dual-readout calorimetry. While all calorimeters allow estimation of energy depositions in their active material, dual-readout calorimeters aim to provide additional information on the light produced in the sensitive media via, for example, wavelength and polarization, and/or a precision timing measurements,…
▽ More
In this White Paper for the 2021 Snowmass process, we detail the status and prospects for dual-readout calorimetry. While all calorimeters allow estimation of energy depositions in their active material, dual-readout calorimeters aim to provide additional information on the light produced in the sensitive media via, for example, wavelength and polarization, and/or a precision timing measurements, allowing an estimation of the shower-by-shower particle content. Utilizing this knowledge of the shower particle content may allow unprecedented energy resolution for hadronic particles and jets and new types of particle flow algorithms. We also discuss the impact continued development of this kind of calorimetry could have on precision on Higgs boson property measurements at future colliders.
△ Less
Submitted 4 May, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Response of a CMS HGCAL silicon-pad electromagnetic calorimeter prototype to 20-300 GeV positrons
Authors:
B. Acar,
G. Adamov,
C. Adloff,
S. Afanasiev,
N. Akchurin,
B. Akgün,
F. Alam Khan,
M. Alhusseini,
J. Alison,
A. Alpana,
G. Altopp,
M. Alyari,
S. An,
S. Anagul,
I. Andreev,
P. Aspell,
I. O. Atakisi,
O. Bach,
A. Baden,
G. Bakas,
A. Bakshi,
S. Bannerjee,
P. Bargassa,
D. Barney,
F. Beaudette
, et al. (364 additional authors not shown)
Abstract:
The Compact Muon Solenoid Collaboration is designing a new high-granularity endcap calorimeter, HGCAL, to be installed later this decade. As part of this development work, a prototype system was built, with an electromagnetic section consisting of 14 double-sided structures, providing 28 sampling layers. Each sampling layer has an hexagonal module, where a multipad large-area silicon sensor is glu…
▽ More
The Compact Muon Solenoid Collaboration is designing a new high-granularity endcap calorimeter, HGCAL, to be installed later this decade. As part of this development work, a prototype system was built, with an electromagnetic section consisting of 14 double-sided structures, providing 28 sampling layers. Each sampling layer has an hexagonal module, where a multipad large-area silicon sensor is glued between an electronics circuit board and a metal baseplate. The sensor pads of approximately 1 cm$^2$ are wire-bonded to the circuit board and are readout by custom integrated circuits. The prototype was extensively tested with beams at CERN's Super Proton Synchrotron in 2018. Based on the data collected with beams of positrons, with energies ranging from 20 to 300 GeV, measurements of the energy resolution and linearity, the position and angular resolutions, and the shower shapes are presented and compared to a detailed Geant4 simulation.
△ Less
Submitted 31 March, 2022; v1 submitted 12 November, 2021;
originally announced November 2021.
-
Test Beam Study of SiPM-on-Tile Configurations
Authors:
A. Belloni,
Y. M. Chen,
A. Dyshkant,
T. K. Edberg,
S. Eno,
V. Zutshi,
J. Freeman,
M. Krohn,
Y. Lai,
D. Lincoln,
S. Los,
J. Mans,
G. Reichenbach,
L. Uplegger,
S. A. Uzunyan
Abstract:
Light yield and spatial uniformity for a large variety of configurations of scintillator tiles was studied. The light from each scintillator was collected by a Silicon Photomultiplier (SiPM) directly viewing the produced scintillation light (SiPM-on-tile technique). The varied parameters included tile transverse size, tile thickness, tile wrap** material, scintillator composition, and SiPM model…
▽ More
Light yield and spatial uniformity for a large variety of configurations of scintillator tiles was studied. The light from each scintillator was collected by a Silicon Photomultiplier (SiPM) directly viewing the produced scintillation light (SiPM-on-tile technique). The varied parameters included tile transverse size, tile thickness, tile wrap** material, scintillator composition, and SiPM model. These studies were performed using 120 GeV protons at the Fermilab Test Beam Facility. External tracking allowed the position of each proton penetrating a tile to be measured. The results were compared to a GEANT4 simulation of each configuration of scinitillator, wrap**, and SiPM.
△ Less
Submitted 17 May, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Construction and commissioning of CMS CE prototype silicon modules
Authors:
B. Acar,
G. Adamov,
C. Adloff,
S. Afanasiev,
N. Akchurin,
B. Akgün,
M. Alhusseini,
J. Alison,
G. Altopp,
M. Alyari,
S. An,
S. Anagul,
I. Andreev,
M. Andrews,
P. Aspell,
I. A. Atakisi,
O. Bach,
A. Baden,
G. Bakas,
A. Bakshi,
P. Bargassa,
D. Barney,
E. Becheva,
P. Behera,
A. Belloni
, et al. (307 additional authors not shown)
Abstract:
As part of its HL-LHC upgrade program, the CMS Collaboration is develo** a High Granularity Calorimeter (CE) to replace the existing endcap calorimeters. The CE is a sampling calorimeter with unprecedented transverse and longitudinal readout for both electromagnetic (CE-E) and hadronic (CE-H) compartments. The calorimeter will be built with $\sim$30,000 hexagonal silicon modules. Prototype modul…
▽ More
As part of its HL-LHC upgrade program, the CMS Collaboration is develo** a High Granularity Calorimeter (CE) to replace the existing endcap calorimeters. The CE is a sampling calorimeter with unprecedented transverse and longitudinal readout for both electromagnetic (CE-E) and hadronic (CE-H) compartments. The calorimeter will be built with $\sim$30,000 hexagonal silicon modules. Prototype modules have been constructed with 6-inch hexagonal silicon sensors with cell areas of 1.1~$cm^2$, and the SKIROC2-CMS readout ASIC. Beam tests of different sampling configurations were conducted with the prototype modules at DESY and CERN in 2017 and 2018. This paper describes the construction and commissioning of the CE calorimeter prototype, the silicon modules used in the construction, their basic performance, and the methods used for their calibration.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
The DAQ system of the 12,000 Channel CMS High Granularity Calorimeter Prototype
Authors:
B. Acar,
G. Adamov,
C. Adloff,
S. Afanasiev,
N. Akchurin,
B. Akgün,
M. Alhusseini,
J. Alison,
G. Altopp,
M. Alyari,
S. An,
S. Anagul,
I. Andreev,
M. Andrews,
P. Aspell,
I. A. Atakisi,
O. Bach,
A. Baden,
G. Bakas,
A. Bakshi,
P. Bargassa,
D. Barney,
E. Becheva,
P. Behera,
A. Belloni
, et al. (307 additional authors not shown)
Abstract:
The CMS experiment at the CERN LHC will be upgraded to accommodate the 5-fold increase in the instantaneous luminosity expected at the High-Luminosity LHC (HL-LHC). Concomitant with this increase will be an increase in the number of interactions in each bunch crossing and a significant increase in the total ionising dose and fluence. One part of this upgrade is the replacement of the current endca…
▽ More
The CMS experiment at the CERN LHC will be upgraded to accommodate the 5-fold increase in the instantaneous luminosity expected at the High-Luminosity LHC (HL-LHC). Concomitant with this increase will be an increase in the number of interactions in each bunch crossing and a significant increase in the total ionising dose and fluence. One part of this upgrade is the replacement of the current endcap calorimeters with a high granularity sampling calorimeter equipped with silicon sensors, designed to manage the high collision rates. As part of the development of this calorimeter, a series of beam tests have been conducted with different sampling configurations using prototype segmented silicon detectors. In the most recent of these tests, conducted in late 2018 at the CERN SPS, the performance of a prototype calorimeter equipped with ${\approx}12,000\rm{~channels}$ of silicon sensors was studied with beams of high-energy electrons, pions and muons. This paper describes the custom-built scalable data acquisition system that was built with readily available FPGA mezzanines and low-cost Raspberry PI computers.
△ Less
Submitted 8 December, 2020; v1 submitted 7 December, 2020;
originally announced December 2020.
-
High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing
Authors:
Alexandre Belloni,
Mingli Chen,
Oscar Hernan Madrid Padilla,
Zixuan,
Wang
Abstract:
We propose a generalization of the linear panel quantile regression model to accommodate both \textit{sparse} and \textit{dense} parts: sparse means while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approxima…
▽ More
We propose a generalization of the linear panel quantile regression model to accommodate both \textit{sparse} and \textit{dense} parts: sparse means while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approximated by latent factors and their loadings. Such a structure poses problems for traditional sparse estimators, such as the $\ell_1$-penalised Quantile Regression, and for traditional latent factor estimator, such as PCA. We propose a new estimation procedure, based on the ADMM algorithm, consists of combining the quantile loss function with $\ell_1$ \textit{and} nuclear norm regularization. We show, under general conditions, that our estimator can consistently estimate both the nonzero coefficients of the covariates and the latent low-rank matrix.
Our proposed model has a "Characteristics + Latent Factors" Asset Pricing Model interpretation: we apply our model and estimator with a large-dimensional panel of financial data and find that (i) characteristics have sparser predictive power once latent factors were controlled (ii) the factors and coefficients at upper and lower quantiles are different from the median.
△ Less
Submitted 23 August, 2022; v1 submitted 4 December, 2019;
originally announced December 2019.
-
A high dimensional Central Limit Theorem for martingales, with applications to context tree models
Authors:
Alexandre Belloni,
Roberto I. Oliveira
Abstract:
We establish a central limit theorem for (a sequence of) multivariate martingales which dimension potentially grows with the length $n$ of the martingale. A consequence of the results are Gaussian couplings and a multiplier bootstrap for the maximum of a multivariate martingale whose dimensionality $d$ can be as large as $e^{n^c}$ for some $c>0$. We also develop new anti-concentration bounds for t…
▽ More
We establish a central limit theorem for (a sequence of) multivariate martingales which dimension potentially grows with the length $n$ of the martingale. A consequence of the results are Gaussian couplings and a multiplier bootstrap for the maximum of a multivariate martingale whose dimensionality $d$ can be as large as $e^{n^c}$ for some $c>0$. We also develop new anti-concentration bounds for the maximum component of a high-dimensional Gaussian vector, which we believe is of independent interest.
The results are applicable to a variety of settings. We fully develop its use to the estimation of context tree models (or variable length Markov chains) for discrete stationary time series. Specifically, we provide a bootstrap-based rule to tune several regularization parameters in a theoretically valid Lepski-type method. Such bootstrap-based approach accounts for the correlation structure and leads to potentially smaller penalty choices, which in turn improve the estimation of the transition probabilities.
△ Less
Submitted 7 September, 2018;
originally announced September 2018.
-
Latent Agents in Networks: Estimation and Targeting
Authors:
Baris Ata,
Alexandre Belloni,
Ozan Candogan
Abstract:
We consider a network of agents. Associated with each agent are her covariate and outcome. Agents influence each other's outcomes according to a certain connection/influence structure. A subset of the agents participate on a platform, and hence, are observable to it. The rest are not observable to the platform and are called the latent agents. The platform does not know the influence structure of…
▽ More
We consider a network of agents. Associated with each agent are her covariate and outcome. Agents influence each other's outcomes according to a certain connection/influence structure. A subset of the agents participate on a platform, and hence, are observable to it. The rest are not observable to the platform and are called the latent agents. The platform does not know the influence structure of the observable or the latent parts of the network. It only observes the data on past covariates and decisions of the observable agents. Observable agents influence each other both directly and indirectly through the influence they exert on the latent agents.
We investigate how the platform can estimate the dependence of the observable agents' outcomes on their covariates, taking the latent agents into account. First, we show that this relationship can be succinctly captured by a matrix and provide an algorithm for estimating it under a suitable approximate sparsity condition using historical data of covariates and outcomes for the observable agents. We also obtain convergence rates for the proposed estimator despite the high dimensionality that allows more agents than observations. Second, we show that the approximate sparsity condition holds under the standard conditions used in the literature. Hence, our results apply to a large class of networks. Finally, we apply our results to two practical settings: targeted advertising and promotional pricing. We show that by using the available historical data with our estimator, it is possible to obtain asymptotically optimal advertising/pricing decisions, despite the presence of latent agents.
△ Less
Submitted 26 January, 2022; v1 submitted 14 August, 2018;
originally announced August 2018.
-
Subvector Inference in Partially Identified Models with Many Moment Inequalities
Authors:
Alexandre Belloni,
Federico Bugni,
Victor Chernozhukov
Abstract:
This paper considers inference for a function of a parameter vector in a partially identified model with many moment inequalities. This framework allows the number of moment conditions to grow with the sample size, possibly at exponential rates. Our main motivating application is subvector inference, i.e., inference on a single component of the partially identified parameter vector associated with…
▽ More
This paper considers inference for a function of a parameter vector in a partially identified model with many moment inequalities. This framework allows the number of moment conditions to grow with the sample size, possibly at exponential rates. Our main motivating application is subvector inference, i.e., inference on a single component of the partially identified parameter vector associated with a treatment effect or a policy variable of interest.
Our inference method compares a MinMax test statistic (minimum over parameters satisfying $H_0$ and maximum over moment inequalities) against critical values that are based on bootstrap approximations or analytical bounds. We show that this method controls asymptotic size uniformly over a large class of data generating processes despite the partially identified many moment inequality setting. The finite sample analysis allows us to obtain explicit rates of convergence on the size control. Our results are based on combining non-asymptotic approximations and new high-dimensional central limit theorems for the MinMax of the components of random matrices. Unlike the previous literature on functional inference in partially identified models, our results do not rely on weak convergence results based on Donsker's class assumptions and, in fact, our test statistic may not even converge in distribution. Our bootstrap approximation requires the choice of a tuning parameter sequence that can avoid the excessive concentration of our test statistic. To this end, we propose an asymptotically valid data-driven method to select this tuning parameter sequence. This method generalizes the selection of tuning parameter sequences to problems outside the Donsker's class assumptions and may also be of independent interest. Our procedures based on self-normalized moderate deviation bounds are relatively more conservative but easier to implement.
△ Less
Submitted 29 June, 2018;
originally announced June 2018.
-
High-Dimensional Econometrics and Regularized GMM
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Denis Chetverikov,
Christian Hansen,
Kengo Kato
Abstract:
This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means.…
▽ More
This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means. Within this context, we review fundamental results including high-dimensional central limit theorems, bootstrap approximation of high-dimensional limit distributions, and moderate deviation theory. We also review key concepts underlying inference when many parameters are of interest such as multiple testing with family-wise error rate or false discovery rate control. We then turn to a general high-dimensional minimum distance framework with a special focus on generalized method of moments problems where we present results for estimation and inference about model parameters. The presented results cover a wide array of econometric applications, and we discuss several leading special cases including high-dimensional linear regression and linear instrumental variables models to illustrate the general results.
△ Less
Submitted 10 June, 2018; v1 submitted 5 June, 2018;
originally announced June 2018.
-
Simultaneous Confidence Intervals for High-dimensional Linear Models with Many Endogenous Variables
Authors:
Alexandre Belloni,
Christian Hansen,
Whitney Newey
Abstract:
High-dimensional linear models with endogenous variables play an increasingly important role in recent econometric literature. In this work we allow for models with many endogenous variables and many instrument variables to achieve identification. Because of the high-dimensionality in the second stage, constructing honest confidence regions with asymptotically correct coverage is non-trivial. Our…
▽ More
High-dimensional linear models with endogenous variables play an increasingly important role in recent econometric literature. In this work we allow for models with many endogenous variables and many instrument variables to achieve identification. Because of the high-dimensionality in the second stage, constructing honest confidence regions with asymptotically correct coverage is non-trivial. Our main contribution is to propose estimators and confidence regions that would achieve that. The approach relies on moment conditions that have an additional orthogonal property with respect to nuisance parameters. Moreover, estimation of high-dimension nuisance parameters is carried out via new pivotal procedures. In order to achieve simultaneously valid confidence regions we use a multiplier bootstrap procedure to compute critical values and establish its validity.
△ Less
Submitted 28 August, 2019; v1 submitted 21 December, 2017;
originally announced December 2017.
-
Pivotal Estimation via Self-Normalization for High-Dimensional Linear Models with Error in Variables
Authors:
Alexandre Belloni,
Abhishek Kaul,
Mathieu Rosenbaum
Abstract:
We propose a new estimator for the high-dimensional linear regression model with observation error in the design where the number of coefficients is potentially larger than the sample size. The main novelty of our procedure is that the choice of penalty parameters is pivotal. The estimator is based on applying a self-normalization to the constraints that characterize the estimator. Importantly, we…
▽ More
We propose a new estimator for the high-dimensional linear regression model with observation error in the design where the number of coefficients is potentially larger than the sample size. The main novelty of our procedure is that the choice of penalty parameters is pivotal. The estimator is based on applying a self-normalization to the constraints that characterize the estimator. Importantly, we show how to cast the computation of the estimator as the solution of a convex program with second order cone constraints. This allows the use of algorithms with theoretical guarantees and reliable implementation. Under sparsity assumptions, we derive $\ell_q$-rates of convergence and show that consistency can be achieved even if the number of regressors exceeds the sample size. We further provide a simple to implement rule to threshold the estimator that yields a provably sparse estimator with similar $\ell_2$ and $\ell_1$-rates of convergence. The thresholds are data-driven and component dependents. Finally, we also study the rates of convergence of estimators that refit the data based on a selected support with possible model selection mistakes. In addition to our finite sample theoretical results that allow for non-i.i.d. data, we also present simulations to compare the performance of the proposed estimators.
△ Less
Submitted 6 September, 2019; v1 submitted 28 August, 2017;
originally announced August 2017.
-
US Cosmic Visions: New Ideas in Dark Matter 2017: Community Report
Authors:
Marco Battaglieri,
Alberto Belloni,
Aaron Chou,
Priscilla Cushman,
Bertrand Echenard,
Rouven Essig,
Juan Estrada,
Jonathan L. Feng,
Brenna Flaugher,
Patrick J. Fox,
Peter Graham,
Carter Hall,
Roni Harnik,
JoAnne Hewett,
Joseph Incandela,
Eder Izaguirre,
Daniel McKinsey,
Matthew Pyle,
Natalie Roe,
Gray Rybka,
Pierre Sikivie,
Tim M. P. Tait,
Natalia Toro,
Richard Van De Water,
Neal Weiner
, et al. (226 additional authors not shown)
Abstract:
This white paper summarizes the workshop "U.S. Cosmic Visions: New Ideas in Dark Matter" held at University of Maryland on March 23-25, 2017.
This white paper summarizes the workshop "U.S. Cosmic Visions: New Ideas in Dark Matter" held at University of Maryland on March 23-25, 2017.
△ Less
Submitted 14 July, 2017;
originally announced July 2017.
-
Confidence Bands for Coefficients in High Dimensional Linear Models with Error-in-variables
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Abhishek Kaul
Abstract:
We study high-dimensional linear models with error-in-variables. Such models are motivated by various applications in econometrics, finance and genetics. These models are challenging because of the need to account for measurement errors to avoid non-vanishing biases in addition to handle the high dimensionality of the parameters. A recent growing literature has proposed various estimators that ach…
▽ More
We study high-dimensional linear models with error-in-variables. Such models are motivated by various applications in econometrics, finance and genetics. These models are challenging because of the need to account for measurement errors to avoid non-vanishing biases in addition to handle the high dimensionality of the parameters. A recent growing literature has proposed various estimators that achieve good rates of convergence. Our main contribution complements this literature with the construction of simultaneous confidence regions for the parameters of interest in such high-dimensional linear models with error-in-variables.
These confidence regions are based on the construction of moment conditions that have an additional orthogonal property with respect to nuisance parameters. We provide a construction that requires us to estimate an additional high-dimensional linear model with error-in-variables for each component of interest. We use a multiplier bootstrap to compute critical values for simultaneous confidence intervals for a subset $S$ of the components. We show its validity despite of possible model selection mistakes, and allowing for the cardinality of $S$ to be larger than the sample size.
We apply and discuss the implications of our results to two examples and conduct Monte Carlo simulations to illustrate the performance of the proposed procedure.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
quantreg.nonpar: An R Package for Performing Nonparametric Series Quantile Regression
Authors:
Michael Lipsitz,
Alexandre Belloni,
Victor Chernozhukov,
Iván Fernández-Val
Abstract:
The R package quantreg.nonpar implements nonparametric quantile regression methods to estimate and make inference on partially linear quantile models. quantreg.nonpar obtains point estimates of the conditional quantile function and its derivatives based on series approximations to the nonparametric part of the model. It also provides pointwise and uniform confidence intervals over a region of cova…
▽ More
The R package quantreg.nonpar implements nonparametric quantile regression methods to estimate and make inference on partially linear quantile models. quantreg.nonpar obtains point estimates of the conditional quantile function and its derivatives based on series approximations to the nonparametric part of the model. It also provides pointwise and uniform confidence intervals over a region of covariate values and/or quantile indices for the same functions using analytical and resampling methods. This paper serves as an introduction to the package and displays basic functionality of the functions contained within.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
Quantile Graphical Models: Prediction and Conditional Independence with Applications to Systemic Risk
Authors:
Alexandre Belloni,
Mingli Chen,
Victor Chernozhukov
Abstract:
We propose two types of Quantile Graphical Models (QGMs) --- Conditional Independence Quantile Graphical Models (CIQGMs) and Prediction Quantile Graphical Models (PQGMs). CIQGMs characterize the conditional independence of distributions by evaluating the distributional dependence structure at each quantile index. As such, CIQGMs can be used for validation of the graph structure in the causal graph…
▽ More
We propose two types of Quantile Graphical Models (QGMs) --- Conditional Independence Quantile Graphical Models (CIQGMs) and Prediction Quantile Graphical Models (PQGMs). CIQGMs characterize the conditional independence of distributions by evaluating the distributional dependence structure at each quantile index. As such, CIQGMs can be used for validation of the graph structure in the causal graphical models (\cite{pearl2009causality, robins1986new, heckman2015causal}). One main advantage of these models is that we can apply them to large collections of variables driven by non-Gaussian and non-separable shocks. PQGMs characterize the statistical dependencies through the graphs of the best linear predictors under asymmetric loss functions. PQGMs make weaker assumptions than CIQGMs as they allow for misspecification. Because of QGMs' ability to handle large collections of variables and focus on specific parts of the distributions, we could apply them to quantify tail interdependence. The resulting tail risk network can be used for measuring systemic risk contributions that help make inroads in understanding international financial contagion and dependence structures of returns under downside market movements.
We develop estimation and inference methods for QGMs focusing on the high-dimensional case, where the number of variables in the graph is large compared to the number of observations. For CIQGMs, these methods and results include valid simultaneous choices of penalty functions, uniform rates of convergence, and confidence regions that are simultaneously valid. We also derive analogous results for PQGMs, which include new results for penalized quantile regressions in high-dimensional settings to handle misspecification, many controls, and a continuum of additional conditioning events.
△ Less
Submitted 28 October, 2019; v1 submitted 1 July, 2016;
originally announced July 2016.
-
Uniformly Valid Post-Regularization Confidence Regions for Many Functional Parameters in Z-Estimation Framework
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Denis Chetverikov,
Ying Wei
Abstract:
In this paper we develop procedures to construct simultaneous confidence bands for $\tilde p$ potentially infinite-dimensional parameters after model selection for general moment condition models where $\tilde p$ is potentially much larger than the sample size of available data, $n$. This allows us to cover settings with functional response data where each of the $\tilde p$ parameters is a functio…
▽ More
In this paper we develop procedures to construct simultaneous confidence bands for $\tilde p$ potentially infinite-dimensional parameters after model selection for general moment condition models where $\tilde p$ is potentially much larger than the sample size of available data, $n$. This allows us to cover settings with functional response data where each of the $\tilde p$ parameters is a function. The procedure is based on the construction of score functions that satisfy certain orthogonality condition. The proposed simultaneous confidence bands rely on uniform central limit theorems for high-dimensional vectors (and not on Donsker arguments as we allow for $\tilde p \gg n$). To construct the bands, we employ a multiplier bootstrap procedure which is computationally efficient as it only involves resampling the estimated score functions (and does not require resolving the high-dimensional optimization problems). We formally apply the general theory to inference on regression coefficient process in the distribution regression model with a logistic link, where two implementations are analyzed in detail. Simulations and an application to real data are provided to help illustrate the applicability of the results.
△ Less
Submitted 3 February, 2019; v1 submitted 23 December, 2015;
originally announced December 2015.
-
Esca** the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions
Authors:
Alexandre Belloni,
Tengyuan Liang,
Hariharan Narayanan,
Alexander Rakhlin
Abstract:
We consider the problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations. The problem is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same $\mathcal{O}^*$ complexity as sampling from log-concave distributions. In addition to extend the anal…
▽ More
We consider the problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations. The problem is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same $\mathcal{O}^*$ complexity as sampling from log-concave distributions. In addition to extend the analysis for log-concave distributions to approximate log-concave distributions, the implementation of the 1-dimensional sampler of the Hit-and-Run walk requires new methods and analysis. The algorithm then is based on simulated annealing which does not relies on first order conditions which makes it essentially immune to local minima.
We then apply the method to different motivating problems. In the context of zeroth order stochastic convex optimization, the proposed method produces an $ε$-minimizer after $\mathcal{O}^*(n^{7.5}ε^{-2})$ noisy function evaluations by inducing a $\mathcal{O}(ε/n)$-approximately log concave distribution. We also consider in detail the case when the "amount of non-convexity" decays towards the optimum of the function. Other applications of the method discussed in this work include private computation of empirical risk minimizers, two-stage stochastic programming, and approximate dynamic programming for online learning.
△ Less
Submitted 15 June, 2015; v1 submitted 28 January, 2015;
originally announced January 2015.
-
An $\{l_1,l_2,l_{\infty}\}$-Regularization Approach to High-Dimensional Errors-in-variables Models
Authors:
Alexandre Belloni,
Mathieu Rosenbaum,
Alexandre B. Tsybakov
Abstract:
Several new estimation methods have been recently proposed for the linear regression model with observation error in the design. Different assumptions on the data generating process have motivated different estimators and analysis. In particular, the literature considered (1) observation errors in the design uniformly bounded by some $\bar δ$, and (2) zero mean independent observation errors. Unde…
▽ More
Several new estimation methods have been recently proposed for the linear regression model with observation error in the design. Different assumptions on the data generating process have motivated different estimators and analysis. In particular, the literature considered (1) observation errors in the design uniformly bounded by some $\bar δ$, and (2) zero mean independent observation errors. Under the first assumption, the rates of convergence of the proposed estimators depend explicitly on $\bar δ$, while the second assumption has been applied when an estimator for the second moment of the observational error is available. This work proposes and studies two new estimators which, compared to other procedures for regression models with errors in the design, exploit an additional $l_{\infty}$-norm regularization. The first estimator is applicable when both (1) and (2) hold but does not require an estimator for the second moment of the observational error. The second estimator is applicable under (2) and requires an estimator for the second moment of the observation error. Importantly, we impose no assumption on the accuracy of this pilot estimator, in contrast to the previously known procedures. As the recent proposals, we allow the number of covariates to be much larger than the sample size. We establish the rates of convergence of the estimators and compare them with the bounds obtained for related estimators in the literature. These comparisons show interesting insights on the interplay of the assumptions and the achievable rates of convergence.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Inference in High Dimensional Panel Models with an Application to Gun Control
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Christian Hansen,
Damian Kozbur
Abstract:
We consider estimation and inference in panel data models with additive unobserved individual specific heterogeneity in a high dimensional setting. The setting allows the number of time varying regressors to be larger than the sample size. To make informative estimation and inference feasible, we require that the overall contribution of the time varying variables after eliminating the individual s…
▽ More
We consider estimation and inference in panel data models with additive unobserved individual specific heterogeneity in a high dimensional setting. The setting allows the number of time varying regressors to be larger than the sample size. To make informative estimation and inference feasible, we require that the overall contribution of the time varying variables after eliminating the individual specific heterogeneity can be captured by a relatively small number of the available variables whose identities are unknown. This restriction allows the problem of estimation to proceed as a variable selection problem. Importantly, we treat the individual specific heterogeneity as fixed effects which allows this heterogeneity to be related to the observed time varying variables in an unspecified way and allows that this heterogeneity may be non-zero for all individuals. Within this framework, we provide procedures that give uniformly valid inference over a fixed subset of parameters in the canonical linear fixed effects model and over coefficients on a fixed vector of endogenous variables in panel data instrumental variables models with fixed effects and many instruments. An input to develo** the properties of our proposed procedures is the use of a variant of the Lasso estimator that allows for a grouped data structure where data across groups are independent and dependence within groups is unrestricted. We provide formal conditions within this structure under which the proposed Lasso variant selects a sparse model with good approximation properties. We present simulation results in support of the theoretical developments and illustrate the use of the methods in an application aimed at estimating the effect of gun prevalence on crime rates.
△ Less
Submitted 24 November, 2014;
originally announced November 2014.
-
Observation of the rare $B^0_s\toμ^+μ^-$ decay from the combined analysis of CMS and LHCb data
Authors:
The CMS,
LHCb Collaborations,
:,
V. Khachatryan,
A. M. Sirunyan,
A. Tumasyan,
W. Adam,
T. Bergauer,
M. Dragicevic,
J. Erö,
M. Friedl,
R. Frühwirth,
V. M. Ghete,
C. Hartl,
N. Hörmann,
J. Hrubec,
M. Jeitler,
W. Kiesenhofer,
V. Knünz,
M. Krammer,
I. Krätschmer,
D. Liko,
I. Mikulec,
D. Rabady,
B. Rahbaran
, et al. (2807 additional authors not shown)
Abstract:
A joint measurement is presented of the branching fractions $B^0_s\toμ^+μ^-$ and $B^0\toμ^+μ^-$ in proton-proton collisions at the LHC by the CMS and LHCb experiments. The data samples were collected in 2011 at a centre-of-mass energy of 7 TeV, and in 2012 at 8 TeV. The combined analysis produces the first observation of the $B^0_s\toμ^+μ^-$ decay, with a statistical significance exceeding six sta…
▽ More
A joint measurement is presented of the branching fractions $B^0_s\toμ^+μ^-$ and $B^0\toμ^+μ^-$ in proton-proton collisions at the LHC by the CMS and LHCb experiments. The data samples were collected in 2011 at a centre-of-mass energy of 7 TeV, and in 2012 at 8 TeV. The combined analysis produces the first observation of the $B^0_s\toμ^+μ^-$ decay, with a statistical significance exceeding six standard deviations, and the best measurement of its branching fraction so far. Furthermore, evidence for the $B^0\toμ^+μ^-$ decay is obtained with a statistical significance of three standard deviations. The branching fraction measurements are statistically compatible with SM predictions and impose stringent constraints on several theories beyond the SM.
△ Less
Submitted 17 August, 2015; v1 submitted 17 November, 2014;
originally announced November 2014.
-
Linear and Conic Programming Estimators in High-Dimensional Errors-in-variables Models
Authors:
Alexandre Belloni,
Mathieu Rosenbaum,
Alexandre Tsybakov
Abstract:
We consider the linear regression model with observation error in the design. In this setting, we allow the number of covariates to be much larger than the sample size. Several new estimation methods have been recently introduced for this model. Indeed, the standard Lasso estimator or Dantzig selector turn out to become unreliable when only noisy regressors are available, which is quite common in…
▽ More
We consider the linear regression model with observation error in the design. In this setting, we allow the number of covariates to be much larger than the sample size. Several new estimation methods have been recently introduced for this model. Indeed, the standard Lasso estimator or Dantzig selector turn out to become unreliable when only noisy regressors are available, which is quite common in practice. We show in this work that under suitable sparsity assumptions, the procedure introduced in Rosenbaum and Tsybakov (2013) is almost optimal in a minimax sense and, despite non-convexities, can be efficiently computed by a single linear programming problem. Furthermore, we provide an estimator attaining the minimax efficiency bound. This estimator is written as a second order cone programming minimisation problem which can be solved numerically in polynomial time.
△ Less
Submitted 3 July, 2016; v1 submitted 1 August, 2014;
originally announced August 2014.
-
Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Kengo Kato
Abstract:
This work proposes new inference methods for a regression coefficient of interest in a (heterogeneous) quantile regression model. We consider a high-dimensional model where the number of regressors potentially exceeds the sample size but a subset of them suffice to construct a reasonable approximation to the conditional quantile function. The proposed methods are (explicitly or implicitly) based o…
▽ More
This work proposes new inference methods for a regression coefficient of interest in a (heterogeneous) quantile regression model. We consider a high-dimensional model where the number of regressors potentially exceeds the sample size but a subset of them suffice to construct a reasonable approximation to the conditional quantile function. The proposed methods are (explicitly or implicitly) based on orthogonal score functions that protect against moderate model selection mistakes, which are often inevitable in the approximately sparse model considered in the present paper. We establish the uniform validity of the proposed confidence regions for the quantile regression coefficient. Importantly, these methods directly apply to more than one variable and a continuum of quantile indices. In addition, the performance of the proposed methods is illustrated through Monte-Carlo experiments and an empirical example, dealing with risk factors in childhood malnutrition.
△ Less
Submitted 23 June, 2016; v1 submitted 26 December, 2013;
originally announced December 2013.
-
Program Evaluation and Causal Inference with High-Dimensional Data
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Ivan Fernández-Val,
Christian Hansen
Abstract:
In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenou…
▽ More
In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets.
△ Less
Submitted 5 January, 2018; v1 submitted 11 November, 2013;
originally announced November 2013.
-
Supplementary Appendix for "Inference on Treatment Effects After Selection Amongst High-Dimensional Controls"
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Christian Hansen
Abstract:
In this supplementary appendix we provide additional results, omitted proofs and extensive simulations that complement the analysis of the main text (arXiv:1201.0224).
In this supplementary appendix we provide additional results, omitted proofs and extensive simulations that complement the analysis of the main text (arXiv:1201.0224).
△ Less
Submitted 20 June, 2013; v1 submitted 26 May, 2013;
originally announced May 2013.
-
Post-Selection Inference for Generalized Linear Models with Many Controls
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Ying Wei
Abstract:
This paper considers generalized linear models in the presence of many controls. We lay out a general methodology to estimate an effect of interest based on the construction of an instrument that immunize against model selection mistakes and apply it to the case of logistic binary choice model. More specifically we propose new methods for estimating and constructing confidence regions for a regres…
▽ More
This paper considers generalized linear models in the presence of many controls. We lay out a general methodology to estimate an effect of interest based on the construction of an instrument that immunize against model selection mistakes and apply it to the case of logistic binary choice model. More specifically we propose new methods for estimating and constructing confidence regions for a regression parameter of primary interest $α_0$, a parameter in front of the regressor of interest, such as the treatment variable or a policy variable. These methods allow to estimate $α_0$ at the root-$n$ rate when the total number $p$ of other regressors, called controls, potentially exceed the sample size $n$ using sparsity assumptions. The sparsity assumption means that there is a subset of $s<n$ controls which suffices to accurately approximate the nuisance part of the regression function. Importantly, the estimators and these resulting confidence regions are valid uniformly over $s$-sparse models satisfying $s^2\log^2 p = o(n)$ and other technical conditions. These procedures do not rely on traditional consistent model selection arguments for their validity. In fact, they are robust with respect to moderate model selection mistakes in variable selection. Under suitable conditions, the estimators are semi-parametrically efficient in the sense of attaining the semi-parametric efficiency bounds for the class of models in this paper.
△ Less
Submitted 21 March, 2016; v1 submitted 14 April, 2013;
originally announced April 2013.
-
Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Kengo Kato
Abstract:
We develop uniformly valid confidence regions for regression coefficients in a high-dimensional sparse median regression model with homoscedastic errors. Our methods are based on a moment equation that is immunized against non-regular estimation of the nuisance part of the median regression function by using Neyman's orthogonalization. We establish that the resulting instrumental median regression…
▽ More
We develop uniformly valid confidence regions for regression coefficients in a high-dimensional sparse median regression model with homoscedastic errors. Our methods are based on a moment equation that is immunized against non-regular estimation of the nuisance part of the median regression function by using Neyman's orthogonalization. We establish that the resulting instrumental median regression estimator of a target regression coefficient is asymptotically normally distributed uniformly with respect to the underlying sparse model and is semi-parametrically efficient. We also generalize our method to a general non-smooth Z-estimation framework with the number of target parameters $p_1$ being possibly much larger than the sample size $n$. We extend Huber's results on asymptotic normality to this setting, demonstrating uniform asymptotic normality of the proposed estimators over $p_1$-dimensional rectangles, constructing simultaneous confidence bands on all of the $p_1$ target parameters, and establishing asymptotic validity of the bands uniformly over underlying approximately sparse models.
Keywords: Instrument; Post-selection inference; Sparsity; Neyman's Orthogonal Score test; Uniformly valid inference; Z-estimation.
△ Less
Submitted 18 October, 2020; v1 submitted 31 March, 2013;
originally announced April 2013.
-
Some New Asymptotic Theory for Least Squares Series: Pointwise and Uniform Results
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Denis Chetverikov,
Kengo Kato
Abstract:
In applications it is common that the exact form of a conditional expectation is unknown and having flexible functional forms can lead to improvements. Series method offers that by approximating the unknown function based on $k$ basis functions, where $k$ is allowed to grow with the sample size $n$. We consider series estimators for the conditional mean in light of: (i) sharp LLNs for matrices der…
▽ More
In applications it is common that the exact form of a conditional expectation is unknown and having flexible functional forms can lead to improvements. Series method offers that by approximating the unknown function based on $k$ basis functions, where $k$ is allowed to grow with the sample size $n$. We consider series estimators for the conditional mean in light of: (i) sharp LLNs for matrices derived from the noncommutative Khinchin inequalities, (ii) bounds on the Lebesgue factor that controls the ratio between the $L^\infty$ and $L_2$-norms of approximation errors, (iii) maximal inequalities for processes whose entropy integrals diverge, and (iv) strong approximations to series-type processes.
These technical tools allow us to contribute to the series literature, specifically the seminal work of Newey (1997), as follows. First, we weaken the condition on the number $k$ of approximating functions used in series estimation from the typical $k^2/n \to 0$ to $k/n \to 0$, up to log factors, which was available only for spline series before. Second, we derive $L_2$ rates and pointwise central limit theorems results when the approximation error vanishes. Under an incorrectly specified model, i.e. when the approximation error does not vanish, analogous results are also shown. Third, under stronger conditions we derive uniform rates and functional central limit theorems that hold if the approximation error vanishes or not. That is, we derive the strong approximation for the entire estimate of the nonparametric function.
We derive uniform rates, Gaussian approximations, and uniform confidence bands for a wide collection of linear functionals of the conditional expectation function.
△ Less
Submitted 17 June, 2015; v1 submitted 3 December, 2012;
originally announced December 2012.
-
Inference on Treatment Effects After Selection Amongst High-Dimensional Controls
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Christian Hansen
Abstract:
We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger than the sample size. To make informative inference feasible, we require the model to be approximatel…
▽ More
We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the "post-double-selection" method. Our results apply to Lasso-type methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties.
The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates.
△ Less
Submitted 9 May, 2012; v1 submitted 30 December, 2011;
originally announced January 2012.
-
Inference for High-Dimensional Sparse Econometric Models
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Christian Hansen
Abstract:
This article is about estimation and inference methods for high dimensional sparse (HDS) regression models in econometrics. High dimensional sparse models arise in situations where many regressors (or series terms) are available and the regression function is well-approximated by a parsimonious, yet unknown set of regressors. The latter condition makes it possible to estimate the entire regression…
▽ More
This article is about estimation and inference methods for high dimensional sparse (HDS) regression models in econometrics. High dimensional sparse models arise in situations where many regressors (or series terms) are available and the regression function is well-approximated by a parsimonious, yet unknown set of regressors. The latter condition makes it possible to estimate the entire regression function effectively by searching for approximately the right set of regressors. We discuss methods for identifying this set of regressors and estimating their coefficients based on $\ell_1$-penalization and describe key theoretical results. In order to capture realistic practical situations, we expressly allow for imperfect selection of regressors and study the impact of this imperfect selection on estimation and inference results. We focus the main part of the article on the use of HDS models and methods in the instrumental variables model and the partially linear model. We present a set of novel inference results for these models and illustrate their use with applications to returns to schooling and growth regression.
△ Less
Submitted 30 December, 2011;
originally announced January 2012.
-
Approximate group context tree
Authors:
Alexandre Belloni,
Roberto I. Oliveira
Abstract:
We study a variable length Markov chain model associated with a group of stationary processes that share the same context tree but each process has potentially different conditional probabilities. We propose a new model selection and estimation method which is computationally efficient. We develop oracle and adaptivity inequalities, as well as model selection properties, that hold under continuity…
▽ More
We study a variable length Markov chain model associated with a group of stationary processes that share the same context tree but each process has potentially different conditional probabilities. We propose a new model selection and estimation method which is computationally efficient. We develop oracle and adaptivity inequalities, as well as model selection properties, that hold under continuity of the transition probabilities and polynomial $β$-mixing. In particular, model misspecification is allowed.
These results are applied to interesting families of processes. For Markov processes, we obtain uniform rate of convergence for the estimation error of transition probabilities as well as perfect model selection results. For chains of infinite order with complete connections, we obtain explicit uniform rates of convergence on the estimation of conditional probabilities, which have an explicit dependence on the processes' continuity rates. Similar guarantees are also derived for renewal processes.
Our results are shown to be applicable to discrete stochastic dynamic programming problems and to dynamic discrete choice models. We also apply our estimator to a linguistic study, based on recent work, by Galves et al (2012), of the rhythmic differences between Brazilian and European Portuguese.
△ Less
Submitted 30 December, 2015; v1 submitted 1 July, 2011;
originally announced July 2011.
-
High Dimensional Sparse Econometric Models: An Introduction
Authors:
Alexandre Belloni,
Victor Chernozhukov
Abstract:
In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using L1-penalization and post-L1-penalization methods. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with Monte Carlo simulations and an empirical a…
▽ More
In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using L1-penalization and post-L1-penalization methods. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with Monte Carlo simulations and an empirical application. In the application, we examine and confirm the empirical validity of the Solow-Swan model for international economic growth.
△ Less
Submitted 1 September, 2011; v1 submitted 26 June, 2011;
originally announced June 2011.
-
Conditional Quantile Processes based on Series or Many Regressors
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Denis Chetverikov,
Iván Fernández-Val
Abstract:
Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In…
▽ More
Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-specific coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the QR-series coefficient process, namely we obtain uniform strong approximations to the QR-series coefficient process by conditionally pivotal and Gaussian processes. Based on these strong approximations, or couplings, we develop four resampling methods (pivotal, gradient bootstrap, Gaussian, and weighted bootstrap) that can be used for inference on the entire QR-series coefficient function.
We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence and show how to use the four resampling methods mentioned above for inference on the functionals. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and the covariate value, and covering the pointwise case as a by-product. We demonstrate the practical utility of these results with an example, where we estimate the price elasticity function and test the Slutsky condition of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption.
△ Less
Submitted 9 August, 2018; v1 submitted 30 May, 2011;
originally announced May 2011.
-
Pivotal estimation via square-root Lasso in nonparametric regression
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Lie Wang
Abstract:
We propose a self-tuning $\sqrt{\mathrm {Lasso}}$ method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even i…
▽ More
We propose a self-tuning $\sqrt{\mathrm {Lasso}}$ method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even in extreme cases, such as the infinite variance case and the noiseless case, in contrast to Lasso. We establish various nonasymptotic bounds for $\sqrt{\mathrm {Lasso}}$ including prediction norm rate and sparsity. Our analysis is based on new impact factors that are tailored for bounding prediction norm. In order to cover heteroscedastic non-Gaussian noise, we rely on moderate deviation theory for self-normalized sums to achieve Gaussian-like results under weak conditions. Moreover, we derive bounds on the performance of ordinary least square (ols) applied to the model selected by $\sqrt{\mathrm {Lasso}}$ accounting for possible misspecification of the selected model. Under mild conditions, the rate of convergence of ols post $\sqrt{\mathrm {Lasso}}$ is as good as $\sqrt{\mathrm {Lasso}}$'s rate. As an application, we consider the use of $\sqrt{\mathrm {Lasso}}$ and ols post $\sqrt{\mathrm {Lasso}}$ as estimators of nuisance parameters in a generic semiparametric problem (nonlinear moment condition or $Z$-problem), resulting in a construction of $\sqrt{n}$-consistent and asymptotically normal estimators of the main parameters.
△ Less
Submitted 26 May, 2014; v1 submitted 7 May, 2011;
originally announced May 2011.
-
LASSO Methods for Gaussian Instrumental Variables Models
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Christian Hansen
Abstract:
In this note, we propose to use sparse methods (e.g. LASSO, Post-LASSO, sqrt-LASSO, and Post-sqrt-LASSO) to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments in the canonical Gaussian case. The methods apply even when the number of instruments is much larger than the sample size. We derive asymptotic distributions for t…
▽ More
In this note, we propose to use sparse methods (e.g. LASSO, Post-LASSO, sqrt-LASSO, and Post-sqrt-LASSO) to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments in the canonical Gaussian case. The methods apply even when the number of instruments is much larger than the sample size. We derive asymptotic distributions for the resulting IV estimators and provide conditions under which these sparsity-based IV estimators are asymptotically oracle-efficient. In simulation experiments, a sparsity-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. We illustrate the procedure in an empirical example using the Angrist and Krueger (1991) schooling data.
△ Less
Submitted 23 February, 2011; v1 submitted 6 December, 2010;
originally announced December 2010.
-
Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain
Authors:
Alexandre Belloni,
Daniel Chen,
Victor Chernozhukov,
Christian Hansen
Abstract:
We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically n…
▽ More
We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically normal when the first-stage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semi-parametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic "beta-min" conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark.
In develo** the IV results, we establish a series of new results for Lasso and Post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances which uses a data-weighted $\ell_1$-penalty function. Using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and Post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that $\log p = o(n^{1/3})$.
△ Less
Submitted 19 April, 2015; v1 submitted 20 October, 2010;
originally announced October 2010.
-
Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming
Authors:
Alexandre Belloni,
Victor Chernozhukov,
Lie Wang
Abstract:
We propose a pivotal method for estimating high-dimensional sparse linear regression models, where the overall number of regressors $p$ is large, possibly much larger than $n$, but only $s$ regressors are significant. The method is a modification of the lasso, called the square-root lasso. The method is pivotal in that it neither relies on the knowledge of the standard deviation $σ$ or nor does it…
▽ More
We propose a pivotal method for estimating high-dimensional sparse linear regression models, where the overall number of regressors $p$ is large, possibly much larger than $n$, but only $s$ regressors are significant. The method is a modification of the lasso, called the square-root lasso. The method is pivotal in that it neither relies on the knowledge of the standard deviation $σ$ or nor does it need to pre-estimate $σ$. Moreover, the method does not rely on normality or sub-Gaussianity of noise. It achieves near-oracle performance, attaining the convergence rate $σ\{(s/n)\log p\}^{1/2}$ in the prediction norm, and thus matching the performance of the lasso with known $σ$. These performance results are valid for both Gaussian and non-Gaussian errors, under some mild moment restrictions. We formulate the square-root lasso as a solution to a convex conic programming problem, which allows us to implement the estimator using efficient algorithmic methods, such as interior-point and first-order methods.
△ Less
Submitted 18 December, 2011; v1 submitted 28 September, 2010;
originally announced September 2010.
-
Least squares after model selection in high-dimensional sparse models
Authors:
Alexandre Belloni,
Victor Chernozhukov
Abstract:
In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of…
▽ More
In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the Lasso-based model selection correctly includes all components of the "true" model as a subset and also achieves sufficient sparsity. In the extreme case, when Lasso perfectly selects the "true" model, the OLS post-Lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by Lasso, which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are nonasymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the Lasso estimator acting as a selector in the first step, but also applies to any other estimator, for example, various forms of thresholded Lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces additional sparsity subject to maintaining a certain goodness of fit. The latter scheme has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it dominates those procedures as well as traditional thresholding in a wide variety of experiments.
△ Less
Submitted 20 March, 2013; v1 submitted 31 December, 2009;
originally announced January 2010.
-
On multivariate quantiles under partial orders
Authors:
Alexandre Belloni,
Robert L. Winkler
Abstract:
This paper focuses on generalizing quantiles from the ordering point of view. We propose the concept of partial quantiles, which are based on a given partial order. We establish that partial quantiles are equivariant under order-preserving transformations of the data, robust to outliers, characterize the probability distribution if the partial order is sufficiently rich, generalize the concept of…
▽ More
This paper focuses on generalizing quantiles from the ordering point of view. We propose the concept of partial quantiles, which are based on a given partial order. We establish that partial quantiles are equivariant under order-preserving transformations of the data, robust to outliers, characterize the probability distribution if the partial order is sufficiently rich, generalize the concept of efficient frontier, and can measure dispersion from the partial order perspective. We also study several statistical aspects of partial quantiles. We provide estimators, associated rates of convergence, and asymptotic distributions that hold uniformly over a continuum of quantile indices. Furthermore, we provide procedures that can restore monotonicity properties that might have been disturbed by estimation error, establish computational complexity bounds, and point out a concentration of measure phenomenon (the latter under independence and the componentwise natural order). Finally, we illustrate the concepts by discussing several theoretical examples and simulations. Empirical applications to compare intake nutrients within diets, to evaluate the performance of investment funds, and to study the impact of policies on tobacco awareness are also presented to illustrate the concepts and their use.
△ Less
Submitted 30 May, 2011; v1 submitted 29 December, 2009;
originally announced December 2009.
-
Posterior Inference in Curved Exponential Families under Increasing Dimensions
Authors:
Alexandre Belloni,
Victor Chernozhukov
Abstract:
This work studies the large sample properties of the posterior-based inference in the curved exponential family under increasing dimension. The curved structure arises from the imposition of various restrictions on the model, such as moment restrictions, and plays a fundamental role in econometrics and others branches of data analysis. We establish conditions under which the posterior distribution…
▽ More
This work studies the large sample properties of the posterior-based inference in the curved exponential family under increasing dimension. The curved structure arises from the imposition of various restrictions on the model, such as moment restrictions, and plays a fundamental role in econometrics and others branches of data analysis. We establish conditions under which the posterior distribution is approximately normal, which in turn implies various good properties of estimation and inference procedures based on the posterior. In the process we also revisit and improve upon previous results for the exponential family under increasing dimension by making use of concentration of measure. We also discuss a variety of applications to high-dimensional versions of the classical econometric models including the multinomial model with moment restrictions, seemingly unrelated regression equations, and single structural equation models. In our analysis, both the parameter dimension and the number of moments are increasing with the sample size.
△ Less
Submitted 22 April, 2014; v1 submitted 20 April, 2009;
originally announced April 2009.
-
L1-Penalized Quantile Regression in High-Dimensional Sparse Models
Authors:
Alexandre Belloni,
Victor Chernozhukov
Abstract:
We consider median regression and, more generally, a possibly infinite collection of quantile regressions in high-dimensional sparse models. In these models the overall number of regressors $p$ is very large, possibly larger than the sample size $n$, but only $s$ of these regressors have non-zero impact on the conditional quantile of the response variable, where $s$ grows slower than $n$. We consi…
▽ More
We consider median regression and, more generally, a possibly infinite collection of quantile regressions in high-dimensional sparse models. In these models the overall number of regressors $p$ is very large, possibly larger than the sample size $n$, but only $s$ of these regressors have non-zero impact on the conditional quantile of the response variable, where $s$ grows slower than $n$. We consider quantile regression penalized by the $\ell_1$-norm of coefficients ($\ell_1$-QR). First, we show that $\ell_1$-QR is consistent at the rate $\sqrt{s/n} \sqrt{\log p}$. The overall number of regressors $p$ affects the rate only through the $\log p$ factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that $s/n$ converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that $\ell_1$-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in $\ell_1$-QR is of same stochastic order as $s$. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of $\ell_1$-QR in a Monte-Carlo experiment, and illustrate its use on an international economic growth application.
△ Less
Submitted 26 September, 2019; v1 submitted 19 April, 2009;
originally announced April 2009.
-
Expected Performance of the ATLAS Experiment - Detector, Trigger and Physics
Authors:
The ATLAS Collaboration,
G. Aad,
E. Abat,
B. Abbott,
J. Abdallah,
A. A. Abdelalim,
A. Abdesselam,
O. Abdinov,
B. Abi,
M. Abolins,
H. Abramowicz,
B. S. Acharya,
D. L. Adams,
T. N. Addy,
C. Adorisio,
P. Adragna,
T. Adye,
J. A. Aguilar-Saavedra,
M. Aharrouche,
S. P. Ahlen,
F. Ahles,
A. Ahmad,
H. Ahmed,
G. Aielli,
T. Akdogan
, et al. (2587 additional authors not shown)
Abstract:
A detailed study is presented of the expected performance of the ATLAS detector. The reconstruction of tracks, leptons, photons, missing energy and jets is investigated, together with the performance of b-tagging and the trigger. The physics potential for a variety of interesting physics processes, within the Standard Model and beyond, is examined. The study comprises a series of notes based on…
▽ More
A detailed study is presented of the expected performance of the ATLAS detector. The reconstruction of tracks, leptons, photons, missing energy and jets is investigated, together with the performance of b-tagging and the trigger. The physics potential for a variety of interesting physics processes, within the Standard Model and beyond, is examined. The study comprises a series of notes based on simulations of the detector and physics processes, with particular emphasis given to the data expected from the first years of operation of the LHC at CERN.
△ Less
Submitted 14 August, 2009; v1 submitted 28 December, 2008;
originally announced January 2009.