-
Nested Nonparametric Instrumental Variable Regression: Long Term, Mediated, and Time Varying Treatment Effects
Authors:
Isaac Meza,
Rahul Singh
Abstract:
Several causal parameters in short panel data models are scalar summaries of a function called a nested nonparametric instrumental variable regression (nested NPIV). Examples include long term, mediated, and time varying treatment effects identified using proxy variables. However, it appears that no prior estimators or guarantees for nested NPIV exist, preventing flexible estimation and inference…
▽ More
Several causal parameters in short panel data models are scalar summaries of a function called a nested nonparametric instrumental variable regression (nested NPIV). Examples include long term, mediated, and time varying treatment effects identified using proxy variables. However, it appears that no prior estimators or guarantees for nested NPIV exist, preventing flexible estimation and inference for these causal parameters. A major challenge is compounding ill posedness due to the nested inverse problems. We analyze adversarial estimators of nested NPIV, and provide sufficient conditions for efficient inference on the causal parameter. Our nonasymptotic analysis has three salient features: (i) introducing techniques that limit how ill posedness compounds; (ii) accommodating neural networks, random forests, and reproducing kernel Hilbert spaces; and (iii) extending to causal functions, e.g. long term heterogeneous treatment effects. We measure long term heterogeneous treatment effects of Project STAR and mediated proximal treatment effects of the Job Corps.
△ Less
Submitted 10 March, 2024; v1 submitted 28 December, 2021;
originally announced December 2021.
-
Triplet loss based embeddings for forensic speaker identification in Spanish
Authors:
Emmanuel Maqueda,
Javier Alvarez-Jimenez,
Carlos Mena,
Ivan Meza
Abstract:
With the advent of digital technology, it is more common that committed crimes or legal disputes involve some form of speech recording where the identity of a speaker is questioned [1]. In face of this situation, the field of forensic speaker identification has been looking to shed light on the problem by quantifying how much a speech recording belongs to a particular person in relation to a popul…
▽ More
With the advent of digital technology, it is more common that committed crimes or legal disputes involve some form of speech recording where the identity of a speaker is questioned [1]. In face of this situation, the field of forensic speaker identification has been looking to shed light on the problem by quantifying how much a speech recording belongs to a particular person in relation to a population. In this work, we explore the use of speech embeddings obtained by training a CNN using the triplet loss. In particular, we focus on the Spanish language which has not been extensively studies. We propose extracting the embeddings from speech spectrograms samples, then explore several configurations of such spectrograms, and finally, quantify the embeddings quality. We also show some limitations of our data setting which is predominantly composed by male speakers. At the end, we propose two approaches to calculate the Likelihood Radio given out speech embeddings and we show that triplet loss is a good alternative to create speech embeddings for forensic speaker identification.
△ Less
Submitted 13 September, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Hacia los Comités de Ética en Inteligencia Artificial
Authors:
Sofía Trejo,
Ivan Meza,
Fernanda López-Escobedo
Abstract:
The goal of Artificial Intelligence based systems is to take decisions that have an effect in their environment and impact society. This points out to the necessity of mechanism that regulate the impact of this type of system in society. For this reason, it is priority to create the rules and specialized organizations that can oversight the following of such rules, particularly that human rights p…
▽ More
The goal of Artificial Intelligence based systems is to take decisions that have an effect in their environment and impact society. This points out to the necessity of mechanism that regulate the impact of this type of system in society. For this reason, it is priority to create the rules and specialized organizations that can oversight the following of such rules, particularly that human rights precepts at local and international level. This work proposes the creation, at the universities, of Ethical Committees or Commissions specialized on Artificial Intelligence that would be in charge of define the principles and will guarantee the following of good practices in the field Artificial Intelligence.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Lost in Translation: Analysis of Information Loss During Machine Translation Between Polysynthetic and Fusional Languages
Authors:
Manuel Mager,
Elisabeth Mager,
Alfonso Medina-Urrea,
Ivan Meza,
Katharina Kann
Abstract:
Machine translation from polysynthetic to fusional languages is a challenging task, which gets further complicated by the limited amount of parallel text available. Thus, translation performance is far from the state of the art for high-resource and more intensively studied language pairs. To shed light on the phenomena which hamper automatic translation to and from polysynthetic languages, we stu…
▽ More
Machine translation from polysynthetic to fusional languages is a challenging task, which gets further complicated by the limited amount of parallel text available. Thus, translation performance is far from the state of the art for high-resource and more intensively studied language pairs. To shed light on the phenomena which hamper automatic translation to and from polysynthetic languages, we study translations from three low-resource, polysynthetic languages (Nahuatl, Wixarika and Yorem Nokki) into Spanish and vice versa. Doing so, we find that in a morpheme-to-morpheme alignment an important amount of information contained in polysynthetic morphemes has no Spanish counterpart, and its translation is often omitted. We further conduct a qualitative analysis and, thus, identify morpheme types that are commonly hard to align or ignored in the translation process.
△ Less
Submitted 1 July, 2018;
originally announced July 2018.
-
Challenges of language technologies for the indigenous languages of the Americas
Authors:
Manuel Mager,
Ximena Gutierrez-Vasques,
Gerardo Sierra,
Ivan Meza
Abstract:
Indigenous languages of the American continent are highly diverse. However, they have received little attention from the technological perspective. In this paper, we review the research, the digital resources and the available NLP systems that focus on these languages. We present the main challenges and research questions that arise when distant languages and low-resource scenarios are faced. We w…
▽ More
Indigenous languages of the American continent are highly diverse. However, they have received little attention from the technological perspective. In this paper, we review the research, the digital resources and the available NLP systems that focus on these languages. We present the main challenges and research questions that arise when distant languages and low-resource scenarios are faced. We would like to encourage NLP research in linguistically rich and diverse areas like the Americas.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.