-
Identification and Inference with Min-over-max Estimators for the Measurement of Labor Market Fairness
Authors:
Karthik Rajkumar
Abstract:
These notes shows how to do inference on the Demographic Parity (DP) metric. Although the metric is a complex statistic involving min and max computations, we propose a smooth approximation of those functions and derive its asymptotic distribution. The limit of these approximations and their gradients converge to those of the true max and min functions, wherever they exist. More importantly, when…
▽ More
These notes shows how to do inference on the Demographic Parity (DP) metric. Although the metric is a complex statistic involving min and max computations, we propose a smooth approximation of those functions and derive its asymptotic distribution. The limit of these approximations and their gradients converge to those of the true max and min functions, wherever they exist. More importantly, when the true max and min functions are not differentiable, the approximations still are, and they provide valid asymptotic inference everywhere in the domain. We conclude with some directions on how to compute confidence intervals for DP, how to test if it is under 0.8 (the U.S. Equal Employment Opportunity Commission fairness threshold), and how to do inference in an A/B test.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Causal Estimation of Position Bias in Recommender Systems Using Marketplace Instruments
Authors:
Rina Friedberg,
Karthik Rajkumar,
Jialiang Mao,
Qian Yao,
YinYin Yu,
Min Liu
Abstract:
Information retrieval systems, such as online marketplaces, news feeds, and search engines, are ubiquitous in today's digital society. They facilitate information discovery by ranking retrieved items on predicted relevance, i.e. likelihood of interaction (click, share) between users and items. Typically modeled using past interactions, such rankings have a major drawback: interaction depends on th…
▽ More
Information retrieval systems, such as online marketplaces, news feeds, and search engines, are ubiquitous in today's digital society. They facilitate information discovery by ranking retrieved items on predicted relevance, i.e. likelihood of interaction (click, share) between users and items. Typically modeled using past interactions, such rankings have a major drawback: interaction depends on the attention items receive. A highly-relevant item placed outside a user's attention could receive little interaction. This discrepancy between observed interaction and true relevance is termed the position bias. Position bias degrades relevance estimation and when it compounds over time, it can silo users into false relevant items, causing marketplace inefficiencies. Position bias may be identified with randomized experiments, but such an approach can be prohibitive in cost and feasibility. Past research has also suggested propensity score methods, which do not adequately address unobserved confounding; and regression discontinuity designs, which have poor external validity. In this work, we address these concerns by leveraging the abundance of A/B tests in ranking evaluations as instrumental variables. Historical A/B tests allow us to access exogenous variation in rankings without manually introducing them, harming user experience and platform revenue. We demonstrate our methodology in two distinct applications at LinkedIn - feed ads and the People-You-May-Know (PYMK) recommender. The marketplaces comprise users and campaigns on the ads side, and invite senders and recipients on PYMK. By leveraging prior experimentation, we obtain quasi-experimental variation in item rankings that is orthogonal to user relevance. Our method provides robust position effect estimates that handle unobserved confounding well, greater generalizability, and easily extends to other information retrieval systems.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Telesco** continued fractions for the error term in Stirling's formula
Authors:
Gaurav Bhatnagar,
Krishnan Rajkumar
Abstract:
In this paper, we introduce telesco** continued fractions to find lower bounds for the error term $r_n$ in Stirling's approximation $\displaystyle n! = \sqrt{2π}n^{n+1/2}e^{-n}e^{r_n}.$ This improves lower bounds given earlier by Cesàro (1922), Robbins (1955), Nanjundiah (1959), Maria (1965) and Popov (2017). The expression is in terms of a continued fraction, together with an algorithm to find…
▽ More
In this paper, we introduce telesco** continued fractions to find lower bounds for the error term $r_n$ in Stirling's approximation $\displaystyle n! = \sqrt{2π}n^{n+1/2}e^{-n}e^{r_n}.$ This improves lower bounds given earlier by Cesàro (1922), Robbins (1955), Nanjundiah (1959), Maria (1965) and Popov (2017). The expression is in terms of a continued fraction, together with an algorithm to find successive terms of this continued fraction. The technique we introduce allows us to experimentally obtain upper and lower bounds for a sequence of convergents of a continued fraction in terms of a difference of two continued fractions.
△ Less
Submitted 30 June, 2023; v1 submitted 2 April, 2022;
originally announced April 2022.
-
Donor's Deferral and Return Behavior: Partial Identification from a Regression Discontinuity Design with Manipulation
Authors:
Evan Rosenman,
Karthik Rajkumar,
Romain Gauriot,
Robert Slonim
Abstract:
Volunteer labor can temporarily yield lower benefits to charities than its costs. In such instances, organizations may wish to defer volunteer donations to a later date. Exploiting a discontinuity in blood donations' eligibility criteria, we show that deferring donors reduces their future volunteerism. In our setting, medical staff manipulates donors' reported hemoglobin levels over a threshold to…
▽ More
Volunteer labor can temporarily yield lower benefits to charities than its costs. In such instances, organizations may wish to defer volunteer donations to a later date. Exploiting a discontinuity in blood donations' eligibility criteria, we show that deferring donors reduces their future volunteerism. In our setting, medical staff manipulates donors' reported hemoglobin levels over a threshold to facilitate donation. Such manipulation invalidates standard regression discontinuity design. To circumvent this issue, we propose a procedure for obtaining partial identification bounds where manipulation is present. Our procedure is applicable in various regression discontinuity settings where the running variable is manipulated and discrete.
△ Less
Submitted 5 April, 2024; v1 submitted 4 October, 2019;
originally announced October 2019.
-
Coexistence of 11.2Tb/s Carrier-Grade Classical Channels and a DV-QKD Channel over a 7-Core Multicore Fibre
Authors:
Emilio Hugues-Salas,
Qibing Wang,
Rui Wang,
Kalyani Rajkumar,
George T. Kanellos,
Reza Nejabati,
Dimitra Simeonidou
Abstract:
We successfully demonstrate coexistence of record-high 11.2 Tb/s (56x200Gb/s) classical channels with a discrete-variable-QKD channel over a multicore fibre. Continuous secret key generation is confirmed together with classical channel performance below the SDFEC limit and a minimum quantum channel spacing of 17nm in the C-band.
We successfully demonstrate coexistence of record-high 11.2 Tb/s (56x200Gb/s) classical channels with a discrete-variable-QKD channel over a multicore fibre. Continuous secret key generation is confirmed together with classical channel performance below the SDFEC limit and a minimum quantum channel spacing of 17nm in the C-band.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Genetic Algorithm Based Resource Minimization in Network Code Based Peer-to-Peer Network
Authors:
M. Anandaraj,
K. Selvaraj,
P. Ganeshkumar,
K. Rajkumar,
S. Sriram
Abstract:
Block scheduling is difficult to implement in P2P network since there is no central coordinator. This problem can be solved by employing network coding technique which allows intermediate nodes to perform the coding operation instead of conventional store and forward the received data. There is a general assumption in this area of research so far that a target download rate is always attainable at…
▽ More
Block scheduling is difficult to implement in P2P network since there is no central coordinator. This problem can be solved by employing network coding technique which allows intermediate nodes to perform the coding operation instead of conventional store and forward the received data. There is a general assumption in this area of research so far that a target download rate is always attainable at every peer as long as coding operation is performed at all the nodes in the network. An interesting study is made that a maximum download rate can be attained by performing the coding operation at relatively small portion of the network. The problem of finding the minimal set of node to perform the coding operation and links to carry the coded data is called as a network code minimization problem (NCMP). It is proved to be NP hard problem. It can be solved using genetic algorithm (GA) because GA can be used to solve the diverse NP hard problem. A new NCMP model is proposed which considers both minimize the resources needed to perform coding operation and dynamic change in network topology due to disconnection. Based on this new NCMP model, an effective and novel GA is proposed by implementing problem specific GA operators into the evolutionary process. There is an attempt to implement the different compositions and several options of GA elements which worked well in many other problems and pick the one that works best for this resource minimization problem. Our simulation results prove that the proposed system outperforms the random selection and coding at all possible node mechanisms in terms of both download time and system throughput.
△ Less
Submitted 1 August, 2020; v1 submitted 15 June, 2019;
originally announced June 2019.
-
Ridge regularization for Mean Squared Error Reduction in Regression with Weak Instruments
Authors:
Karthik Rajkumar
Abstract:
In this paper, I show that classic two-stage least squares (2SLS) estimates are highly unstable with weak instruments. I propose a ridge estimator (ridge IV) and show that it is asymptotically normal even with weak instruments, whereas 2SLS is severely distorted and un-bounded. I motivate the ridge IV estimator as a convex optimization problem with a GMM objective function and an L2 penalty. I sho…
▽ More
In this paper, I show that classic two-stage least squares (2SLS) estimates are highly unstable with weak instruments. I propose a ridge estimator (ridge IV) and show that it is asymptotically normal even with weak instruments, whereas 2SLS is severely distorted and un-bounded. I motivate the ridge IV estimator as a convex optimization problem with a GMM objective function and an L2 penalty. I show that ridge IV leads to sizable mean squared error reductions theoretically and validate these results in a simulation study inspired by data designs of papers published in the American Economic Review.
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
Fixed Divisor of a Multivariate Polynomial and Generalized Factorials in Several Variables
Authors:
Devendra Prasad,
Krishnan Rajkumar,
A. Satyanarayana Reddy
Abstract:
We define new generalized factorials in several variables over an arbitrary subset $\underline{S} \subseteq R^n,$ where $R$ is a Dedekind domain and $n$ is a positive integer. We then study the properties of the fixed divisor $d(\underline{S},f)$ of a multivariate polynomial $f \in R[x_1,x_2, \ldots, x_n]$. We generalize the results of Polya, Bhargava, Gunji & McQuillan and strengthen that of Evra…
▽ More
We define new generalized factorials in several variables over an arbitrary subset $\underline{S} \subseteq R^n,$ where $R$ is a Dedekind domain and $n$ is a positive integer. We then study the properties of the fixed divisor $d(\underline{S},f)$ of a multivariate polynomial $f \in R[x_1,x_2, \ldots, x_n]$. We generalize the results of Polya, Bhargava, Gunji & McQuillan and strengthen that of Evrard, all of which relate the fixed divisor to generalized factorials of $\underline{S}$. We also express $d(\underline{S},f)$ in terms of the images $f(\underline{a})$ of finitely many elements $\underline{a} \in R^n$, generalizing a result of Hensel, and in terms of the coefficients of $f$ under explicit bases.
△ Less
Submitted 15 March, 2018;
originally announced March 2018.
-
A Survey on Fixed Divisors
Authors:
Devendra Prasad,
Krishnan Rajkumar,
A. Satyanarayana Reddy
Abstract:
In this article, we compile the work done by various mathematicians on the topic of the fixed divisor of a polynomial. This article explains most of the results concisely and is intended to be an exhaustive survey. We present the results on fixed divisors in various algebraic settings as well as the applications of fixed divisors to various algebraic and number theoretic problems. The work is pres…
▽ More
In this article, we compile the work done by various mathematicians on the topic of the fixed divisor of a polynomial. This article explains most of the results concisely and is intended to be an exhaustive survey. We present the results on fixed divisors in various algebraic settings as well as the applications of fixed divisors to various algebraic and number theoretic problems. The work is presented in an orderly fashion so as to start from the simplest case of $\Z,$ progressively leading up to the case of Dedekind domains. We also ask a few open questions according to their context, which may give impetus to the reader to work further in this direction. We describe various bounds for fixed divisors as well as the connection of fixed divisors with different notions in the ring of integer-valued polynomials. Finally, we suggest how the generalization of the ring of integer-valued polynomials in the case of the ring of $n \times n$ matrices over $\Z$ (or Dedekind domain) could lead to the generalization of fixed divisors in that setting.
△ Less
Submitted 8 June, 2019; v1 submitted 23 September, 2017;
originally announced September 2017.
-
A simplification of Apéry's proof of the irrationality of ζ(3)
Authors:
Krishnan Rajkumar
Abstract:
A simplification of Apéry's proof of the irrationality of ζ(3) is presented. The construction of approximations is motivated from the viewpoint of 2-dimensional recurrence relations which simplifies many of the details of the proof. Conclusive evidence is also presented that these constructions arise from a continued fraction due to Ramanujan.
A simplification of Apéry's proof of the irrationality of ζ(3) is presented. The construction of approximations is motivated from the viewpoint of 2-dimensional recurrence relations which simplifies many of the details of the proof. Conclusive evidence is also presented that these constructions arise from a continued fraction due to Ramanujan.
△ Less
Submitted 24 December, 2012;
originally announced December 2012.
-
On the zeros of the Epstein zeta function
Authors:
Anirban Mukhopadhyay,
Krishnan Rajkumar,
Kotyada Srinivas
Abstract:
In this article, we count the number of consecutive zeros of the Epstein zeta-function, associated to a certain quadratic form, on the critical line with ordinates lying in $[0,T], T$ sufficiently large and which are separated apart by a given positive number $V$.
In this article, we count the number of consecutive zeros of the Epstein zeta-function, associated to a certain quadratic form, on the critical line with ordinates lying in $[0,T], T$ sufficiently large and which are separated apart by a given positive number $V$.
△ Less
Submitted 2 February, 2011;
originally announced February 2011.
-
On the zeros of functions in the Selberg class
Authors:
Anirban Mukhopadhyay,
Kotyada Srinivas,
Krishnan Rajkumar
Abstract:
It is proved that under some suitable conditions, the degree two functions in the Selberg class have infinitely many zeros on the critical line.
It is proved that under some suitable conditions, the degree two functions in the Selberg class have infinitely many zeros on the critical line.
△ Less
Submitted 5 February, 2011; v1 submitted 4 April, 2008;
originally announced April 2008.