CORRECTING SELECTION BIAS IN STANDARDIZED TEST SCORES COMPARISONS

Onil Boussim
Abstract.

This paper addresses the issue of sample selection bias when comparing countries using International assessments like PISA (Program for International Student Assessment). Despite its widespread use, PISA rankings may be biased due to different attrition patterns in different countries, leading to inaccurate comparisons. This study proposes a methodology to correct for sample selection bias using a quantile selection model. Applying the method to PISA 2018 data, I find that correcting for selection bias significantly changes the rankings (based on the mean) of countries’ educational performances. My results highlight the importance of accounting for sample selection bias in international educational comparisons.

Keywords: Quantiles, Sample selection, International student achievement assessments, Pisa.

JEL codes: C34, C83, I20

We thank Marc Henry, Andres Aradillas-Lopez, Michael Gechter, Ismael Mourifie and all the participants of the African Econometric Society 2024 for helpful discussions and comments. All errors are mine (Onil Boussim : [email protected])

1. Introduction

International comparisons of educational achievement, typically derived from standardized assessment scores have become a crucial tool for evaluating and guiding educational policies worldwide. These comparisons not only facilitate cross-national evaluations but also influence educational reforms, with rankings often serving as a motivation for program adjustments or justification for adopting foreign reforms (see Nagy (1996), Martin et al. (2000), McEwan and Marshall (2004), Cromley (2009), Tienken (2008), McGaw (2008), and Jakubowski and Pokropek (2015)). The most famous of them is the Program for International Student Assessment (PISA).

PISA is a triennial international survey that aims to evaluate education systems worldwide by assessing the skills and knowledge of 15-year-old students. PISA focuses on three main domains which are reading, mathematics, and science. To participate in PISA, students must be attending school by age 15 and have completed at least six years of education. Some countries may also exclude certain students from the sample, such as those living in remote areas or on reserves. Therefore, the coverage rate (percentage of 15-year-olds covered by PISA) is generally not 100. Because of that, If a given country A has a coverage rate close 100 % and country B has a low coverage rate, one may end up comparing nearly everyone in country A against the ”best” in country B (assuming that those who managed to stay in the school system up to the age required are the best of the system). This problem also applies to every other international standardized assessment.

For this reason, many critics suggest that the ranking and comparisons may not be accurate considering sample selection biases due to the exclusion of an important portion of the target population and can therefore lead to inaccurate comparisons (see Rotberg (1995), Berliner (1993), Ferreira and Gignoux (2014)). One cannot ignore low coverage rate because staying in the school system is correlated with some student characteristics that also affect assessment scores like socioeconomic background and unobserved factors (talent, motivation,…). Adjusting for sample selection bias before ranking countries is a more accurate way to compare them.

Solving this selection problem is an interesting econometric question because we face 2 main challenges. The first challenge is the absence of any information about non-enrolled individuals in the data. In that sense, it differs from the context of correcting sample selection in wages based on surveys that contain information on labor force participants and also non-participants(see Heckman (1974), Arellano and Bonhomme (2017), Chernozhukov et al. (2023)…). The second challenge is that most methods rely on the existence of a valid instrument while in our case, there is no justifiable good instrument (at least to my knowledge). Because of these two challenges, the identification power is limited.

I propose a sample selection correction method based on a quantile selection model (see Huber and Melly (2015), Arellano and Bonhomme (2016), Arellano and Bonhomme (2017)). I show that selection-corrected quantiles estimates can be obtained by suitably shifting the percentile levels of the observed quantile function as in Arellano and Bonhomme (2017). I obtain non-parametric partial identification of the selection-corrected quantiles with a stochastic dominance assumption and I have to rely on a parametric assumption for point identification of those quantiles. The methodology is applied to (PISA) 2018 and the findings reveal that when selection biases are corrected, the rankings and comparisons of countries’ educational performances can change significantly. This underscores the importance of accounting for sample selection biases to ensure fair and accurate international educational comparisons.

The rest of the paper is as follows. In Section 2, I present the model and the identification results. In Section 3, I present the application and I conclude in Section 4. All the detailed proofs can be found in the appendix.

2. Model and Identification

Fix a given country. Let Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the hypothetical score an individual would achieve if they had completed their education up to the age required for the assessment (for example age 15 for PISA assessment). We refer to it as the potential assessment score. Let S𝑆\displaystyle Sitalic_S be a binary variable that takes the value 11\displaystyle 11 if the individual meets the requirement to be part of the assessment and 00\displaystyle 0 otherwise. Notice that if the probablity of S=1𝑆1\displaystyle S=1italic_S = 1 is 11\displaystyle 11 ((S=1)=1𝑆11\displaystyle\mathbb{P}(S=1)=1roman_ℙ ( italic_S = 1 ) = 1), we can observe Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT directly in the data for the whole target population. But if that probability is strictly less than 1, we do not observe Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT for the subpopulation for which S=0𝑆0\displaystyle S=0italic_S = 0. Let Y𝑌\displaystyle Yitalic_Y be the observed assessment score the researcher has access to. Y𝑌\displaystyle Yitalic_Y and Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT coincide only when S=1𝑆1\displaystyle S=1italic_S = 1.

We already explained that countries ranking should be based on the mean of Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and not the mean of Y𝑌\displaystyle Yitalic_Y. But here we focus in the whole distribution of Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT since in case some cases the researcher might be interested in that. Since Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a continuous random variable, it can be represented by a skorohod quantile representation. Let U𝒰[0,1]similar-to𝑈𝒰01\displaystyle U\sim\mathcal{U}[0,1]italic_U ∼ caligraphic_U [ 0 , 1 ] be the rank111The rank of a continuous random variable X𝑋\displaystyle Xitalic_X is a function that assigns a numerical value, typically between 0 and 1, indicating the proportion of the distribution that is less than or equal to a given value x𝑥\displaystyle xitalic_x. of Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. I consider the following sample selection model:

Y=qY(U)superscript𝑌subscript𝑞superscript𝑌𝑈\displaystyle\displaystyle Y^{\ast}=q_{Y^{\ast}}(U)italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_U )
Y=Y if S=1𝑌superscript𝑌 if 𝑆1\displaystyle\displaystyle Y=Y^{\ast}\textit{ if }S=1italic_Y = italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT if italic_S = 1

The first equation is simply the skorohod quantile representation of Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. The objective here to identify the quantiles qY(u)subscript𝑞superscript𝑌𝑢\displaystyle q_{Y^{\ast}}(u)italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) for all u[0,1]𝑢01\displaystyle u\in[0,1]italic_u ∈ [ 0 , 1 ]. From the quantiles we can derive the Cdf and the mean and any inequality measure we might be interested in :

FY(y)=01𝟙{qY(u)y}}\displaystyle F_{Y^{\ast}}(y)=\int_{0}^{1}\mathbb{1}\{q_{Y^{\ast}}(u)\leq y\}\}italic_F start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_y ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT blackboard_𝟙 { italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) ≤ italic_y } }
E(Y)=01qY(u)𝑑u𝐸superscript𝑌superscriptsubscript01subscript𝑞superscript𝑌𝑢differential-d𝑢\displaystyle E(Y^{\ast})=\int_{0}^{1}q_{Y^{\ast}}(u)duitalic_E ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) italic_d italic_u

For a given rank u[0,1]𝑢01\displaystyle u\in[0,1]italic_u ∈ [ 0 , 1 ], define u~(Uu|S=1)~𝑢𝑈conditional𝑢𝑆1\displaystyle\tilde{u}\equiv\mathbb{P}(U\leq u|S=1)over~ start_ARG italic_u end_ARG ≡ roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) as the selection corrected rank. The first important result of this paper is the following lemma :

Lemma 1.
u[0,1], we have :for-all𝑢01 we have :\displaystyle\forall u\in[0,1],\textit{ we have :}∀ italic_u ∈ [ 0 , 1 ] , we have :
qY(u)=qY|S=1(u~)subscript𝑞superscript𝑌𝑢subscript𝑞conditional𝑌𝑆1~𝑢\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}(\tilde{u})italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) = italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( over~ start_ARG italic_u end_ARG )

From the data, we already know the quantile function qY|S=1(u),u[0,1]subscript𝑞conditional𝑌𝑆1𝑢𝑢01\displaystyle q_{Y|S=1}(u),u\in[0,1]italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( italic_u ) , italic_u ∈ [ 0 , 1 ]. By this lemma, we can see that it is possible to recover the quantiles of Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by applying a suitable shifting of percentile levels in the quantile function of Y|S=1conditional𝑌𝑆1\displaystyle Y|S=1italic_Y | italic_S = 1. Since we do not know u~~𝑢\displaystyle\tilde{u}over~ start_ARG italic_u end_ARG, lemma 1 is not enough to identify qY(u)subscript𝑞superscript𝑌𝑢\displaystyle q_{Y^{\ast}}(u)italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ). But it allows us to focus on what to look for.

To be able to discuss identification results, let’s make some assumptions. Since the data we have is by nature truncated, we make the following assumption concerning the coverage rate.

Assumption 1 (Identification of coverage rate).

p(S=1)𝑝𝑆1\displaystyle p\equiv\mathbb{P}(S=1)italic_p ≡ roman_ℙ ( italic_S = 1 ) can be identified.

Assumption 1 is trivially satisfied with PISA data since we always have access to the coverage rate : the proportion of individuals in the target population included in the assessment.

In the absence of more restrictive assumptions and because of the data limitations, the exact value of the quantiles cannot be pinpointed, we therefore have a range of many possible values. That means that without more assumptions the quantiles are partially identified. Here is the first partial identification result:

Lemma 2.
Under Assumptions 1 ,u[0,1],Under Assumptions 1 for-all𝑢01\displaystyle\textit{Under Assumptions 1 },\forall u\in[0,1],Under Assumptions 1 , ∀ italic_u ∈ [ 0 , 1 ] ,
qY|S=1(max{u+p1,0}p)qY(u)qY|S=1(min{u,p}p)subscript𝑞conditional𝑌𝑆1𝑢𝑝10𝑝subscript𝑞superscript𝑌𝑢subscript𝑞conditional𝑌𝑆1𝑢𝑝𝑝\displaystyle\displaystyle q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q% _{Y^{\ast}}(u)\leq q_{Y|S=1}\left(\frac{\min\{u,p\}}{p}\right)italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ) ≤ italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) ≤ italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG roman_min { italic_u , italic_p } end_ARG start_ARG italic_p end_ARG )

This lemma follows from an application of the Fréchet–Hoeffding inequality on the probability of a joint event. The construction of the bounds only necessitates the knowledge of the propensity score. By assuming that I can get the propensity score, I can compute these bounds. However, one can make the bounds more informative by adding a reasonable structure to the unobservables.

Assumption 2.

[Stochastic Dominance]  

 u[0,1], (Uu|S=1)(Uu|S=0)formulae-sequencefor-all u01 𝑈conditional𝑢𝑆1𝑈conditional𝑢𝑆0\displaystyle\forall\textit{ u}\in[0,1],\textit{ }\mathbb{P}(U\leq u|S=1)\leq% \mathbb{P}(U\leq u|S=0)∀ u ∈ [ 0 , 1 ] , roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) ≤ roman_ℙ ( italic_U ≤ italic_u | italic_S = 0 )

The assumption is that the distribution of U|S=1conditional𝑈𝑆1\displaystyle U|S=1italic_U | italic_S = 1 stochastically dominates the distribution of U|S=0conditional𝑈𝑆0\displaystyle U|S=0italic_U | italic_S = 0. This typically implies that U𝑈\displaystyle Uitalic_U values tend to be higher when S=1𝑆1\displaystyle S=1italic_S = 1 compared to when S=0𝑆0\displaystyle S=0italic_S = 0. This assumption conveys the idea that individuals who meet the requirement for the assessment (S=1)𝑆1\displaystyle(S=1)( italic_S = 1 ) generally have higher latent abilities than those who do not S=0𝑆0\displaystyle S=0italic_S = 0. With this assumption, we can derive the following theorem.

Theorem 1 (Partial Identification).
Under Assumptions 1 and 2, the following bounds are valid and sharp

u[0,1]for-all𝑢01\displaystyle\forall u\in[0,1]∀ italic_u ∈ [ 0 , 1 ],

qY|S=1(max{u+p1,0}p)qY(u)qY|S=1(u)subscript𝑞conditional𝑌𝑆1𝑢𝑝10𝑝subscript𝑞superscript𝑌𝑢subscript𝑞conditional𝑌𝑆1𝑢\displaystyle q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q_{Y^{\ast}}% (u)\leq q_{Y|S=1}(u)italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ) ≤ italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) ≤ italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( italic_u )

We also have :

01qY|S=1(max{u+p1,0}p)𝑑uE(Y)E(Y|S=1)superscriptsubscript01subscript𝑞conditional𝑌𝑆1𝑢𝑝10𝑝differential-d𝑢𝐸superscript𝑌𝐸conditional𝑌𝑆1\displaystyle\int_{0}^{1}q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)du\leq E% (Y^{\ast})\leq E(Y|S=1)∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ) italic_d italic_u ≤ italic_E ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≤ italic_E ( italic_Y | italic_S = 1 )

This theorem states that given the model and the assumptions, our latent quantile function for a given u[0,1]𝑢01\displaystyle u\in[0,1]italic_u ∈ [ 0 , 1 ] lies in the above interval whose bounds cannot be improved upon without further assumptions. In other words, the lower and upper bounds are the best lower and upper bounds consistent with the data and the assumptions. This interval is smaller than the one derived in Lemma 1 since the upper bound upper of this one is smaller. This is due to the introduction of Assumption 2.

Notice that if p=1𝑝1\displaystyle p=1italic_p = 1, the bounds in Lemma 2 and Theorem 1 shrink to exactly one point (the latent quantile becomes exactly the observed quantile). This is because the sample selection vanishes.

-The upper bound in the theorem is exactly observed quantile function. This is to show that the analysis or comparisons made on observed quantiles can suffer positive bias since we consider the highest value possible for the quantile (and therefore for the mean). This upper bound is only achieved when the selection is random (independent of the evaluation). This means that the official ranking is based on the assumption that people leave the school system randomly.

-The lower bound is achieved when we consider that individuals meet the requirements of the assessment if their rank U𝑈\displaystyle Uitalic_U is greater than the proportion of those who do not satisfy the requirement (1p)1𝑝\displaystyle(1-p)( 1 - italic_p ). This leads to the following structure S=𝟙{U1p}𝑆double-struck-𝟙𝑈1𝑝\displaystyle S=\mathbb{1}\{U\geq 1-p\}italic_S = blackboard_𝟙 { italic_U ≥ 1 - italic_p }. It means that we observe the outcome Y𝑌\displaystyle Yitalic_Y only for the proportion p𝑝\displaystyle pitalic_p of individuals at the top of the distribution of U𝑈\displaystyle Uitalic_U. This corresponds to the scenario where individuals with the lowest values of U𝑈\displaystyle Uitalic_U (the least talented ones) are the one effectively excluded from the assessment.

One can compare countries based on the bounds of the average by assigning to each country the middle of those bounds. Another way is also to just rank them based on the lower bound, which is the most conservative way to rank them.

On top of that, I propose a parametric solution to have point identification. In fact, in this problem, we cannot get point identification without adding more structure to the selection mechanism. Let’s consider the following :

Assumption 3.

[Structure on S] .

S=𝟙{UV},Vβ(1,θ0)formulae-sequence𝑆double-struck-𝟙𝑈𝑉similar-to𝑉𝛽1subscript𝜃0\displaystyle S=\mathbb{1}\{U\geq V\},V\sim\beta(1,\theta_{0})italic_S = blackboard_𝟙 { italic_U ≥ italic_V } , italic_V ∼ italic_β ( 1 , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT )

The variables U𝑈\displaystyle Uitalic_U and V𝑉\displaystyle Vitalic_V are unobserved, U𝑈\displaystyle Uitalic_U can be seen as a return and V𝑉\displaystyle Vitalic_V as a cost. so the assumption says that children are enrolled in school when their return to schooling is higher than their cost. U𝑈\displaystyle Uitalic_U and V𝑉\displaystyle Vitalic_V can be correlated but we do not specify their dependence. V𝑉\displaystyle Vitalic_V mainly captures the differences in enrollment patterns between countries. The choice of the beta distribution is motivated by the limited identification power and some economic intuition.

Here are the technical reasons: Since the return U𝑈\displaystyle Uitalic_U has support in [0,1]01\displaystyle[0,1][ 0 , 1 ], the cost also needs to have the same support to make the comparison meaningful. Now I choose to fix the first parameter to 11\displaystyle 11 (I can also fix the second) since I have only one moment available p𝑝\displaystyle pitalic_p, I can only identify one unknown parameter in the parametric family.

The economic reasons are: Since V𝑉\displaystyle Vitalic_V captures the cost faced by individuals, it is also reasonable to expect the likelihood of V𝑉\displaystyle Vitalic_V to be a decreasing function. The reason is that for every country there is much more people facing relative lower cost than relative higher costs. Therefore as the cost increase, as the proportion of people facing it should decrease.

Secondly, since V𝑉\displaystyle Vitalic_V captures differences between countries, it is also desirable that for countries with larger values of p𝑝\displaystyle pitalic_p, smaller values of V𝑉\displaystyle Vitalic_V occur more frequently than larger ones. V𝑉\displaystyle Vitalic_V should be more left-skewed as p𝑝\displaystyle pitalic_p increases. It simply means that you are more likely to face a lower cost when being in a country with higher p𝑝\displaystyle pitalic_p.

With all those given constraints, it appears natural to us to chose the parametric family β(1,θ0)𝛽1subscript𝜃0\displaystyle\beta(1,\theta_{0})italic_β ( 1 , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) which satisfies all the conditions when p>0.5𝑝0.5\displaystyle p>0.5italic_p > 0.5 as we can see in figure 1. p>0.5𝑝0.5\displaystyle p>0.5italic_p > 0.5 is generally satisfied for all the countries in PISA.

Figure 1. PDF of Vβ(1,θ(p))similar-to𝑉𝛽1𝜃𝑝\displaystyle V\sim\beta(1,\theta(p))italic_V ∼ italic_β ( 1 , italic_θ ( italic_p ) ) for different values of p𝑝\displaystyle pitalic_p
Refer to caption
Theorem 2 (Identification).

Under Assumptions 1 and 3, qYsubscript𝑞superscript𝑌\displaystyle q_{Y^{\ast}}italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT is identified and we have :

qY(u)=qY|S=1(1p(uFV,θ0(u)0uv𝑑FV,θ0(v)))subscript𝑞superscript𝑌𝑢subscript𝑞conditional𝑌𝑆11𝑝𝑢subscript𝐹𝑉subscript𝜃0𝑢superscriptsubscript0𝑢𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}\left(\frac{1}{p}\left(uF_{V,\theta_{0}% }(u)-\int_{0}^{u}vdF_{V,\theta_{0}}(v)\right)\right)italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) = italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( italic_u italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_u ) - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_u end_POSTSUPERSCRIPT italic_v italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) ) )
𝔼(Y)=01qY|S=1(1p(uFV,θ0(u)0uv𝑑FV,θ0(v)))𝑑u𝔼superscript𝑌superscriptsubscript01subscript𝑞conditional𝑌𝑆11𝑝𝑢subscript𝐹𝑉subscript𝜃0𝑢superscriptsubscript0𝑢𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣differential-d𝑢\displaystyle\mathbb{E}(Y^{\ast})=\int_{0}^{1}q_{Y|S=1}\left(\frac{1}{p}\left(% uF_{V,\theta_{0}}(u)-\int_{0}^{u}vdF_{V,\theta_{0}}(v)\right)\right)duroman_𝔼 ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( italic_u italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_u ) - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_u end_POSTSUPERSCRIPT italic_v italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) ) ) italic_d italic_u

where θ0=11p1where subscript𝜃011𝑝1\displaystyle\textit{where }\theta_{0}=\frac{1}{1-p}-1where italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 1 - italic_p end_ARG - 1

This theorem gives us a practical and very simple way to correct the sample selection problem in the rankings. In the next section, we see how this actually can make a difference.

3. Application

In this section, I compute estimates of the selection-corrected means to make educational achievement comparisons using PISA 2018.

Table 1 and Table 2 available in the appendix summarize the results respectively in mathematics and reading. From there, we can see that the corrected means are all below the observed ones, evidence that the selection bias creates an upward bias of the mean. Since the selection does not affect countries in the same way, the correction also affect them differently (higher values of p𝑝\displaystyle pitalic_p correspond to a smaller change in the mean than lower values of p𝑝\displaystyle pitalic_p). Ranks change because of that as we can observe in the last 2 columns of the table. To read the table, the column countries lists the participating countries in the PISA 2018 assessment, p𝑝\displaystyle pitalic_p represents the coverage rate or the proportion of the target population covered by the PISA sample for each country. Lbound and Ubound provide the lower and upper bounds derived in Theorem 1. Mean(Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT) shows the corrected mean for each country. Rank 1 represents the official ranking of countries based on their reported mean scores in the PISA 2018 (It is also the rank based on Ubound). Rank 2 represents the corrected ranking of countries after adjusting using Theorem 2 and Rank 3 is the rank based on the lower bound in Theorem 1.

Here are some of the substantial shifts in rankings between Rank 1 (official ranking) and Rank 2 (corrected ranking) after adjusting for sample selection in mathematics: Canada from 12 to 19 : Moved down 7 places. North Macedonia (from 67 to 58): Moved up 9 places. Slovenia ( from 14 to 8): Moved up 6 places. Poland (from 10 to 13): Moved down 3 places. United Kingdom:(from 19 to 28): Moved down 9 places. Germany (from 20 to 12): Moved up 8 places. Finland (from 16 to 10): Moved up 6 places. Ireland (from 22 to 16): Moved up 6 places. Jordan (from 63 to 72): Moved down 9 places.

We can also see that in ranking 3, the decrease in rank is more severe for countries with lower values of p𝑝\displaystyle pitalic_p. But in both cases Rank 2 and Rank 3, we can see that accounting for sample selection definitely affect the ranking. The same thing happen also for reading.

For the table presenting the PISA 2018 Reading results, here are some of the more substantial shifts in rankings. Slovenia (from 21 to 11): Moved up 10 places. Germany (from 20 to 8): Moved up 12 places. Czech Republic (from 25 to 16): Moved up 9 places. Belgium (from 23 to 19): Moved up 4 places. Ireland (from 8 to 4): Moved up 4 places. Sweden (from 11 to 23): Moved down 12 places. United Kingdom (from 14 to 22): Moved down 8 places. United States (from 13 to 21): Moved down 8 places. North Macedonia (from 67 to 55): Moved up 12 places. Jordan (from 55 to 69): Moved down 14 places.

Similarly to the mathematics results, these substantial shifts in rankings for the reading assessment underscore the importance of accounting for sample selection in international assessments to ensure a fair and accurate comparison of educational outcomes across countries.

4. Conclusion

In this paper, I have introduced a method to correct sample selection in country comparisons using international assessment scores data like PISA. The correction is done using a quantile selection model and under different assumptions, I explain how we can partially identify and point identify the latent quantiles. The results of my application on PISA 2018 suggest that the observed quantiles are upward biased and rankings are not accurate.

Appendix A Proofs of the results in the main text

A.1. Proof of Lemma 1

u~~𝑢\displaystyle\displaystyle\tilde{u}over~ start_ARG italic_u end_ARG =\displaystyle\displaystyle== (Uu|S=1)𝑈conditional𝑢𝑆1\displaystyle\displaystyle\mathbb{P}(U\leq u|S=1)roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 )
=\displaystyle\displaystyle== (YqY(u)|S=1)superscript𝑌conditionalsubscript𝑞superscript𝑌𝑢𝑆1\displaystyle\displaystyle\mathbb{P}(Y^{\ast}\leq q_{Y^{\ast}}(u)|S=1)roman_ℙ ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≤ italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) | italic_S = 1 )
=\displaystyle\displaystyle== FY|S=1(qY(u))subscript𝐹conditionalsuperscript𝑌𝑆1subscript𝑞superscript𝑌𝑢\displaystyle\displaystyle F_{Y^{\ast}|S=1}(q_{Y^{\ast}}(u))italic_F start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_S = 1 end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) )
=\displaystyle\displaystyle== FY|S=1(qY(u))subscript𝐹conditional𝑌𝑆1subscript𝑞superscript𝑌𝑢\displaystyle\displaystyle F_{Y|S=1}(q_{Y^{\ast}}(u))italic_F start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) )

From there, we have that :

qY(u)=qY|S=1(u~)subscript𝑞superscript𝑌𝑢subscript𝑞conditional𝑌𝑆1~𝑢\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}(\tilde{u})italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) = italic_q start_POSTSUBSCRIPT italic_Y | italic_S = 1 end_POSTSUBSCRIPT ( over~ start_ARG italic_u end_ARG )

A.2. Proof of Lemma 2

u~~𝑢\displaystyle\displaystyle\tilde{u}over~ start_ARG italic_u end_ARG =\displaystyle\displaystyle== (Uu|S=1)𝑈conditional𝑢𝑆1\displaystyle\displaystyle\mathbb{P}(U\leq u|S=1)roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 )
=\displaystyle\displaystyle== (Uu,S=1)pformulae-sequence𝑈𝑢𝑆1𝑝\displaystyle\displaystyle\frac{\mathbb{P}(U\leq u,S=1)}{p}divide start_ARG roman_ℙ ( italic_U ≤ italic_u , italic_S = 1 ) end_ARG start_ARG italic_p end_ARG

Now we apply Fréchet-bounds to the joint probability, and we obtain :

max{u+p1,0}pu~min{u,p}p𝑢𝑝10𝑝~𝑢𝑢𝑝𝑝\displaystyle\displaystyle\frac{\max\{u+p-1,0\}}{p}\leq\tilde{u}\leq\frac{\min% \{u,p\}}{p}divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ≤ over~ start_ARG italic_u end_ARG ≤ divide start_ARG roman_min { italic_u , italic_p } end_ARG start_ARG italic_p end_ARG

Now, using the monononicity of qYsubscript𝑞𝑌\displaystyle q_{Y}italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT, we finally obtain :

qY(max{u+p1,0}p)qY(u)qY(min{u,p}p)subscript𝑞𝑌𝑢𝑝10𝑝subscript𝑞superscript𝑌𝑢subscript𝑞𝑌𝑢𝑝𝑝\displaystyle\displaystyle q_{Y}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q_{% Y^{\ast}}(u)\leq q_{Y}\left(\frac{\min\{u,p\}}{p}\right)italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ) ≤ italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) ≤ italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( divide start_ARG roman_min { italic_u , italic_p } end_ARG start_ARG italic_p end_ARG )

A.3. Proof of Theorem 1

:

STEP 1 : Validity

First, we need to prove the validity of the inequalities. By Lemma 2, we already know that :

qY(u)qY(max{u+p1,0}p)subscript𝑞superscript𝑌𝑢subscript𝑞𝑌𝑢𝑝10𝑝\displaystyle\displaystyle q_{Y^{\ast}}(u)\geq q_{Y}\left(\frac{\max\{u+p-1,0% \}}{p}\right)italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) ≥ italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG )

Now consider the fact that:

u=(Uu|S=1)p+(Uu|S=0)(1p)𝑢𝑈conditional𝑢𝑆1𝑝𝑈conditional𝑢𝑆01𝑝\displaystyle\displaystyle u=\mathbb{P}\left(U\leq u|S=1\right)p+\mathbb{P}% \left(U\leq u|S=0\right)(1-p)italic_u = roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) italic_p + roman_ℙ ( italic_U ≤ italic_u | italic_S = 0 ) ( 1 - italic_p )

Using the stochastic dominance assumption, we get :

(Uu|S=1)p+(Uu|S=1)(1p)𝑈conditional𝑢𝑆1𝑝𝑈conditional𝑢𝑆11𝑝\displaystyle\displaystyle\mathbb{P}\left(U\leq u|S=1\right)p+\mathbb{P}\left(% U\leq u|S=1\right)(1-p)roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) italic_p + roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) ( 1 - italic_p ) \displaystyle\displaystyle\leq (Uu|S=1)p+(Uu|S=0)(1p)𝑈conditional𝑢𝑆1𝑝𝑈conditional𝑢𝑆01𝑝\displaystyle\displaystyle\mathbb{P}\left(U\leq u|S=1\right)p+\mathbb{P}\left(% U\leq u|S=0\right)(1-p)roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) italic_p + roman_ℙ ( italic_U ≤ italic_u | italic_S = 0 ) ( 1 - italic_p )
=\displaystyle\displaystyle== u𝑢\displaystyle\displaystyle uitalic_u

which is simply :

(Uu|S=1)u𝑈conditional𝑢𝑆1𝑢\displaystyle\mathbb{P}\left(U\leq u|S=1\right)\leq uroman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) ≤ italic_u

Using these probability bounds, and the monotonocity of qYsubscript𝑞𝑌\displaystyle q_{Y}italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT, we have that :

qY(u)subscript𝑞superscript𝑌𝑢\displaystyle\displaystyle q_{Y^{\ast}}(u)italic_q start_POSTSUBSCRIPT italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_u ) \displaystyle\displaystyle\leq qY(u)subscript𝑞𝑌𝑢\displaystyle\displaystyle q_{Y}(u)italic_q start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_u )

STEP 2 : Sharpness

For the lower bound, consider S=𝟙{U1p}𝑆double-struck-𝟙𝑈1𝑝\displaystyle S=\mathbb{1}\{U\geq 1-p\}italic_S = blackboard_𝟙 { italic_U ≥ 1 - italic_p }. We have :

(U1p)=p𝑈1𝑝𝑝\displaystyle\mathbb{P}(U\geq 1-p)=proman_ℙ ( italic_U ≥ 1 - italic_p ) = italic_p

In that case, we have :

u~=(Uu,U1p)p=max{u+p1,0}p~𝑢formulae-sequence𝑈𝑢𝑈1𝑝𝑝𝑢𝑝10𝑝\displaystyle\tilde{u}=\frac{\mathbb{P}(U\leq u,U\geq 1-p)}{p}=\frac{\max\{u+p% -1,0\}}{p}over~ start_ARG italic_u end_ARG = divide start_ARG roman_ℙ ( italic_U ≤ italic_u , italic_U ≥ 1 - italic_p ) end_ARG start_ARG italic_p end_ARG = divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG

And the stochastic dominance assumption also hold :

(Uu|S=1)=max{u+p1,0}p(Uu|S=0)=(Uu|U<1p)=min{u,1p}1p𝑈conditional𝑢𝑆1𝑢𝑝10𝑝𝑈conditional𝑢𝑆0𝑈conditional𝑢𝑈1𝑝𝑢1𝑝1𝑝\displaystyle\mathbb{P}(U\leq u|S=1)=\frac{\max\{u+p-1,0\}}{p}\leq\mathbb{P}(U% \leq u|S=0)=\mathbb{P}(U\leq u|U<1-p)=\frac{\min\{u,1-p\}}{1-p}roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) = divide start_ARG roman_max { italic_u + italic_p - 1 , 0 } end_ARG start_ARG italic_p end_ARG ≤ roman_ℙ ( italic_U ≤ italic_u | italic_S = 0 ) = roman_ℙ ( italic_U ≤ italic_u | italic_U < 1 - italic_p ) = divide start_ARG roman_min { italic_u , 1 - italic_p } end_ARG start_ARG 1 - italic_p end_ARG

The lower bound is therefore sharp.

For the upper bound, consider that S𝑆\displaystyle Sitalic_S is such that S𝑆\displaystyle Sitalic_S is independent of U𝑈\displaystyle Uitalic_U. We will have

u~=(Uu|S=1)=(Uu|S=0)=u~𝑢𝑈conditional𝑢𝑆1𝑈conditional𝑢𝑆0𝑢\displaystyle\tilde{u}=\mathbb{P}(U\leq u|S=1)=\mathbb{P}(U\leq u|S=0)=uover~ start_ARG italic_u end_ARG = roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 ) = roman_ℙ ( italic_U ≤ italic_u | italic_S = 0 ) = italic_u

The upper bound is therefore also sharp.

A.4. Proof of Theorem 2

:

We have that :

u~~𝑢\displaystyle\displaystyle\tilde{u}over~ start_ARG italic_u end_ARG =\displaystyle\displaystyle== (Uu|S=1)𝑈conditional𝑢𝑆1\displaystyle\displaystyle\mathbb{P}(U\leq u|S=1)roman_ℙ ( italic_U ≤ italic_u | italic_S = 1 )
=\displaystyle\displaystyle== 1p((Uu,S=1))1𝑝formulae-sequence𝑈𝑢𝑆1\displaystyle\displaystyle\frac{1}{p}\left(\mathbb{P}(U\leq u,S=1)\right)divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( roman_ℙ ( italic_U ≤ italic_u , italic_S = 1 ) )
=\displaystyle\displaystyle== 1p((Uu,UV))1𝑝formulae-sequence𝑈𝑢𝑈𝑉\displaystyle\displaystyle\frac{1}{p}\left(\mathbb{P}(U\leq u,U\geq V)\right)divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( roman_ℙ ( italic_U ≤ italic_u , italic_U ≥ italic_V ) )
=\displaystyle\displaystyle== 1p((Uu,Uv)𝑑FV,θ0(v))1𝑝formulae-sequence𝑈𝑢𝑈𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle\displaystyle\frac{1}{p}\left(\int\mathbb{P}(U\leq u,U\geq v)dF_{% V,\theta_{0}}(v)\right)divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( ∫ roman_ℙ ( italic_U ≤ italic_u , italic_U ≥ italic_v ) italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) )
=\displaystyle\displaystyle== 1p(0u(uv)𝑑FV,θ0(v))1𝑝superscriptsubscript0𝑢𝑢𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle\displaystyle\frac{1}{p}\left(\int_{0}^{u}(u-v)dF_{V,\theta_{0}}(% v)\right)divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_u end_POSTSUPERSCRIPT ( italic_u - italic_v ) italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) )
=\displaystyle\displaystyle== 1p(uFV,θ0(u)0uv𝑑FV,θ0(v))1𝑝𝑢subscript𝐹𝑉subscript𝜃0𝑢superscriptsubscript0𝑢𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle\displaystyle\frac{1}{p}\left(uF_{V,\theta_{0}}(u)-\int_{0}^{u}% vdF_{V,\theta_{0}}(v)\right)divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ( italic_u italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_u ) - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_u end_POSTSUPERSCRIPT italic_v italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v ) )

Now we also have that :

p𝑝\displaystyle\displaystyle pitalic_p =\displaystyle\displaystyle== (UV)𝑈𝑉\displaystyle\displaystyle\mathbb{P}(U\geq V)roman_ℙ ( italic_U ≥ italic_V )
=\displaystyle\displaystyle== (Uv)𝑑FV,θ0(v)𝑈𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle\displaystyle\int\mathbb{P}(U\geq v)dF_{V,\theta_{0}}(v)∫ roman_ℙ ( italic_U ≥ italic_v ) italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v )
=\displaystyle\displaystyle== (1v)𝑑FV,θ0(v)1𝑣differential-dsubscript𝐹𝑉subscript𝜃0𝑣\displaystyle\displaystyle\int(1-v)dF_{V,\theta_{0}}(v)∫ ( 1 - italic_v ) italic_d italic_F start_POSTSUBSCRIPT italic_V , italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_v )
=\displaystyle\displaystyle== 1𝔼θ0(V)1subscript𝔼subscript𝜃0𝑉\displaystyle\displaystyle 1-\mathbb{E}_{\theta_{0}}(V)1 - roman_𝔼 start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_V )

From there, we have : Eθ0(V)=1psubscript𝐸subscript𝜃0𝑉1𝑝\displaystyle E_{\theta_{0}}(V)=1-pitalic_E start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_V ) = 1 - italic_p

Appendix B Tables

Table 1. PISA 2018 MATHS
Countries p Lbound Ubound Mean(Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT) Rank 1 Rank 2 Rank 3
China 0.812 524.98 591.13 560.66 1 1 3
Singapore 0.953 550.87 566.76 555.87 2 2 1
Macau 0.883 512.48 556.64 536.15 3 4 4
Hong Kong 0.984 547.15 551.78 547.83 4 3 2
Taiwan 0.921 501.3 531.97 513.22 5 5 7
Japan 0.909 501.89 528.62 511.54 6 6 6
Korea 0.881 482.78 525.15 500.36 7 11 9
Estonia 0.931 502.42 522.99 510.64 8 7 5
Netherlands 0.912 490.21 520.53 502.7 9 9 8
Poland 0.9 486.89 516.44 497.8 10 13 10
Switzerland 0.889 475.49 515.39 493.65 11 15 11
Canada 0.863 468.6 512.03 487.85 12 19 15
Denmark 0.878 477.75 510.21 490.14 13 17 12
Slovenia 0.979 501 508.9 503.53 14 8 7
Belgium 0.936 488.35 508.28 494.88 15 14 9
Finland 0.963 497.06 507.84 500.46 16 10 8
Norway 0.911 471.94 502.64 484.58 17 20 13
Sweden 0.857 460.94 502.55 477.12 18 25 18
United Kingdom 0.848 444.36 502.2 473.86 19 28 23
Germany 0.993 498.61 499.96 498.71 20 12 6
Austria 0.889 465.62 499.47 478.55 21 23 16
Ireland 0.962 489.24 499.25 492.04 22 16 10
Czech Republic 0.954 485.91 498.94 489.48 23 18 11
Latvia 0.886 469.22 497.19 479.33 24 21 17
France 0.913 470.2 495.58 478.5 25 24 14
Iceland 0.916 471.74 495.07 479.11 26 22 13
New Zealand 0.888 463.03 494.61 474.69 27 27 19
Portugal 0.873 450.56 492.99 468.59 28 30 22
Australia 0.894 457.54 492.15 472.69 29 29 20
Russia 0.936 469.73 487.92 475.68 30 26 12
Slovak Republic 0.862 436.41 487.6 460.11 31 35 28
Italy 0.846 438.56 486.28 458.26 32 37 27
Lithuania 0.903 456.12 484.15 466.53 33 31 21
Luxembourg 0.871 441.17 483.5 459.76 34 36 25
Hungary 0.896 444.36 482.26 462.28 35 34 23
United States 0.861 435.59 477.92 453.76 37 38 30
Belarus 0.876 431.22 471.94 448.89 38 39 33
Malta 0.972 461.7 470.86 464.08 39 33 24
Croatia 0.891 433.32 464.43 445.81 40 40 32
Israel 0.809 395.24 462.21 425.38 41 44 38
Turkey 0.726 372.29 453.63 415.28 42 48 41
Ukraine 0.867 409.65 453.17 428.52 43 42 37
Greece 0.927 429.6 450.96 437.22 44 41 35
Serbia 0.885 413.42 448.25 427.28 45 43 39
Malaysia 0.723 361.52 440.57 403.94 46 51 44
Romania 0.726 339.93 430.68 388.54 50 56 50
United Arab Emirates 0.918 413.32 437.08 421.29 48 46 39
Albania 0.757 363.13 436.71 403.25 49 52 43
Bosnia and Herzegovina 0.823 362.94 406.85 381.56 61 59 46
Mexico 0.664 294.5 408. 64 367.08 60 63 52
Georgia 0.826 349.84 398.7 372.02 65 61 48
Peru 0.731 324 399.3 363.14 64 64 51
North Macedonia 0.947 377.12 393.2 382.21 67 58 45
Colombia 0.619 298.78 391.13 347.1 68 70 53
Brazil 0.65 297.08 382.82 341.86 69 73 54
Argentina 0.806 331.35 379.69 351.95 70 69 47
Indonesia 0.849 335.57 378.05 356.25 71 67 49
Saudi Arabia 0.845 336.42 374.13 352.68 72 68 50
Morocco 0.643 284.42 369.02 330.42 73 74 55
Kosovo 0.844 322.43 364.91 342.98 74 71 51
Panama 0.535 242.69 352.43 304.35 75 76 56
Philippines 0.679 267.1 352.39 313.8 76 75 57
Dominican Republic 0.73 261.21 324.53 295.11 77 77 58
Table 3. PISA 2018 READING
Countries p Lbound Ubound Mean(Ysuperscript𝑌\displaystyle Y^{\ast}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT) Rank 1 Rank 2 Rank 3
China 0.812 493.82 555.25 524.18 1 2 3
Singapore 0.953 532.06 549.5 537.08 2 1 1
Macau 0.883 484.2 525.13 501.71 3 7 4
Hong Kong 0.984 519.11 524.32 519.84 4 3 2
Estonia 0.931 503.73 523.37 510.01 5 6 5
Canada 0.863 475.63 520.12 493.5 6 9 8
Finland 0.963 506.4 519.66 510.08 7 5 6
Ireland 0.962 507.04 518.41 510.21 8 4 7
Korea 0.881 475.92 513.84 488.86 9 12 9
Poland 0.9 478.76 512.16 491.81 10 10 9
Sweden 0.857 453.82 505.74 475.14 11 23 13
New Zealand 0.888 470.81 505.29 482.29 12 15 10
United States 0.861 459.64 505 476.99 13 21 15
United Kingdom 0.848 460.17 504.34 476.6 14 22 14
Japan 0.909 471.65 503.3 484.4 15 14 11
Taiwan 0.921 475.98 502.37 484.59 16 13 11
Australia 0.894 465.19 502.27 479.2 17 17 13
Denmark 0.878 464.52 501.88 479.12 18 18 13
Norway 0.911 468.84 499.11 478.52 19 20 13
Germany 0.993 497.36 498.94 497.48 20 8 8
Slovenia 0.979 490.11 495.64 491.11 21 11 9
France 0.913 464.55 493.03 474.67 22 24 13
Belgium 0.936 472.32 492.87 478.62 23 19 13
Portugal 0.873 451.54 491.95 467.59 24 25 15
Czech Republic 0.954 477.09 490.42 480.71 25 16 10
Netherlands 0.912 454.31 484.58 465.34 26 27 15
Austria 0.889 448.66 483.53 461.95 27 28 15
Switzerland 0.889 448.99 483.5 461.44 28 29 15
Croatia 0.891 446.23 479.08 459.51 29 31 16
Russia 0.936 460.2 478.71 465.92 30 26 15
Latvia 0.886 447.13 478.05 458.59 31 32 16
Spain 0.918 452.01 476.58 460.74 32 30 15
Italy 0.846 426.62 476.17 447.16 33 37 18
Hungary 0.896 445.16 476.1 456.29 34 34 17
Lithuania 0.903 448.38 475.75 457.53 35 33 16
Belarus 0.876 438.37 474.12 452.34 36 36 17
Iceland 0.916 448.49 473.8 455.57 37 35 16
Israel 0.809 404.81 469.99 428.79 38 44 21
Luxembourg 0.871 430.47 469.69 444.31 39 38 19
Ukraine 0.867 422.09 465.42 440.44 40 41 20
Turkey 0.726 387.5 465.23 427.44 41 45 21
Slovak Republic 0.862 419.95 457.78 433.03 42 43 20
Greece 0.927 436.51 457.3 442.82 43 39 20
Chile 0.893 420.92 452.76 433.35 44 42 20
Malta 0.972 439.68 448.28 441.35 45 40 20
Serbia 0.885 409.2 439.25 419.16 46 46 21
United Arab Emirates 0.918 407.66 431.05 414.22 47 47 21
Costa Rica 0.628 337.61 427.14 383.24 48 53 23
Romania 0.726 323.94 427.05 381.25 49 54 23
Uruguay 0.78 362.02 426.45 390.95 50 52 22
Moldova 0.951 409.71 423.94 413.97 51 48 21
Montenegro 0.947 407.86 420.95 411.55 52 49 21
Mexico 0.664 339.31 420.6 379.68 53 56 23
Bulgaria 0.72 348.51 419.9 378.74 54 58 23
Jordan 0.54 262.1 419.01 356.46 55 69 25
Malaysia 0.723 348.04 415.01 377.72 56 59 23
Brazil 0.65 312.14 413.13 364.76 57 65 24
Colombia 0.619 313.85 412.12 365.17 58 64 24
Brunei 0.974 403.21 409.04 404.41 59 50 21
Qatar 0.923 386.21 407.09 392.04 60 51 22
Albania 0.757 344.93 405.35 374.52 61 61 23
Bosnia and Herzegovina 0.823 361.45 402.91 378.86 62 57 22
Argentina 0.806 338.8 401.19 367.98 63 63 24
Peru 0.731 331.68 400.32 362.93 64 66 24
Saudi Arabia 0.845 356.32 398.93 374.45 65 62 23
Thailand 0.724 327.11 393.27 361.47 66 67 24
North Macedonia 0.947 374.82 392.09 380.77 67 55 22
Azerbaijan 0.463 264.3 389.46 339.62 68 71 25
Kazakhstan 0.92 369.82 386.67 376.25 69 60 22
Georgia 0.826 338.46 379.69 356.49 70 68 24
Panama 0.535 259.74 378.23 324.81 71 74 25
Indonesia 0.849 341.16 370.96 352.43 72 70 24
Morocco 0.643 286.39 359.63 323.34 73 75 25
Lebanon 0.867 306.62 352.8 326.22 74 73 25
Kosovo 0.844 321.98 352.5 334.6 75 72 25
Dominican Republic 0.73 279.38 341.09 310.67 76 76 25
Philippines 0.679 282.68 339.47 307.82 77 77 25

References

  • Arellano and Bonhomme (2016) M. Arellano and S. Bonhomme. Sample selection in quantile regression: A survey. In Handbook of quantile regression, pages 209–224. Chapman and Hall/CRC, 2016.
  • Arellano and Bonhomme (2017) M. Arellano and S. Bonhomme. Quantile selection models with an application to understanding changes in wage inequality. Econometrica, 85(1):1–28, 2017.
  • Berliner (1993) D. C. Berliner. International comparisons of student achievement. In National Forum, volume 73, page 25, 1993.
  • Chernozhukov et al. (2023) V. Chernozhukov, I. Fernández-Val, and S. Luo. Distribution regression with sample selection and uk wage decomposition. arXiv preprint arXiv:1811.11603, 2023.
  • Cromley (2009) J. G. Cromley. Reading achievement and science proficiency: International comparisons from the programme on international student assessment. Reading Psychology, 30(2):89–118, 2009.
  • Ferreira and Gignoux (2014) F. H. Ferreira and J. Gignoux. The measurement of educational inequality: Achievement and opportunity. The World Bank Economic Review, 28(2):210–246, 2014.
  • Heckman (1974) J. Heckman. Shadow prices, market wages, and labor supply. Econometrica: journal of the econometric society, pages 679–694, 1974.
  • Huber and Melly (2015) M. Huber and B. Melly. A test of the conditional independence assumption in sample selection models. Journal of Applied Econometrics, 30(7):1144–1168, 2015.
  • Jakubowski and Pokropek (2015) M. Jakubowski and A. Pokropek. Reading achievement progress across countries. International Journal of Educational Development, 45:77–88, 2015.
  • Martin et al. (2000) M. O. Martin et al. International comparisons of student achievement. In Learning from others, pages 29–47. Springer, 2000.
  • McEwan and Marshall (2004) P. J. McEwan and J. H. Marshall. Why does academic achievement vary across countries? evidence from cuba and mexico. Education Economics, 12(3):205–217, 2004.
  • McGaw (2008) B. McGaw. The role of the oecd in international comparative studies of achievement. Assessment in Education: Principles, Policy & Practice, 15(3):223–243, 2008.
  • Nagy (1996) P. Nagy. International comparisons of student achievement in mathematics and science: A canadian perspective. Canadian Journal of Education/Revue canadienne de l’éducation, pages 396–413, 1996.
  • Rotberg (1995) I. C. Rotberg. Myths about test score comparisons. Science, 270(5241):1446–1448, 1995.
  • Tienken (2008) C. H. Tienken. Rankings of international achievement test performance and economic strength: Correlation or conjecture? International Journal of Education Policy and Leadership, 3(4):1–15, 2008.