CORRECTING SELECTION BIAS IN STANDARDIZED TEST SCORES COMPARISONS

Onil Boussim

Abstract.

This paper addresses the issue of sample selection bias when comparing countries using International assessments like PISA (Program for International Student Assessment). Despite its widespread use, PISA rankings may be biased due to different attrition patterns in different countries, leading to inaccurate comparisons. This study proposes a methodology to correct for sample selection bias using a quantile selection model. Applying the method to PISA 2018 data, I find that correcting for selection bias significantly changes the rankings (based on the mean) of countries’ educational performances. My results highlight the importance of accounting for sample selection bias in international educational comparisons.

Keywords: Quantiles, Sample selection, International student achievement assessments, Pisa.

JEL codes: C34, C83, I20

We thank Marc Henry, Andres Aradillas-Lopez, Michael Gechter, Ismael Mourifie and all the participants of the African Econometric Society 2024 for helpful discussions and comments. All errors are mine (Onil Boussim : [email protected])

1. Introduction

International comparisons of educational achievement, typically derived from standardized assessment scores have become a crucial tool for evaluating and guiding educational policies worldwide. These comparisons not only facilitate cross-national evaluations but also influence educational reforms, with rankings often serving as a motivation for program adjustments or justification for adopting foreign reforms (see Nagy (1996), Martin et al. (2000), McEwan and Marshall (2004), Cromley (2009), Tienken (2008), McGaw (2008), and Jakubowski and Pokropek (2015)). The most famous of them is the Program for International Student Assessment (PISA).

PISA is a triennial international survey that aims to evaluate education systems worldwide by assessing the skills and knowledge of 15-year-old students. PISA focuses on three main domains which are reading, mathematics, and science. To participate in PISA, students must be attending school by age 15 and have completed at least six years of education. Some countries may also exclude certain students from the sample, such as those living in remote areas or on reserves. Therefore, the coverage rate (percentage of 15-year-olds covered by PISA) is generally not 100. Because of that, If a given country A has a coverage rate close 100 % and country B has a low coverage rate, one may end up comparing nearly everyone in country A against the ”best” in country B (assuming that those who managed to stay in the school system up to the age required are the best of the system). This problem also applies to every other international standardized assessment.

For this reason, many critics suggest that the ranking and comparisons may not be accurate considering sample selection biases due to the exclusion of an important portion of the target population and can therefore lead to inaccurate comparisons (see Rotberg (1995), Berliner (1993), Ferreira and Gignoux (2014)). One cannot ignore low coverage rate because staying in the school system is correlated with some student characteristics that also affect assessment scores like socioeconomic background and unobserved factors (talent, motivation,…). Adjusting for sample selection bias before ranking countries is a more accurate way to compare them.

Solving this selection problem is an interesting econometric question because we face 2 main challenges. The first challenge is the absence of any information about non-enrolled individuals in the data. In that sense, it differs from the context of correcting sample selection in wages based on surveys that contain information on labor force participants and also non-participants(see Heckman (1974), Arellano and Bonhomme (2017), Chernozhukov et al. (2023)…). The second challenge is that most methods rely on the existence of a valid instrument while in our case, there is no justifiable good instrument (at least to my knowledge). Because of these two challenges, the identification power is limited.

I propose a sample selection correction method based on a quantile selection model (see Huber and Melly (2015), Arellano and Bonhomme (2016), Arellano and Bonhomme (2017)). I show that selection-corrected quantiles estimates can be obtained by suitably shifting the percentile levels of the observed quantile function as in Arellano and Bonhomme (2017). I obtain non-parametric partial identification of the selection-corrected quantiles with a stochastic dominance assumption and I have to rely on a parametric assumption for point identification of those quantiles. The methodology is applied to (PISA) 2018 and the findings reveal that when selection biases are corrected, the rankings and comparisons of countries’ educational performances can change significantly. This underscores the importance of accounting for sample selection biases to ensure fair and accurate international educational comparisons.

The rest of the paper is as follows. In Section 2, I present the model and the identification results. In Section 3, I present the application and I conclude in Section 4. All the detailed proofs can be found in the appendix.

2. Model and Identification

Fix a given country. Let $\displaystyle Y^{\ast}$ be the hypothetical score an individual would achieve if they had completed their education up to the age required for the assessment (for example age 15 for PISA assessment). We refer to it as the potential assessment score. Let $\displaystyle S$ be a binary variable that takes the value $\displaystyle 1$ if the individual meets the requirement to be part of the assessment and $\displaystyle 0$ otherwise. Notice that if the probablity of $\displaystyle S=1$ is $\displaystyle 1$ ( $\displaystyle\mathbb{P}(S=1)=1$ ), we can observe $\displaystyle Y^{\ast}$ directly in the data for the whole target population. But if that probability is strictly less than 1, we do not observe $\displaystyle Y^{\ast}$ for the subpopulation for which $\displaystyle S=0$ . Let $\displaystyle Y$ be the observed assessment score the researcher has access to. $\displaystyle Y$ and $\displaystyle Y^{\ast}$ coincide only when $\displaystyle S=1$ .

We already explained that countries ranking should be based on the mean of $\displaystyle Y^{\ast}$ and not the mean of $\displaystyle Y$ . But here we focus in the whole distribution of $\displaystyle Y^{\ast}$ since in case some cases the researcher might be interested in that. Since $\displaystyle Y^{\ast}$ is a continuous random variable, it can be represented by a skorohod quantile representation. Let $\displaystyle U\sim\mathcal{U}[0,1]$ be the rank¹¹1The rank of a continuous random variable $\displaystyle X$ is a function that assigns a numerical value, typically between 0 and 1, indicating the proportion of the distribution that is less than or equal to a given value $\displaystyle x$ . of $\displaystyle Y^{\ast}$ . I consider the following sample selection model:

	$\displaystyle\displaystyle Y^{\ast}=q_{Y^{\ast}}(U)$
	$\displaystyle\displaystyle Y=Y^{\ast}\textit{ if }S=1$

The first equation is simply the skorohod quantile representation of $\displaystyle Y^{\ast}$ . The objective here to identify the quantiles $\displaystyle q_{Y^{\ast}}(u)$ for all $\displaystyle u\in[0,1]$ . From the quantiles we can derive the Cdf and the mean and any inequality measure we might be interested in :

\displaystyle F_{Y^{\ast}}(y)=\int_{0}^{1}\mathbb{1}\{q_{Y^{\ast}}(u)\leq y\}\}

\displaystyle E(Y^{\ast})=\int_{0}^{1}q_{Y^{\ast}}(u)du

For a given rank $\displaystyle u\in[0,1]$ , define $\displaystyle\tilde{u}\equiv\mathbb{P}(U\leq u|S=1)$ as the selection corrected rank. The first important result of this paper is the following lemma :

Lemma 1.

\displaystyle\forall u\in[0,1],\textit{ we have :}

\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}(\tilde{u})

From the data, we already know the quantile function $\displaystyle q_{Y|S=1}(u),u\in[0,1]$ . By this lemma, we can see that it is possible to recover the quantiles of $\displaystyle Y^{\ast}$ by applying a suitable shifting of percentile levels in the quantile function of $\displaystyle Y|S=1$ . Since we do not know $\displaystyle\tilde{u}$ , lemma 1 is not enough to identify $\displaystyle q_{Y^{\ast}}(u)$ . But it allows us to focus on what to look for.

To be able to discuss identification results, let’s make some assumptions. Since the data we have is by nature truncated, we make the following assumption concerning the coverage rate.

Assumption 1 (Identification of coverage rate).

$\displaystyle p\equiv\mathbb{P}(S=1)$ can be identified.

Assumption 1 is trivially satisfied with PISA data since we always have access to the coverage rate : the proportion of individuals in the target population included in the assessment.

In the absence of more restrictive assumptions and because of the data limitations, the exact value of the quantiles cannot be pinpointed, we therefore have a range of many possible values. That means that without more assumptions the quantiles are partially identified. Here is the first partial identification result:

Lemma 2.

\displaystyle\textit{Under Assumptions 1 },\forall u\in[0,1],

\displaystyle\displaystyle q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q% _{Y^{\ast}}(u)\leq q_{Y|S=1}\left(\frac{\min\{u,p\}}{p}\right)

This lemma follows from an application of the Fréchet–Hoeffding inequality on the probability of a joint event. The construction of the bounds only necessitates the knowledge of the propensity score. By assuming that I can get the propensity score, I can compute these bounds. However, one can make the bounds more informative by adding a reasonable structure to the unobservables.

Assumption 2.

[Stochastic Dominance]

\displaystyle\forall\textit{ u}\in[0,1],\textit{ }\mathbb{P}(U\leq u|S=1)\leq% \mathbb{P}(U\leq u|S=0)

The assumption is that the distribution of $\displaystyle U|S=1$ stochastically dominates the distribution of $\displaystyle U|S=0$ . This typically implies that $\displaystyle U$ values tend to be higher when $\displaystyle S=1$ compared to when $\displaystyle S=0$ . This assumption conveys the idea that individuals who meet the requirement for the assessment $\displaystyle(S=1)$ generally have higher latent abilities than those who do not $\displaystyle S=0$ . With this assumption, we can derive the following theorem.

Theorem 1 (Partial Identification).

Under Assumptions 1 and 2, the following bounds are valid and sharp

$\displaystyle\forall u\in[0,1]$ ,

\displaystyle q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q_{Y^{\ast}}% (u)\leq q_{Y|S=1}(u)

We also have :

\displaystyle\int_{0}^{1}q_{Y|S=1}\left(\frac{\max\{u+p-1,0\}}{p}\right)du\leq E% (Y^{\ast})\leq E(Y|S=1)

This theorem states that given the model and the assumptions, our latent quantile function for a given $\displaystyle u\in[0,1]$ lies in the above interval whose bounds cannot be improved upon without further assumptions. In other words, the lower and upper bounds are the best lower and upper bounds consistent with the data and the assumptions. This interval is smaller than the one derived in Lemma 1 since the upper bound upper of this one is smaller. This is due to the introduction of Assumption 2.

Notice that if $\displaystyle p=1$ , the bounds in Lemma 2 and Theorem 1 shrink to exactly one point (the latent quantile becomes exactly the observed quantile). This is because the sample selection vanishes.

-The upper bound in the theorem is exactly observed quantile function. This is to show that the analysis or comparisons made on observed quantiles can suffer positive bias since we consider the highest value possible for the quantile (and therefore for the mean). This upper bound is only achieved when the selection is random (independent of the evaluation). This means that the official ranking is based on the assumption that people leave the school system randomly.

-The lower bound is achieved when we consider that individuals meet the requirements of the assessment if their rank $\displaystyle U$ is greater than the proportion of those who do not satisfy the requirement $\displaystyle(1-p)$ . This leads to the following structure $\displaystyle S=\mathbb{1}\{U\geq 1-p\}$ . It means that we observe the outcome $\displaystyle Y$ only for the proportion $\displaystyle p$ of individuals at the top of the distribution of $\displaystyle U$ . This corresponds to the scenario where individuals with the lowest values of $\displaystyle U$ (the least talented ones) are the one effectively excluded from the assessment.

One can compare countries based on the bounds of the average by assigning to each country the middle of those bounds. Another way is also to just rank them based on the lower bound, which is the most conservative way to rank them.

On top of that, I propose a parametric solution to have point identification. In fact, in this problem, we cannot get point identification without adding more structure to the selection mechanism. Let’s consider the following :

Assumption 3.

[Structure on S] .

\displaystyle S=\mathbb{1}\{U\geq V\},V\sim\beta(1,\theta_{0})

The variables $\displaystyle U$ and $\displaystyle V$ are unobserved, $\displaystyle U$ can be seen as a return and $\displaystyle V$ as a cost. so the assumption says that children are enrolled in school when their return to schooling is higher than their cost. $\displaystyle U$ and $\displaystyle V$ can be correlated but we do not specify their dependence. $\displaystyle V$ mainly captures the differences in enrollment patterns between countries. The choice of the beta distribution is motivated by the limited identification power and some economic intuition.

Here are the technical reasons: Since the return $\displaystyle U$ has support in $\displaystyle[0,1]$ , the cost also needs to have the same support to make the comparison meaningful. Now I choose to fix the first parameter to $\displaystyle 1$ (I can also fix the second) since I have only one moment available $\displaystyle p$ , I can only identify one unknown parameter in the parametric family.

The economic reasons are: Since $\displaystyle V$ captures the cost faced by individuals, it is also reasonable to expect the likelihood of $\displaystyle V$ to be a decreasing function. The reason is that for every country there is much more people facing relative lower cost than relative higher costs. Therefore as the cost increase, as the proportion of people facing it should decrease.

Secondly, since $\displaystyle V$ captures differences between countries, it is also desirable that for countries with larger values of $\displaystyle p$ , smaller values of $\displaystyle V$ occur more frequently than larger ones. $\displaystyle V$ should be more left-skewed as $\displaystyle p$ increases. It simply means that you are more likely to face a lower cost when being in a country with higher $\displaystyle p$ .

With all those given constraints, it appears natural to us to chose the parametric family $\displaystyle\beta(1,\theta_{0})$ which satisfies all the conditions when $\displaystyle p>0.5$ as we can see in figure 1. $\displaystyle p>0.5$ is generally satisfied for all the countries in PISA.

Refer to caption — Figure 1. PDF of $\displaystyle V\sim\beta(1,\theta(p))$ for different values of $\displaystyle p$

Theorem 2 (Identification).

Under Assumptions 1 and 3, $\displaystyle q_{Y^{\ast}}$ is identified and we have :

\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}\left(\frac{1}{p}\left(uF_{V,\theta_{0}% }(u)-\int_{0}^{u}vdF_{V,\theta_{0}}(v)\right)\right)

\displaystyle\mathbb{E}(Y^{\ast})=\int_{0}^{1}q_{Y|S=1}\left(\frac{1}{p}\left(% uF_{V,\theta_{0}}(u)-\int_{0}^{u}vdF_{V,\theta_{0}}(v)\right)\right)du

$\displaystyle\textit{where }\theta_{0}=\frac{1}{1-p}-1$

This theorem gives us a practical and very simple way to correct the sample selection problem in the rankings. In the next section, we see how this actually can make a difference.

3. Application

In this section, I compute estimates of the selection-corrected means to make educational achievement comparisons using PISA 2018.

Table 1 and Table 2 available in the appendix summarize the results respectively in mathematics and reading. From there, we can see that the corrected means are all below the observed ones, evidence that the selection bias creates an upward bias of the mean. Since the selection does not affect countries in the same way, the correction also affect them differently (higher values of $\displaystyle p$ correspond to a smaller change in the mean than lower values of $\displaystyle p$ ). Ranks change because of that as we can observe in the last 2 columns of the table. To read the table, the column countries lists the participating countries in the PISA 2018 assessment, $\displaystyle p$ represents the coverage rate or the proportion of the target population covered by the PISA sample for each country. Lbound and Ubound provide the lower and upper bounds derived in Theorem 1. Mean( $\displaystyle Y^{\ast}$ ) shows the corrected mean for each country. Rank 1 represents the official ranking of countries based on their reported mean scores in the PISA 2018 (It is also the rank based on Ubound). Rank 2 represents the corrected ranking of countries after adjusting using Theorem 2 and Rank 3 is the rank based on the lower bound in Theorem 1.

Here are some of the substantial shifts in rankings between Rank 1 (official ranking) and Rank 2 (corrected ranking) after adjusting for sample selection in mathematics: Canada from 12 to 19 : Moved down 7 places. North Macedonia (from 67 to 58): Moved up 9 places. Slovenia ( from 14 to 8): Moved up 6 places. Poland (from 10 to 13): Moved down 3 places. United Kingdom:(from 19 to 28): Moved down 9 places. Germany (from 20 to 12): Moved up 8 places. Finland (from 16 to 10): Moved up 6 places. Ireland (from 22 to 16): Moved up 6 places. Jordan (from 63 to 72): Moved down 9 places.

We can also see that in ranking 3, the decrease in rank is more severe for countries with lower values of $\displaystyle p$ . But in both cases Rank 2 and Rank 3, we can see that accounting for sample selection definitely affect the ranking. The same thing happen also for reading.

For the table presenting the PISA 2018 Reading results, here are some of the more substantial shifts in rankings. Slovenia (from 21 to 11): Moved up 10 places. Germany (from 20 to 8): Moved up 12 places. Czech Republic (from 25 to 16): Moved up 9 places. Belgium (from 23 to 19): Moved up 4 places. Ireland (from 8 to 4): Moved up 4 places. Sweden (from 11 to 23): Moved down 12 places. United Kingdom (from 14 to 22): Moved down 8 places. United States (from 13 to 21): Moved down 8 places. North Macedonia (from 67 to 55): Moved up 12 places. Jordan (from 55 to 69): Moved down 14 places.

Similarly to the mathematics results, these substantial shifts in rankings for the reading assessment underscore the importance of accounting for sample selection in international assessments to ensure a fair and accurate comparison of educational outcomes across countries.

4. Conclusion

In this paper, I have introduced a method to correct sample selection in country comparisons using international assessment scores data like PISA. The correction is done using a quantile selection model and under different assumptions, I explain how we can partially identify and point identify the latent quantiles. The results of my application on PISA 2018 suggest that the observed quantiles are upward biased and rankings are not accurate.

Appendix A Proofs of the results in the main text

A.1. Proof of Lemma 1

$\displaystyle\displaystyle\tilde{u}$	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\mathbb{P}(U\leq u\|S=1)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\mathbb{P}(Y^{\ast}\leq q_{Y^{\ast}}(u)\|S=1)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle F_{Y^{\ast}\|S=1}(q_{Y^{\ast}}(u))$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle F_{Y\|S=1}(q_{Y^{\ast}}(u))$

From there, we have that :

\displaystyle q_{Y^{\ast}}(u)=q_{Y|S=1}(\tilde{u})

A.2. Proof of Lemma 2

	$\displaystyle\displaystyle\tilde{u}$	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\mathbb{P}(U\leq u\|S=1)$
		$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{\mathbb{P}(U\leq u,S=1)}{p}$

Now we apply Fréchet-bounds to the joint probability, and we obtain :

\displaystyle\displaystyle\frac{\max\{u+p-1,0\}}{p}\leq\tilde{u}\leq\frac{\min% \{u,p\}}{p}

Now, using the monononicity of $\displaystyle q_{Y}$ , we finally obtain :

\displaystyle\displaystyle q_{Y}\left(\frac{\max\{u+p-1,0\}}{p}\right)\leq q_{% Y^{\ast}}(u)\leq q_{Y}\left(\frac{\min\{u,p\}}{p}\right)

A.3. Proof of Theorem 1

STEP 1 : Validity

First, we need to prove the validity of the inequalities. By Lemma 2, we already know that :

\displaystyle\displaystyle q_{Y^{\ast}}(u)\geq q_{Y}\left(\frac{\max\{u+p-1,0% \}}{p}\right)

Now consider the fact that:

\displaystyle\displaystyle u=\mathbb{P}\left(U\leq u|S=1\right)p+\mathbb{P}% \left(U\leq u|S=0\right)(1-p)

Using the stochastic dominance assumption, we get :

	$\displaystyle\displaystyle\mathbb{P}\left(U\leq u\|S=1\right)p+\mathbb{P}\left(% U\leq u\|S=1\right)(1-p)$	$\displaystyle\displaystyle\leq$	$\displaystyle\displaystyle\mathbb{P}\left(U\leq u\|S=1\right)p+\mathbb{P}\left(% U\leq u\|S=0\right)(1-p)$
		$\displaystyle\displaystyle=$	$\displaystyle\displaystyle u$

which is simply :

\displaystyle\mathbb{P}\left(U\leq u|S=1\right)\leq u

Using these probability bounds, and the monotonocity of $\displaystyle q_{Y}$ , we have that :

\displaystyle\displaystyle q_{Y^{\ast}}(u)

\displaystyle\displaystyle\leq

\displaystyle\displaystyle q_{Y}(u)

STEP 2 : Sharpness

For the lower bound, consider $\displaystyle S=\mathbb{1}\{U\geq 1-p\}$ . We have :

\displaystyle\mathbb{P}(U\geq 1-p)=p

In that case, we have :

\displaystyle\tilde{u}=\frac{\mathbb{P}(U\leq u,U\geq 1-p)}{p}=\frac{\max\{u+p% -1,0\}}{p}

And the stochastic dominance assumption also hold :

\displaystyle\mathbb{P}(U\leq u|S=1)=\frac{\max\{u+p-1,0\}}{p}\leq\mathbb{P}(U% \leq u|S=0)=\mathbb{P}(U\leq u|U<1-p)=\frac{\min\{u,1-p\}}{1-p}

The lower bound is therefore sharp.

For the upper bound, consider that $\displaystyle S$ is such that $\displaystyle S$ is independent of $\displaystyle U$ . We will have

\displaystyle\tilde{u}=\mathbb{P}(U\leq u|S=1)=\mathbb{P}(U\leq u|S=0)=u

The upper bound is therefore also sharp.

A.4. Proof of Theorem 2

We have that :

$\displaystyle\displaystyle\tilde{u}$	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\mathbb{P}(U\leq u\|S=1)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{1}{p}\left(\mathbb{P}(U\leq u,S=1)\right)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{1}{p}\left(\mathbb{P}(U\leq u,U\geq V)\right)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{1}{p}\left(\int\mathbb{P}(U\leq u,U\geq v)dF_{% V,\theta_{0}}(v)\right)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{1}{p}\left(\int_{0}^{u}(u-v)dF_{V,\theta_{0}}(% v)\right)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\frac{1}{p}\left(uF_{V,\theta_{0}}(u)-\int_{0}^{u}% vdF_{V,\theta_{0}}(v)\right)$

Now we also have that :

$\displaystyle\displaystyle p$	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\mathbb{P}(U\geq V)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\int\mathbb{P}(U\geq v)dF_{V,\theta_{0}}(v)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle\int(1-v)dF_{V,\theta_{0}}(v)$
	$\displaystyle\displaystyle=$	$\displaystyle\displaystyle 1-\mathbb{E}_{\theta_{0}}(V)$

From there, we have : $\displaystyle E_{\theta_{0}}(V)=1-p$

Appendix B Tables

Table 1. PISA 2018 MATHS

Countries	p	Lbound	Ubound	Mean( $\displaystyle Y^{\ast}$ )	Rank 1	Rank 2	Rank 3
China	0.812	524.98	591.13	560.66	1	1	3
Singapore	0.953	550.87	566.76	555.87	2	2	1
Macau	0.883	512.48	556.64	536.15	3	4	4
Hong Kong	0.984	547.15	551.78	547.83	4	3	2
Taiwan	0.921	501.3	531.97	513.22	5	5	7
Japan	0.909	501.89	528.62	511.54	6	6	6
Korea	0.881	482.78	525.15	500.36	7	11	9
Estonia	0.931	502.42	522.99	510.64	8	7	5
Netherlands	0.912	490.21	520.53	502.7	9	9	8
Poland	0.9	486.89	516.44	497.8	10	13	10
Switzerland	0.889	475.49	515.39	493.65	11	15	11
Canada	0.863	468.6	512.03	487.85	12	19	15
Denmark	0.878	477.75	510.21	490.14	13	17	12
Slovenia	0.979	501	508.9	503.53	14	8	7
Belgium	0.936	488.35	508.28	494.88	15	14	9
Finland	0.963	497.06	507.84	500.46	16	10	8
Norway	0.911	471.94	502.64	484.58	17	20	13
Sweden	0.857	460.94	502.55	477.12	18	25	18
United Kingdom	0.848	444.36	502.2	473.86	19	28	23
Germany	0.993	498.61	499.96	498.71	20	12	6
Austria	0.889	465.62	499.47	478.55	21	23	16
Ireland	0.962	489.24	499.25	492.04	22	16	10
Czech Republic	0.954	485.91	498.94	489.48	23	18	11
Latvia	0.886	469.22	497.19	479.33	24	21	17
France	0.913	470.2	495.58	478.5	25	24	14
Iceland	0.916	471.74	495.07	479.11	26	22	13
New Zealand	0.888	463.03	494.61	474.69	27	27	19
Portugal	0.873	450.56	492.99	468.59	28	30	22
Australia	0.894	457.54	492.15	472.69	29	29	20
Russia	0.936	469.73	487.92	475.68	30	26	12
Slovak Republic	0.862	436.41	487.6	460.11	31	35	28
Italy	0.846	438.56	486.28	458.26	32	37	27
Lithuania	0.903	456.12	484.15	466.53	33	31	21
Luxembourg	0.871	441.17	483.5	459.76	34	36	25
Hungary	0.896	444.36	482.26	462.28	35	34	23
United States	0.861	435.59	477.92	453.76	37	38	30
Belarus	0.876	431.22	471.94	448.89	38	39	33
Malta	0.972	461.7	470.86	464.08	39	33	24
Croatia	0.891	433.32	464.43	445.81	40	40	32
Israel	0.809	395.24	462.21	425.38	41	44	38
Turkey	0.726	372.29	453.63	415.28	42	48	41
Ukraine	0.867	409.65	453.17	428.52	43	42	37
Greece	0.927	429.6	450.96	437.22	44	41	35
Serbia	0.885	413.42	448.25	427.28	45	43	39
Malaysia	0.723	361.52	440.57	403.94	46	51	44
Romania	0.726	339.93	430.68	388.54	50	56	50
United Arab Emirates	0.918	413.32	437.08	421.29	48	46	39
Albania	0.757	363.13	436.71	403.25	49	52	43
Bosnia and Herzegovina	0.823	362.94	406.85	381.56	61	59	46
Mexico	0.664	294.5	408. 64	367.08	60	63	52
Georgia	0.826	349.84	398.7	372.02	65	61	48
Peru	0.731	324	399.3	363.14	64	64	51
North Macedonia	0.947	377.12	393.2	382.21	67	58	45
Colombia	0.619	298.78	391.13	347.1	68	70	53
Brazil	0.65	297.08	382.82	341.86	69	73	54
Argentina	0.806	331.35	379.69	351.95	70	69	47
Indonesia	0.849	335.57	378.05	356.25	71	67	49
Saudi Arabia	0.845	336.42	374.13	352.68	72	68	50
Morocco	0.643	284.42	369.02	330.42	73	74	55
Kosovo	0.844	322.43	364.91	342.98	74	71	51
Panama	0.535	242.69	352.43	304.35	75	76	56
Philippines	0.679	267.1	352.39	313.8	76	75	57
Dominican Republic	0.73	261.21	324.53	295.11	77	77	58

Table 3. PISA 2018 READING

Countries	p	Lbound	Ubound	Mean( $\displaystyle Y^{\ast}$ )	Rank 1	Rank 2	Rank 3
China	0.812	493.82	555.25	524.18	1	2	3
Singapore	0.953	532.06	549.5	537.08	2	1	1
Macau	0.883	484.2	525.13	501.71	3	7	4
Hong Kong	0.984	519.11	524.32	519.84	4	3	2
Estonia	0.931	503.73	523.37	510.01	5	6	5
Canada	0.863	475.63	520.12	493.5	6	9	8
Finland	0.963	506.4	519.66	510.08	7	5	6
Ireland	0.962	507.04	518.41	510.21	8	4	7
Korea	0.881	475.92	513.84	488.86	9	12	9
Poland	0.9	478.76	512.16	491.81	10	10	9
Sweden	0.857	453.82	505.74	475.14	11	23	13
New Zealand	0.888	470.81	505.29	482.29	12	15	10
United States	0.861	459.64	505	476.99	13	21	15
United Kingdom	0.848	460.17	504.34	476.6	14	22	14
Japan	0.909	471.65	503.3	484.4	15	14	11
Taiwan	0.921	475.98	502.37	484.59	16	13	11
Australia	0.894	465.19	502.27	479.2	17	17	13
Denmark	0.878	464.52	501.88	479.12	18	18	13
Norway	0.911	468.84	499.11	478.52	19	20	13
Germany	0.993	497.36	498.94	497.48	20	8	8
Slovenia	0.979	490.11	495.64	491.11	21	11	9
France	0.913	464.55	493.03	474.67	22	24	13
Belgium	0.936	472.32	492.87	478.62	23	19	13
Portugal	0.873	451.54	491.95	467.59	24	25	15
Czech Republic	0.954	477.09	490.42	480.71	25	16	10
Netherlands	0.912	454.31	484.58	465.34	26	27	15
Austria	0.889	448.66	483.53	461.95	27	28	15
Switzerland	0.889	448.99	483.5	461.44	28	29	15
Croatia	0.891	446.23	479.08	459.51	29	31	16
Russia	0.936	460.2	478.71	465.92	30	26	15
Latvia	0.886	447.13	478.05	458.59	31	32	16
Spain	0.918	452.01	476.58	460.74	32	30	15
Italy	0.846	426.62	476.17	447.16	33	37	18
Hungary	0.896	445.16	476.1	456.29	34	34	17
Lithuania	0.903	448.38	475.75	457.53	35	33	16
Belarus	0.876	438.37	474.12	452.34	36	36	17
Iceland	0.916	448.49	473.8	455.57	37	35	16
Israel	0.809	404.81	469.99	428.79	38	44	21
Luxembourg	0.871	430.47	469.69	444.31	39	38	19
Ukraine	0.867	422.09	465.42	440.44	40	41	20
Turkey	0.726	387.5	465.23	427.44	41	45	21
Slovak Republic	0.862	419.95	457.78	433.03	42	43	20
Greece	0.927	436.51	457.3	442.82	43	39	20
Chile	0.893	420.92	452.76	433.35	44	42	20
Malta	0.972	439.68	448.28	441.35	45	40	20
Serbia	0.885	409.2	439.25	419.16	46	46	21
United Arab Emirates	0.918	407.66	431.05	414.22	47	47	21
Costa Rica	0.628	337.61	427.14	383.24	48	53	23
Romania	0.726	323.94	427.05	381.25	49	54	23
Uruguay	0.78	362.02	426.45	390.95	50	52	22
Moldova	0.951	409.71	423.94	413.97	51	48	21
Montenegro	0.947	407.86	420.95	411.55	52	49	21
Mexico	0.664	339.31	420.6	379.68	53	56	23
Bulgaria	0.72	348.51	419.9	378.74	54	58	23
Jordan	0.54	262.1	419.01	356.46	55	69	25
Malaysia	0.723	348.04	415.01	377.72	56	59	23
Brazil	0.65	312.14	413.13	364.76	57	65	24
Colombia	0.619	313.85	412.12	365.17	58	64	24
Brunei	0.974	403.21	409.04	404.41	59	50	21
Qatar	0.923	386.21	407.09	392.04	60	51	22
Albania	0.757	344.93	405.35	374.52	61	61	23
Bosnia and Herzegovina	0.823	361.45	402.91	378.86	62	57	22
Argentina	0.806	338.8	401.19	367.98	63	63	24
Peru	0.731	331.68	400.32	362.93	64	66	24
Saudi Arabia	0.845	356.32	398.93	374.45	65	62	23
Thailand	0.724	327.11	393.27	361.47	66	67	24
North Macedonia	0.947	374.82	392.09	380.77	67	55	22
Azerbaijan	0.463	264.3	389.46	339.62	68	71	25
Kazakhstan	0.92	369.82	386.67	376.25	69	60	22
Georgia	0.826	338.46	379.69	356.49	70	68	24
Panama	0.535	259.74	378.23	324.81	71	74	25
Indonesia	0.849	341.16	370.96	352.43	72	70	24
Morocco	0.643	286.39	359.63	323.34	73	75	25
Lebanon	0.867	306.62	352.8	326.22	74	73	25
Kosovo	0.844	321.98	352.5	334.6	75	72	25
Dominican Republic	0.73	279.38	341.09	310.67	76	76	25
Philippines	0.679	282.68	339.47	307.82	77	77	25

References

Arellano and Bonhomme (2016) M. Arellano and S. Bonhomme. Sample selection in quantile regression: A survey. In Handbook of quantile regression, pages 209–224. Chapman and Hall/CRC, 2016.
Arellano and Bonhomme (2017) M. Arellano and S. Bonhomme. Quantile selection models with an application to understanding changes in wage inequality. Econometrica, 85(1):1–28, 2017.
Berliner (1993) D. C. Berliner. International comparisons of student achievement. In National Forum, volume 73, page 25, 1993.
Chernozhukov et al. (2023) V. Chernozhukov, I. Fernández-Val, and S. Luo. Distribution regression with sample selection and uk wage decomposition. arXiv preprint arXiv:1811.11603, 2023.
Cromley (2009) J. G. Cromley. Reading achievement and science proficiency: International comparisons from the programme on international student assessment. Reading Psychology, 30(2):89–118, 2009.
Ferreira and Gignoux (2014) F. H. Ferreira and J. Gignoux. The measurement of educational inequality: Achievement and opportunity. The World Bank Economic Review, 28(2):210–246, 2014.
Heckman (1974) J. Heckman. Shadow prices, market wages, and labor supply. Econometrica: journal of the econometric society, pages 679–694, 1974.
Huber and Melly (2015) M. Huber and B. Melly. A test of the conditional independence assumption in sample selection models. Journal of Applied Econometrics, 30(7):1144–1168, 2015.
Jakubowski and Pokropek (2015) M. Jakubowski and A. Pokropek. Reading achievement progress across countries. International Journal of Educational Development, 45:77–88, 2015.
Martin et al. (2000) M. O. Martin et al. International comparisons of student achievement. In Learning from others, pages 29–47. Springer, 2000.
McEwan and Marshall (2004) P. J. McEwan and J. H. Marshall. Why does academic achievement vary across countries? evidence from cuba and mexico. Education Economics, 12(3):205–217, 2004.
McGaw (2008) B. McGaw. The role of the oecd in international comparative studies of achievement. Assessment in Education: Principles, Policy & Practice, 15(3):223–243, 2008.
Nagy (1996) P. Nagy. International comparisons of student achievement in mathematics and science: A canadian perspective. Canadian Journal of Education/Revue canadienne de l’éducation, pages 396–413, 1996.
Rotberg (1995) I. C. Rotberg. Myths about test score comparisons. Science, 270(5241):1446–1448, 1995.
Tienken (2008) C. H. Tienken. Rankings of international achievement test performance and economic strength: Correlation or conjecture? International Journal of Education Policy and Leadership, 3(4):1–15, 2008.