-
OffsetBias: Leveraging Debiased Data for Tuning Evaluators
Authors:
Junsoo Park,
Seungyeon Jwa,
Meiying Ren,
Daeyoung Kim,
Sanghyuk Choi
Abstract:
Employing Large Language Models (LLMs) to assess the quality of generated responses, such as prompting instruct-tuned models or fine-tuning judge models, has become a widely adopted evaluation method. It is also known that such evaluators are vulnerable to biases, such as favoring longer responses. While it is important to overcome this problem, the specifics of these biases remain under-explored.…
▽ More
Employing Large Language Models (LLMs) to assess the quality of generated responses, such as prompting instruct-tuned models or fine-tuning judge models, has become a widely adopted evaluation method. It is also known that such evaluators are vulnerable to biases, such as favoring longer responses. While it is important to overcome this problem, the specifics of these biases remain under-explored. In this work, we qualitatively identify six types of biases inherent in various judge models. We propose EvalBiasBench as a meta-evaluation collection of hand-crafted test cases for each bias type. Additionally, we present de-biasing dataset construction methods and the associated preference dataset OffsetBias. Experimental results demonstrate that fine-tuning on our dataset significantly enhances the robustness of judge models against biases and improves performance across most evaluation scenarios. We release our datasets and the fine-tuned judge model to public.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Progress report on testing robustness of the Newton method in data analysis on 2-point correlation function using a MILC HISQ ensemble
Authors:
Tanmoy Bhattacharya,
Benjamin J. Choi,
Rajan Gupta,
Yong-Chull Jang,
Seungyeob Jwa,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Jeonghwan Pak,
Sungwoo Park
Abstract:
We report recent progress in data analysis on the two point correlation functions which will be prerequisite to obtain semileptonic form factors for the $B_{(s)} \to D_{(s)}\ellν$ decays. We use a MILC HISQ ensemble for the measurement. We use the HISQ action for light quarks, and the Oktay-Kronfeld (OK) action for the heavy quarks ($b$ and $c$). We used a sequential Bayesian method for the data a…
▽ More
We report recent progress in data analysis on the two point correlation functions which will be prerequisite to obtain semileptonic form factors for the $B_{(s)} \to D_{(s)}\ellν$ decays. We use a MILC HISQ ensemble for the measurement. We use the HISQ action for light quarks, and the Oktay-Kronfeld (OK) action for the heavy quarks ($b$ and $c$). We used a sequential Bayesian method for the data analysis. Here we test the new fitting methodology of Benjamin J.~Choi in a completely independent manner.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Current progress on the semileptonic form factors for $\bar{B} \to D^{\ast} \ell \barν$ decay using the Oktay-Kronfeld action
Authors:
Tanmoy Bhattacharya,
Benjamin J. Choi,
Rajan Gupta,
Yong-Chull Jang,
Seungyeob Jwa,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Jeonghwan Pak,
Sungwoo Park
Abstract:
We present recent progress in calculating the semileptonic form factors $h_{A_1}(w)$ for the $\bar{B} \to D^{\ast} \ell \barν$ decays. We use the Oktay-Kronfeld (OK) action for the charm and bottom valence quarks and the HISQ action for light quarks. We adopt the Newton method combined with the scanning method to find a good initial guess for the $χ^2$ minimizer in the fitting of the 2pt correlati…
▽ More
We present recent progress in calculating the semileptonic form factors $h_{A_1}(w)$ for the $\bar{B} \to D^{\ast} \ell \barν$ decays. We use the Oktay-Kronfeld (OK) action for the charm and bottom valence quarks and the HISQ action for light quarks. We adopt the Newton method combined with the scanning method to find a good initial guess for the $χ^2$ minimizer in the fitting of the 2pt correlation functions. The main advantage is that the Newton method lets us to consume all the time slices allowed by the physical positivity. We report the first, reliable, but preliminary results for $h_{A_1}(w)/ρ_{A_1}$ at zero recoil ($w=1$). Here we use a MILC HISQ ensemble ($a = 0.12$ fm, $M_π$ = 220 MeV, and $N_f = 2 + 1 + 1$ flavors).
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
2023 Update of $\varepsilon_K$ with lattice QCD inputs
Authors:
Seungyeob Jwa,
Jeehun Kim,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Jeonghwan Pak,
Sungwoo Park
Abstract:
We report recent progress on $\varepsilon_K$ evaluated directly from the standard model (SM) with lattice QCD inputs such as $\hat{B}_K$, $|V_{cb}|$, $|V_{us}|$, $|V_{ud}|$, $ξ_0$, $ξ_2$, $ξ_\text{LD}$, $f_K$, and $m_c$. We find that the standard model with exclusive $|V_{cb}|$ and lattice QCD inputs describes only 66\% of the experimental value of $|\varepsilon_K|$ and does not explain its remain…
▽ More
We report recent progress on $\varepsilon_K$ evaluated directly from the standard model (SM) with lattice QCD inputs such as $\hat{B}_K$, $|V_{cb}|$, $|V_{us}|$, $|V_{ud}|$, $ξ_0$, $ξ_2$, $ξ_\text{LD}$, $f_K$, and $m_c$. We find that the standard model with exclusive $|V_{cb}|$ and lattice QCD inputs describes only 66\% of the experimental value of $|\varepsilon_K|$ and does not explain its remaining 34\%, which corresponds to a strong tension in $|\varepsilon_K|$ at the $4.9σ\sim 3.9σ$ level between the SM theory and experiment. We also find that this tension disappears when we use the inclusive value of $|V_{cb}|$ obtained using the heavy quark expansion based on the QCD sum rule approach.
△ Less
Submitted 20 December, 2023; v1 submitted 20 November, 2023;
originally announced December 2023.
-
Improved data analysis on two-point correlation function with sequential Bayesian method
Authors:
Tanmoy Bhattacharya,
Benjamin J. Choi,
Rajan Gupta,
Yong-Chull Jang,
Seungyeob Jwa,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Sungwoo Park,
Boram Yoon
Abstract:
We report our progress in data analysis on two-point correlation functions of the $B$ meson using sequential Bayesian method. The data set of measurement is obtained using the Oktay-Kronfeld (OK) action for the bottom quarks (valence quarks) and the HISQ action for the light quarks on the MILC HISQ lattices. We find that the old initial guess for the $χ^2$ minimizer in the fitting code is poor eno…
▽ More
We report our progress in data analysis on two-point correlation functions of the $B$ meson using sequential Bayesian method. The data set of measurement is obtained using the Oktay-Kronfeld (OK) action for the bottom quarks (valence quarks) and the HISQ action for the light quarks on the MILC HISQ lattices. We find that the old initial guess for the $χ^2$ minimizer in the fitting code is poor enough to slow down the analysis somewhat. In order to find a better initial guess, we adopt the Newton method. We find that the Newton method provides a natural test to check whether the $χ^2$ minimizer finds a local minimum or the global minimum, and it also reduces the number of iterations dramatically.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Deep learning study on the Dirac eigenvalue spectrum of staggered quarks
Authors:
Hwancheol Jeong,
Chulwoo Jung,
Seungyeob Jwa,
Jeehun Kim,
Nam Soo Kim,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Youngjo Lee,
Jeonghwan Pak,
Chanju Park
Abstract:
We study the chirality of staggered quarks on the Dirac eigenvalue spectrum using deep learning (DL) techniques. The Kluberg-Stern method to construct staggered bilinear operators conserves continuum property such as recursion relations, uniqueness of chirality, and Ward identities, which leads to a unique and characteristic pattern (we call it "leakage pattern (LP)") in the matrix elements of the…
▽ More
We study the chirality of staggered quarks on the Dirac eigenvalue spectrum using deep learning (DL) techniques. The Kluberg-Stern method to construct staggered bilinear operators conserves continuum property such as recursion relations, uniqueness of chirality, and Ward identities, which leads to a unique and characteristic pattern (we call it "leakage pattern (LP)") in the matrix elements of the chirality operator sandwiched between two quark eigenstates of staggered Dirac operator. DL analysis gives $99.4(2)\%$ accuracy on normal gauge configurations and $0.998$ AUC (Area Under ROC Curve) for classifying non-zero mode octets in the Dirac eigenvalue spectrum. It confirms that the leakage pattern is universal on normal gauge configurations. The multi-layer perceptron (MLP) method turns out to be the best DL model for our study on the LP.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Chiral symmetry and taste symmetry from the eigenvalue spectrum of staggered Dirac operators
Authors:
Hwancheol Jeong,
Chulwoo Jung,
Seungyeob Jwa,
Jangho Kim,
Jeehun Kim,
Nam Soo Kim,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Youngjo Lee,
Jeonghwan Pak
Abstract:
We investigate general properties of the eigenvalue spectrum for improved staggered quarks. We introduce a new chirality operator $[γ_5 \otimes 1]$ and a new shift operator $[1 \otimes ξ_5]$, which respect the same recursion relation as the $γ_5$ operator in the continuum. Then we show that matrix elements of the chirality operator sandwiched between two eigenstates of the staggered Dirac operator…
▽ More
We investigate general properties of the eigenvalue spectrum for improved staggered quarks. We introduce a new chirality operator $[γ_5 \otimes 1]$ and a new shift operator $[1 \otimes ξ_5]$, which respect the same recursion relation as the $γ_5$ operator in the continuum. Then we show that matrix elements of the chirality operator sandwiched between two eigenstates of the staggered Dirac operator are related to those of the shift operator by the Ward identity of the conserved $U(1)_A$ symmetry of staggered fermion actions. We perform a numerical study in quenched QCD using HYP staggered quarks to demonstrate the Ward identity. We introduce a new concept of leakage patterns which collectively represent the matrix elements of the chirality operator and the shift operator sandwiched between two eigenstates of the staggered Dirac operator. The leakage pattern provides a new method to identify zero modes and non-zero modes in the Dirac eigenvalue spectrum. This method is as robust as the spectral flow method but requires much less computing power. Analysis using a machine learning technique confirms that the leakage pattern is universal, since the staggered Dirac eigenmodes on normal gauge configurations respect it. In addition, the leakage pattern can be used to determine a ratio of renormalization factors as a by-product. We conclude that it might be possible and realistic to measure the topological charge $Q$ using the Atiya-Singer index theorem and the leakage pattern of the chirality operator in the staggered fermion formalism.
△ Less
Submitted 6 May, 2021; v1 submitted 21 May, 2020;
originally announced May 2020.
-
Semileptonic $B \to D^{(\ast)} \ellν$ Decay Form Factors using the Oktay-Kronfeld Action
Authors:
Tanmoy Bhattacharya,
Benjamin J. Choi,
Rajan Gupta,
Yong-Chull Jang,
Seungyeob Jwa,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Sungwoo Park
Abstract:
We report recent progress in calculating semileptonic form factors for the $\bar{B} \to D^\ast \ell \barν$ and $\bar{B} \to D \ell \barν$ decays using the Oktay-Kronfeld (OK) action for bottom and charm quarks. We use the second order in heavy quark effective power counting $\mathcal{O}(λ^2)$ improved currents in this work. The HISQ action is used for the light spectator quarks. We analyzed four…
▽ More
We report recent progress in calculating semileptonic form factors for the $\bar{B} \to D^\ast \ell \barν$ and $\bar{B} \to D \ell \barν$ decays using the Oktay-Kronfeld (OK) action for bottom and charm quarks. We use the second order in heavy quark effective power counting $\mathcal{O}(λ^2)$ improved currents in this work. The HISQ action is used for the light spectator quarks. We analyzed four $2+1+1$-flavor MILC HISQ ensembles with $a\approx 0.09\,\mathrm{fm}$, $0.12\,\mathrm{fm}$ and $M_π\approx 220\,\mathrm{MeV}$, $310\,\mathrm{MeV}$: $a09m220$, $a09m310$, $a12m220$, $a12m310$. Preliminary results for $B\to D^\ast\ellν$ decays form factor $h_{A_1}(w)$ at zero recoil ($w=1$) are reported. Preliminary results for $B \to D\,\ellν$ decays form factors $h_\pm(w)$ over a kinematic range $1<w<1.3$ are reported as well.
△ Less
Submitted 20 March, 2020;
originally announced March 2020.
-
Leptonic decays of $B_{(s)}$ and $D_{(s)}$ using the OK action
Authors:
Tanmoy Bhattacharya,
Benjamin J. Choi,
Rajan Gupta,
Yong-Chull Jang,
Seungyeob Jwa,
Sunkyu Lee,
Weonjong Lee,
Jaehoon Leem,
Sungwoo Park
Abstract:
We present recent progress in the lattice calculation of leptonic decay constants for $B_{(s)}$ and $D_{(s)}$ mesons using the Oktay-Kronfeld (OK) action for charm and bottom valence quarks, whose masses are tuned non-perturbatively. The calculations are done on 6 HISQ ensembles generated by the MILC collaboration with $N_f=2+1+1$ flavors. We also use the HISQ action for the light spectator quarks…
▽ More
We present recent progress in the lattice calculation of leptonic decay constants for $B_{(s)}$ and $D_{(s)}$ mesons using the Oktay-Kronfeld (OK) action for charm and bottom valence quarks, whose masses are tuned non-perturbatively. The calculations are done on 6 HISQ ensembles generated by the MILC collaboration with $N_f=2+1+1$ flavors. We also use the HISQ action for the light spectator quarks. Results are presented for the ratios $f_{B_s}/f_B$ and $f_{D_s}/f_D$, which reflect $SU(3)$ flavor symmetry breaking, and are independent of the renormalization constants of the axial currents.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Update on $B\to D^\ast \ell ν$ form factor at zero-recoil using the Oktay-Kronfeld action
Authors:
Tanmoy Bhattacharya,
Rajan Gupta,
Sungwoo Park,
Yong-Chull Jang,
Jon A. Bailey,
Benjamin J. Choi,
Hwancheol Jeong,
Seungyeob Jwa,
Sunkyu Lee,
Weonjong Lee,
Jeonghwan Pak,
Jaehoon Leem
Abstract:
We present an update on the calculation of $\bar{B}\to D^\ast \ell \barν$ semileptonic form factor at zero recoil using the Oktay-Kronfeld bottom and charm quarks on $N_f=2+1+1$ flavor HISQ ensembles generated by the MILC collaboration. Preliminary results are given for two ensembles with $a\approx 0.12$ and $0.09$ fm and $M_π\approx 310$ MeV. Calculations have been done with a number of valence q…
▽ More
We present an update on the calculation of $\bar{B}\to D^\ast \ell \barν$ semileptonic form factor at zero recoil using the Oktay-Kronfeld bottom and charm quarks on $N_f=2+1+1$ flavor HISQ ensembles generated by the MILC collaboration. Preliminary results are given for two ensembles with $a\approx 0.12$ and $0.09$ fm and $M_π\approx 310$ MeV. Calculations have been done with a number of valence quark masses, and the dependence of the form factor on them is investigated on the $a\approx 0.12$ fm ensemble. The excited state is controlled by using multistate fits to the three-point correlators measured at 4--6 source-sink separations.
△ Less
Submitted 18 December, 2018;
originally announced December 2018.
-
How to identify zero modes for improved staggered fermions
Authors:
Hwancheol Jeong,
Seungyeob Jwa,
Jangho Kim,
Sunghee Kim,
Sunkyu Lee,
Weonjong Lee,
Jeonghwan Pak
Abstract:
We present results of the eigenvalue spectrum for the staggered Diräc operator obtained using a modified Lanczos algorithm. We identify zero modes and non-zero modes. We derive the chiral Ward identity derived from the conserved $U(1)_A$ symmetry, and check it numerically. This is the first step toward construction of an improved method to identify zero modes reliably with staggered fermions.
We present results of the eigenvalue spectrum for the staggered Diräc operator obtained using a modified Lanczos algorithm. We identify zero modes and non-zero modes. We derive the chiral Ward identity derived from the conserved $U(1)_A$ symmetry, and check it numerically. This is the first step toward construction of an improved method to identify zero modes reliably with staggered fermions.
△ Less
Submitted 16 November, 2017; v1 submitted 6 November, 2017;
originally announced November 2017.