-
Considerations for the Interpretation of Bias Measures of Word Embeddings
Authors:
Inom Mirzaev,
Anthony Schulte,
Michael Conover,
Sam Shah
Abstract:
Word embedding spaces are powerful tools for capturing latent semantic relationships between terms in corpora, and have become widely popular for building state-of-the-art natural language processing algorithms. However, studies have shown that societal biases present in text corpora may be incorporated into the word embedding spaces learned from them. Thus, there is an ethical concern that human-…
▽ More
Word embedding spaces are powerful tools for capturing latent semantic relationships between terms in corpora, and have become widely popular for building state-of-the-art natural language processing algorithms. However, studies have shown that societal biases present in text corpora may be incorporated into the word embedding spaces learned from them. Thus, there is an ethical concern that human-like biases contained in the corpora and their derived embedding spaces might be propagated, or even amplified with the usage of the biased embedding spaces in downstream applications. In an attempt to quantify these biases so that they may be better understood and studied, several bias metrics have been proposed. We explore the statistical properties of these proposed measures in the context of their cited applications as well as their supposed utilities. We find that there are caveats to the simple interpretation of these metrics as proposed. We find that the bias metric proposed by Bolukbasi et al. 2016 is highly sensitive to embedding hyper-parameter selection, and that in many cases, the variance due to the selection of some hyper-parameters is greater than the variance in the metric due to corpus selection, while in fewer cases the bias rankings of corpora vary with hyper-parameter selection. In light of these observations, it may be the case that bias estimates should not be thought to directly measure the properties of the underlying corpus, but rather the properties of the specific embedding spaces in question, particularly in the context of hyper-parameter selections used to generate them. Hence, bias metrics of spaces generated with differing hyper-parameters should be compared only with explicit consideration of the embedding-learning algorithms particular configurations.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Anti-van der Waerden Numbers of Graph Products
Authors:
Hunter Rehm,
Alex Schulte,
Nathan Warnberg
Abstract:
In this paper, anti-van der Waerden numbers on Cartesian products of graphs are investigated and a conjecture made by Schulte, et al (see arXiv:1802.01509) is answered. In particular, the anti-van der Waerden number of the Cartesian product of two graphs has an upper bound of four. This result is then used to determine the anti-van der Waerden number for any Cartesian product of two paths.
In this paper, anti-van der Waerden numbers on Cartesian products of graphs are investigated and a conjecture made by Schulte, et al (see arXiv:1802.01509) is answered. In particular, the anti-van der Waerden number of the Cartesian product of two graphs has an upper bound of four. This result is then used to determine the anti-van der Waerden number for any Cartesian product of two paths.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Anti-van der Waerden numbers on Graphs
Authors:
Zhanar Berikkyzy,
Alex Schulte,
Elizabeth Sprangel,
Shanise Walker,
Nathan Warnberg,
Michael Young
Abstract:
In this paper arithmetic progressions on the integers and the integers modulo n are extended to graphs. This allows for the definition of the anti-van der Waerden number of a graph. Much of the focus of this paper is on 3-term arithmetic progressions for which general bounds are obtained based on the radius and diameter of a graph. The general bounds are improved for trees and Cartesian products a…
▽ More
In this paper arithmetic progressions on the integers and the integers modulo n are extended to graphs. This allows for the definition of the anti-van der Waerden number of a graph. Much of the focus of this paper is on 3-term arithmetic progressions for which general bounds are obtained based on the radius and diameter of a graph. The general bounds are improved for trees and Cartesian products and exact values are determined for some classes of graphs. Larger k-term arithmetic progressions are considered and a connection between the Ramsey number of paths and the anti-van der Waerden number of graphs is established.
△ Less
Submitted 21 June, 2019; v1 submitted 5 February, 2018;
originally announced February 2018.
-
On Edge-Colored Saturation Problems
Authors:
Michael Ferrara,
Daniel Johnston,
Sarah Loeb,
Florian Pfender,
Alex Schulte,
Heather C. Smith,
Eric Sullivan,
Michael Tait,
Casey Tompkins
Abstract:
Let $\mathcal{C}$ be a family of edge-colored graphs. A $t$-edge colored graph $G$ is $(\mathcal{C}, t)$-saturated if $G$ does not contain any graph in $\mathcal{C}$ but the addition of any edge in any color in $[t]$ creates a copy of some graph in $\mathcal{C}$. Similarly to classical saturation functions, define $\mathrm{sat}_t(n, \mathcal{C})$ to be the minimum number of edges in a…
▽ More
Let $\mathcal{C}$ be a family of edge-colored graphs. A $t$-edge colored graph $G$ is $(\mathcal{C}, t)$-saturated if $G$ does not contain any graph in $\mathcal{C}$ but the addition of any edge in any color in $[t]$ creates a copy of some graph in $\mathcal{C}$. Similarly to classical saturation functions, define $\mathrm{sat}_t(n, \mathcal{C})$ to be the minimum number of edges in a $(\mathcal{C},t)$ saturated graph. Let $\mathcal{C}_r(H)$ be the family consisting of every edge-colored copy of $H$ which uses exactly $r$ colors.
In this paper we consider a variety of colored saturation problems. We determine the order of magnitude for $\mathrm{sat}_t(n, \mathcal{C}_r(K_k))$ for all $r$, showing a sharp change in behavior when $r\geq \binom{k-1}{2}+2$. A particular case of this theorem proves a conjecture of Barrus, Ferrara, Vandenbussche, and Wenger. We determine $\mathrm{sat}_t(n, \mathcal{C}_2(K_3))$ exactly and determine the extremal graphs. Additionally, we document some interesting irregularities in the colored saturation function.
△ Less
Submitted 30 November, 2017;
originally announced December 2017.
-
Anti-van der Waerden numbers of 3-term arithmetic progressions
Authors:
Zhanar Berikkyzy,
Alex Schulte,
Michael Young
Abstract:
The \emph{anti-van der Waerden number}, denoted by $aw([n],k)$, is the smallest $r$ such that every exact $r$-coloring of $[n]$ contains a rainbow $k$-term arithmetic progression. Butler et. al. showed that $\lceil \log_3 n \rceil + 2 \le aw([n],3) \le \lceil \log_2 n \rceil + 1$, and conjectured that there exists a constant $C$ such that $aw([n],3) \le \lceil \log_3 n \rceil + C$. In this paper,…
▽ More
The \emph{anti-van der Waerden number}, denoted by $aw([n],k)$, is the smallest $r$ such that every exact $r$-coloring of $[n]$ contains a rainbow $k$-term arithmetic progression. Butler et. al. showed that $\lceil \log_3 n \rceil + 2 \le aw([n],3) \le \lceil \log_2 n \rceil + 1$, and conjectured that there exists a constant $C$ such that $aw([n],3) \le \lceil \log_3 n \rceil + C$. In this paper, we show this conjecture is true by determining $aw([n],3)$ for all $n$. We prove that for $7\cdot 3^{m-2}+1 \leq n \leq 21 \cdot 3^{m-2}$, \[ aw([n],3)=\left\{\begin{array}{ll} m+2, & \mbox{if $n=3^m$}\\ m+3, & \mbox{otherwise}.
\end{array}\right.\]
△ Less
Submitted 29 April, 2016;
originally announced April 2016.
-
Anchoring ceria nanoparticles on reduced graphene oxide and their electronic transport properties
Authors:
Daeha Joung,
Virendra Singh,
Sanghoon Park,
Alfons Schulte,
Sudipta Seal,
Saiful I. Khondaker
Abstract:
This paper has been withdrawn.
This paper has been withdrawn.
△ Less
Submitted 21 September, 2011; v1 submitted 8 July, 2011;
originally announced July 2011.