Search | arXiv e-print repository

Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation

Authors: Andrew Smart, Ding Wang, Ellis Monk, Mark Díaz, Atoosa Kasirzadeh, Erin Van Liemt, Sonja Schmer-Galunder

Abstract: Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from… ▽ More Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from aspects of annotation work. This paper outlines a critical genealogy of data annotation; starting with its psychological and perceptual aspects. We draw on similarities with critiques of the rise of computerized lab-based psychological experiments in the 1970's which question whether these experiments permit the generalization of results beyond the laboratory settings within which these results are typically obtained. Do data annotations permit the generalization of results beyond the settings, or locations, in which they were obtained? Psychology is overly reliant on participants from Western, Educated, Industrialized, Rich, and Democratic societies (WEIRD). Many of the people who work as data annotation platform workers, however, are not from WEIRD countries; most data annotation workers are based in Global South countries. Social categorizations and classifications from WEIRD countries are imposed on non-WEIRD annotators through instructions and tasks, and through them, on data, which is then used to train or evaluate AI models in WEIRD countries. We synthesize evidence from several recent lines of research and argue that data annotation is a form of automated social categorization that risks entrenching outdated and static social categories that are in reality dynamic and changing. We propose a framework for understanding the interplay of the global social conditions of data annotation with the subjective phenomenological experience of data annotation work. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 18 pages

arXiv:2401.13142 [pdf, ps, other]

Unsocial Intelligence: an Investigation of the Assumptions of AGI Discourse

Authors: Borhane Blili-Hamelin, Leif Hancox-Li, Andrew Smart

Abstract: Dreams of machines rivaling human intelligence have shaped the field of AI since its inception. Yet, the very meaning of human-level AI or artificial general intelligence (AGI) remains elusive and contested. Definitions of AGI embrace a diverse range of incompatible values and assumptions. Contending with the fractured worldviews of AGI discourse is vital for critiques that pursue different values… ▽ More Dreams of machines rivaling human intelligence have shaped the field of AI since its inception. Yet, the very meaning of human-level AI or artificial general intelligence (AGI) remains elusive and contested. Definitions of AGI embrace a diverse range of incompatible values and assumptions. Contending with the fractured worldviews of AGI discourse is vital for critiques that pursue different values and futures. To that end, we provide a taxonomy of AGI definitions, laying the ground for examining the key social, political, and ethical assumptions they make. We highlight instances in which these definitions frame AGI or human-level AI as a technical topic and expose the value-laden choices being implicitly made. Drawing on feminist, STS, and social science scholarship on the political and social character of intelligence in both humans and machines, we propose contextual, democratic, and participatory paths to imagining future forms of machine intelligence. The development of future forms of AI must involve explicit attention to the values it encodes, the people it includes or excludes, and a commitment to epistemic justice. △ Less

Submitted 14 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.10075 [pdf, other]

Assessing LLMs for Moral Value Pluralism

Authors: Noam Benkler, Drisana Mosaphir, Scott Friedman, Andrew Smart, Sonja Schmer-Galunder

Abstract: The fields of AI current lacks methods to quantitatively assess and potentially alter the moral values inherent in the output of large language models (LLMs). However, decades of social science research has developed and refined widely-accepted moral value surveys, such as the World Values Survey (WVS), eliciting value judgments from direct questions in various geographies. We have turned those qu… ▽ More The fields of AI current lacks methods to quantitatively assess and potentially alter the moral values inherent in the output of large language models (LLMs). However, decades of social science research has developed and refined widely-accepted moral value surveys, such as the World Values Survey (WVS), eliciting value judgments from direct questions in various geographies. We have turned those questions into value statements and use NLP to compute to how well popular LLMs are aligned with moral values for various demographics and cultures. While the WVS is accepted as an explicit assessment of values, we lack methods for assessing implicit moral and cultural values in media, e.g., encountered in social media, political rhetoric, narratives, and generated by AI systems such as LLMs that are increasingly present in our daily lives. As we consume online content and utilize LLM outputs, we might ask, which moral values are being implicitly promoted or undercut, or -- in the case of LLMs -- if they are intending to represent a cultural identity, are they doing so consistently? In this paper we utilize a Recognizing Value Resonance (RVR) NLP model to identify WVS values that resonate and conflict with a given passage of output text. We apply RVR to the text generated by LLMs to characterize implicit moral values, allowing us to quantify the moral/cultural distance between LLMs and various demographics that have been surveyed using the WVS. In line with other work we find that LLMs exhibit several Western-centric value biases; they overestimate how conservative people in non-Western countries are, they are less accurate in representing gender for non-Western countries, and portray older populations as having more traditional values. Our results highlight value misalignment and age groups, and a need for social science informed technological solutions addressing value plurality in LLMs. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted Paper to workshop on "AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics" at NeurIPS 2023

arXiv:2307.10312 [pdf, other]

Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development

Authors: Shalaleh Rismani, Renee Shelby, Andrew Smart, Renelito Delos Santos, AJung Moon, Negar Rostamzadeh

Abstract: Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Resu… ▽ More Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Results of our analysis demonstrate the safety frameworks - both of which are not designed explicitly examine social and ethical risks - can uncover failure and hazards that pose social and ethical risks. We discovered a broad range of failures and hazards (i.e., functional, social, and ethical) by analyzing interactions (i.e., between different ML models in the product, between the ML product and user, and between development teams) and processes (i.e., preparation of training data or workflows for using an ML service/product). Our findings underscore the value and importance of examining beyond an ML model in examining social and ethical risks, especially when we have minimal information about an ML model. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2306.07466 [pdf, other]

Statistical Methods for Auditing the Quality of Manual Content Reviews

Authors: Xuan Yang, Andrew J Smart, Daniel Theron

Abstract: Large technology firms face the problem of moderating content on their online platforms for compliance with laws and policies. To accomplish this at the scale of billions of pieces of content per day, a combination of human and machine review are necessary to label content. Subjective judgement and human bias are of concern to both human annotated content as well as to auditors who may be employed… ▽ More Large technology firms face the problem of moderating content on their online platforms for compliance with laws and policies. To accomplish this at the scale of billions of pieces of content per day, a combination of human and machine review are necessary to label content. Subjective judgement and human bias are of concern to both human annotated content as well as to auditors who may be employed to evaluate the quality of such annotations in conformance with law and/or policy. To address this concern, this paper presents a novel application of statistical analysis methods to identify human error and these sources of audit risk. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2305.09573 [pdf, other]

doi 10.1145/3593013.3593990

Walking the Walk of AI Ethics: Organizational Challenges and the Individualization of Risk among Ethics Entrepreneurs

Authors: Sanna J. Ali, Angèle Christin, Andrew Smart, Riitta Katila

Abstract: Amidst decline in public trust in technology, computing ethics have taken center stage, and critics have raised questions about corporate ethics washing. Yet few studies examine the actual implementation of AI ethics values in technology companies. Based on a qualitative analysis of technology workers tasked with integrating AI ethics into product development, we find that workers experience an en… ▽ More Amidst decline in public trust in technology, computing ethics have taken center stage, and critics have raised questions about corporate ethics washing. Yet few studies examine the actual implementation of AI ethics values in technology companies. Based on a qualitative analysis of technology workers tasked with integrating AI ethics into product development, we find that workers experience an environment where policies, practices, and outcomes are decoupled. We analyze AI ethics workers as ethics entrepreneurs who work to institutionalize new ethics-related practices within organizations. We show that ethics entrepreneurs face three major barriers to their work. First, they struggle to have ethics prioritized in an environment centered around software product launches. Second, ethics are difficult to quantify in a context where company goals are incentivized by metrics. Third, the frequent reorganization of teams makes it difficult to access knowledge and maintain relationships central to their work. Consequently, individuals take on great personal risk when raising ethics issues, especially when they come from marginalized backgrounds. These findings shed light on complex dynamics of institutional change at technology companies. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 10 pages, to be published in ACM FAccT '23

ACM Class: K.7.1; K.7.2; K.7.4; K.4.1; K.5.2

arXiv:2303.08177 [pdf, other]

The Equitable AI Research Roundtable (EARR): Towards Community-Based Decision Making in Responsible AI Development

Authors: Jamila Smith-Loud, Andrew Smart, Darlene Neal, Amber Ebinama, Eric Corbett, Paul Nicholas, Qazi Rashid, Anne Peckham, Sarah Murphy-Gray, Nicole Morris, Elisha Smith Arrillaga, Nicole-Marie Cotton, Emnet Almedom, Olivia Araiza, Eliza McCullough, Abbie Langston, Christopher Nellum

Abstract: This paper reports on our initial evaluation of The Equitable AI Research Roundtable -- a coalition of experts in law, education, community engagement, social justice, and technology. EARR was created in collaboration among a large tech firm, nonprofits, NGO research institutions, and universities to provide critical research based perspectives and feedback on technology's emergent ethical and soc… ▽ More This paper reports on our initial evaluation of The Equitable AI Research Roundtable -- a coalition of experts in law, education, community engagement, social justice, and technology. EARR was created in collaboration among a large tech firm, nonprofits, NGO research institutions, and universities to provide critical research based perspectives and feedback on technology's emergent ethical and social harms. Through semi-structured workshops and discussions within the large tech firm, EARR has provided critical perspectives and feedback on how to conceptualize equity and vulnerability as they relate to AI technology. We outline three principles in practice of how EARR has operated thus far that are especially relevant to the concerns of the FAccT community: how EARR expands the scope of expertise in AI development, how it fosters opportunities for epistemic curiosity and responsibility, and that it creates a space for mutual learning. This paper serves as both an analysis and translation of lessons learned through this engagement approach, and the possibilities for future research. △ Less

Submitted 14 March, 2023; originally announced March 2023.

Comments: 14 pages, 1 figure

arXiv:2303.00738 [pdf, other]

What Are the Chances? Explaining the Epsilon Parameter in Differential Privacy

Authors: Priyanka Nanayakkara, Mary Anne Smart, Rachel Cummings, Gabriel Kaptchuk, Elissa Redmiles

Abstract: Differential privacy (DP) is a mathematical privacy notion increasingly deployed across government and industry. With DP, privacy protections are probabilistic: they are bounded by the privacy budget parameter, $ε$. Prior work in health and computational science finds that people struggle to reason about probabilistic risks. Yet, communicating the implications of $ε$ to people contributing their d… ▽ More Differential privacy (DP) is a mathematical privacy notion increasingly deployed across government and industry. With DP, privacy protections are probabilistic: they are bounded by the privacy budget parameter, $ε$. Prior work in health and computational science finds that people struggle to reason about probabilistic risks. Yet, communicating the implications of $ε$ to people contributing their data is vital to avoiding privacy theater -- presenting meaningless privacy protection as meaningful -- and empowering more informed data-sharing decisions. Drawing on best practices in risk communication and usability, we develop three methods to convey probabilistic DP guarantees to end users: two that communicate odds and one offering concrete examples of DP outputs. We quantitatively evaluate these explanation methods in a vignette survey study ($n=963$) via three metrics: objective risk comprehension, subjective privacy understanding of DP guarantees, and self-efficacy. We find that odds-based explanation methods are more effective than (1) output-based methods and (2) state-of-the-art approaches that gloss over information about $ε$. Further, when offered information about $ε$, respondents are more willing to share their data than when presented with a state-of-the-art DP explanation; this willingness to share is sensitive to $ε$ values: as privacy protections weaken, respondents are less likely to share data. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2210.05791 [pdf, other]

Sociotechnical Harms of Algorithmic Systems: Sco** a Taxonomy for Harm Reduction

Authors: Renee Shelby, Shalaleh Rismani, Kathryn Henne, AJung Moon, Negar Rostamzadeh, Paul Nicholas, N'Mah Yilla, Jess Gallegos, Andrew Smart, Emilio Garcia, Gurleen Virk

Abstract: Understanding the landscape of potential harms from algorithmic systems enables practitioners to better anticipate consequences of the systems they build. It also supports the prospect of incorporating controls to help minimize harms that emerge from the interplay of technologies and social and cultural dynamics. A growing body of scholarship has identified a wide range of harms across different a… ▽ More Understanding the landscape of potential harms from algorithmic systems enables practitioners to better anticipate consequences of the systems they build. It also supports the prospect of incorporating controls to help minimize harms that emerge from the interplay of technologies and social and cultural dynamics. A growing body of scholarship has identified a wide range of harms across different algorithmic technologies. However, computing research and practitioners lack a high level and synthesized overview of harms from algorithmic systems. Based on a sco** review of computing research $(n=172)$, we present an applied taxonomy of sociotechnical harms to support a more systematic surfacing of potential harms in algorithmic systems. The final taxonomy builds on and refers to existing taxonomies, classifications, and terminologies. Five major themes related to sociotechnical harms - representational, allocative, quality-of-service, interpersonal harms, and social system/societal harms - and sub-themes are presented along with a description of these categories. We conclude with a discussion of challenges and opportunities for future research. △ Less

Submitted 18 July, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2210.03535 [pdf, other]

From plane crashes to algorithmic harm: applicability of safety engineering frameworks for responsible ML

Authors: Shalaleh Rismani, Renee Shelby, Andrew Smart, Edgar Jatho, Joshua Kroll, AJung Moon, Negar Rostamzadeh

Abstract: Inappropriate design and deployment of machine learning (ML) systems leads to negative downstream social and ethical impact -- described here as social and ethical risks -- for users, society and the environment. Despite the growing need to regulate ML systems, current processes for assessing and mitigating risks are disjointed and inconsistent. We interviewed 30 industry practitioners on their cu… ▽ More Inappropriate design and deployment of machine learning (ML) systems leads to negative downstream social and ethical impact -- described here as social and ethical risks -- for users, society and the environment. Despite the growing need to regulate ML systems, current processes for assessing and mitigating risks are disjointed and inconsistent. We interviewed 30 industry practitioners on their current social and ethical risk management practices, and collected their first reactions on adapting safety engineering frameworks into their practice -- namely, System Theoretic Process Analysis (STPA) and Failure Mode and Effects Analysis (FMEA). Our findings suggest STPA/FMEA can provide appropriate structure toward social and ethical risk assessment and mitigation processes. However, we also find nontrivial challenges in integrating such frameworks in the fast-paced culture of the ML industry. We call on the ML research community to strengthen existing frameworks and assess their efficacy, ensuring that ML systems are safer for all people. △ Less

Submitted 5 October, 2022; originally announced October 2022.

arXiv:2202.13028 [pdf, ps, other]

Healthsheet: Development of a Transparency Artifact for Health Datasets

Authors: Negar Rostamzadeh, Diana Mincu, Subhrajit Roy, Andrew Smart, Lauren Wilcox, Mahima Pushkarna, Jessica Schrouff, Razvan Amironesei, Nyalleng Moorosi, Katherine Heller

Abstract: Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in develo** ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Develo** guidelines to imp… ▽ More Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in develo** ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Develo** guidelines to improve documentation practices regarding the creation, use, and maintenance of ML healthcare datasets is therefore of critical importance. In this work, we introduce Healthsheet, a contextualized adaptation of the original datasheet questionnaire ~\cite{gebru2018datasheets} for health-specific applications. Through a series of semi-structured interviews, we adapt the datasheets for healthcare data documentation. As part of the Healthsheet development process and to understand the obstacles researchers face in creating datasheets, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic health Records (EHR), clinical trial study data, and smartphone-based performance outcome measures. Our findings from the interviewee study and case studies show 1) that datasheets should be contextualized for healthcare, 2) that despite incentives to adopt accountability practices such as datasheets, there is a lack of consistency in the broader use of these practices 3) how the ML for health community views datasheets and particularly \textit{Healthsheets} as diagnostic tool to surface the limitations and strength of datasets and 4) the relative importance of different fields in the datasheet to healthcare concerns. △ Less

Submitted 25 February, 2022; originally announced February 2022.

arXiv:2111.04439 [pdf, other]

Addressing Privacy Threats from Machine Learning

Authors: Mary Anne Smart

Abstract: Every year at NeurIPS, machine learning researchers gather and discuss exciting applications of machine learning in areas such as public health, disaster response, climate change, education, and more. However, many of these same researchers are expressing growing concern about applications of machine learning for surveillance (Nanayakkara et al., 2021). This paper presents a brief overview of stra… ▽ More Every year at NeurIPS, machine learning researchers gather and discuss exciting applications of machine learning in areas such as public health, disaster response, climate change, education, and more. However, many of these same researchers are expressing growing concern about applications of machine learning for surveillance (Nanayakkara et al., 2021). This paper presents a brief overview of strategies for resisting these surveillance technologies and calls for greater collaboration between machine learning and human-computer interaction researchers to address the threats that these technologies pose. △ Less

Submitted 24 October, 2021; originally announced November 2021.

Comments: 3 pages. Human Centered AI Workshop @ NeurIPS 2021 accepted submission

arXiv:2102.05085 [pdf, ps, other]

The Use and Misuse of Counterfactuals in Ethical Machine Learning

Authors: Atoosa Kasirzadeh, Andrew Smart

Abstract: The use of counterfactuals for considerations of algorithmic fairness and explainability is gaining prominence within the machine learning community and industry. This paper argues for more caution with the use of counterfactuals when the facts to be considered are social categories such as race or gender. We review a broad body of papers from philosophy and social sciences on social ontology and… ▽ More The use of counterfactuals for considerations of algorithmic fairness and explainability is gaining prominence within the machine learning community and industry. This paper argues for more caution with the use of counterfactuals when the facts to be considered are social categories such as race or gender. We review a broad body of papers from philosophy and social sciences on social ontology and the semantics of counterfactuals, and we conclude that the counterfactual approach in machine learning fairness and social explainability can require an incoherent theory of what social categories are. Our findings suggest that most often the social categories may not admit counterfactual manipulation, and hence may not appropriately satisfy the demands for evaluating the truth or falsity of counterfactuals. This is important because the widespread use of counterfactuals in machine learning can lead to misleading results when applied in high-stakes domains. Accordingly, we argue that even though counterfactuals play an essential part in some causal inferences, their use for questions of algorithmic fairness and social explanations can create more problems than they resolve. Our positive result is a set of tenets about using counterfactuals for fairness and explanations in machine learning. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Comments: 9 pages, 1 table, 1 figure

arXiv:2012.04216 [pdf, other]

Fairness Preferences, Actual and Hypothetical: A Study of Crowdworker Incentives

Authors: Angie Peng, Jeff Naecker, Ben Hutchinson, Andrew Smart, Nyalleng Moorosi

Abstract: How should we decide which fairness criteria or definitions to adopt in machine learning systems? To answer this question, we must study the fairness preferences of actual users of machine learning systems. Stringent parity constraints on treatment or impact can come with trade-offs, and may not even be preferred by the social groups in question (Zafar et al., 2017). Thus it might be beneficial to… ▽ More How should we decide which fairness criteria or definitions to adopt in machine learning systems? To answer this question, we must study the fairness preferences of actual users of machine learning systems. Stringent parity constraints on treatment or impact can come with trade-offs, and may not even be preferred by the social groups in question (Zafar et al., 2017). Thus it might be beneficial to elicit what the group's preferences are, rather than rely on a priori defined mathematical fairness constraints. Simply asking for self-reported rankings of users is challenging because research has shown that there are often gaps between people's stated and actual preferences(Bernheim et al., 2013). This paper outlines a research program and experimental designs for investigating these questions. Participants in the experiments are invited to perform a set of tasks in exchange for a base payment--they are told upfront that they may receive a bonus later on, and the bonus could depend on some combination of output quantity and quality. The same group of workers then votes on a bonus payment structure, to elicit preferences. The voting is hypothetical (not tied to an outcome) for half the group and actual (tied to the actual payment outcome) for the other half, so that we can understand the relation between a group's actual preferences and hypothetical (stated) preferences. Connections and lessons from fairness in machine learning are explored. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:2010.13561 [pdf, other]

Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure

Authors: Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, Margaret Mitchell

Abstract: Rising concern for the societal implications of artificial intelligence systems has inspired demands for greater transparency and accountability. However the datasets which empower machine learning are often used, shared and re-used with little visibility into the processes of deliberation which led to their creation. Which stakeholder groups had their perspectives included when the dataset was co… ▽ More Rising concern for the societal implications of artificial intelligence systems has inspired demands for greater transparency and accountability. However the datasets which empower machine learning are often used, shared and re-used with little visibility into the processes of deliberation which led to their creation. Which stakeholder groups had their perspectives included when the dataset was conceived? Which domain experts were consulted regarding how to model subgroups and other phenomena? How were questions of representational biases measured and addressed? Who labeled the data? In this paper, we introduce a rigorous framework for dataset development transparency which supports decision-making and accountability. The framework uses the cyclical, infrastructural and engineering nature of dataset development to draw on best practices from the software development lifecycle. Each stage of the data development lifecycle yields a set of documents that facilitate improved communication and decision-making, as well as drawing attention the value and necessity of careful data work. The proposed framework is intended to contribute to closing the accountability gap in artificial intelligence systems, by making visible the often overlooked work that goes into dataset creation. △ Less

Submitted 29 January, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

arXiv:2007.07399 [pdf, ps, other]

Bringing the People Back In: Contesting Benchmark Machine Learning Datasets

Authors: Remi Denton, Alex Hanna, Razvan Amironesei, Andrew Smart, Hilary Nicole, Morgan Klaus Scheuerman

Abstract: In response to algorithmic unfairness embedded in sociotechnical systems, significant attention has been focused on the contents of machine learning datasets which have revealed biases towards white, cisgender, male, and Western data subjects. In contrast, comparatively less attention has been paid to the histories, values, and norms embedded in such datasets. In this work, we outline a research… ▽ More In response to algorithmic unfairness embedded in sociotechnical systems, significant attention has been focused on the contents of machine learning datasets which have revealed biases towards white, cisgender, male, and Western data subjects. In contrast, comparatively less attention has been paid to the histories, values, and norms embedded in such datasets. In this work, we outline a research program - a genealogy of machine learning data - for investigating how and why these datasets have been created, what and whose values influence the choices of data to collect, the contextual and contingent conditions of their creation. We describe the ways in which benchmark datasets in machine learning operate as infrastructure and pose four research questions for these datasets. This interrogation forces us to "bring the people back in" by aiding us in understanding the labor embedded in dataset construction, and thereby presenting new avenues of contestation for other researchers encountering the data. △ Less

Submitted 14 July, 2020; originally announced July 2020.

arXiv:2006.09663 [pdf, other]

Extending the Machine Learning Abstraction Boundary: A Complex Systems Approach to Incorporate Societal Context

Authors: Donald Martin Jr., Vinodkumar Prabhakaran, Jill Kuhlberg, Andrew Smart, William S. Isaac

Abstract: Machine learning (ML) fairness research tends to focus primarily on mathematically-based interventions on often opaque algorithms or models and/or their immediate inputs and outputs. Such oversimplified mathematical models abstract away the underlying societal context where ML models are conceived, developed, and ultimately deployed. As fairness itself is a socially constructed concept that origin… ▽ More Machine learning (ML) fairness research tends to focus primarily on mathematically-based interventions on often opaque algorithms or models and/or their immediate inputs and outputs. Such oversimplified mathematical models abstract away the underlying societal context where ML models are conceived, developed, and ultimately deployed. As fairness itself is a socially constructed concept that originates from that societal context along with the model inputs and the models themselves, a lack of an in-depth understanding of societal context can easily undermine the pursuit of ML fairness. In this paper, we outline three new tools to improve the comprehension, identification and representation of societal context. First, we propose a complex adaptive systems (CAS) based model and definition of societal context that will help researchers and product developers to expand the abstraction boundary of ML fairness work to include societal context. Second, we introduce collaborative causal theory formation (CCTF) as a key capability for establishing a sociotechnical frame that incorporates diverse mental models and associated causal theories in modeling the problem and solution space for ML-based products. Finally, we identify community based system dynamics (CBSD) as a powerful, transparent and rigorous approach for practicing CCTF during all phases of the ML product development process. We conclude with a discussion of how these systems theoretic approaches to understand the societal context within which sociotechnical systems are embedded can improve the development of fair and inclusive ML-based products. △ Less

Submitted 17 June, 2020; originally announced June 2020.

Comments: 11 pages, 5 figures

arXiv:2005.07572 [pdf, other]

Participatory Problem Formulation for Fairer Machine Learning Through Community Based System Dynamics

Authors: Donald Martin Jr., Vinodkumar Prabhakaran, Jill Kuhlberg, Andrew Smart, William S. Isaac

Abstract: Recent research on algorithmic fairness has highlighted that the problem formulation phase of ML system development can be a key source of bias that has significant downstream impacts on ML system fairness outcomes. However, very little attention has been paid to methods for improving the fairness efficacy of this critical phase of ML system development. Current practice neither accounts for the d… ▽ More Recent research on algorithmic fairness has highlighted that the problem formulation phase of ML system development can be a key source of bias that has significant downstream impacts on ML system fairness outcomes. However, very little attention has been paid to methods for improving the fairness efficacy of this critical phase of ML system development. Current practice neither accounts for the dynamic complexity of high-stakes domains nor incorporates the perspectives of vulnerable stakeholders. In this paper we introduce community based system dynamics (CBSD) as an approach to enable the participation of typically excluded stakeholders in the problem formulation phase of the ML system development process and facilitate the deep problem understanding required to mitigate bias during this crucial stage. △ Less

Submitted 22 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

Comments: Eighth Annual Conference on Learning Representations (ICLR 2020), Virtual Workshop: Machine Learning in Real Life, April 26, 2020, 6 pages, 1 figure, fix comment typo, fix author name

arXiv:2002.10077 [pdf, other]

Approximate Data Deletion from Machine Learning Models

Authors: Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, James Zou

Abstract: Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model… ▽ More Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension $d$ and independent of the number of training data $n$. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models. △ Less

Submitted 23 February, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

Comments: 20 pages, 1 figure, accepted for publication at AISTATS 2021

arXiv:2001.00973 [pdf, other]

Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing

Authors: Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, Parker Barnes

Abstract: Rising concern for the societal implications of artificial intelligence systems has inspired a wave of academic and journalistic literature in which deployed systems are audited for harm by investigators from outside the organizations deploying the algorithms. However, it remains challenging for practitioners to identify the harmful repercussions of their own systems prior to deployment, and, once… ▽ More Rising concern for the societal implications of artificial intelligence systems has inspired a wave of academic and journalistic literature in which deployed systems are audited for harm by investigators from outside the organizations deploying the algorithms. However, it remains challenging for practitioners to identify the harmful repercussions of their own systems prior to deployment, and, once deployed, emergent issues can become difficult or impossible to trace back to their source. In this paper, we introduce a framework for algorithmic auditing that supports artificial intelligence system development end-to-end, to be applied throughout the internal organization development lifecycle. Each stage of the audit yields a set of documents that together form an overall audit report, drawing on an organization's values or principles to assess the fit of decisions made throughout the process. The proposed auditing framework is intended to contribute to closing the accountability gap in the development and deployment of large-scale artificial intelligence systems by embedding a robust process to ensure audit integrity. △ Less

Submitted 3 January, 2020; originally announced January 2020.

Comments: Accepted to ACM FAT* (Fariness, Accountability and Transparency) conference 2020. Full workable templates for the documents of the SMACTR framework presented in the paper can be found here https://drive.google.com/drive/folders/1GWlq8qGZXb2lNHxWBuo2wl-rlHsjNPM0?usp=sharing

arXiv:1912.03593 [pdf, ps, other]

doi 10.1145/3351095.3372826

Towards a Critical Race Methodology in Algorithmic Fairness

Authors: Alex Hanna, Emily Denton, Andrew Smart, Jamila Smith-Loud

Abstract: We examine the way race and racial categories are adopted in algorithmic fairness frameworks. Current methodologies fail to adequately account for the socially constructed nature of race, instead adopting a conceptualization of race as a fixed attribute. Treating race as an attribute, rather than a structural, institutional, and relational phenomenon, can serve to minimize the structural aspects o… ▽ More We examine the way race and racial categories are adopted in algorithmic fairness frameworks. Current methodologies fail to adequately account for the socially constructed nature of race, instead adopting a conceptualization of race as a fixed attribute. Treating race as an attribute, rather than a structural, institutional, and relational phenomenon, can serve to minimize the structural aspects of algorithmic unfairness. In this work, we focus on the history of racial categories and turn to critical race theory and sociological work on race and ethnicity to ground conceptualizations of race for fairness research, drawing on lessons from public health, biomedical research, and social survey research. We argue that algorithmic fairness researchers need to take into account the multidimensionality of race, take seriously the processes of conceptualizing and operationalizing race, focus on social processes which produce racial inequality, and consider perspectives of those most affected by sociotechnical systems. △ Less

Submitted 7 December, 2019; originally announced December 2019.

Comments: Conference on Fairness, Accountability, and Transparency (FAT* '20), January 27-30, 2020, Barcelona, Spain

arXiv:1511.05595 [pdf]

High-gradient High-charge CW Superconducting RF gun with CsK2Sb photocathode

Authors: Igor Pinayev, Vladimir N. Litvinenko, Joseph Tuozzolo, Jean Clifford Brutus, Sergey Belomestnykh, Chase Boulware, Charles Folz, David Gassner, Terry Grimm, Yue Hao, James Jamilkowski, Yichao **g, Dmitry Kayran, George Mahler, Michael Mapes, Toby Miller, Geetha Narayan, Brian Sheehy, Triveni Rao, John Skaritka, Kevin Smith, Louis Snydstrup, Yatming Than, Erdong Wang, Gang Wang , et al. (18 additional authors not shown)

Abstract: High-gradient CW photo-injectors operating at high accelerating gradients promise to revolutionize many sciences and applications. They can establish the basis for super-bright monochromatic X-ray free-electron lasers, super-bright hadron beams, nuclear- waste transmutation or a new generation of microchip production. In this letter we report on our operation of a superconducting RF electron gun w… ▽ More High-gradient CW photo-injectors operating at high accelerating gradients promise to revolutionize many sciences and applications. They can establish the basis for super-bright monochromatic X-ray free-electron lasers, super-bright hadron beams, nuclear- waste transmutation or a new generation of microchip production. In this letter we report on our operation of a superconducting RF electron gun with a record-high accelerating gradient at the CsK2Sb photocathode (i.e. ~ 20 MV/m) generating a record-high bunch charge (i.e., 3 nC). We briefly describe the system and then detail our experimental results. This achievement opens new era in generating high-power electron beams with a very high brightness. △ Less

Submitted 17 November, 2015; originally announced November 2015.

Comments: 13 pager, 5 figures

arXiv:1510.05495 [pdf]

Canine Olfactory Differentiation of Cancer: A Review of the Literature

Authors: Oliver Gould, Amy Smart, Norman Ratcliffe, Ben de Lacy Costello

Abstract: Numerous studies have attempted to demonstrate the olfactory ability of canines to detect several common cancer types from human bodily fluids, breath and tissue. Canines have been reported to detect bladder cancer (sensitivity of 0.63-0.73 and specificity of 0.64-0.92) and prostate cancer (sensitivity of 0.91-0.99 and specificity of 0.91-0.97) from urine; breast cancer (sensitivity of 0.88 and sp… ▽ More Numerous studies have attempted to demonstrate the olfactory ability of canines to detect several common cancer types from human bodily fluids, breath and tissue. Canines have been reported to detect bladder cancer (sensitivity of 0.63-0.73 and specificity of 0.64-0.92) and prostate cancer (sensitivity of 0.91-0.99 and specificity of 0.91-0.97) from urine; breast cancer (sensitivity of 0.88 and specificity of 0.98) and lung cancer (sensitivity 0.56-0.99 and specificity of 8.30-0.99) on breath and colorectal cancer from stools (sensitivity of 0.91-0.97 and specificity of 0.97-0.99). The quoted figures of sensitivity and specificity across differing studies demonstrate that in many cases results are variable from study to study; this raises questions about the reproducibility of methodology and study design which we have identified herein. Furthermore in some studies the controls used have resulted in differentiation of samples which are of limited use for clinical diagnosis. These studies provide some evidence that cancer gives rise to different volatile organic compounds (VOCs) compared to healthy samples. Whilst canine detection may be unsuitable for clinical implementation they can, at least, provide inspiration for more traditional laboratory investigations. △ Less

Submitted 5 October, 2015; originally announced October 2015.

Comments: Total of 23 pages including citations, with 1 embedded table

Showing 1–23 of 23 results for author: Smart, A