Search | arXiv e-print repository

Red-Teaming for Generative AI: Silver Bullet or Security Theater?

Authors: Michael Feffer, Anusha Sinha, Wesley Hanwen Deng, Zachary C. Lipton, Hoda Heidari

Abstract: In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what… ▽ More In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what precisely it means, what role it can play in regulation, and how it relates to conventional red-teaming practices as originally conceived in the field of cybersecurity. In this work, we identify recent cases of red-teaming activities in the AI industry and conduct an extensive survey of relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. Our analysis reveals that prior methods and practices of AI red-teaming diverge along several axes, including the purpose of the activity (which is often vague), the artifact under evaluation, the setting in which the activity is conducted (e.g., actors, resources, and methods), and the resulting decisions it informs (e.g., reporting, disclosure, and mitigation). In light of our findings, we argue that while red-teaming may be a valuable big-tent idea for characterizing GenAI harm mitigations, and that industry may effectively apply red-teaming and other strategies behind closed doors to safeguard AI, gestures towards red-teaming (based on public definitions) as a panacea for every possible risk verge on security theater. To move toward a more robust toolbox of evaluations for generative AI, we synthesize our recommendations into a question bank meant to guide and scaffold future AI red-teaming practices. △ Less

Submitted 15 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2306.06542 [pdf, ps, other]

Investigating Practices and Opportunities for Cross-functional Collaboration around AI Fairness in Industry Practice

Authors: Wesley Hanwen Deng, Nur Yildirim, Monica Chang, Motahhare Eslami, Ken Holstein, Michael Madaio

Abstract: An emerging body of research indicates that ineffective cross-functional collaboration -- the interdisciplinary work done by industry practitioners across roles -- represents a major barrier to addressing issues of fairness in AI design and development. In this research, we sought to better understand practitioners' current practices and tactics to enact cross-functional collaboration for AI fairn… ▽ More An emerging body of research indicates that ineffective cross-functional collaboration -- the interdisciplinary work done by industry practitioners across roles -- represents a major barrier to addressing issues of fairness in AI design and development. In this research, we sought to better understand practitioners' current practices and tactics to enact cross-functional collaboration for AI fairness, in order to identify opportunities to support more effective collaboration. We conducted a series of interviews and design workshops with 23 industry practitioners spanning various roles from 17 companies. We found that practitioners engaged in bridging work to overcome frictions in understanding, contextualization, and evaluation around AI fairness across roles. In addition, in organizational contexts with a lack of resources and incentives for fairness work, practitioners often piggybacked on existing requirements (e.g., for privacy assessments) and AI development norms (e.g., the use of quantitative evaluation metrics), although they worry that these tactics may be fundamentally compromised. Finally, we draw attention to the invisible labor that practitioners take on as part of this bridging and piggybacking work to enact interdisciplinary collaboration for fairness. We close by discussing opportunities for both FAccT researchers and AI practitioners to better support cross-functional collaboration for fairness in the design and development of AI systems. △ Less

Submitted 10 June, 2023; originally announced June 2023.

Comments: In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT '23)

arXiv:2304.00167 [pdf, other]

Towards "Anytime, Anywhere" Community Learning and Engagement around the Design of Public Sector AI

Authors: Wesley Hanwen Deng, Motahhare Eslami, Kenneth Holstein

Abstract: Data-driven algorithmic and AI systems are increasingly being deployed to automate or augment decision processes across a wide range of public service settings. Yet community members are often unaware of the presence, operation, and impacts of these systems on their lives. With the shift towards algorithmic decision-making in public services, technology developers increasingly assume the role of d… ▽ More Data-driven algorithmic and AI systems are increasingly being deployed to automate or augment decision processes across a wide range of public service settings. Yet community members are often unaware of the presence, operation, and impacts of these systems on their lives. With the shift towards algorithmic decision-making in public services, technology developers increasingly assume the role of de-facto policymakers, and opportunities for democratic participation are foreclosed. In this position paper, we articulate an early vision around the design of ubiquitous infrastructure for public learning and engagement around civic AI technologies. Building on this vision, we provide a list of questions that we hope can prompt stimulating conversations among the HCI community. △ Less

Submitted 21 April, 2023; v1 submitted 31 March, 2023; originally announced April 2023.

Journal ref: AI Literacy: Finding Common Threads between Education, Design, Policy, and Explainability Workshop at CHI 2023

arXiv:2210.03709 [pdf, other]

doi 10.1145/3544548.3581026

Understanding Practices, Challenges, and Opportunities for User-Engaged Algorithm Auditing in Industry Practice

Authors: Wesley Hanwen Deng, Bill Boyuan Guo, Alicia DeVrio, Hong Shen, Motahhare Eslami, Kenneth Holstein

Abstract: Recent years have seen growing interest among both researchers and practitioners in user-engaged approaches to algorithm auditing, which directly engage users in detecting problematic behaviors in algorithmic systems. However, we know little about industry practitioners' current practices and challenges around user-engaged auditing, nor what opportunities exist for them to better leverage such app… ▽ More Recent years have seen growing interest among both researchers and practitioners in user-engaged approaches to algorithm auditing, which directly engage users in detecting problematic behaviors in algorithmic systems. However, we know little about industry practitioners' current practices and challenges around user-engaged auditing, nor what opportunities exist for them to better leverage such approaches in practice. To investigate, we conducted a series of interviews and iterative co-design activities with practitioners who employ user-engaged auditing approaches in their work. Our findings reveal several challenges practitioners face in appropriately recruiting and incentivizing user auditors, scaffolding user audits, and deriving actionable insights from user-engaged audit reports. Furthermore, practitioners shared organizational obstacles to user-engaged auditing, surfacing a complex relationship between practitioners and user auditors. Based on these findings, we discuss opportunities for future HCI research to help realize the potential (and the mitigate risks) of user-engaged auditing in industry practice. △ Less

Submitted 21 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: 18 pages. In Proceedings of CHI 2023

Journal ref: CHI 2023: ACM Conference on Human Factors in Computing Systems. April 23-28, 2023, Hamburg, Germany

arXiv:2205.06922 [pdf, other]

doi 10.1145/3531146.3533113

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits

Authors: Wesley Hanwen Deng, Manish Nagireddy, Michelle Seng Ah Lee, Jatinder Singh, Zhiwei Steven Wu, Kenneth Holstein, Haiyi Zhu

Abstract: Recent years have seen the development of many open-source ML fairness toolkits aimed at hel** ML practitioners assess and address unfairness in their systems. However, there has been little research investigating how ML practitioners actually use these toolkits in practice. In this paper, we conducted the first in-depth empirical exploration of how industry practitioners (try to) work with exis… ▽ More Recent years have seen the development of many open-source ML fairness toolkits aimed at hel** ML practitioners assess and address unfairness in their systems. However, there has been little research investigating how ML practitioners actually use these toolkits in practice. In this paper, we conducted the first in-depth empirical exploration of how industry practitioners (try to) work with existing fairness toolkits. In particular, we conducted think-aloud interviews to understand how participants learn about and use fairness toolkits, and explored the generality of our findings through an anonymous online survey. We identified several opportunities for fairness toolkits to better address practitioner needs and scaffold them in using toolkits effectively and responsibly. Based on these findings, we highlight implications for the design of future open-source fairness toolkits that can support practitioners in better contextualizing, communicating, and collaborating around ML fairness efforts. △ Less

Submitted 10 January, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

Comments: ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2022)

arXiv:2205.06920 [pdf, ps, other]

Beyond General Purpose Machine Translation: The Need for Context-specific Empirical Research to Design for Appropriate User Trust

Authors: Wesley Hanwen Deng, Nikita Mehandru, Samantha Robertson, Niloufar Salehi

Abstract: Machine Translation (MT) has the potential to help people overcome language barriers and is widely used in high-stakes scenarios, such as in hospitals. However, in order to use MT reliably and safely, users need to understand when to trust MT outputs and how to assess the quality of often imperfect translation results. In this paper, we discuss research directions to support users to calibrate tru… ▽ More Machine Translation (MT) has the potential to help people overcome language barriers and is widely used in high-stakes scenarios, such as in hospitals. However, in order to use MT reliably and safely, users need to understand when to trust MT outputs and how to assess the quality of often imperfect translation results. In this paper, we discuss research directions to support users to calibrate trust in MT systems. We share findings from an empirical study in which we conducted semi-structured interviews with 20 clinicians to understand how they communicate with patients across language barriers, and if and how they use MT systems. Based on our findings, we advocate for empirical research on how MT systems are used in practice as an important first step to addressing the challenges in building appropriate trust between users and MT tools. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: Workshop on Trust and Reliance in AI-Human Teams (TRAIT): https://doi.org/10.1145/3491101.3503704

arXiv:2010.11411 [pdf, other]

doi 10.1145/3442188.3445971

Value Cards: An Educational Toolkit for Teaching Social Impacts of Machine Learning through Deliberation

Authors: Hong Shen, Wesley Hanwen Deng, Aditi Chattopadhyay, Zhiwei Steven Wu, Xu Wang, Haiyi Zhu

Abstract: Recently, there have been increasing calls for computer science curricula to complement existing technical training with topics related to Fairness, Accountability, Transparency, and Ethics. In this paper, we present Value Card, an educational toolkit to inform students and practitioners of the social impacts of different machine learning models via deliberation. This paper presents an early use o… ▽ More Recently, there have been increasing calls for computer science curricula to complement existing technical training with topics related to Fairness, Accountability, Transparency, and Ethics. In this paper, we present Value Card, an educational toolkit to inform students and practitioners of the social impacts of different machine learning models via deliberation. This paper presents an early use of our approach in a college-level computer science course. Through an in-class activity, we report empirical data for the initial effectiveness of our approach. Our results suggest that the use of the Value Cards toolkit can improve students' understanding of both the technical definitions and trade-offs of performance metrics and apply them in real-world contexts, help them recognize the significance of considering diverse social values in the development of deployment of algorithmic systems, and enable them to communicate, negotiate and synthesize the perspectives of diverse stakeholders. Our study also demonstrates a number of caveats we need to consider when using the different variants of the Value Cards toolkit. Finally, we discuss the challenges as well as future applications of our approach. △ Less

Submitted 10 January, 2023; v1 submitted 21 October, 2020; originally announced October 2020.

Journal ref: ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2021)

Showing 1–7 of 7 results for author: Deng, W H