Skip to main content

Showing 1–46 of 46 results for author: Kalai, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.12954  [pdf, other

    cs.CL cs.AI cs.HC

    Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

    Authors: Mirac Suzgun, Adam Tauman Kalai

    Abstract: We introduce meta-prompting, an effective scaffolding technique designed to enhance the functionality of language models (LMs). This approach transforms a single LM into a multi-faceted conductor, adept at managing and integrating multiple independent LM queries. By employing high-level instructions, meta-prompting guides the LM to break down complex tasks into smaller, more manageable subtasks. T… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: https://github.com/suzgunmirac/meta-prompting

  2. arXiv:2311.14648  [pdf, other

    cs.CL cs.AI

    Calibrated Language Models Must Hallucinate

    Authors: Adam Tauman Kalai, Santosh S. Vempala

    Abstract: Recent language models generate false but plausible-sounding text with surprising frequency. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts, having nothing to do with th… ▽ More

    Submitted 19 March, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: In Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) 2024

  3. arXiv:2311.10538  [pdf, other

    cs.AI

    Testing Language Model Agents Safely in the Wild

    Authors: Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau

    Abstract: A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild. Yet real-world autonomous tests face several unique safety challenges, both due to the possibility of causing harm during a test, as well as the risk of encountering new unsafe agent behavior through interactions with real-world and potentially malicious actors. We propose a framework for conducting safe autonomous agent tes… ▽ More

    Submitted 3 December, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

  4. arXiv:2310.02304  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

    Authors: Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai

    Abstract: Several recent advances in AI systems (e.g., Tree-of-Thoughts and Program-Aided Language Models) solve problems by providing a "scaffolding" program that structures multiple calls to language models to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with… ▽ More

    Submitted 1 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

  5. arXiv:2306.11644  [pdf, other

    cs.CL cs.AI cs.LG

    Textbooks Are All You Need

    Authors: Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

    Abstract: We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accu… ▽ More

    Submitted 2 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 26 pages; changed color scheme of plot. fixed minor typos and added couple clarifications

  6. arXiv:2305.18248  [pdf, other

    cs.CL cs.AI

    Do Language Models Know When They're Hallucinating References?

    Authors: Ayush Agrawal, Mirac Suzgun, Lester Mackey, Adam Tauman Kalai

    Abstract: State-of-the-art language models (LMs) are notoriously susceptible to generating hallucinated information. Such inaccurate outputs not only undermine the reliability of these models but also limit their use and raise serious concerns about misinformation and propaganda. In this work, we focus on hallucinated book and article references and present them as the "model organism" of language model hal… ▽ More

    Submitted 20 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  7. arXiv:2304.09424  [pdf, other

    cs.LG cs.AI stat.ML

    Loss Minimization Yields Multicalibration for Large Neural Networks

    Authors: Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, Preetum Nakkiran

    Abstract: Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions. In this work, we consider the setting where the protected groups can be represented by neural networks of size $k$, and the… ▽ More

    Submitted 7 December, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: In ITCS 2024

  8. arXiv:2211.11081  [pdf, other

    cs.CL cs.LG

    A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

    Authors: Shafi Goldwasser, David F. Gruber, Adam Tauman Kalai, Orr Paradise

    Abstract: Neural networks are capable of translating between languages -- in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligent… ▽ More

    Submitted 3 November, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    ACM Class: I.2.7; I.2.6

  9. arXiv:2209.00735  [pdf, other

    cs.LG stat.ML

    Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

    Authors: Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang

    Abstract: Neural networks (NNs) struggle to efficiently solve certain problems, such as learning parities, even when there are simple learning algorithms for those problems. Can NNs discover learning algorithms on their own? We exhibit a NN architecture that, in polynomial time, learns as well as any efficient learning algorithm describable by a constant-sized program. For example, on parity problems, the N… ▽ More

    Submitted 15 January, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: v2: final camera-ready revisions for NeurIPS 2022

  10. arXiv:2208.12063  [pdf, other

    cs.LG cs.DS cs.IR

    Partial Matrix Completion

    Authors: Elad Hazan, Adam Tauman Kalai, Varun Kanade, Clara Mohri, Y. Jennifer Sun

    Abstract: The matrix completion problem aims to reconstruct a low-rank matrix based on a revealed set of possibly noisy entries. Prior works consider completing the entire matrix with generalization error guarantees. However, the completion accuracy can be drastically different over different entries. This work establishes a new framework of partial matrix completion, where the goal is to identify a large s… ▽ More

    Submitted 17 December, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2023

  11. arXiv:2208.10264  [pdf, other

    cs.CL cs.AI cs.LG

    Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

    Authors: Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai

    Abstract: We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what extent a given language model, such as GPT models, can simulate different aspects of human behavior. A TE can also reveal consistent distortions in a language model's simulation of a specific human behavior. Unlike the Turing Test, which involves simulating a single arbitrary individual, a TE requires simulati… ▽ More

    Submitted 9 July, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted for oral presentation at International Conference on Machine Learning (ICML) 2023

  12. arXiv:2207.14502  [pdf, other

    cs.LG cs.AI

    Language Models Can Teach Themselves to Program Better

    Authors: Patrick Haluptzok, Matthew Bowers, Adam Tauman Kalai

    Abstract: Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems, even solving some competitive-programming problems. Self-play has proven useful in games such as Go, and thus it is natural to ask whether LMs can generate their own instructive programming problems to improve their performance. We show that it is possible for an LM to synthesi… ▽ More

    Submitted 12 April, 2023; v1 submitted 29 July, 2022; originally announced July 2022.

    Comments: 22 pages, 14 figures

  13. arXiv:2205.09838  [pdf, ps, other

    cs.LG stat.ML

    Why GANs are overkill for NLP

    Authors: David Alvarez-Melis, Vikas Garg, Adam Tauman Kalai

    Abstract: This work offers a novel theoretical perspective on why, despite numerous attempts, adversarial approaches to generative modeling (e.g., GANs) have not been as popular for certain generation tasks, particularly sequential tasks such as Natural Language Generation, as they have in others, such as Computer Vision. In particular, on sequential data such as text, maximum-likelihood approaches are sign… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  14. arXiv:2109.05389  [pdf, other

    cs.LG stat.ML

    Omnipredictors

    Authors: Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, Udi Wieder

    Abstract: Loss minimization is a dominant paradigm in machine learning, where a predictor is trained to minimize some loss function that depends on an uncertain event (e.g., "will it rain tomorrow?''). Different loss functions imply different learning algorithms and, at times, very different predictors. While widespread and appealing, a clear drawback of this approach is that the loss function may not be kn… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

    Comments: 35 pages, 1 figure

  15. Social Norm Bias: Residual Harms of Fairness-Aware Algorithms

    Authors: Myra Cheng, Maria De-Arteaga, Lester Mackey, Adam Tauman Kalai

    Abstract: Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential ty… ▽ More

    Submitted 10 August, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: Spotlighted at the 2021 ICML Machine Learning for Data Workshop and presented at the 2021 ICML Socially Responsible Machine Learning Workshop

    Report number: Data Min Knowl Disc (2023)

  16. arXiv:2106.05784  [pdf, other

    cs.LG cs.AI cs.CL cs.PL cs.SE

    Programming Puzzles

    Authors: Tal Schuster, Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai

    Abstract: We introduce a new type of programming challenge called programming puzzles, as an objective and comprehensive evaluation of program synthesis, and release an open-source dataset of Python Programming Puzzles (P3). Each puzzle is defined by a short Python program $f$, and the goal is to find an input which makes $f$ return True. The puzzles are objective in that each one is specified entirely by t… ▽ More

    Submitted 6 November, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 (Datasets and Benchmarks Track). Puzzles repository: https://github.com/microsoft/PythonProgrammingPuzzles

  17. arXiv:2105.14119  [pdf, other

    cs.LG cs.AI cs.DS stat.ML

    Towards optimally abstaining from prediction with OOD test examples

    Authors: Adam Tauman Kalai, Varun Kanade

    Abstract: A common challenge across all areas of machine learning is that training data is not distributed like test data, due to natural shifts, "blind spots," or adversarial examples; such test examples are referred to as out-of-distribution (OOD) test examples. We consider a model where one may abstain from predicting, at a fixed cost. In particular, our transductive abstention algorithm takes labeled tr… ▽ More

    Submitted 27 October, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: In NeurIPS 2021 (+spotlight), 24 pages

  18. arXiv:2102.07802  [pdf, ps, other

    cs.LG stat.ML

    Efficient Learning with Arbitrary Covariate Shift

    Authors: Adam Kalai, Varun Kanade

    Abstract: We give an efficient algorithm for learning a binary function in a given class C of bounded VC dimension, with training data distributed according to P and test data according to Q, where P and Q may be arbitrary distributions over X. This is the generic form of what is called covariate shift, which is impossible in general as arbitrary P and Q may not even overlap. However, recently guarantees we… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  19. arXiv:2007.05145  [pdf, other

    cs.LG stat.ML

    Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples

    Authors: Shafi Goldwasser, Adam Tauman Kalai, Yael Tauman Kalai, Omar Montasser

    Abstract: We present a transductive learning algorithm that takes as input training examples from a distribution $P$ and arbitrary (unlabeled) test examples, possibly chosen by an adversary. This is unlike prior work that assumes that test examples are small perturbations of $P$. Our algorithm outputs a selective classifier, which abstains from predicting on some examples. By considering selective transduct… ▽ More

    Submitted 30 September, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: To appear in NeurIPS 2020

  20. arXiv:2002.05660  [pdf, other

    cs.LG stat.ML

    Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization

    Authors: Vikas K. Garg, Adam Kalai, Katrina Ligett, Zhiwei Steven Wu

    Abstract: Domain generalization is the problem of machine learning when the training data and the test data come from different data domains. We present a simple theoretical model of learning to generalize across domains in which there is a meta-distribution over data distributions, and those data distributions may even have different supports. In our model, the training data given to a learning algorithm c… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  21. arXiv:1910.04123  [pdf, other

    cs.GT econ.TH

    The Disparate Equilibria of Algorithmic Decision Making when Individuals Invest Rationally

    Authors: Lydia T. Liu, Ashia Wilson, Nika Haghtalab, Adam Tauman Kalai, Christian Borgs, Jennifer Chayes

    Abstract: The long-term impact of algorithmic decision making is shaped by the dynamics between the deployed decision rule and individuals' response. Focusing on settings where each individual desires a positive classification---including many important applications such as hiring and school admissions, we study a dynamic learning setting where individuals invest in a positive outcome based on their group's… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    Comments: 30 pages, 7 figures

  22. arXiv:1904.11875  [pdf, other

    cs.LG stat.ML

    Learning to Prune: Speeding up Repeated Computations

    Authors: Daniel Alabi, Adam Tauman Kalai, Katrina Ligett, Cameron Musco, Christos Tzamos, Ellen Vitercik

    Abstract: It is common to encounter situations where one must solve a sequence of similar computational problems. Running a standard algorithm with worst-case runtime guarantees on each instance will fail to take advantage of valuable structure shared across the problem instances. For example, when a commuter drives from work to home, there are typically only a handful of routes that will ever be the shorte… ▽ More

    Submitted 26 April, 2019; originally announced April 2019.

  23. arXiv:1904.05233  [pdf, other

    cs.LG cs.CL stat.ML

    What's in a Name? Reducing Bias in Bios without Access to Protected Attributes

    Authors: Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Tauman Kalai

    Abstract: There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These methods typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) protected attributes may not be available or it may not be legal to use them, and (2) it is often desirable to simultaneously consider multiple protect… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: Accepted at NAACL 2019; Best Thematic Paper

  24. arXiv:1902.02783  [pdf, other

    cs.CL cs.LG stat.ML

    Humor in Word Embeddings: Cockamamie Gobbledegook for Nincompoops

    Authors: Limor Gultchin, Genevieve Patterson, Nancy Baym, Nathaniel Swinger, Adam Tauman Kalai

    Abstract: While humor is often thought to be beyond the reach of Natural Language Processing, we show that several aspects of single-word humor correlate with simple linear directions in Word Embeddings. In particular: (a) the word vectors capture multiple aspects discussed in humor theories from various disciplines; (b) each individual's sense of humor can be represented by a vector, which can predict diff… ▽ More

    Submitted 24 May, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

  25. arXiv:1901.09451  [pdf, other

    cs.IR cs.LG stat.ML

    Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

    Authors: Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Adam Tauman Kalai

    Abstract: We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in di… ▽ More

    Submitted 27 January, 2019; originally announced January 2019.

    Comments: Accepted at ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*), 2019

  26. arXiv:1812.08769  [pdf, other

    cs.CL cs.LG

    What are the biases in my word embedding?

    Authors: Nathaniel Swinger, Maria De-Arteaga, Neil Thomas Heffernan IV, Mark DM Leiserson, Adam Tauman Kalai

    Abstract: This paper presents an algorithm for enumerating biases in word embeddings. The algorithm exposes a large number of offensive associations related to sensitive features such as race and gender on publicly available embeddings, including a supposedly "debiased" embedding. These biases are concerning in light of the widespread use of word embeddings. The associations are identified by geometric patt… ▽ More

    Submitted 19 June, 2019; v1 submitted 20 December, 2018; originally announced December 2018.

    Comments: At AIES 2019: the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society

  27. arXiv:1804.04503  [pdf, other

    cs.LG cs.DS stat.ML

    Unleashing Linear Optimizers for Group-Fair Learning and Optimization

    Authors: Daniel Alabi, Nicole Immorlica, Adam Tauman Kalai

    Abstract: Most systems and learning algorithms optimize average performance or average loss -- one reason being computational complexity. However, many objectives of practical interest are more complex than simply average loss. This arises, for example, when balancing performance or loss with fairness across people. We prove that, from a computational perspective, optimizing arbitrary objectives that take i… ▽ More

    Submitted 4 June, 2018; v1 submitted 10 April, 2018; originally announced April 2018.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2018

  28. arXiv:1802.07229  [pdf, other

    cs.LG cs.DS stat.ML

    Actively Avoiding Nonsense in Generative Models

    Authors: Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos

    Abstract: A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data. This happens due to "model error," i.e., when the true data generating distribution does not fit within the class of generative models being learned. To address this, we propose a model of active distribution learning using a binary invalidity oracle that identifies some examples as clearly i… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

  29. arXiv:1712.03650  [pdf, other

    cs.HC cs.CR cs.CY

    Usability of Humanly Computable Passwords

    Authors: Samira Samadi, Santosh Vempala, Adam Tauman Kalai

    Abstract: Reusing passwords across multiple websites is a common practice that compromises security. Recently, Blum and Vempala have proposed password strategies to help people calculate, in their heads, passwords for different sites without dependence on third-party tools or external devices. Thus far, the security and efficiency of these "mental algorithms" has been analyzed only theoretically. But are su… ▽ More

    Submitted 24 May, 2018; v1 submitted 11 December, 2017; originally announced December 2017.

  30. arXiv:1710.01799  [pdf, other

    cs.CL

    Counterfactual Language Model Adaptation for Suggesting Phrases

    Authors: Kenneth C. Arnold, Kai-Wei Chang, Adam T. Kalai

    Abstract: Mobile devices use language models to suggest words and phrases for use in text entry. Traditional language models are based on contextual word frequency in a static corpus of text. However, certain types of phrases, when offered to writers as suggestions, may be systematically chosen more often than their frequency would predict. In this paper, we propose the task of generating suggestions that w… ▽ More

    Submitted 4 October, 2017; originally announced October 2017.

  31. arXiv:1709.08669  [pdf, other

    cs.LG cs.AI stat.ML

    Glass-Box Program Synthesis: A Machine Learning Approach

    Authors: Konstantina Christakopoulou, Adam Tauman Kalai

    Abstract: Recently proposed models which learn to write computer programs from data use either input/output examples or rich execution traces. Instead, we argue that a novel alternative is to use a glass-box loss function, given as a program itself that can be directly inspected. Glass-box optimization covers a wide range of problems, from computing the greatest common divisor of two integers, to learning-t… ▽ More

    Submitted 25 September, 2017; originally announced September 2017.

  32. arXiv:1709.05262  [pdf, other

    cs.AI cs.LG stat.ML

    Supervising Unsupervised Learning

    Authors: Vikas K. Garg, Adam Kalai

    Abstract: We introduce a framework to leverage knowledge acquired from a repository of (heterogeneous) supervised datasets to new unsupervised datasets. Our perspective avoids the subjectivity inherent in unsupervised learning by reducing it to supervised learning, and provides a principled way to evaluate unsupervised algorithms. We demonstrate the versatility of our framework via simple agnostic bounds on… ▽ More

    Submitted 16 February, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

    Comments: 11 two column pages. arXiv admin note: substantial text overlap with arXiv:1612.09030

  33. arXiv:1707.06613  [pdf, other

    cs.LG cs.CY

    Decoupled classifiers for fair and efficient machine learning

    Authors: Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, Max Leiserson

    Abstract: When it is ethical and legal to use a sensitive attribute (such as gender or race) in machine learning systems, the question remains how to do so. We show that the naive application of machine learning algorithms using sensitive features leads to an inherent tradeoff in accuracy between groups. We provide a simple and efficient decoupling technique, that can be added on top of any black-box machin… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

  34. arXiv:1706.08160  [pdf, other

    cs.CL cs.LG

    Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

    Authors: Shyam Upadhyay, Kai-Wei Chang, Matt Taddy, Adam Kalai, James Zou

    Abstract: Word embeddings, which represent a word as a point in a vector space, have become ubiquitous to several NLP tasks. A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by exploiting crosslingual signals to aid sense identification. We present a multi-view Bayesian non-parametric algorithm which improves multi-sense word embeddings by (a… ▽ More

    Submitted 25 June, 2017; originally announced June 2017.

    Comments: ACL 2017 Repl4NLP workshop

  35. arXiv:1612.09030  [pdf, other

    cs.LG cs.AI cs.CV

    Meta-Unsupervised-Learning: A supervised approach to unsupervised learning

    Authors: Vikas K. Garg, Adam Tauman Kalai

    Abstract: We introduce a new paradigm to investigate unsupervised learning, reducing unsupervised learning to supervised learning. Specifically, we mitigate the subjectivity in unsupervised decision-making by leveraging knowledge acquired from prior, possibly heterogeneous, supervised learning tasks. We demonstrate the versatility of our framework via comprehensive expositions and detailed experiments on se… ▽ More

    Submitted 3 January, 2017; v1 submitted 28 December, 2016; originally announced December 2016.

    Comments: 22 pages

  36. arXiv:1607.06520  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

    Authors: Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai

    Abstract: The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing… ▽ More

    Submitted 21 July, 2016; originally announced July 2016.

  37. arXiv:1606.06121  [pdf, other

    cs.CL cs.LG stat.ML

    Quantifying and Reducing Stereotypes in Word Embeddings

    Authors: Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai

    Abstract: Machine learning algorithms are optimized to model statistical properties of the training data. If the input data reflects stereotypes and biases of the broader society, then the output of the learning algorithm also captures these stereotypes. In this paper, we initiate the study of gender stereotypes in {\em word embedding}, a popular framework to represent text data. As their use becomes increa… ▽ More

    Submitted 20 June, 2016; originally announced June 2016.

    Comments: presented at 2016 ICML Workshop on #Data4Good: Machine Learning in Social Good Applications, New York, NY

  38. arXiv:1504.00064  [pdf, other

    stat.ML cs.LG

    Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons

    Authors: James Y. Zou, Kamalika Chaudhuri, Adam Tauman Kalai

    Abstract: We introduce an unsupervised approach to efficiently discover the underlying features in a data set via crowdsourcing. Our queries ask crowd members to articulate a feature common to two out of three displayed examples. In addition we also ask the crowd to provide binary labels to the remaining examples based on the discovered features. The triples are chosen adaptively based on the labels of the… ▽ More

    Submitted 31 March, 2015; originally announced April 2015.

  39. arXiv:1302.4297  [pdf, other

    cs.LG stat.ML

    Feature Multi-Selection among Subjective Features

    Authors: Sivan Sabato, Adam Kalai

    Abstract: When dealing with subjective, noisy, or otherwise nebulous features, the "wisdom of crowds" suggests that one may benefit from multiple judgments of the same feature on the same object. We give theoretically-motivated `feature multi-selection' algorithms that choose, among a large set of candidate features, not only which features to judge but how many times to judge each one. We demonstrate the e… ▽ More

    Submitted 14 May, 2013; v1 submitted 18 February, 2013; originally announced February 2013.

    Journal ref: S. Sabato and A. Kalai, "Feature Multi-Selection among Subjective Features", Proceedings of the 30th International Conference on Machine Learning (ICML), 2013

  40. arXiv:1209.3811  [pdf, other

    cs.AI

    Textual Features for Programming by Example

    Authors: Aditya Krishna Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, Adam Tauman Kalai

    Abstract: In Programming by Example, a system attempts to infer a program from input and output examples, generally by searching for a composition of certain base functions. Performing a naive brute force search is infeasible for even mildly involved tasks. We note that the examples themselves often present clues as to which functions to compose, and how to rank the resulting programs. In text processing, w… ▽ More

    Submitted 17 September, 2012; originally announced September 2012.

  41. arXiv:1105.1033  [pdf, other

    cs.LG

    Adaptively Learning the Crowd Kernel

    Authors: Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, Adam Tauman Kalai

    Abstract: We introduce an algorithm that, given n objects, learns a similarity matrix over all n^2 pairs, from crowdsourced data alone. The algorithm samples responses to adaptively chosen triplet-based relative-similarity queries. Each query has the form "is object 'a' more similar to 'b' or to 'c'?" and is chosen to be maximally informative given the preceding responses. The output is an embedding of the… ▽ More

    Submitted 25 June, 2011; v1 submitted 5 May, 2011; originally announced May 2011.

    Comments: 9 pages, 7 figures, Accepted to the 28th International Conference on Machine Learning (ICML), 2011

    Journal ref: The 28th International Conference on Machine Learning, 2011

  42. arXiv:1104.2018  [pdf, other

    cs.AI cs.LG stat.ML

    Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

    Authors: Sham Kakade, Adam Tauman Kalai, Varun Kanade, Ohad Shamir

    Abstract: Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) recently pr… ▽ More

    Submitted 11 April, 2011; originally announced April 2011.

  43. arXiv:1101.2883  [pdf, ps, other

    cs.GT cs.DS

    Dueling Algorithms

    Authors: Nicole Immorlica, Adam Tauman Kalai, Brendan Lucier, Ankur Moitra, Andrew Postlewaite, Moshe Tennenholtz

    Abstract: We revisit classic algorithmic search and optimization problems from the perspective of competition. Rather than a single optimizer minimizing expected cost, we consider a zero-sum game in which an optimization problem is presented to two players, whose only goal is to outperform the opponent. Such games are typically exponentially large zero-sum games, but they often have a rich combinatorial str… ▽ More

    Submitted 14 January, 2011; originally announced January 2011.

    Comments: 26 pages

  44. arXiv:0812.0933  [pdf, ps, other

    cs.LG cs.CC

    Decision trees are PAC-learnable from most product distributions: a smoothed analysis

    Authors: Adam Tauman Kalai, Shang-Hua Teng

    Abstract: We consider the problem of PAC-learning decision trees, i.e., learning a decision tree over the n-dimensional hypercube from independent random labeled examples. Despite significant effort, no polynomial-time algorithm is known for learning polynomial-sized decision trees (even trees of any super-constant size), even when examples are assumed to be drawn from the uniform distribution on {0,1}^n.… ▽ More

    Submitted 4 December, 2008; originally announced December 2008.

  45. arXiv:cs/0408007  [pdf, ps, other

    cs.LG cs.CC

    Online convex optimization in the bandit setting: gradient descent without a gradient

    Authors: Abraham D. Flaxman, Adam Tauman Kalai, H. Brendan McMahan

    Abstract: We consider a the general online convex optimization framework introduced by Zinkevich. In this setting, there is a sequence of convex functions. Each period, we must choose a signle point (from some feasible set) and pay a cost equal to the value of the next function on our chosen point. Zinkevich shows that, if the each function is revealed after the choice is made, then one can achieve vanish… ▽ More

    Submitted 2 August, 2004; originally announced August 2004.

    Comments: 12 pages

  46. arXiv:cs/0010022  [pdf, ps, other

    cs.LG cs.AI cs.DS

    Noise-Tolerant Learning, the Parity Problem, and the Statistical Query Model

    Authors: Avrim Blum, Adam Kalai, Hal Wasserman

    Abstract: We describe a slightly sub-exponential time algorithm for learning parity functions in the presence of random classification noise. This results in a polynomial-time algorithm for the case of parity functions that depend on only the first O(log n log log n) bits of input. This is the first known instance of an efficient noise-tolerant algorithm for a concept class that is provably not learnable… ▽ More

    Submitted 15 October, 2000; originally announced October 2000.

    ACM Class: I.2.6