Skip to main content

Showing 1–13 of 13 results for author: Khani, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.06134  [pdf

    cs.DC

    A Two-Level Thermal Cycling-aware Task Map** Technique for Reliability Management in Manycore Systems

    Authors: Fatemeh Hossein Khani, Omid Akbari, Muhammad Shafique

    Abstract: Reliability management is one of the primary concerns in manycore systems design. Different aging mechanisms such as Negative-Bias Temperature Instability (NBTI), Electromigration (EM), and thermal cycling can reduce the reliability of these systems. However, state-of-the-art works mainly focused on NBTI and EM, whereas a few works have considered the thermal cycling effect. The thermal cycling ef… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  2. arXiv:2311.05661  [pdf, other

    cs.CL cs.AI cs.LG

    Prompt Engineering a Prompt Engineer

    Authors: Qinyuan Ye, Maxamed Axmed, Reid Pryzant, Fereshte Khani

    Abstract: Prompt engineering is a challenging yet crucial task for optimizing the performance of large language models on customized tasks. It requires complex reasoning to examine the model's errors, hypothesize what is missing or misleading in the current prompt, and communicate the task with clarity. While recent works indicate that large language models can be meta-prompted to perform automatic prompt e… ▽ More

    Submitted 2 July, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Accepted to ACL 2024 Findings. Camera-ready version

  3. arXiv:2305.17804  [pdf, other

    cs.CL

    Targeted Data Generation: Finding and Fixing Model Weaknesses

    Authors: Zexue He, Marco Tulio Ribeiro, Fereshte Khani

    Abstract: Even when aggregate accuracy is high, state-of-the-art NLP models often fail systematically on specific subgroups of data, resulting in unfair outcomes and eroding user trust. Additional data collection may not help in addressing these weaknesses, as such challenging subgroups may be unknown to users, and underrepresented in the existing and new data. We propose Targeted Data Generation (TDG), a f… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  4. arXiv:2305.12219  [pdf, other

    cs.LG cs.AI cs.CL

    Collaborative Development of NLP models

    Authors: Fereshte Khani, Marco Tulio Ribeiro

    Abstract: Despite substantial advancements, Natural Language Processing (NLP) models often require post-training adjustments to enforce business rules, rectify undesired behavior, and align with user values. These adjustments involve operationalizing "concepts"--dictating desired model responses to certain inputs. However, it's difficult for a single entity to enumerate and define all possible concepts, ind… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

  5. arXiv:2210.00055  [pdf, other

    cs.LG cs.CV

    MaskTune: Mitigating Spurious Correlations by Forcing to Explore

    Authors: Saeid Asgari Taghanaki, Aliasghar Khani, Fereshte Khani, Ali Gholami, Linh Tran, Ali Mahdavi-Amiri, Ghassan Hamarneh

    Abstract: A fundamental challenge of over-parameterized deep learning models is learning meaningful data representations that yield good performance on a downstream task without over-fitting spurious input features. This work proposes MaskTune, a masking strategy that prevents over-reliance on spurious (or a limited number of) features. MaskTune forces the trained model to explore new features during a sing… ▽ More

    Submitted 8 October, 2022; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  6. arXiv:2207.01548  [pdf, other

    cs.LG cs.CV

    Counterbalancing Teacher: Regularizing Batch Normalized Models for Robustness

    Authors: Saeid Asgari Taghanaki, Ali Gholami, Fereshte Khani, Kristy Choi, Linh Tran, Ran Zhang, Aliasghar Khani

    Abstract: Batch normalization (BN) is a ubiquitous technique for training deep neural networks that accelerates their convergence to reach higher accuracy. However, we demonstrate that BN comes with a fundamental drawback: it incentivizes the model to rely on low-variance features that are highly specific to the training (in-domain) data, hurting generalization performance on out-of-domain examples. In this… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  7. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  8. arXiv:2012.04550  [pdf, other

    cs.LG stat.ML

    In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness

    Authors: Sang Michael Xie, Ananya Kumar, Robbie Jones, Fereshte Khani, Tengyu Ma, Percy Liang

    Abstract: Consider a prediction setting with few in-distribution labeled examples and many unlabeled examples both in- and out-of-distribution (OOD). The goal is to learn a model which performs well both in-distribution and OOD. In these settings, auxiliary information is often cheaply available for every input. How should we best leverage this auxiliary information for the prediction task? Empirically acro… ▽ More

    Submitted 7 April, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: ICLR 2021

  9. arXiv:2012.04104  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

    Authors: Fereshte Khani, Percy Liang

    Abstract: The presence of spurious features interferes with the goal of obtaining robust models that perform well across many groups within the population. A natural remedy is to remove spurious features from the model. However, in this work we show that removal of spurious features can decrease accuracy due to the inductive biases of overparameterized models. We completely characterize how the removal of s… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  10. arXiv:1911.09876  [pdf, other

    cs.LG stat.ML

    Feature Noise Induces Loss Discrepancy Across Groups

    Authors: Fereshte Khani, Percy Liang

    Abstract: The performance of standard learning procedures has been observed to differ widely across groups. Recent studies usually attribute this loss discrepancy to an information deficiency for one group (e.g., one group has less data). In this work, we point to a more subtle source of loss discrepancy---feature noise. Our main result is that even when there is no information deficiency specific to one gr… ▽ More

    Submitted 5 November, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: ICML 2020

  11. arXiv:1906.03518  [pdf, other

    cs.LG stat.ML

    Maximum Weighted Loss Discrepancy

    Authors: Fereshte Khani, Aditi Raghunathan, Percy Liang

    Abstract: Though machine learning algorithms excel at minimizing the average loss over a population, this might lead to large discrepancies between the losses across groups within the population. To capture this inequality, we introduce and study a notion we call maximum weighted loss discrepancy (MWLD), the maximum (weighted) difference between the loss of a group and the loss of the population. We relate… ▽ More

    Submitted 8 June, 2019; originally announced June 2019.

    Comments: ICLR 2019 Workshop. Safe Machine Learning: Specification, Robustness, and Assurance

  12. arXiv:1805.11774  [pdf, other

    cs.CL

    Planning, Inference and Pragmatics in Sequential Language Games

    Authors: Fereshte Khani, Noah D. Goodman, Percy Liang

    Abstract: We study sequential language games in which two players, each with private information, communicate to achieve a common goal. In such games, a successful player must (i) infer the partner's private information from the partner's messages, (ii) generate messages that are most likely to help with the goal, and (iii) reason pragmatically about the partner's strategy. We propose a model that captures… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: In proceedings of TACL 2018

  13. arXiv:1606.06368  [pdf, other

    cs.LG cs.AI cs.CL

    Unanimous Prediction for 100% Precision with Application to Learning Semantic Map**s

    Authors: Fereshte Khani, Martin Rinard, Percy Liang

    Abstract: Can we train a system that, on any new input, either says "don't know" or makes a prediction that is guaranteed to be correct? We answer the question in the affirmative provided our model family is well-specified. Specifically, we introduce the unanimity principle: only predict when all models consistent with the training data predict the same output. We operationalize this principle for semantic… ▽ More

    Submitted 23 June, 2016; v1 submitted 20 June, 2016; originally announced June 2016.

    Comments: ACL 2016, Removed the duplicate author name of the previous version

    ACM Class: I.2.7; I.2.6