Skip to main content

Showing 1–9 of 9 results for author: Chee, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09882  [pdf, other

    cs.IR cs.CY cs.LG

    Harm Mitigation in Recommender Systems under User Preference Dynamics

    Authors: Jerry Chee, Shankar Kalyanaraman, Sindhu Kiranmai Ernala, Udi Weinsberg, Sarah Dean, Stratis Ioannidis

    Abstract: We consider a recommender system that takes into account the interplay between recommendations, the evolution of user interests, and harmful content. We model the impact of recommendations on user behavior, particularly the tendency to consume harmful content. We seek recommendation policies that establish a tradeoff between maximizing click-through rate (CTR) and mitigating harm. We establish con… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Recommender Systems; Harm Mitigation; Amplification; User Preference Modeling

  2. arXiv:2402.04396  [pdf, other

    cs.LG cs.AI cs.CL

    QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

    Authors: Albert Tseng, Jerry Chee, Qingyao Sun, Volodymyr Kuleshov, Christopher De Sa

    Abstract: Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing their weights to low-precision. In this work, we introduce QuIP#, a weight-only PTQ method that achieves state-of-the-art results in extreme compression regimes ($\le$ 4 bits per weight) using three novel techniques. First, QuIP# improves QuIP's (Chee et al., 2023) incoherence processing by using the randomized Had… ▽ More

    Submitted 4 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  3. arXiv:2401.15221  [pdf, other

    cs.HC

    Designing and Testing a Mobile Application for Collecting WhatsApp Chat Data While Preserving Privacy

    Authors: Brennan Schaffner, Archie Brohn, Jason Chee, K. J. Feng, Marshini Chetty

    Abstract: It is common practice for researchers to join public WhatsApp chats and scrape their contents for analysis. However, research shows collecting data this way contradicts user expectations and preferences, even if the data is effectively public. To overcome these issues, we outline design considerations for collecting WhatsApp chat data with improved user privacy by heightening user control and over… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  4. arXiv:2307.13304  [pdf, other

    cs.LG cs.CL

    QuIP: 2-Bit Quantization of Large Language Models With Guarantees

    Authors: Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, Christopher De Sa

    Abstract: This work studies post-training parameter quantization in large language models (LLMs). We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from $\textit{incoherent}$ weight and Hessian matrices, i.e., from the weights being even in magnitude and the directions in which it is important to round them accurately being unaligned w… ▽ More

    Submitted 15 January, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

  5. arXiv:2110.04378  [pdf, other

    eess.AS cs.LG cs.SD

    Performance optimizations on deep noise suppression models

    Authors: Jerry Chee, Sebastian Braun, Vishak Gopal, Ross Cutler

    Abstract: We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model. While deep learning approaches have been remarkably successful in enhancing audio quality, their increased complexity inhibits their deployment in real-time applications. We achieve up to a 7.25X inference speedup over the baseline, with a smooth model… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  6. arXiv:2108.00065  [pdf, other

    cs.LG

    Model Preserving Compression for Neural Networks

    Authors: Jerry Chee, Megan Renz, Anil Damle, Christopher De Sa

    Abstract: After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands. When compressing, it is desirable to preserve the original model's per-example decisions (e.g., to go beyond top-1 accuracy or preserve robustness), maintain the network's structure, automatically determine per-layer compression levels, and eliminate the need for fine tuning.… ▽ More

    Submitted 14 October, 2022; v1 submitted 30 July, 2021; originally announced August 2021.

    Comments: 26 pages, 15 figures. To be published in Advances in Neural Information Processing Systems 35

    MSC Class: 68W99; 65F55

  7. arXiv:2106.09686  [pdf, other

    cs.LG cs.AI

    How Low Can We Go: Trading Memory for Error in Low-Precision Training

    Authors: Chengrun Yang, Ziyang Wu, Jerry Chee, Christopher De Sa, Madeleine Udell

    Abstract: Low-precision arithmetic trains deep learning models using less energy, less memory and less time. However, we pay a price for the savings: lower precision may yield larger round-off error and hence larger prediction error. As applications proliferate, users must choose which precision to use to train a new model, and chip manufacturers must decide which precisions to manufacture. We view these pr… ▽ More

    Submitted 17 March, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: ICLR 2022

  8. arXiv:2008.12224  [pdf, other

    cs.LG stat.ML

    Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum

    Authors: Jerry Chee, ** Li

    Abstract: Convergence detection of iterative stochastic optimization methods is of great practical interest. This paper considers stochastic gradient descent (SGD) with a constant learning rate and momentum. We show that there exists a transient phase in which iterates move towards a region of interest, and a stationary phase in which iterates remain bounded in that region around a minimum point. We constru… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

  9. arXiv:1710.06382  [pdf, other

    stat.ML cs.LG math.ST stat.CO

    Convergence diagnostics for stochastic gradient descent with constant step size

    Authors: Jerry Chee, Panos Toulis

    Abstract: Many iterative procedures in stochastic optimization exhibit a transient phase followed by a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in that region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transition in the context… ▽ More

    Submitted 22 February, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: Accepted to Artificial Intelligence and Statistics, 2018