Skip to main content

Showing 1–16 of 16 results for author: Tseng, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11235  [pdf, other

    cs.LG

    QTIP: Quantization with Trellises and Incoherence Processing

    Authors: Albert Tseng, Qingyao Sun, David Hou, Christopher De Sa

    Abstract: Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing weights to low-precision datatypes. Since LLM inference is usually memory-bound, PTQ methods can improve inference throughput. Recent state-of-the-art PTQ approaches have converged on using vector quantization (VQ) to quantize multiple weights at once, which improves information utilization through better sha**.… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2402.16359  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    Feedback Efficient Online Fine-Tuning of Diffusion Models

    Authors: Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Sergey Levine, Tommaso Biancalani

    Abstract: Diffusion models excel at modeling complex data distributions, including those of images, proteins, and small molecules. However, in many cases, our goal is to model parts of the distribution that maximize certain properties: for example, we may want to generate images with high aesthetic quality, or molecules with high bioactivity. It is natural to frame this as a reinforcement learning (RL) prob… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Under review (codes will be released soon)

  3. arXiv:2402.15194  [pdf, other

    cs.LG cs.AI stat.ML

    Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control

    Authors: Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Tommaso Biancalani, Sergey Levine

    Abstract: Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins. While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images or the functional properties of generated proteins. Diffusion models can be finetuned in a goal… ▽ More

    Submitted 28 February, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Under review (codes will be released soon)

  4. arXiv:2402.04396  [pdf, other

    cs.LG cs.AI cs.CL

    QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

    Authors: Albert Tseng, Jerry Chee, Qingyao Sun, Volodymyr Kuleshov, Christopher De Sa

    Abstract: Post-training quantization (PTQ) reduces the memory footprint of LLMs by quantizing their weights to low-precision. In this work, we introduce QuIP#, a weight-only PTQ method that achieves state-of-the-art results in extreme compression regimes ($\le$ 4 bits per weight) using three novel techniques. First, QuIP# improves QuIP's (Chee et al., 2023) incoherence processing by using the randomized Had… ▽ More

    Submitted 4 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  5. arXiv:2306.02957  [pdf, other

    cs.LG stat.ML

    Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion

    Authors: Alex M. Tseng, Nathaniel Diamant, Tommaso Biancalani, Gabriele Scalia

    Abstract: Diffusion models have achieved state-of-the-art performance in generating many different kinds of data, including images, text, and videos. Despite their success, there has been limited research on how the underlying diffusion process and the final convergent prior can affect generative performance; this research has also been limited to continuous data types and a score-based diffusion framework.… ▽ More

    Submitted 21 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

  6. arXiv:2306.00392  [pdf, other

    cs.LG

    Coneheads: Hierarchy Aware Attention

    Authors: Albert Tseng, Tao Yu, Toni J. B. Liu, Christopher De Sa

    Abstract: Attention networks such as transformers have achieved state-of-the-art performance in many domains. These networks rely heavily on the dot product attention operator, which computes the similarity between two points by taking their inner product. However, the inner product does not explicitly model the complex structural properties of real world datasets, such as hierarchies between data points. T… ▽ More

    Submitted 3 December, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  7. arXiv:2305.19800  [pdf, other

    q-bio.BM cs.LG

    RINGER: Rapid Conformer Generation for Macrocycles with Sequence-Conditioned Internal Coordinate Diffusion

    Authors: Colin A. Grambow, Hayley Weir, Nathaniel L. Diamant, Alex M. Tseng, Tommaso Biancalani, Gabriele Scalia, Kangway V. Chuang

    Abstract: Macrocyclic peptides are an emerging therapeutic modality, yet computational approaches for accurately sampling their diverse 3D ensembles remain challenging due to their conformational diversity and geometric constraints. Here, we introduce RINGER, a diffusion-based transformer model for sequence-conditioned generation of macrocycle structures based on internal coordinates. RINGER provides fast b… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  8. arXiv:2305.15215  [pdf, other

    cs.LG

    Shadow Cones: A Generalized Framework for Partial Order Embeddings

    Authors: Tao Yu, Toni J. B. Liu, Albert Tseng, Christopher De Sa

    Abstract: Hyperbolic space has proven to be well-suited for capturing hierarchical relations in data, such as trees and directed acyclic graphs. Prior work introduced the concept of entailment cones, which uses partial orders defined by nested cones in the PoincarĂ© ball to model hierarchies. Here, we introduce the ``shadow cones" framework, a physics-inspired entailment cone construction. Specifically, we m… ▽ More

    Submitted 8 April, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: ICLR 2024

  9. arXiv:2302.03790  [pdf, other

    cs.LG

    GraphGUIDE: interpretable and controllable conditional graph generation with discrete Bernoulli diffusion

    Authors: Alex M. Tseng, Nathaniel Diamant, Tommaso Biancalani, Gabriele Scalia

    Abstract: Diffusion models achieve state-of-the-art performance in generating realistic objects and have been successfully applied to images, text, and videos. Recent work has shown that diffusion can also be defined on graphs, including graph representations of drug-like molecules. Unfortunately, it remains difficult to perform conditional generation on graphs in a way which is interpretable and controllab… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  10. arXiv:2301.10857  [pdf, other

    cs.LG cs.SI

    Improving Graph Generation by Restricting Graph Bandwidth

    Authors: Nathaniel Diamant, Alex M. Tseng, Kangway V. Chuang, Tommaso Biancalani, Gabriele Scalia

    Abstract: Deep graph generative modeling has proven capable of learning the distribution of complex, multi-scale structures characterizing real-world graphs. However, one of the main limitations of existing methods is their large output space, which limits generation scalability and hinders accurate modeling of the underlying distribution. To overcome these limitations, we propose a novel approach that sign… ▽ More

    Submitted 30 May, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted at ICML 2023

  11. arXiv:2212.10777  [pdf, other

    cs.LG cs.AI

    Hierarchically branched diffusion models leverage dataset structure for class-conditional generation

    Authors: Alex M. Tseng, Max Shen, Tommaso Biancalani, Gabriele Scalia

    Abstract: Class-labeled datasets, particularly those common in scientific domains, are rife with internal structure, yet current class-conditional diffusion models ignore these relationships and implicitly diffuse on all classes in a flat fashion. To leverage this structure, we propose hierarchically branched diffusion models as a novel framework for class-conditional generation. Branched diffusion models r… ▽ More

    Submitted 1 February, 2024; v1 submitted 21 December, 2022; originally announced December 2022.

  12. arXiv:2210.12192  [pdf, other

    cs.LG

    Conditional Diffusion with Less Explicit Guidance via Model Predictive Control

    Authors: Max W. Shen, Ehsan Hajiramezanali, Gabriele Scalia, Alex Tseng, Nathaniel Diamant, Tommaso Biancalani, Andreas Loukas

    Abstract: How much explicit guidance is necessary for conditional diffusion? We consider the problem of conditional sampling using an unconditional diffusion model and limited explicit guidance (e.g., a noised classifier, or a conditional diffusion model) that is restricted to a small number of time steps. We explore a model predictive control (MPC)-like approach to approximate guidance by simulating uncond… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  13. arXiv:2111.15186  [pdf, other

    cs.LG cs.CV

    Automatic Synthesis of Diverse Weak Supervision Sources for Behavior Analysis

    Authors: Albert Tseng, Jennifer J. Sun, Yisong Yue

    Abstract: Obtaining annotations for large training sets is expensive, especially in settings where domain knowledge is required, such as behavior analysis. Weak supervision has been studied to reduce annotation costs by using weak labels from task-specific labeling functions (LFs) to augment ground truth labels. However, domain experts still need to hand-craft different LFs for different tasks, limiting sca… ▽ More

    Submitted 11 May, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: 8 pages, to appear at CVPR 2022

  14. arXiv:2005.10284  [pdf, other

    cs.LG cs.AI stat.ML

    An Adversarial Approach for Explaining the Predictions of Deep Neural Networks

    Authors: Arash Rahnama, Andrew Tseng

    Abstract: Machine learning models have been successfully applied to a wide range of applications including computer vision, natural language processing, and speech recognition. A successful implementation of these models however, usually relies on deep neural networks (DNNs) which are treated as opaque black-box systems due to their incomprehensible complexity and intricate internal mechanism. In this work,… ▽ More

    Submitted 28 September, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

  15. arXiv:1910.01179  [pdf, other

    cs.LG stat.ML

    Learning Calibratable Policies using Programmatic Style-Consistency

    Authors: Eric Zhan, Albert Tseng, Yisong Yue, Adith Swaminathan, Matthew Hausknecht

    Abstract: We study the problem of controllable generation of long-term sequential behaviors, where the goal is to calibrate to multiple behavior styles simultaneously. In contrast to the well-studied areas of controllable generation of images, text, and speech, there are two questions that pose significant challenges when generating long-term behaviors: how should we specify the factors of variation to cont… ▽ More

    Submitted 16 July, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

  16. arXiv:1802.03358  [pdf, other

    cs.LG cs.CR stat.ML

    Deep Learning for Malicious Flow Detection

    Authors: Yun-Chun Chen, Yu-Jhe Li, Aragorn Tseng, Tsungnan Lin

    Abstract: Cyber security has grown up to be a hot issue in recent years. How to identify potential malware becomes a challenging task. To tackle this challenge, we adopt deep learning approaches and perform flow detection on real data. However, real data often encounters an issue of imbalanced data distribution which will lead to a gradient dilution issue. When training a neural network, this problem will n… ▽ More

    Submitted 9 February, 2018; originally announced February 2018.

    Comments: 7 pages, 5 figures, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC) (IEEE PIMRC 2017 Track 1)