Skip to main content

Showing 1–18 of 18 results for author: Isik, B

.
  1. arXiv:2406.16797  [pdf, other

    cs.CL cs.AI

    Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

    Authors: Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal

    Abstract: Existing methods for adapting large language models (LLMs) to new tasks are not suited to multi-task adaptation because they modify all the model weights -- causing destructive interference between tasks. The resulting effects, such as catastrophic forgetting of earlier tasks, make it challenging to obtain good performance on multiple tasks at the same time. To mitigate this, we propose Lottery Ti… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.09366  [pdf, other

    cs.LG cs.CV q-bio.NC

    Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

    Authors: Rylan Schaeffer, Victor Lecomte, Dhruv Bhandarkar Pai, Andres Carranza, Berivan Isik, Alyssa Unell, Mikail Khona, Thomas Yerxa, Yann LeCun, SueYeon Chung, Andrey Gromov, Ravid Shwartz-Ziv, Sanmi Koyejo

    Abstract: Maximum Manifold Capacity Representations (MMCR) is a recent multi-view self-supervised learning (MVSSL) method that matches or surpasses other leading MVSSL methods. MMCR is intriguing because it does not fit neatly into any of the commonplace MVSSL lineages, instead originating from a statistical mechanical perspective on the linear separability of data manifolds. In this paper, we seek to impro… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.17512  [pdf, other

    cs.LG cs.AI cs.CY

    On Fairness of Low-Rank Adaptation of Large Models

    Authors: Zhoujie Ding, Ken Ziyu Liu, Pura Peetathawatchai, Berivan Isik, Sanmi Koyejo

    Abstract: Low-rank adaptation of large models, particularly LoRA, has gained traction due to its computational efficiency. This efficiency, contrasted with the prohibitive costs of full-model fine-tuning, means that practitioners often turn to LoRA and sometimes without a complete understanding of its ramifications. In this study, we focus on fairness and ask whether LoRA has an unexamined impact on utility… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.02341  [pdf, other

    cs.CR cs.LG

    Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy

    Authors: Wei-Ning Chen, Berivan Isik, Peter Kairouz, Albert No, Sewoong Oh, Zheng Xu

    Abstract: We study $L_2$ mean estimation under central differential privacy and communication constraints, and address two key challenges: firstly, existing mean estimation schemes that simultaneously handle both constraints are usually optimized for $L_\infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L_2$ geometry, resulting in suboptimal leading constants in mean square… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  5. arXiv:2402.05887  [pdf, other

    eess.IV cs.MM

    Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers

    Authors: Onur G. Guleryuz, Philip A. Chou, Berivan Isik, Hugues Hoppe, Danhang Tang, Ruofei Du, Jonathan Taylor, Philip Davidson, Sean Fanello

    Abstract: We propose sandwiching standard image and video codecs between pre- and post-processing neural networks. The networks are jointly trained through a differentiable codec proxy to minimize a given rate-distortion loss. This sandwich architecture not only improves the standard codec's performance on its intended content, it can effectively adapt the codec to other types of image/video content and to… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  6. arXiv:2402.04177  [pdf, other

    cs.CL cs.LG stat.ML

    Scaling Laws for Downstream Task Performance of Large Language Models

    Authors: Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo

    Abstract: Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  7. arXiv:2306.12625  [pdf, other

    cs.LG cs.DC stat.ML

    Adaptive Compression in Federated Learning via Side Information

    Authors: Berivan Isik, Francesco Pase, Deniz Gunduz, Sanmi Koyejo, Tsachy Weissman, Michele Zorzi

    Abstract: The high communication cost of sending model updates from the clients to the server is a significant bottleneck for scalable federated learning (FL). Among existing approaches, state-of-the-art bitrate-accuracy tradeoffs have been achieved using stochastic compression methods -- in which the client $n$ sends a sample from a client-only probability distribution $q_{φ^{(n)}}$, and the server estimat… ▽ More

    Submitted 21 April, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Published at the International Conference on Artificial Intelligence and Statistics (AISTATS), 2024

  8. arXiv:2306.04924  [pdf, other

    cs.LG cs.CR cs.DC cs.IT stat.ML

    Exact Optimality of Communication-Privacy-Utility Tradeoffs in Distributed Mean Estimation

    Authors: Berivan Isik, Wei-Ning Chen, Ayfer Ozgur, Tsachy Weissman, Albert No

    Abstract: We study the mean estimation problem under communication and local differential privacy constraints. While previous work has proposed \emph{order}-optimal algorithms for the same problem (i.e., asymptotically optimal as we spend more bits), \emph{exact} optimality (in the non-asymptotic setting) still has not been achieved. In this work, we take a step towards characterizing the \emph{exact}-optim… ▽ More

    Submitted 28 October, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Published at the Conference on Neural Information Processing Systems (NeurIPS), 2023

  9. arXiv:2303.11473  [pdf, other

    eess.IV cs.LG cs.MM

    Sandwiched Video Compression: Efficiently Extending the Reach of Standard Codecs with Neural Wrappers

    Authors: Berivan Isik, Onur G. Guleryuz, Danhang Tang, Jonathan Taylor, Philip A. Chou

    Abstract: We propose sandwiched video compression -- a video compression system that wraps neural networks around a standard video codec. The sandwich framework consists of a neural pre- and post-processor with a standard video codec between them. The networks are trained jointly to optimize a rate-distortion loss function with the goal of significantly improving over the standard codec in various compressi… ▽ More

    Submitted 5 July, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Published at the International Conference on Image Processing (ICIP), 2023

  10. arXiv:2210.07437  [pdf, other

    cs.IT

    Upper bounds on the Rate of Uniformly-Random Codes for the Deletion Channel

    Authors: Berivan Isik, Francisco Pernice, Tsachy Weissman

    Abstract: We consider the maximum coding rate achievable by uniformly-random codes for the deletion channel. We prove an upper bound that's within 0.1 of the best known lower bounds for all values of the deletion probability $d,$ and much closer for small and large $d.$ We give simulation results which suggest that our upper bound is within 0.05 of the exact value for all $d$, and within $0.01$ for… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  11. arXiv:2209.15328  [pdf, other

    cs.LG stat.AP stat.ML

    Sparse Random Networks for Communication-Efficient Federated Learning

    Authors: Berivan Isik, Francesco Pase, Deniz Gunduz, Tsachy Weissman, Michele Zorzi

    Abstract: One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient compression methods, we propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial \em… ▽ More

    Submitted 8 February, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: Published at the International Conference on Learning Representations (ICLR) 2023

  12. arXiv:2202.02892  [pdf, other

    cs.IT cs.LG eess.SP

    Lossy Compression of Noisy Data for Private and Data-Efficient Learning

    Authors: Berivan Isik, Tsachy Weissman

    Abstract: Storage-efficient privacy-preserving learning is crucial due to increasing amounts of sensitive user data required for modern learning tasks. We propose a framework for reducing the storage cost of user data while at the same time providing privacy guarantees, without essential loss in the utility of the data for learning. Our method comprises noise injection followed by lossy compression. We show… ▽ More

    Submitted 22 March, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

    Comments: Published at the IEEE Journal on Selected Areas in Information Theory (JSAIT). Preliminary version was presented at the IEEE International Symposium on Information Theory (ISIT), 2022, with a slightly different title, "Learning under Storage and Privacy Constraints."

  13. arXiv:2111.08988  [pdf, other

    cs.GR cs.LG eess.IV eess.SP

    LVAC: Learned Volumetric Attribute Compression for Point Clouds using Coordinate Based Networks

    Authors: Berivan Isik, Philip A. Chou, Sung ** Hwang, Nick Johnston, George Toderici

    Abstract: We consider the attributes of a point cloud as samples of a vector-valued volumetric function at discrete positions. To compress the attributes given the positions, we compress the parameters of the volumetric function. We model the volumetric function by tiling space into blocks, and representing the function over each block by shifts of a coordinate-based, or implicit, neural network. Inputs to… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: 30 pages, 29 figures

  14. arXiv:2105.03120  [pdf, other

    cs.CV cs.LG

    Neural 3D Scene Compression via Model Compression

    Authors: Berivan Isik

    Abstract: Rendering 3D scenes requires access to arbitrary viewpoints from the scene. Storage of such a 3D scene can be done in two ways; (1) storing 2D images taken from the 3D scene that can reconstruct the scene back through interpolations, or (2) storing a representation of the 3D scene itself that already encodes views from all directions. So far, traditional 3D compression methods have focused on the… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: Stanford CS 231A Final Project, 2021. WiCV at CVPR 2021

  15. arXiv:2102.08329  [pdf, other

    cs.LG cs.IT eess.SP stat.ML

    An Information-Theoretic Justification for Model Pruning

    Authors: Berivan Isik, Tsachy Weissman, Albert No

    Abstract: We study the neural network (NN) compression problem, viewing the tension between the compression ratio and NN performance through the lens of rate-distortion theory. We choose a distortion metric that reflects the effect of NN compression on the model output and derive the tradeoff between rate (compression) and distortion. In addition to characterizing theoretical limits of NN compression, this… ▽ More

    Submitted 9 February, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Published in the International Conference on Artificial Intelligence and Statistics (AISTATS) 2022. Previous titles: 1) Rate-Distortion Theoretic Model Compression: Successive Refinement for Pruning, 2) Successive pruning for model compression via rate distortion theory

  16. arXiv:2102.07725  [pdf, other

    cs.LG

    Neural Network Compression for Noisy Storage Devices

    Authors: Berivan Isik, Kristy Choi, Xin Zheng, Tsachy Weissman, Stefano Ermon, H. -S. Philip Wong, Armin Alaghi

    Abstract: Compression and efficient storage of neural network (NN) parameters is critical for applications that run on resource-constrained devices. Despite the significant progress in NN model compression, there has been considerably less investigation in the actual \textit{physical} storage of NN parameters. Conventionally, model compression and physical storage are decoupled, as digital storage media wit… ▽ More

    Submitted 13 March, 2023; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Published at the ACM Transactions on Embedded Computing Systems (TECS), 2023

  17. arXiv:2005.10761  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    rTop-k: A Statistical Estimation Approach to Distributed SGD

    Authors: Leighton Pate Barnes, Huseyin A. Inan, Berivan Isik, Ayfer Ozgur

    Abstract: The large communication cost for exchanging gradients between different nodes significantly limits the scalability of distributed training for large-scale learning models. Motivated by this observation, there has been significant recent interest in techniques that reduce the communication cost of distributed Stochastic Gradient Descent (SGD), with gradient sparsification techniques such as top-k a… ▽ More

    Submitted 2 December, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

  18. arXiv:2005.09303  [pdf

    cs.SE

    Visual GUI testing in practice: An extended industrial case study

    Authors: Vahid Garousi, Wasif Afzal, Adem Çağlar, İhsan Berk Işık, Berker Baydan, Seçkin Çaylak, Ahmet Zeki Boyraz, Burak Yolaçan, Kadir Herkiloğlu

    Abstract: Context: Visual GUI testing (VGT) is referred to as the latest generation GUI-based testing. It is a tool-driven technique, which uses image recognition for interacting with and asserting the behavior of the system under test. Motivated by the industrial need of a large Turkish software and systems company providing solutions in the areas of defense and IT sector, an action-research project was re… ▽ More

    Submitted 20 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.