Skip to main content

Showing 1–30 of 30 results for author: Karnin, Z

.
  1. arXiv:2309.03831  [pdf, other

    cs.CL cs.AI cs.LG

    Uncovering Drift in Textual Data: An Unsupervised Method for Detecting and Mitigating Drift in Machine Learning Models

    Authors: Saeed Khaki, Akhouri Abhinav Aditya, Zohar Karnin, Lan Ma, Olivia Pan, Samarth Marudheri Chandrashekar

    Abstract: Drift in machine learning refers to the phenomenon where the statistical properties of data or context, in which the model operates, change over time leading to a decrease in its performance. Therefore, maintaining a constant monitoring process for machine learning model performance is crucial in order to proactively prevent any potential performance regression. However, supervised drift detection… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 8 pages, Accepted in 2023 Amazon Internal Machine Learning Conference

  2. arXiv:2205.11603  [pdf, other

    cs.CL

    Representation Projection Invariance Mitigates Representation Collapse

    Authors: Anastasia Razdaibiedina, Ashish Khetan, Zohar Karnin, Daniel Khashabi, Vishaal Kapoor, Vivek Madan

    Abstract: Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization… ▽ More

    Submitted 21 November, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 41 pages, 6 figures

  3. arXiv:2203.14380  [pdf, other

    cs.CL

    Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection

    Authors: Xin Huang, Ashish Khetan, Rene Bidart, Zohar Karnin

    Abstract: Transformer-based language models such as BERT have achieved the state-of-the-art performance on various NLP tasks, but are computationally prohibitive. A recent line of works use various heuristics to successively shorten sequence length while transforming tokens through encoders, in tasks such as classification and ranking that require a single token embedding for prediction. We present a novel… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

    Comments: 20 pages, 10 figures

  4. arXiv:2111.13657  [pdf, other

    cs.LG cs.AI stat.ML

    Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models

    Authors: David Nigenda, Zohar Karnin, Muhammad Bilal Zafar, Raghu Ramesha, Alan Tan, Michele Donini, Krishnaram Kenthapadi

    Abstract: With the increasing adoption of machine learning (ML) models and systems in high-stakes settings across different industries, guaranteeing a model's performance after deployment has become crucial. Monitoring models in production is a critical aspect of ensuring their continued performance and reliability. We present Amazon SageMaker Model Monitor, a fully managed service that continuously monitor… ▽ More

    Submitted 5 August, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

  5. arXiv:2107.11094  [pdf, other

    cs.CL

    Improving Early Sepsis Prediction with Multi Modal Learning

    Authors: Fred Qin, Vivek Madan, Ujjwal Ratan, Zohar Karnin, Vishaal Kapoor, Parminder Bhatia, Taha Kass-Hout

    Abstract: Sepsis is a life-threatening disease with high morbidity, mortality and healthcare costs. The early prediction and administration of antibiotics and intravenous fluids is considered crucial for the treatment of sepsis and can save potentially millions of lives and billions in health care costs. Professional clinical care practitioners have proposed clinical criterion which aid in early detection o… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

  6. arXiv:2012.08483  [pdf, other

    cs.LG

    Amazon SageMaker Autopilot: a white box AutoML solution at scale

    Authors: Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel

    Abstract: AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  7. arXiv:2012.06678  [pdf, other

    cs.LG cs.AI

    TabTransformer: Tabular Data Modeling Using Contextual Embeddings

    Authors: Xin Huang, Ashish Khetan, Milan Cvitkovic, Zohar Karnin

    Abstract: We propose TabTransformer, a novel deep tabular data modeling architecture for supervised and semi-supervised learning. The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy. Through extensive experiments on fifteen publicly available dataset… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 7 pages, 5 figures

  8. arXiv:2011.06015  [pdf, other

    cs.LG

    GANMEX: One-vs-One Attributions Guided by GAN-based Counterfactual Explanation Baselines

    Authors: Sheng-Min Shih, Pin-Ju Tien, Zohar Karnin

    Abstract: Attribution methods have been shown as promising approaches for identifying key features that led to learned model predictions. While most existing attribution methods rely on a baseline input for performing feature perturbations, limited research has been conducted to address the baseline selection issues. Poor choices of baselines limit the ability of one-vs-one (1-vs-1) explanations for multi-c… ▽ More

    Submitted 23 June, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: International Conference on Machine Learning 2021

  9. arXiv:2007.13382  [pdf, other

    stat.ML cs.LG

    Practical and sample efficient zero-shot HPO

    Authors: Fela Winkelmolen, Nikita Ivkin, H. Furkan Bozkurt, Zohar Karnin

    Abstract: Zero-shot hyperparameter optimization (HPO) is a simple yet effective use of transfer learning for constructing a small list of hyperparameter (HP) configurations that complement each other. That is to say, for any given dataset, at least one of them is expected to perform well. Current techniques for obtaining this list are computationally expensive as they rely on running training jobs on a dive… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

  10. arXiv:2006.11685  [pdf, other

    cs.LG stat.ML

    An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

    Authors: Julian Katz-Samuels, Lalit Jain, Zohar Karnin, Kevin Jamieson

    Abstract: This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. Leveraging ideas from the theory of suprema of empirical processes, we provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. Unlike previous approaches which sample b… ▽ More

    Submitted 20 June, 2020; originally announced June 2020.

  11. arXiv:2005.11282  [pdf, other

    cs.LG cs.CV stat.ML

    PruneNet: Channel Pruning via Global Importance

    Authors: Ashish Khetan, Zohar Karnin

    Abstract: Channel pruning is one of the predominant approaches for accelerating deep neural networks. Most existing pruning methods either train from scratch with a sparsity inducing term such as group lasso, or prune redundant channels in a pretrained network and then fine tune the network. Both strategies suffer from some limitations: the use of group lasso is computationally expensive, difficult to conve… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

    Comments: 12 pages, 3 figures, Published in ICLR 2020 NAS Workshop

  12. arXiv:2005.06628  [pdf, other

    cs.CL cs.LG

    schuBERT: Optimizing Elements of BERT

    Authors: Ashish Khetan, Zohar Karnin

    Abstract: Transformers \citep{vaswani2017attention} have gradually become a key component for many state-of-the-art natural language representation models. A recent Transformer based model- BERT \citep{devlin2018bert} achieved state-of-the-art results on various natural language processing tasks, including GLUE, SQuAD v1.1, and SQuAD v2.0. This model however is computationally prohibitive and has a huge num… ▽ More

    Submitted 9 May, 2020; originally announced May 2020.

    Comments: 11 pages, 6 figures, Accepted for publication in ACL 2020 as a long paper

  13. Relative Error Streaming Quantiles

    Authors: Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, Pavel Veselý

    Abstract: Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of $n$ items from a data universe equipped with a total order, the task is to compute a sketch (data structure) of size polylogarithmic in $n$. Given the sketch and a query item $y$, one should be able to approximate its rank in the stream, i.e., the number of stream… ▽ More

    Submitted 24 August, 2023; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: Final version of the paper to appear in Journal of the ACM. Compared to the previous version, we removed any restrictions on the accuracy parameters in the main result and thoroughly revised the paper. 48 pages, 2 figures

    ACM Class: F.2.2

  14. arXiv:1907.00236  [pdf, other

    cs.DS cs.DB cs.LG

    Streaming Quantiles Algorithms with Small Space and Update Time

    Authors: Nikita Ivkin, Edo Liberty, Kevin Lang, Zohar Karnin, Vladimir Braverman

    Abstract: Approximating quantiles and distributions over streaming data has been studied for roughly two decades now. Recently, Karnin, Lang, and Liberty proposed the first asymptotically optimal algorithm for doing so. This manuscript complements their theoretical result by providing a practical variants of their algorithm with improved constants. For a given sketch size, our techniques provably reduce the… ▽ More

    Submitted 29 June, 2019; originally announced July 2019.

  15. arXiv:1906.09489  [pdf, other

    cs.LG stat.ML

    Asymmetric Random Projections

    Authors: Nick Ryder, Zohar Karnin, Edo Liberty

    Abstract: Random projections (RP) are a popular tool for reducing dimensionality while preserving local geometry. In many applications the data set to be projected is given to us in advance, yet the current RP techniques do not make use of information about the data. In this paper, we provide a computationally light way to extract statistics from the data that allows designing a data dependent RP with super… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

    Comments: 14 pages, 5 figures

  16. arXiv:1906.04845  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Discrepancy, Coresets, and Sketches in Machine Learning

    Authors: Zohar Karnin, Edo Liberty

    Abstract: This paper defines the notion of class discrepancy for families of functions. It shows that low discrepancy classes admit small offline and streaming coresets. We provide general techniques for bounding the class discrepancy of machine learning problems. As corollaries of the general technique we bound the discrepancy (and therefore coreset complexity) of logistic regression, sigmoid activation lo… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

  17. arXiv:1905.08170  [pdf, other

    cs.LG cs.CV stat.ML

    DARC: Differentiable ARchitecture Compression

    Authors: Shashank Singh, Ashish Khetan, Zohar Karnin

    Abstract: In many learning situations, resources at inference time are significantly more constrained than resources at training time. This paper studies a general paradigm, called Differentiable ARchitecture Compression (DARC), that combines model compression and architecture search to learn models that are resource-efficient at inference time. Given a resource-intensive base architecture, DARC utilizes th… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

  18. arXiv:1706.04690  [pdf, ps, other

    cs.LG

    Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP

    Authors: Satyen Kale, Zohar Karnin, Tengyuan Liang, Dávid Pál

    Abstract: Online sparse linear regression is an online problem where an algorithm repeatedly chooses a subset of coordinates to observe in an adversarially chosen feature vector, makes a real-valued prediction, receives the true label, and incurs the squared loss. The goal is to design an online learning algorithm with sublinear regret to the best sparse linear predictor in hindsight. Without any assumption… ▽ More

    Submitted 14 June, 2017; originally announced June 2017.

    Comments: Appearing in 34th International Conference on Machine Learning (ICML), 2017

    Journal ref: Proceedings of the 34th International Conference on Machine Learning 70 (2017) 1780-1788

  19. arXiv:1607.01381  [pdf, other

    stat.ML cs.AI cs.IR

    One-Shot Session Recommendation Systems with Combinatorial Items

    Authors: Yahel David, Dotan Di Castro, Zohar Karnin

    Abstract: In recent years, content recommendation systems in large websites (or \emph{content providers}) capture an increased focus. While the type of content varies, e.g.\ movies, articles, music, advertisements, etc., the high level problem remains the same. Based on knowledge obtained so far on the user, recommend the most desired content. In this paper we present a method to handle the well known user-… ▽ More

    Submitted 5 July, 2016; originally announced July 2016.

  20. arXiv:1606.09296  [pdf, other

    cs.AI cs.HC cs.SI

    How Many Folders Do You Really Need?

    Authors: Mihajlo Grbovic, Guy Halawi, Zohar Karnin, Yoelle Maarek

    Abstract: Email classification is still a mostly manual task. Consequently, most Web mail users never define a single folder. Recently however, automatic classification offering the same categories to all users has started to appear in some Web mail clients, such as AOL or Gmail. We adopt this approach, rather than previous (unsuccessful) personalized approaches because of the change in the nature of consum… ▽ More

    Submitted 29 June, 2016; originally announced June 2016.

    Comments: 10 pages, 12 figures, Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM 2014), Shanghai, China

    ACM Class: H.4.3

    Journal ref: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM 2014), Shanghai, China

  21. arXiv:1603.05346  [pdf, other

    cs.DS

    Optimal Quantile Approximation in Streams

    Authors: Zohar Karnin, Kevin Lang, Edo Liberty

    Abstract: This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An $\varepsilon$ approximate quantile sketch receives a stream of items $x_1,\ldots,x_n$ and allows one to approximate the rank of any query up to additive error $\varepsilon n$ with probability at least $1-δ$. The rank of a query $x$ is the number… ▽ More

    Submitted 5 April, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

  22. arXiv:1507.04330  [pdf, ps, other

    cs.DC

    Optimal Dynamic Distributed MIS

    Authors: Keren Censor-Hillel, Elad Haramaty, Zohar Karnin

    Abstract: Finding a maximal independent set (MIS) in a graph is a cornerstone task in distributed computing. The local nature of an MIS allows for fast solutions in a static distributed setting, which are logarithmic in the number of nodes or in their degrees. The result trivially applies for the dynamic distributed model, in which edges or nodes may be inserted or deleted. In this paper, we take a differen… ▽ More

    Submitted 16 July, 2015; v1 submitted 15 July, 2015; originally announced July 2015.

    Comments: 19 pages including appendix and references

  23. arXiv:1506.00312  [pdf, other

    cs.LG

    Copeland Dueling Bandits

    Authors: Masrour Zoghi, Zohar Karnin, Shimon Whiteson, Maarten de Rijke

    Abstract: A version of the dueling bandit problem is addressed in which a Condorcet winner may not exist. Two algorithms are proposed that instead seek to minimize regret with respect to the Copeland winner, which, unlike the Condorcet winner, is guaranteed to exist. The first, Copeland Confidence Bound (CCB), is designed for small numbers of arms, while the second, Scalable Copeland Bandits (SCB), works be… ▽ More

    Submitted 31 May, 2015; originally announced June 2015.

    Comments: 33 pages, 8 figures

  24. arXiv:1406.2431  [pdf, other

    cs.IR cs.LG

    Budget-Constrained Item Cold-Start Handling in Collaborative Filtering Recommenders via Optimal Design

    Authors: Oren Anava, Shahar Golan, Nadav Golbandi, Zohar Karnin, Ronny Lempel, Oleg Rokhlenko, Oren Somekh

    Abstract: It is well known that collaborative filtering (CF) based recommender systems provide better modeling of users and items associated with considerable rating history. The lack of historical ratings results in the user and the item cold-start problems. The latter is the main focus of this work. Most of the current literature addresses this problem by integrating content-based recommendation technique… ▽ More

    Submitted 20 September, 2016; v1 submitted 10 June, 2014; originally announced June 2014.

    Comments: 11 pages, 2 figures

    MSC Class: 62K05

  25. arXiv:1405.3396  [pdf, other

    cs.LG

    Reducing Dueling Bandits to Cardinal Bandits

    Authors: Nir Ailon, Thorsten Joachims, Zohar Karnin

    Abstract: We present algorithms for reducing the Dueling Bandits problem to the conventional (stochastic) Multi-Armed Bandits problem. The Dueling Bandits problem is an online model of learning with ordinal feedback of the form "A is preferred to B" (as opposed to cardinal feedback like "A has value 2.5"), giving it wide applicability in learning from implicit user feedback and revealed and stated preferenc… ▽ More

    Submitted 14 May, 2014; originally announced May 2014.

  26. arXiv:1312.6214  [pdf, other

    cs.LG cs.AI cs.DS

    Volumetric Spanners: an Efficient Exploration Basis for Learning

    Authors: Elad Hazan, Zohar Karnin, Raghu Mehka

    Abstract: Numerous machine learning problems require an exploration basis - a mechanism to explore the action space. We define a novel geometric notion of exploration basis with low variance, called volumetric spanners, and give efficient algorithms to construct such a basis. We show how efficient volumetric spanners give rise to the first efficient and optimal regret algorithm for bandit linear optimizat… ▽ More

    Submitted 25 May, 2014; v1 submitted 21 December, 2013; originally announced December 2013.

  27. arXiv:1311.4643  [pdf, other

    cs.LG cs.IT math.NA stat.ML

    Near-Optimal Entrywise Sampling for Data Matrices

    Authors: Dimitris Achlioptas, Zohar Karnin, Edo Liberty

    Abstract: We consider the problem of selecting non-zero entries of a matrix $A$ in order to produce a sparse sketch of it, $B$, that minimizes $\|A-B\|_2$. For large $m \times n$ matrices, such that $n \gg m$ (for example, representing $n$ observations over $m$ attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information… ▽ More

    Submitted 19 November, 2013; originally announced November 2013.

    Comments: 14 pages, to appear in NIPS' 13

  28. arXiv:1311.0800  [pdf, ps, other

    cs.LG

    Distributed Exploration in Multi-Armed Bandits

    Authors: Eshcar Hillel, Zohar Karnin, Tomer Koren, Ronny Lempel, Oren Somekh

    Abstract: We study exploration in Multi-Armed Bandits in a setting where $k$ players collaborate in order to identify an $ε$-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between… ▽ More

    Submitted 4 November, 2013; originally announced November 2013.

  29. arXiv:1204.6588  [pdf, ps, other

    cs.CC

    A note on: No need to choose: How to get both a PTAS and Sublinear Query Complexity

    Authors: Nir Ailon, Zohar Karnin

    Abstract: We revisit various PTAS's (Polynomial Time Approximation Schemes) for minimization versions of dense problems, and show that they can be performed with sublinear query complexity. This means that not only do we obtain a (1+eps)-approximation to the NP-Hard problems in polynomial time, but also avoid reading the entire input. This setting is particularly advantageous when the price of reading parts… ▽ More

    Submitted 30 April, 2012; originally announced April 2012.

  30. arXiv:1107.1358  [pdf, ps, other

    cs.CC cs.DS cs.LG

    On the Furthest Hyperplane Problem and Maximal Margin Clustering

    Authors: Zohar Karnin, Edo Liberty, Shachar Lovett, Roy Schwartz, Omri Weinstein

    Abstract: This paper introduces the Furthest Hyperplane Problem (FHP), which is an unsupervised counterpart of Support Vector Machines. Given a set of n points in Rd, the objective is to produce the hyperplane (passing through the origin) which maximizes the separation margin, that is, the minimal distance between the hyperplane and any input point. To the best of our knowledge, this is the first paper achi… ▽ More

    Submitted 2 February, 2012; v1 submitted 7 July, 2011; originally announced July 2011.