Skip to main content

Showing 1–50 of 50 results for author: Arik, S O

.
  1. arXiv:2406.15708  [pdf, other

    cs.CL cs.AI cs.LG

    Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization

    Authors: Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Sercan O. Arik

    Abstract: Large language models have demonstrated remarkable capabilities, but their performance is heavily reliant on effective prompt engineering. Automatic prompt optimization (APO) methods are designed to automate this and can be broadly categorized into those targeting instructions (instruction optimization, IO) vs. those targeting exemplars (exemplar selection, ES). Despite their shared objective, the… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2406.02818  [pdf, other

    cs.CL

    Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

    Authors: Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik

    Abstract: Addressing the challenge of effectively processing long contexts has become a critical issue for Large Language Models (LLMs). Two common strategies have emerged: 1) reducing the input length, such as retrieving relevant chunks by Retrieval-Augmented Generation (RAG), and 2) expanding the context window limit of LLMs. However, both strategies have drawbacks: input reduction has no guarantee of cov… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 19 pages, 6 figures

  3. arXiv:2406.00222  [pdf, other

    cs.CL cs.AI cs.LG

    Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training

    Authors: Maximillian Chen, Ruoxi Sun, Sercan Ö. Arık, Tomas Pfister

    Abstract: Large language models (LLMs) aligned through reinforcement learning from human feedback (RLHF) have quickly become one of the dominant paradigms for building intelligent conversational assistant agents. However, despite their strong performance across many benchmarks, LLM-based agents still lack conversational skills such as disambiguation: when generalized assistants are faced with ambiguity, the… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  4. arXiv:2405.18654  [pdf, other

    cs.CV

    Mitigating Object Hallucination via Data Augmented Contrastive Tuning

    Authors: Pritam Sarkar, Sayna Ebrahimi, Ali Etemad, Ahmad Beirami, Sercan Ö. Arık, Tomas Pfister

    Abstract: Despite their remarkable progress, Multimodal Large Language Models (MLLMs) tend to hallucinate factually inaccurate information. In this work, we address object hallucinations in MLLMs, where information is offered about an object that is not present in the model input. We introduce a contrastive tuning method that can be applied to a pretrained off-the-shelf MLLM for mitigating hallucinations wh… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2404.09491  [pdf, other

    cs.LG

    Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning

    Authors: Sungwon Han, **sung Yoon, Sercan O Arik, Tomas Pfister

    Abstract: Large Language Models (LLMs), with their remarkable ability to tackle challenging and unseen reasoning problems, hold immense potential for tabular learning, that is vital for many real-world applications. In this paper, we propose a novel in-context learning framework, FeatLLM, which employs LLMs as feature engineers to produce an input data set that is optimally suited for tabular predictions. T… ▽ More

    Submitted 6 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to ICML, 2024

  6. arXiv:2312.01279  [pdf, other

    cs.CL cs.AI cs.LG

    TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents

    Authors: James Enouen, Hootan Nakhost, Sayna Ebrahimi, Sercan O Arik, Yan Liu, Tomas Pfister

    Abstract: Large language models (LLMs) have attracted huge interest in practical applications given their increasingly accurate responses and coherent reasoning abilities. Given their nature as black-boxes using complex reasoning processes on their inputs, it is inevitable that the demand for scalable and faithful explanations for LLMs' generated content will continue to grow. There have been major developm… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  7. arXiv:2311.09533  [pdf, other

    cs.CL

    Effective Large Language Model Adaptation for Improved Grounding and Citation Generation

    Authors: Xi Ye, Ruoxi Sun, Sercan Ö. Arik, Tomas Pfister

    Abstract: Large language models (LLMs) have achieved remarkable advancements in natural language understanding and generation. However, one major issue towards their widespread deployment in the real world is that they can generate "hallucinated" answers that are not factual. Towards this end, this paper focuses on improving LLMs by grounding their responses in retrieved passages and by providing citations.… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  8. arXiv:2311.02883  [pdf, other

    cs.CL

    SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data

    Authors: Ruoxi Sun, Sercan Ö. Arik, Rajarishi Sinha, Hootan Nakhost, Hanjun Dai, Pengcheng Yin, Tomas Pfister

    Abstract: Text-to-SQL aims to automate the process of generating SQL queries on a database from natural language text. In this work, we propose "SQLPrompt", tailored to improve the few-shot prompting capabilities of Text-to-SQL for Large Language Models (LLMs). Our methods include innovative prompt design, execution-based consistency decoding strategy which selects the SQL with the most consistent execution… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  9. arXiv:2311.00886  [pdf, other

    cs.LG

    COSTAR: Improved Temporal Counterfactual Estimation with Self-Supervised Learning

    Authors: Chuizheng Meng, Yihe Dong, Sercan Ö. Arık, Yan Liu, Tomas Pfister

    Abstract: Estimation of temporal counterfactual outcomes from observed history is crucial for decision-making in many domains such as healthcare and e-commerce, particularly when randomized controlled trials (RCTs) suffer from high cost or impracticality. For real-world datasets, modeling time-dependent confounders is challenging due to complex dynamics, long-range dependencies and both past treatments and… ▽ More

    Submitted 12 February, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  10. arXiv:2310.11689  [pdf, other

    cs.CL cs.LG

    Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs

    Authors: Jiefeng Chen, **sung Yoon, Sayna Ebrahimi, Sercan O Arik, Tomas Pfister, Somesh Jha

    Abstract: Large language models (LLMs) have recently shown great advances in a variety of tasks, including natural language understanding and generation. However, their use in high-stakes decision-making scenarios is still limited due to the potential for errors. Selective prediction is a technique that can be used to improve the reliability of the LLMs by allowing them to abstain from making predictions wh… ▽ More

    Submitted 11 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Paper published at Findings of the Association for Computational Linguistics: EMNLP, 2023

  11. arXiv:2310.08750  [pdf, other

    cs.LG

    Search-Adaptor: Embedding Customization for Information Retrieval

    Authors: **sung Yoon, Sercan O Arik, Yanfei Chen, Tomas Pfister

    Abstract: Embeddings extracted by pre-trained Large Language Models (LLMs) have significant potential to improve information retrieval and search. Beyond the zero-shot setup in which they are being conventionally used, being able to take advantage of the information from the relevant query-corpus paired data can further boost the LLM capabilities. In this paper, we propose a novel method, Search-Adaptor, fo… ▽ More

    Submitted 12 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  12. arXiv:2310.04948  [pdf, other

    cs.LG cs.CL

    TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting

    Authors: Defu Cao, Furong Jia, Sercan O Arik, Tomas Pfister, Yixiang Zheng, Wen Ye, Yan Liu

    Abstract: The past decade has witnessed significant advances in time series modeling with deep learning. While achieving state-of-the-art results, the best-performing architectures vary highly across applications and domains. Meanwhile, for natural language processing, the Generative Pre-trained Transformer (GPT) has demonstrated impressive performance via training one general-purpose model across various t… ▽ More

    Submitted 2 April, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR 2024. Camera Ready Version

  13. arXiv:2308.13703  [pdf, other

    cs.LG

    PAITS: Pretraining and Augmentation for Irregularly-Sampled Time Series

    Authors: Nicasia Beebe-Wang, Sayna Ebrahimi, **sung Yoon, Sercan O. Arik, Tomas Pfister

    Abstract: Real-world time series data that commonly reflect sequential human behavior are often uniquely irregularly sampled and sparse, with highly nonuniform sampling over time and entities. Yet, commonly-used pretraining and augmentation methods for time series are not specifically designed for such scenarios. In this paper, we present PAITS (Pretraining and Augmentation for Irregularly-sampled Time Seri… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: Code: \url{https://github.com/google-research/google-research/tree/master/irregular_timeseries_pretraining}

  14. arXiv:2308.13118  [pdf, other

    cs.LG

    Business Metric-Aware Forecasting for Inventory Management

    Authors: Helen Zhou, Sercan O. Arik, **gtao Wang

    Abstract: Time-series forecasts play a critical role in business planning. However, forecasters typically optimize objectives that are agnostic to downstream business goals and thus can produce forecasts misaligned with business preferences. In this work, we demonstrate that optimization of conventional forecasting metrics can often lead to sub-optimal downstream business performance. Focusing on the invent… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  15. arXiv:2306.00739  [pdf, other

    cs.CL cs.AI cs.DB

    SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

    Authors: Ruoxi Sun, Sercan Ö. Arik, Alex Muzio, Lesly Miculicich, Satya Gundabathula, Pengcheng Yin, Hanjun Dai, Hootan Nakhost, Rajarishi Sinha, Zifeng Wang, Tomas Pfister

    Abstract: Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper introduces the SQL-PaLM framework, a comprehensive solution for understanding and enhancing Text-to-SQL using LLMs, using in the learning regimes of few-shot prom… ▽ More

    Submitted 30 March, 2024; v1 submitted 26 May, 2023; originally announced June 2023.

  16. arXiv:2305.16556  [pdf, other

    cs.LG cs.AI

    LANISTR: Multimodal Learning from Structured and Unstructured Data

    Authors: Sayna Ebrahimi, Sercan O. Arik, Yihe Dong, Tomas Pfister

    Abstract: Multimodal large-scale pretraining has shown impressive performance for unstructured data such as language and image. However, a prevalent real-world scenario involves structured data types, tabular and time-series, along with unstructured data. Such scenarios have been understudied. To bridge this gap, we propose LANISTR, an attention-based framework to learn from LANguage, Image, and STRuctured… ▽ More

    Submitted 24 April, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

  17. arXiv:2305.14926  [pdf, other

    cs.CL cs.AI cs.LG

    Universal Self-Adaptive Prompting

    Authors: Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister

    Abstract: A hallmark of modern large language models (LLMs) is their impressive general zero-shot and few-shot abilities, often elicited through in-context learning (ICL) via prompting. However, while highly coveted and being the most general, zero-shot performances in LLMs are still typically weaker due to the lack of guidance and the difficulty of applying existing automatic prompt design methods in gener… ▽ More

    Submitted 20 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 (Main). 10 pages, 5 figures, 4 tables (26 pages, 9 figures and 13 tables including references and appendices)

  18. arXiv:2305.14106  [pdf, other

    cs.CL cs.AI cs.LG

    Better Zero-Shot Reasoning with Self-Adaptive Prompting

    Authors: Xingchen Wan, Ruoxi Sun, Hanjun Dai, Sercan O. Arik, Tomas Pfister

    Abstract: Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans. This is made possible by their strong few and zero-shot abilities -- they can effectively learn from a handful of handcrafted, completed responses ("in-context examples"), or are prompted to reason spontaneously through specially designed tri… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Findings of the Association for Computational Linguistics: ACL 2023. 10 pages, 2 tables, 4 figures (20 pages, 8 tables, 7 figures including references and appendices)

  19. arXiv:2304.03202  [pdf, other

    cs.LG

    SLM: End-to-end Feature Selection via Sparse Learnable Masks

    Authors: Yihe Dong, Sercan O. Arik

    Abstract: Feature selection has been widely used to alleviate compute requirements during training, elucidate model interpretability, and improve model generalizability. We propose SLM -- Sparse Learnable Masks -- a canonical approach for end-to-end feature selection that scales well with respect to both the feature dimension and the number of samples. At the heart of SLM lies a simple but effective learnab… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  20. arXiv:2303.06053  [pdf, other

    cs.LG cs.AI

    TSMixer: An All-MLP Architecture for Time Series Forecasting

    Authors: Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O. Arik, Tomas Pfister

    Abstract: Real-world time-series datasets are often multivariate with complex dynamics. To capture this complexity, high capacity architectures like recurrent- or attention-based sequential deep learning models have become popular. However, recent work demonstrates that simple univariate linear models can outperform such deep learning models on several commonly used academic benchmarks. Extending them, in t… ▽ More

    Submitted 11 September, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Journal ref: Transactions on Machine Learning Research (TMLR), 09/2023

  21. arXiv:2301.04857  [pdf, other

    cs.AI stat.ME

    Neural Spline Search for Quantile Probabilistic Modeling

    Authors: Ruoxi Sun, Chun-Liang Li, Sercan O. Arik, Michael W. Dusenberry, Chen-Yu Lee, Tomas Pfister

    Abstract: Accurate estimation of output quantiles is crucial in many use cases, where it is desired to model the range of possibility. Modeling target distribution at arbitrary quantile levels and at arbitrary input attribute levels are important to offer a comprehensive picture of the data, and requires the quantile function to be expressive enough. The quantile function describing the target distribution… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

  22. arXiv:2212.00173  [pdf, other

    cs.LG

    SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch

    Authors: **sung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Tomas Pfister

    Abstract: Semi-supervised anomaly detection is a common problem, as often the datasets containing anomalies are partially labeled. We propose a canonical framework: Semi-supervised Pseudo-labeler Anomaly Detection with Ensembling (SPADE) that isn't limited by the assumption that labeled and unlabeled data come from the same distribution. Indeed, the assumption is often violated in many applications - for ex… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

  23. arXiv:2211.06582  [pdf, other

    cs.LG cs.CR stat.ML

    Provable Membership Inference Privacy

    Authors: Zachary Izzo, **sung Yoon, Sercan O. Arik, James Zou

    Abstract: In applications involving sensitive data, such as finance and healthcare, the necessity for preserving data privacy can be a significant barrier to machine learning model development. Differential privacy (DP) has emerged as one canonical standard for provable privacy. However, DP's strong theoretical guarantees often come at the cost of a large drop in its utility for machine learning, and DP gua… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    Comments: 19 pages, 2 figures

  24. arXiv:2210.03675  [pdf, other

    cs.LG stat.ML

    Koopman Neural Forecaster for Time Series with Temporal Distribution Shifts

    Authors: Rui Wang, Yihe Dong, Sercan Ö. Arik, Rose Yu

    Abstract: Temporal distributional shifts, with underlying dynamics changing over time, frequently occur in real-world time series and pose a fundamental challenge for deep neural networks (DNNs). In this paper, we propose a novel deep sequence model based on the Koopman theory for time series forecasting: Koopman Neural Forecaster (KNF) which leverages DNNs to learn the linear Koopman space and the coeffici… ▽ More

    Submitted 28 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

  25. arXiv:2209.07999  [pdf, other

    cs.LG cs.AI cs.CV cs.IT eess.IV

    Self-Supervised Learning with an Information Maximization Criterion

    Authors: Serdar Ozsoy, Shadi Hamdan, Sercan Ö. Arik, Deniz Yuret, Alper T. Erdogan

    Abstract: Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    ACM Class: I.2; I.4; I.5

  26. arXiv:2206.07240  [pdf, other

    cs.CV cs.AI cs.LG

    Test-Time Adaptation for Visual Document Understanding

    Authors: Sayna Ebrahimi, Sercan O. Arik, Tomas Pfister

    Abstract: For visual document understanding (VDU), self-supervised pretraining has been shown to successfully generate transferable representations, yet, effective adaptation of such representations to distribution shifts at test-time remains to be an unexplored area. We propose DocTTA, a novel test-time adaptation method for documents, that does source-free domain adaptation using unlabeled target document… ▽ More

    Submitted 23 August, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: Accepted at TMLR 2023

  27. arXiv:2206.06469  [pdf

    cs.LG stat.ML

    Invariant Structure Learning for Better Generalization and Causal Explainability

    Authors: Yunhao Ge, Sercan Ö. Arik, **sung Yoon, Ao Xu, Laurent Itti, Tomas Pfister

    Abstract: Learning the causal structure behind data is invaluable for improving generalization and obtaining high-quality explanations. We propose a novel framework, Invariant Structure Learning (ISL), that is designed to improve causal structure discovery by utilizing generalization as an indication. ISL splits the data into different environments, and learns a structure that is invariant to the target acr… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: 16 pages (including Appendix), 4 figures

  28. arXiv:2206.02107  [pdf, other

    cs.LG

    Interpretable Mixture of Experts

    Authors: Aya Abdelsalam Ismail, Sercan Ö. Arik, **sung Yoon, Ankur Taly, Soheil Feizi, Tomas Pfister

    Abstract: The need for reliable model explanations is prominent for many machine learning applications, particularly for tabular and time-series data as their use cases often involve high-stakes decision making. Towards this goal, we introduce a novel interpretable modeling framework, Interpretable Mixture of Experts (IME), that yields high accuracy, comparable to `black-box' Deep Neural Networks (DNNs) in… ▽ More

    Submitted 25 May, 2023; v1 submitted 5 June, 2022; originally announced June 2022.

  29. arXiv:2202.02403  [pdf, other

    cs.LG cs.AI

    Self-Adaptive Forecasting for Improved Deep Learning on Non-Stationary Time-Series

    Authors: Sercan O. Arik, Nathanael C. Yoder, Tomas Pfister

    Abstract: Real-world time-series datasets often violate the assumptions of standard supervised learning for forecasting -- their distributions evolve over time, rendering the conventional training and model selection procedures suboptimal. In this paper, we propose a novel method, Self-Adaptive Forecasting (SAF), to modify the training of time-series forecasting models to improve their performance on foreca… ▽ More

    Submitted 26 September, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

  30. arXiv:2106.07804  [pdf, other

    cs.LG stat.ML

    Controlling Neural Networks with Rule Representations

    Authors: Sungyong Seo, Sercan O. Arik, **sung Yoon, Xiang Zhang, Kihyuk Sohn, Tomas Pfister

    Abstract: We propose a novel training method that integrates rules into deep learning, in a way the strengths of the rules are controllable at inference. Deep Neural Networks with Controllable Rule Representations (DeepCTRL) incorporates a rule encoder into the model coupled with a rule-based objective, enabling a shared representation for decision making. DeepCTRL is agnostic to data type and model archite… ▽ More

    Submitted 16 November, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

  31. arXiv:2106.06115  [pdf, other

    cs.LG

    Self-supervise, Refine, Repeat: Improving Unsupervised Anomaly Detection

    Authors: **sung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Chen-Yu Lee, Tomas Pfister

    Abstract: Anomaly detection (AD), separating anomalies from normal data, has many applications across domains, from security to healthcare. While most previous works were shown to be effective for cases with fully or partially labeled data, that setting is in practice less common due to labeling being particularly tedious for this task. In this paper, we focus on fully unsupervised AD, in which the entire t… ▽ More

    Submitted 4 August, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: Published in Transactions on Machine Learning Research (TMLR) - August, 2022 - https://openreview.net/forum?id=b3v1UrtF6G

  32. arXiv:2105.12723  [pdf, other

    cs.CV

    Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

    Authors: Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan O. Arik, Tomas Pfister

    Abstract: Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well. In this paper, we explore the idea of nesting basic local transformers on non-overlap** image blocks and aggregating them in a hierarchical way. We find that the block aggregation function plays a critical role in enabling cross-block non-local informa… ▽ More

    Submitted 30 December, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: AAAI2022

  33. arXiv:2008.00646  [pdf, other

    cs.LG stat.ML

    Interpretable Sequence Learning for COVID-19 Forecasting

    Authors: Sercan O. Arik, Chun-Liang Li, **sung Yoon, Rajarishi Sinha, Arkady Epshteyn, Long T. Le, Vikas Menon, Shashank Singh, Leyou Zhang, Nate Yoder, Martin Nikoltchev, Yash Sonthalia, Hootan Nakhost, Elli Kanal, Tomas Pfister

    Abstract: We propose a novel approach that integrates machine learning into compartmental disease modeling to predict the progression of COVID-19. Our model is explainable by design as it explicitly shows how different compartments evolve and it uses interpretable encoders to incorporate covariates and improve performance. Explainability is valuable to ensure that the model's forecasts are credible to epide… ▽ More

    Submitted 13 January, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

  34. arXiv:2007.07477  [pdf, other

    cs.CV cs.AI cs.LG

    Explaining Deep Neural Networks using Unsupervised Clustering

    Authors: Yu-han Liu, Sercan O. Arik

    Abstract: We propose a novel method to explain trained deep neural networks (DNNs), by distilling them into surrogate models using unsupervised clustering. Our method can be applied flexibly to any subset of layers of a DNN architecture and can incorporate low-level and high-level information. On image datasets given pre-trained DNNs, we demonstrate the strength of our method in finding similar training sam… ▽ More

    Submitted 15 July, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

  35. arXiv:1912.09363  [pdf, other

    stat.ML cs.LG

    Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting

    Authors: Bryan Lim, Sercan O. Arik, Nicolas Loeff, Tomas Pfister

    Abstract: Multi-horizon forecasting problems often contain a complex mix of inputs -- including static (i.e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically -- without any prior information on how they interact with the target. While several deep learning models have been proposed for multi-step prediction, they typically comprise black-bo… ▽ More

    Submitted 27 September, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

  36. arXiv:1910.07969  [pdf, other

    cs.LG stat.ML

    On Completeness-aware Concept-Based Explanations in Deep Neural Networks

    Authors: Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar

    Abstract: Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior based on the assumption that complete co… ▽ More

    Submitted 7 February, 2022; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: Updated supplementary

  37. arXiv:1910.07153  [pdf, other

    cs.LG cs.CV

    Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost

    Authors: Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister

    Abstract: Active learning (AL) combines data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance. In pool-based active learning, accessible unlabeled data are not used for model training in most conventional methods. Here, we propose to unify unlabeled sample selection and model training towards minimizing labelin… ▽ More

    Submitted 18 July, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Accepted by ECCV2020

  38. arXiv:1910.00701  [pdf, other

    cs.LG cs.CV stat.ML

    Distilling Effective Supervision from Severe Label Noise

    Authors: Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

    Abstract: Collecting large-scale data with clean labels for supervised training of neural networks is practically challenging. Although noisy labels are usually cheap to acquire, existing methods suffer a lot from label noise. This paper targets at the challenge of robust training at high label noise regimes. The key insight to achieve this goal is to wisely leverage a small trusted set to estimate exemplar… ▽ More

    Submitted 12 June, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

    Comments: CVPR2020

  39. arXiv:1909.12367  [pdf, other

    cs.LG stat.ML

    LIMIS: Locally Interpretable Modeling using Instance-wise Subsampling

    Authors: **sung Yoon, Sercan O. Arik, Tomas Pfister

    Abstract: Understanding black-box machine learning models is crucial for their widespread adoption. Learning globally interpretable models is one approach, but achieving high performance with them is challenging. An alternative approach is to explain individual predictions using locally interpretable models. For locally interpretable modeling, various methods have been proposed and indeed commonly used, but… ▽ More

    Submitted 21 September, 2022; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Published in Transactions on Machine Learning Research (TMLR) - September, 2022 - https://openreview.net/forum?id=S8eABAy8P3

  40. arXiv:1909.11671  [pdf, other

    cs.LG stat.ML

    Data Valuation using Reinforcement Learning

    Authors: **sung Yoon, Sercan O. Arik, Tomas Pfister

    Abstract: Quantifying the value of data is a fundamental problem in machine learning. Data valuation has multiple important use cases: (1) building insights about the learning task, (2) domain adaptation, (3) corrupted sample discovery, and (4) robust learning. To adaptively learn data values jointly with the target task predictor model, we propose a meta learning framework which we name Data Valuation usin… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: 17 pages, 12 figures, 6 tables

  41. arXiv:1908.11406  [pdf, other

    cs.LG cs.AI

    Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning

    Authors: Linchao Zhu, Sercan O. Arik, Yi Yang, Tomas Pfister

    Abstract: We propose a novel adaptive transfer learning framework, learning to transfer learn (L2TL), to improve performance on a target dataset by careful extraction of the related information from a source dataset. Our framework considers cooperative optimization of shared weights between models for source and target tasks, and adjusts the constituent loss weights adaptively. The adaptation of the weights… ▽ More

    Submitted 16 July, 2020; v1 submitted 29 August, 2019; originally announced August 2019.

  42. arXiv:1908.07442  [pdf, other

    cs.LG stat.ML

    TabNet: Attentive Interpretable Tabular Learning

    Authors: Sercan O. Arik, Tomas Pfister

    Abstract: We propose a novel high-performance and interpretable canonical deep tabular data learning architecture, TabNet. TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and more efficient learning as the learning capacity is used for the most salient features. We demonstrate that TabNet outperforms other neural network and decision… ▽ More

    Submitted 9 December, 2020; v1 submitted 20 August, 2019; originally announced August 2019.

  43. arXiv:1902.06292  [pdf, other

    cs.LG cs.CV

    ProtoAttend: Attention-Based Prototypical Learning

    Authors: Sercan O. Arik, Tomas Pfister

    Abstract: We propose a novel inherently interpretable machine learning method that bases decisions on few relevant examples that we call prototypes. Our method, ProtoAttend, can be integrated into a wide range of neural network architectures including pre-trained models. It utilizes an attention mechanism that relates the encoded representations to samples in order to determine prototypes. The resulting mod… ▽ More

    Submitted 25 September, 2019; v1 submitted 17 February, 2019; originally announced February 2019.

  44. arXiv:1808.06719  [pdf, other

    cs.SD cs.LG eess.AS

    Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks

    Authors: Sercan O. Arik, Heewoo Jun, Gregory Diamos

    Abstract: We propose the multi-head convolutional neural network (MCNN) architecture for waveform synthesis from spectrograms. Nonlinear interpolation in MCNN is employed with transposed convolution layers in parallel heads. MCNN achieves more than an order of magnitude higher compute intensity than commonly-used iterative algorithms like Griffin-Lim, yielding efficient utilization for modern multi-core pro… ▽ More

    Submitted 5 November, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

  45. arXiv:1806.07912  [pdf, other

    cs.NE cs.AI

    Resource-Efficient Neural Architect

    Authors: Yanqi Zhou, Siavash Ebrahimi, Sercan Ö. Arık, Haonan Yu, Hairong Liu, Greg Diamos

    Abstract: Neural Architecture Search (NAS) is a laborious process. Prior work on automated NAS targets mainly on improving accuracy, but lacks consideration of computational resource use. We propose the Resource-Efficient Neural Architect (RENA), an efficient resource-constrained NAS using reinforcement learning with network embedding. RENA uses a policy network to process the network embeddings to generate… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

  46. arXiv:1802.06006  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Neural Voice Cloning with a Few Samples

    Authors: Sercan O. Arik, Jitong Chen, Kainan Peng, Wei **, Yanqi Zhou

    Abstract: Voice cloning is a highly desired feature for personalized speech interfaces. Neural network based speech synthesis has been shown to generate high quality speech for a large number of speakers. In this paper, we introduce a neural voice cloning system that takes a few audio samples as input. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-tuni… ▽ More

    Submitted 12 October, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

  47. arXiv:1710.07654  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

    Authors: Wei **, Kainan Peng, Andrew Gibiansky, Sercan O. Arik, Ajay Kannan, Sharan Narang, Jonathan Raiman, John Miller

    Abstract: We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS) system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. We scale Deep Voice 3 to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, we identify common erro… ▽ More

    Submitted 22 February, 2018; v1 submitted 20 October, 2017; originally announced October 2017.

    Comments: Published as a conference paper at ICLR 2018. (v3 changed paper title)

  48. Low-complexity implementation of convex optimization-based phase retrieval

    Authors: Sercan O. Arik, Joseph M. Kahn

    Abstract: Phase retrieval has important applications in optical imaging, communications and sensing. Lifting the dimensionality of the problem allows phase retrieval to be approximated as a convex optimization problem in a higher-dimensional space. Convex optimization-based phase retrieval has been shown to yield high accuracy, yet its low-complexity implementation has not been explored. In this paper, we s… ▽ More

    Submitted 19 March, 2018; v1 submitted 18 July, 2017; originally announced July 2017.

  49. arXiv:1703.05390  [pdf

    cs.CL cs.AI cs.LG

    Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

    Authors: Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates

    Abstract: Keyword spotting (KWS) constitutes a major component of human-technology interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems, we combine the… ▽ More

    Submitted 4 July, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

    Comments: Accepted to Interspeech 2017

  50. arXiv:1702.07825  [pdf, other

    cs.CL cs.LG cs.NE cs.SD

    Deep Voice: Real-time Neural Text-to-Speech

    Authors: Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, Yongguo Kang, Xian Li, John Miller, Andrew Ng, Jonathan Raiman, Shubho Sengupta, Mohammad Shoeybi

    Abstract: We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end neural speech synthesis. The system comprises five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency predi… ▽ More

    Submitted 7 March, 2017; v1 submitted 24 February, 2017; originally announced February 2017.

    Comments: Submitted to ICML 2017