Skip to main content

Showing 1–41 of 41 results for author: Astudillo, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.20062  [pdf, other

    cs.LG stat.ML

    Cost-aware Bayesian optimization via the Pandora's Box Gittins index

    Authors: Qian Xie, Raul Astudillo, Peter Frazier, Ziv Scully, Alexander Terenin

    Abstract: Bayesian optimization is a technique for efficiently optimizing unknown functions in a black-box manner. To handle practical settings where gathering data requires use of finite resources, it is desirable to explicitly incorporate function evaluation costs into Bayesian optimization policies. To understand how to do so, we develop a previously-unexplored connection between cost-aware Bayesian opti… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.14699  [pdf, other

    cs.LG math.OC stat.ML

    Preferential Multi-Objective Bayesian Optimization

    Authors: Raul Astudillo, Kejun Li, Maegan Tucker, Chu Xin Cheng, Aaron D. Ames, Yisong Yue

    Abstract: Preferential Bayesian optimization (PBO) is a framework for optimizing a decision-maker's latent preferences over available design choices. While preferences often involve multiple conflicting objectives, existing work in PBO assumes that preferences can be encoded by a single objective function. For example, in robotic assistive devices, technicians often attempt to maximize user comfort while si… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2403.00827  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refinement of Language Models from External Proxy Metrics Feedback

    Authors: Keshav Ramji, Young-Suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos

    Abstract: It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being grounded in a given document. In this paper, we introduce Proxy Metric-based Self-Refinement (ProMiSe), which enables an LLM to refine its own initial re… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  4. arXiv:2402.11770  [pdf, other

    cs.CL

    Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations

    Authors: Md Arafat Sultan, Jatin Ganhotra, Ramón Fernandez Astudillo

    Abstract: We introduce a structured chain-of-thought (SCoT) prompting approach to generating content-grounded multi-turn question-answer conversations using a pre-trained large language model (LLM). At the core of our proposal is a structured breakdown of the complex task into a number of states in a state machine, so that actions corresponding to various subtasks, e.g., content reading and utterance genera… ▽ More

    Submitted 19 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  5. arXiv:2402.02479  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

    Authors: Gaurav Pandey, Yatin Nandwani, Tahira Naseem, Mayank Mishra, Guangxuan Xu, Dinesh Raghu, Sachindra Joshi, Asim Munawar, Ramón Fernandez Astudillo

    Abstract: Distribution matching methods for language model alignment such as Generation with Distributional Control (GDC) and Distributional Policy Gradient (DPG) have not received the same level of attention in reinforcement learning from human feedback (RLHF) as contrastive methods such as Sequence Likelihood Calibration (SLiC), Direct Preference Optimization (DPO) and its variants. We identify high varia… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024 (main conference)

  6. arXiv:2311.02146  [pdf, other

    stat.ML cs.LG math.OC

    Bayesian Optimization of Function Networks with Partial Evaluations

    Authors: Poompol Buathong, Jiayue Wan, Raul Astudillo, Samuel Daulton, Maximilian Balandat, Peter I. Frazier

    Abstract: Bayesian optimization is a powerful framework for optimizing functions that are expensive or time-consuming to evaluate. Recent work has considered Bayesian optimization of function networks (BOFN), where the objective function is given by a network of functions, each taking as input the output of previous nodes in the network as well as additional parameters. Leveraging this network structure has… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 34 pages, 15 figures, 3 tables

  7. arXiv:2310.13961  [pdf, other

    cs.CL cs.AI

    Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

    Authors: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo

    Abstract: Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we exp… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Journal ref: EMNLP 2023

  8. arXiv:2310.08535  [pdf, other

    cs.AI cs.CL

    Formally Specifying the High-Level Behavior of LLM-Based Agents

    Authors: Maxwell Crouse, Ibrahim Abdelaziz, Ramon Astudillo, Kinjal Basu, Soham Dan, Sadhana Kumaravel, Achille Fokoue, Pavan Kapanipathi, Salim Roukos, Luis Lastras

    Abstract: Autonomous, goal-driven agents powered by LLMs have recently emerged as promising tools for solving challenging problems without the need for task-specific finetuned models that can be expensive to procure. Currently, the design and implementation of such agents is ad hoc, as the wide variety of tasks that LLM-based agents may be applied to naturally means there can be no one-size-fits-all approac… ▽ More

    Submitted 24 January, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Preprint under review

  9. arXiv:2305.20018  [pdf, other

    cs.CL cs.AI

    Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency

    Authors: Maxwell Crouse, Ramon Astudillo, Tahira Naseem, Subhajit Chaudhury, Pavan Kapanipathi, Salim Roukos, Alexander Gray

    Abstract: We introduce Logical Offline Cycle Consistency Optimization (LOCCO), a scalable, semi-supervised method for training a neural semantic parser. Conceptually, LOCCO can be viewed as a form of self-learning where the semantic parser being trained is used to generate annotations for unlabeled text that are then used as new supervision. To increase the quality of annotations, our method utilizes a coun… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  10. arXiv:2305.17273  [pdf, other

    cs.CL cs.AI

    Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document AMR Parsing

    Authors: Sadhana Kumaravel, Tahira Naseem, Ramon Fernandez Astudillo, Radu Florian, Salim Roukos

    Abstract: The sliding window approach provides an elegant way to handle contexts of sizes larger than the Transformer's input window, for tasks like language modeling. Here we extend this approach to the sequence-to-sequence task of document parsing. For this, we exploit recent progress in transition-based parsing to implement a parser with synchronous sliding windows over source and target. We develop an o… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  11. arXiv:2305.04346  [pdf, other

    cs.CL cs.AI

    Laziness Is a Virtue When It Comes to Compositionality in Neural Semantic Parsing

    Authors: Maxwell Crouse, Pavan Kapanipathi, Subhajit Chaudhury, Tahira Naseem, Ramon Astudillo, Achille Fokoue, Tim Klinger

    Abstract: Nearly all general-purpose neural semantic parsers generate logical forms in a strictly top-down autoregressive fashion. Though such systems have achieved impressive results across a variety of datasets and domains, recent works have called into question whether they are ultimately limited in their ability to compositionally generalize. In this work, we approach semantic parsing from, quite litera… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL main conference

  12. arXiv:2304.12272  [pdf, other

    cs.CL cs.AI

    AMR Parsing with Instruction Fine-tuned Pre-trained Language Models

    Authors: Young-Suk Lee, Ramón Fernandez Astudillo, Radu Florian, Tahira Naseem, Salim Roukos

    Abstract: Instruction fine-tuned language models on a collection of instruction annotated datasets (FLAN) have shown highly effective to improve model performance and generalization to unseen tasks. However, a majority of standard parsing tasks including abstract meaning representation (AMR), universal dependency (UD), semantic role labeling (SRL) has been excluded from the FLAN collections for both model t… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  13. arXiv:2303.15746  [pdf, other

    cs.LG stat.ML

    qEUBO: A Decision-Theoretic Acquisition Function for Preferential Bayesian Optimization

    Authors: Raul Astudillo, Zhiyuan Jerry Lin, Eytan Bakshy, Peter I. Frazier

    Abstract: Preferential Bayesian optimization (PBO) is a framework for optimizing a decision maker's latent utility function using preference feedback. This work introduces the expected utility of the best option (qEUBO) as a novel acquisition function for PBO. When the decision maker's responses are noise-free, we show that qEUBO is one-step Bayes optimal and thus equivalent to the popular knowledge gradien… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

  14. arXiv:2205.01464  [pdf, other

    cs.CL

    Inducing and Using Alignments for Transition-based AMR Parsing

    Authors: Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramon Fernandez Astudillo

    Abstract: Transition-based parsers for Abstract Meaning Representation (AMR) rely on node-to-word alignments. These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints. Parsers also train on a point-estimate of the alignment pipeline, neglecting the uncertainty due to the in… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  15. arXiv:2203.11382  [pdf, other

    cs.LG math.OC stat.ML

    Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes

    Authors: Zhiyuan Jerry Lin, Raul Astudillo, Peter I. Frazier, Eytan Bakshy

    Abstract: We consider Bayesian optimization of expensive-to-evaluate experiments that generate vector-valued outcomes over which a decision-maker (DM) has preferences. These preferences are encoded by a utility function that is not known in closed form but can be estimated by asking the DM to express preferences over pairs of outcome vectors. To address this problem, we develop Bayesian optimization with pr… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Journal ref: AISTATS 2022

  16. arXiv:2201.00272  [pdf, other

    cs.LG math.OC stat.ML

    Thinking inside the box: A tutorial on grey-box Bayesian optimization

    Authors: Raul Astudillo, Peter I. Frazier

    Abstract: Bayesian optimization (BO) is a framework for global optimization of expensive-to-evaluate objective functions. Classical BO methods assume that the objective function is a black box. However, internal information about objective function computation is often available. For example, when optimizing a manufacturing line's throughput with simulation, we observe the number of parts waiting at each wo… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

    Comments: Published as an advanced tutorial in the proceedings of the 2021 Winter Simulation Conference

  17. arXiv:2112.15311  [pdf, other

    cs.LG math.OC stat.ML

    Bayesian Optimization of Function Networks

    Authors: Raul Astudillo, Peter I. Frazier

    Abstract: We consider Bayesian optimization of the output of a network of functions, where each function takes as input the output of its parent nodes, and where the network takes significant time to evaluate. Such problems arise, for example, in reinforcement learning, engineering design, and manufacturing. While the standard Bayesian optimization approach observes only the final output, our approach deliv… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

    Comments: In Advances in Neural Information Processing Systems, 2021

  18. arXiv:2112.08513  [pdf, other

    cs.CL

    DocAMR: Multi-Sentence AMR Representation and Evaluation

    Authors: Tahira Naseem, Austin Blodgett, Sadhana Kumaravel, Tim O'Gorman, Young-Suk Lee, Jeffrey Flanigan, Ramón Fernandez Astudillo, Radu Florian, Salim Roukos, Nathan Schneider

    Abstract: Despite extensive research on parsing of English sentences into Abstraction Meaning Representation (AMR) graphs, which are compared to gold graphs via the Smatch metric, full-document parsing into a unified graph representation lacks well-defined representation and evaluation. Taking advantage of a super-sentential level of coreference annotation from previous work, we introduce a simple algorithm… ▽ More

    Submitted 6 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    MSC Class: I.2.7

  19. arXiv:2112.07877  [pdf, other

    cs.CL

    Learning to Transpile AMR into SPARQL

    Authors: Mihaela Bornea, Ramon Fernandez Astudillo, Tahira Naseem, Nandana Mihindukulasooriya, Ibrahim Abdelaziz, Pavan Kapanipathi, Radu Florian, Salim Roukos

    Abstract: We propose a transition-based system to transpile Abstract Meaning Representation (AMR) into SPARQL for Knowledge Base Question Answering (KBQA). This allows us to delegate part of the semantic representation to a strongly pre-trained semantic parser, while learning transpiling with small amount of paired data. We depart from recent work relating AMR and SPARQL constructs, but rather than applying… ▽ More

    Submitted 8 December, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  20. arXiv:2112.07790  [pdf, ps, other

    cs.CL cs.AI

    Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

    Authors: Young-Suk Lee, Ramon Fernandez Astudillo, Thanh Lam Hoang, Tahira Naseem, Radu Florian, Salim Roukos

    Abstract: AMR parsing has experienced an unprecendented increase in performance in the last three years, due to a mixture of effects including architecture improvements and transfer learning. Self-learning techniques have also played a role in pushing performance forward. However, for most recent high performant parsers, the effect of self-learning and silver data augmentation seems to be fading. In this pa… ▽ More

    Submitted 2 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Journal ref: NAACL-HLT 2022

  21. arXiv:2111.06537  [pdf, other

    cs.LG math.OC stat.ML

    Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

    Authors: Raul Astudillo, Daniel R. Jiang, Maximilian Balandat, Eytan Bakshy, Peter I. Frazier

    Abstract: Bayesian optimization (BO) is a sample-efficient approach to optimizing costly-to-evaluate black-box functions. Most BO methods ignore how evaluation costs may vary over the optimization domain. However, these costs can be highly heterogeneous and are often unknown in advance. This occurs in many practical settings, such as hyperparameter tuning of machine learning algorithms or physics-based simu… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: In Advances in Neural Information Processing Systems, 2021

  22. arXiv:2110.15534  [pdf, other

    cs.CL

    Structure-aware Fine-tuning of Sequence-to-sequence Transformers for Transition-based AMR Parsing

    Authors: Jiawei Zhou, Tahira Naseem, Ramón Fernandez Astudillo, Young-Suk Lee, Radu Florian, Salim Roukos

    Abstract: Predicting linearized Abstract Meaning Representation (AMR) graphs using pre-trained sequence-to-sequence Transformer models has recently led to large improvements on AMR parsing benchmarks. These parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well-formedness guarantees or built-in graph-sentence alignments. In this work we explore the integ… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021 main conference

  23. arXiv:2110.09131  [pdf, other

    cs.CL cs.AI

    Ensembling Graph Predictions for AMR Parsing

    Authors: Hoang Thanh Lam, Gabriele Picco, Yufang Hou, Young-Suk Lee, Lam M. Nguyen, Dzung T. Phan, Vanessa López, Ramon Fernandez Astudillo

    Abstract: In many machine learning tasks, models are trained to predict structure data such as graphs. For example, in natural language processing, it is very common to parse texts into dependency trees or abstract meaning representation (AMR) graphs. On the other hand, ensemble methods combine predictions from multiple models to create a new one that is more robust and accurate than individual predictions.… ▽ More

    Submitted 24 January, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: Published at NeurIPS 2021

  24. arXiv:2108.00104  [pdf, other

    cs.CL

    Structural Guidance for Transformer Language Models

    Authors: Peng Qian, Tahira Naseem, Roger Levy, Ramón Fernandez Astudillo

    Abstract: Transformer-based language models pre-trained on large amounts of text data have proven remarkably successful in learning generic transferable linguistic representations. Here we study whether structural guidance leads to more human-like systematic linguistic generalization in Transformer language models without resorting to pre-training on very large amounts of data. We explore two general ideas.… ▽ More

    Submitted 30 July, 2021; originally announced August 2021.

    Comments: To be issued as paper revision for ACL 2021

  25. arXiv:2104.14674  [pdf, other

    cs.CL

    AMR Parsing with Action-Pointer Transformer

    Authors: Jiawei Zhou, Tahira Naseem, Ramón Fernandez Astudillo, Radu Florian

    Abstract: Abstract Meaning Representation parsing is a sentence-to-graph prediction task where target nodes are not explicitly aligned to sentence tokens. However, since graph nodes are semantically based on one or more sentence tokens, implicit alignments can be derived. Transition-based parsers operate over the sentence from left to right, capturing this inductive bias via alignments at the cost of limite… ▽ More

    Submitted 18 May, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: Accepted at NAACL 2021

  26. arXiv:2104.07474  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    EAT: Enhanced ASR-TTS for Self-supervised Speech Recognition

    Authors: Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Ramon Fernandez Astudillo, Jan "Honza'' Černocký

    Abstract: Self-supervised ASR-TTS models suffer in out-of-domain data conditions. Here we propose an enhanced ASR-TTS (EAT) model that incorporates two main features: 1) The ASR$\rightarrow$TTS direction is equipped with a language model reward to penalize the ASR hypotheses before forwarding it to TTS. 2) In the TTS$\rightarrow$ASR direction, a hyper-parameter is introduced to scale the attention context f… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  27. arXiv:2102.02189  [pdf, other

    cs.CL cs.AI

    Bootstrap** Multilingual AMR with Contextual Word Alignments

    Authors: Janaki Sheth, Young-Suk Lee, Ramon Fernandez Astudillo, Tahira Naseem, Radu Florian, Salim Roukos, Todd Ward

    Abstract: We develop high performance multilingualAbstract Meaning Representation (AMR) sys-tems by projecting English AMR annotationsto other languages with weak supervision. Weachieve this goal by bootstrap** transformer-based multilingual word embeddings, in partic-ular those from cross-lingual RoBERTa (XLM-R large). We develop a novel technique forforeign-text-to-English AMR alignment, usingthe contex… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Journal ref: EACL 2021

  28. arXiv:2012.01707  [pdf, other

    cs.CL cs.AI

    Leveraging Abstract Meaning Representation for Knowledge Base Question Answering

    Authors: Pavan Kapanipathi, Ibrahim Abdelaziz, Srinivas Ravishankar, Salim Roukos, Alexander Gray, Ramon Astudillo, Maria Chang, Cristina Cornelio, Saswati Dana, Achille Fokoue, Dinesh Garg, Alfio Gliozzo, Sairam Gurajada, Hima Karanam, Naweed Khan, Dinesh Khandelwal, Young-Suk Lee, Yunyao Li, Francois Luus, Ndivhuwo Makondo, Nandana Mihindukulasooriya, Tahira Naseem, Sumit Neelam, Lucian Popa, Revanth Reddy , et al. (5 additional authors not shown)

    Abstract: Knowledge base question answering (KBQA)is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large end-to-end training datasets. In this work, we propose Neuro-Symbolic Question Answering (NSQA), a modular KBQA system, that leverages (1) Abstract Meaning Representation (AM… ▽ More

    Submitted 2 June, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Accepted to Findings of ACL

  29. arXiv:2010.10673  [pdf, other

    cs.CL

    Pushing the Limits of AMR Parsing with Self-Learning

    Authors: Young-Suk Lee, Ramon Fernandez Astudillo, Tahira Naseem, Revanth Gangi Reddy, Radu Florian, Salim Roukos

    Abstract: Abstract Meaning Representation (AMR) parsing has experienced a notable growth in performance in the last two years, due both to the impact of transfer learning and the development of novel architectures specific to AMR. At the same time, self-learning techniques have helped push the performance boundaries of other natural language processing applications, such as machine translation or question a… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=4q5-oJgLiO, code https://github.com/IBM/transition-amr-parser

  30. arXiv:2010.10669  [pdf, other

    cs.CL

    Transition-based Parsing with Stack-Transformers

    Authors: Ramon Fernandez Astudillo, Miguel Ballesteros, Tahira Naseem, Austin Blodgett, Radu Florian

    Abstract: Modeling the parser state is key to good performance in transition-based parsing. Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state, e.g. stack-LSTM parsers, or local state modeling of contextualized features, e.g. Bi-LSTM parsers. Given the success of Transformer architectures in recent parsing systems, this work explores mod… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=b36spsuUAde, code https://github.com/IBM/transition-amr-parser

  31. arXiv:2007.05554  [pdf, other

    stat.ML cs.LG math.OC

    Bayesian Optimization of Risk Measures

    Authors: Sait Cakmak, Raul Astudillo, Peter Frazier, Enlu Zhou

    Abstract: We consider Bayesian optimization of objective functions of the form $ρ[ F(x, W) ]$, where $F$ is a black-box expensive-to-evaluate function and $ρ$ denotes either the VaR or CVaR risk measure, computed with respect to the randomness induced by the environmental random variable $W$. Such problems arise in decision making under uncertainty, such as in portfolio optimization and robust systems desig… ▽ More

    Submitted 3 November, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: Main paper: 15 pages with 2 figures. Supplement: 14 pages with 3 figures. To appear in NeurIPS 2020

  32. arXiv:2005.09123  [pdf, ps, other

    cs.CL cs.LG

    GPT-too: A language-model-first approach for AMR-to-text generation

    Authors: Manuel Mager, Ramon Fernandez Astudillo, Tahira Naseem, Md Arafat Sultan, Young-Suk Lee, Radu Florian, Salim Roukos

    Abstract: Meaning Representations (AMRs) are broad-coverage sentence-level semantic graphs. Existing approaches to generating text from AMR have focused on training sequence-to-sequence or graph-to-sequence models on AMR annotated data only. In this paper, we propose an alternative approach that combines a strong pre-trained language model with cycle consistency-based re-scoring. Despite the simplicity of t… ▽ More

    Submitted 27 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Paper accepted to the Annual Meeting of the Association for Computational Linguistics (ACL 2020)

  33. arXiv:1911.05934  [pdf, ps, other

    stat.ML cs.LG math.OC

    Multi-Attribute Bayesian Optimization With Interactive Preference Learning

    Authors: Raul Astudillo, Peter I. Frazier

    Abstract: We consider black-box global optimization of time-consuming-to-evaluate functions on behalf of a decision-maker (DM) whose preferences must be learned. Each feasible design is associated with a time-consuming-to-evaluate vector of attributes and each vector of attributes is assigned a utility by the DM's utility function, which may be learned approximately using preferences expressed over pairs of… ▽ More

    Submitted 3 March, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

    Comments: In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

  34. arXiv:1906.01537  [pdf, ps, other

    stat.ML cs.LG math.OC

    Bayesian Optimization of Composite Functions

    Authors: Raul Astudillo, Peter I. Frazier

    Abstract: We consider optimization of composite objective functions, i.e., of the form $f(x)=g(h(x))$, where $h$ is a black-box derivative-free expensive-to-evaluate function with vector-valued outputs, and $g$ is a cheap-to-evaluate real-valued function. While these problems can be solved with standard Bayesian optimization, we propose a novel approach that exploits the composite structure of the objective… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: In Proceedings of the 36th International Conference on Machine Learning, PMLR 97:354-363, 2019

    Journal ref: In Proceedings of the 36th International Conference on Machine Learning, PMLR 97:354-363, 2019

  35. arXiv:1905.01152  [pdf, ps, other

    eess.AS cs.CL cs.IR cs.LG cs.SD

    Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text

    Authors: Murali Karthick Baskar, Shinji Watanabe, Ramon Astudillo, Takaaki Hori, Lukáš Burget, Jan Černocký

    Abstract: Sequence-to-sequence automatic speech recognition (ASR) models require large quantities of data to attain high performance. For this reason, there has been a recent surge in interest for unsupervised and semi-supervised training in such models. This work builds upon recent results showing notable improvements in semi-supervised training using cycle-consistency and related techniques. Such techniqu… ▽ More

    Submitted 20 August, 2019; v1 submitted 30 April, 2019; originally announced May 2019.

    Comments: INTERSPEECH 2019

  36. arXiv:1811.01690  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Cycle-consistency training for end-to-end speech recognition

    Authors: Takaaki Hori, Ramon Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux

    Abstract: This paper presents a method to train end-to-end automatic speech recognition (ASR) models using unpaired data. Although the end-to-end approach can eliminate the need for expert knowledge such as pronunciation dictionaries to build ASR systems, it still requires a large amount of paired data, i.e., speech utterances and their transcriptions. Cycle-consistency losses have been recently proposed as… ▽ More

    Submitted 22 May, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: Submitted to ICASSP'19

  37. arXiv:1807.10893  [pdf, ps, other

    cs.CL

    Back-Translation-Style Data Augmentation for End-to-End ASR

    Authors: Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramon Astudillo, Kazuya Takeda

    Abstract: In this paper we propose a novel data augmentation method for attention-based end-to-end automatic speech recognition (E2E-ASR), utilizing a large amount of text which is not paired with speech signals. Inspired by the back-translation technique proposed in the field of machine translation, we build a neural text-to-encoder model which predicts a sequence of hidden states extracted by a pre-traine… ▽ More

    Submitted 28 July, 2018; originally announced July 2018.

  38. arXiv:1701.00145  [pdf, other

    cs.CL

    Expanding Subjective Lexicons for Social Media Mining with Embedding Subspaces

    Authors: Silvio Amir, Rámon Astudillo, Wang Ling, Paula C. Carvalho, Mário J. Silva

    Abstract: Recent approaches for sentiment lexicon induction have capitalized on pre-trained word embeddings that capture latent semantic properties. However, embeddings obtained by optimizing performance of a given task (e.g. predicting contextual words) are sub-optimal for other applications. In this paper, we address this problem by exploiting task-specific representations, induced via embedding sub-space… ▽ More

    Submitted 6 January, 2017; v1 submitted 31 December, 2016; originally announced January 2017.

  39. arXiv:1609.02082  [pdf, ps, other

    cs.LG cs.CL cs.SD

    An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems

    Authors: Christian Huemmer, Ramón Fernández Astudillo, Walter Kellermann

    Abstract: In this paper, we advance a recently-proposed uncertainty decoding scheme for DNN-HMM (deep neural network - hidden Markov model) hybrid systems. This numerical sampling concept averages DNN outputs produced by a finite set of feature samples (drawn from a probabilistic distortion model) to approximate the posterior likelihoods of the context-dependent HMM states. As main innovation, we propose a… ▽ More

    Submitted 4 August, 2016; originally announced September 2016.

    Comments: 5 pages

  40. arXiv:1602.02068  [pdf, other

    cs.CL cs.LG stat.ML

    From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

    Authors: André F. T. Martins, Ramón Fernandez Astudillo

    Abstract: We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities. After deriving its properties, we show how its Jacobian can be efficiently computed, enabling its use in a network trained with backpropagation. Then, we propose a new smooth and convex loss function which is the sparsemax analogue of the logistic loss. We reveal an unexpect… ▽ More

    Submitted 8 February, 2016; v1 submitted 5 February, 2016; originally announced February 2016.

    Comments: Minor corrections

  41. arXiv:1508.02096  [pdf, other

    cs.CL

    Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

    Authors: Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. Black, Isabel Trancoso

    Abstract: We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs. Relative to traditional word representation models that have independent vectors for each word type, our model requires only a single vector per character type and a fixed set of parameters for the compositional model. Despite the compactness of this model and, more importantly,… ▽ More

    Submitted 23 May, 2016; v1 submitted 9 August, 2015; originally announced August 2015.