Skip to main content

Showing 1–15 of 15 results for author: Mamou, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14105  [pdf, other

    cs.DC cs.AI cs.CL cs.LG

    Distributed Speculative Inference of Large Language Models

    Authors: Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Oren Pereg, Moshe Wasserblat, Tomer Galanti, Michal Gordon, David Harel

    Abstract: Accelerating the inference of large language models (LLMs) is an important challenge in artificial intelligence. This paper introduces distributed speculative inference (DSI), a novel distributed inference algorithm that is provably faster than speculative inference (SI) [leviathan2023fast, chen2023accelerating, miao2023specinfer] and traditional autoregressive inference (non-SI). Like other SI al… ▽ More

    Submitted 28 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2405.04304  [pdf, other

    cs.CL

    Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models

    Authors: Jonathan Mamou, Oren Pereg, Daniel Korat, Moshe Berchansky, Nadav Timor, Moshe Wasserblat, Roy Schwartz

    Abstract: Speculative decoding is commonly used for reducing the inference latency of large language models. Its effectiveness depends highly on the speculation lookahead (SL)-the number of tokens generated by the draft model at each iteration. In this work we show that the common practice of using the same SL for all iterations (static SL) is suboptimal. We introduce DISCO (DynamIc SpeCulation lookahead Op… ▽ More

    Submitted 23 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  3. arXiv:2306.02307  [pdf, other

    cs.CL cs.AI cs.LG

    Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings

    Authors: Daniel Rotem, Michael Hassid, Jonathan Mamou, Roy Schwartz

    Abstract: Adaptive inference is a simple method for reducing inference costs. The method works by maintaining multiple classifiers of different capacities, and allocating resources to each test instance according to its difficulty. In this work, we compare the two main approaches for adaptive inference, Early-Exit and Multi-Model, when training data is limited. First, we observe that for models with the sam… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Proceedings of ACL 2023

  4. arXiv:2204.06271  [pdf, other

    cs.CL cs.AI

    TangoBERT: Reducing Inference Cost by using Cascaded Architecture

    Authors: Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Roy Schwartz

    Abstract: The remarkable success of large transformer-based models such as BERT, RoBERTa and XLNet in many NLP tasks comes with a large increase in monetary and environmental cost due to their high computational load and energy consumption. In order to reduce this computational load in inference time, we present TangoBERT, a cascaded model architecture in which instances are first processed by an efficient… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

  5. arXiv:2104.07578  [pdf, other

    cs.CL

    Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models

    Authors: Matteo Alleman, Jonathan Mamou, Miguel A Del Rio, Hanlin Tang, Yoon Kim, SueYeon Chung

    Abstract: While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings. In particular, it is not entirely clear what aspects of sentence-level syntax are captured by these representations, nor how (if at all) they are built along the stacked layers of the network. In this paper, we aim t… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: 12 pages, 7 figures

  6. arXiv:2006.01095  [pdf, other

    cs.CL cs.NE

    Emergence of Separable Manifolds in Deep Language Representations

    Authors: Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung

    Abstract: Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities. While they are only loosely inspired by the biological brain, recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain. DNNs have subsequently become a popular model class to infer comput… ▽ More

    Submitted 8 July, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: 9 pages. 10 figures. Accepted to ICML 2020. Included supplemental materials

  7. arXiv:1911.03243  [pdf, ps, other

    cs.CL

    Controlled Crowdsourcing for High-Quality QA-SRL Annotation

    Authors: Paul Roit, Ayal Klein, Daniela Stepanov, Jonathan Mamou, Julian Michael, Gabriel Stanovsky, Luke Zettlemoyer, Ido Dagan

    Abstract: Question-answer driven Semantic Role Labeling (QA-SRL) was proposed as an attractive open and natural flavour of SRL, potentially attainable from laymen. Recently, a large-scale crowdsourced QA-SRL corpus and a trained parser were released. Trying to replicate the QA-SRL annotation for new texts, we found that the resulting annotations were lacking in quality, particularly in coverage, making them… ▽ More

    Submitted 13 May, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

  8. arXiv:1910.09061  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Mouse: An End-to-end Auto-context Refinement Framework for Brain Ventricle and Body Segmentation in Embryonic Mice Ultrasound Volumes

    Authors: Tongda Xu, Ziming Qiu, William Das, Chuiyu Wang, Jack Langerman, Nitin Nair, Orlando Aristizabal, Jonathan Mamou, Daniel H. Turnbull, Jeffrey A. Ketterling, Yao Wang

    Abstract: High-frequency ultrasound (HFU) is well suited for imaging embryonic mice due to its noninvasive and real-time characteristics. However, manual segmentation of the brain ventricles (BVs) and body requires substantial time and expertise. This work proposes a novel deep learning based end-to-end auto-context refinement framework, consisting of two stages. The first stage produces a low resolution se… ▽ More

    Submitted 29 October, 2019; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: Full Paper Submission to ISBI 2020

  9. arXiv:1909.10555  [pdf

    eess.IV cs.CV

    Automatic Mouse Embryo Brain Ventricle & Body Segmentation and Mutant Classification From Ultrasound Data Using Deep Learning

    Authors: Ziming Qiu, Nitin Nair, Jack Langerman, Orlando Aristizabal, Jonathan Mamou, Daniel H. Turnbull, Jeffrey A. Ketterling, Yao Wang

    Abstract: High-frequency ultrasound (HFU) is well suited for imaging embryonic mice in vivo because it is non-invasive and real-time. Manual segmentation of the brain ventricles (BVs) and whole body from 3D HFU images is time-consuming and requires specialized training. This paper presents a deep-learning-based segmentation pipeline which automates several time-consuming, repetitive tasks currently performe… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: 4 pages, 6 figures, the 2019 IEEE International Ultrasonics Symposium

  10. arXiv:1909.05608  [pdf, other

    cs.CL cs.AI

    ABSApp: A Portable Weakly-Supervised Aspect-Based Sentiment Extraction System

    Authors: Oren Pereg, Daniel Korat, Moshe Wasserblat, Jonathan Mamou, Ido Dagan

    Abstract: We present ABSApp, a portable system for weakly-supervised aspect-based sentiment extraction. The system is interpretable and user friendly and does not require labeled training data, hence can be rapidly and cost-effectively used across different domains in applied setups. The system flow includes three stages: First, it generates domain-specific aspect and opinion lexicons based on an unlabeled… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

    Comments: 6 pages, demo paper at EMNLP 2019

  11. arXiv:1904.02496  [pdf, ps, other

    cs.CL cs.IR

    Multi-Context Term Embeddings: the Use Case of Corpus-based Term Set Expansion

    Authors: Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan

    Abstract: In this paper, we present a novel algorithm that combines multi-context term embeddings using a neural classifier and we test this approach on the use case of corpus-based term set expansion. In addition, we present a novel and unique dataset for intrinsic evaluation of corpus-based term set expansion algorithms. We show that, over this dataset, our algorithm provides up to 5 mean average precisio… ▽ More

    Submitted 10 April, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

    Comments: 6 pages, RepEval 2019 (NAACL-HLT workshop)

  12. arXiv:1811.03601  [pdf

    eess.IV cs.CV cs.LG q-bio.QM stat.ML

    Deep BV: A Fully Automated System for Brain Ventricle Localization and Segmentation in 3D Ultrasound Images of Embryonic Mice

    Authors: Ziming Qiu, Jack Langerman, Nitin Nair, Orlando Aristizabal, Jonathan Mamou, Daniel H. Turnbull, Jeffrey Ketterling, Yao Wang

    Abstract: Volumetric analysis of brain ventricle (BV) structure is a key tool in the study of central nervous system development in embryonic mice. High-frequency ultrasound (HFU) is the only non-invasive, real-time modality available for rapid volumetric imaging of embryos in utero. However, manual segmentation of the BV from HFU volumes is tedious, time-consuming, and requires specialized expertise. In th… ▽ More

    Submitted 5 November, 2018; originally announced November 2018.

    Comments: IEEE Signal Processing in Medicine and Biology Symposium - 2018, 6 pages, 5 figures

    ACM Class: I.2.6; I.4.6; I.5.1; I.5.4; I.5.5; J.3

  13. arXiv:1808.08953  [pdf, other

    cs.AI cs.CL

    Term Set Expansion based NLP Architect by Intel AI Lab

    Authors: Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

    Abstract: We present SetExpander, a corpus-based system for expanding a seed set of terms into amore complete set of terms that belong to the same semantic class. SetExpander implements an iterative end-to-end workflow. It enables users to easily select a seed set of terms, expand it, view the expanded set, validate it, re-expand the validated set and store it, thus simplifying the extraction of domain-spec… ▽ More

    Submitted 15 October, 2018; v1 submitted 27 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018 System Demonstrations. arXiv admin note: substantial text overlap with arXiv:1807.10104

  14. arXiv:1807.10104  [pdf, other

    cs.AI cs.CL

    Term Set Expansion based on Multi-Context Term Embeddings: an End-to-end Workflow

    Authors: Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

    Abstract: We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class. SetExpander implements an iterative end-to end workflow for term set expansion. It enables users to easily select a seed set of terms, expand it, view the expanded set, validate it, re-expand the validated set and store it, thus simplifying the e… ▽ More

    Submitted 26 July, 2018; originally announced July 2018.

    Comments: COLING 2018 System Demonstration paper

    MSC Class: 68T50 ACM Class: I.2.7

  15. arXiv:1705.07015  [pdf, other

    cs.CV

    Segmentation of 3D High-frequency Ultrasound Images of Human Lymph Nodes Using Graph Cut with Energy Functional Adapted to Local Intensity Distribution

    Authors: Jen-wei Kuo, Jonathan Mamou, Yao Wang, Emi Saegusa-Beecroft, Junji Machi, Ernest J. Feleppa

    Abstract: Previous studies by our group have shown that three-dimensional high-frequency quantitative ultrasound methods have the potential to differentiate metastatic lymph nodes from cancer-free lymph nodes dissected from human cancer patients. To successfully perform these methods inside the lymph node parenchyma, an automatic segmentation method is highly desired to exclude the surrounding thin layer of… ▽ More

    Submitted 19 May, 2017; originally announced May 2017.