Skip to main content

Showing 1–11 of 11 results for author: Ittycheriah, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yan** Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yu**g Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  4. arXiv:2005.11216  [pdf, other

    cs.CL

    A Generative Approach to Titling and Clustering Wikipedia Sections

    Authors: Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, Abe Ittycheriah

    Abstract: We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic enco… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

    Comments: Accepted to WNGT Workshop at ACL 2020

  5. arXiv:1608.02927  [pdf, other

    cs.CL

    Temporal Attention Model for Neural Machine Translation

    Authors: Baskaran Sankaran, Haitao Mi, Yaser Al-Onaizan, Abe Ittycheriah

    Abstract: Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the accumulated temporal memory, as the decoder… ▽ More

    Submitted 9 August, 2016; originally announced August 2016.

    Comments: 8 pages

  6. arXiv:1608.00112  [pdf, other

    cs.CL

    Supervised Attentions for Neural Machine Translation

    Authors: Haitao Mi, Zhiguo Wang, Abe Ittycheriah

    Abstract: In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignmen… ▽ More

    Submitted 30 July, 2016; originally announced August 2016.

    Comments: 6 pages. In Proceedings of EMNLP 2016. arXiv admin note: text overlap with arXiv:1605.03148

  7. arXiv:1605.03209  [pdf, other

    cs.CL

    Vocabulary Manipulation for Neural Machine Translation

    Authors: Haitao Mi, Zhiguo Wang, Abe Ittycheriah

    Abstract: In order to capture rich language phenomena, neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary. For each sentence or batch, we only predict the target words… ▽ More

    Submitted 10 May, 2016; originally announced May 2016.

    Comments: 6 pages

  8. arXiv:1605.03148  [pdf, other

    cs.CL

    Coverage Embedding Models for Neural Machine Translation

    Authors: Haitao Mi, Baskaran Sankaran, Zhiguo Wang, Abe Ittycheriah

    Abstract: In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and drop** translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scal… ▽ More

    Submitted 29 August, 2016; v1 submitted 10 May, 2016; originally announced May 2016.

    Comments: 6 pages; In Proceddings of EMNLP 2016

  9. arXiv:1602.07019  [pdf, other

    cs.CL

    Sentence Similarity Learning by Lexical Decomposition and Composition

    Authors: Zhiguo Wang, Haitao Mi, Abraham Ittycheriah

    Abstract: Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each w… ▽ More

    Submitted 14 July, 2017; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: In Proceedings of Coling 2016

  10. arXiv:1602.06797  [pdf, other

    cs.CL

    Semi-supervised Clustering for Short Text via Deep Representation Learning

    Authors: Zhiguo Wang, Haitao Mi, Abraham Ittycheriah

    Abstract: In this work, we propose a semi-supervised method for short text clustering, where we represent texts as distributed vectors with neural networks, and use a small amount of labeled data to specify our intention for clustering. We design a novel objective to combine the representation learning process and the k-means clustering process together, and optimize the objective with both labeled data and… ▽ More

    Submitted 14 July, 2017; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: In Proceedings of CoNLL 2016

  11. arXiv:1507.02628  [pdf, other

    cs.CL

    FAQ-based Question Answering via Word Alignment

    Authors: Zhiguo Wang, Abraham Ittycheriah

    Abstract: In this paper, we propose a novel word-alignment-based method to solve the FAQ-based question answering task. First, we employ a neural network model to calculate question similarity, where the word alignment between two questions is used for extracting features. Second, we design a bootstrap-based feature extraction method to extract a small set of effective lexical features. Third, we propose a… ▽ More

    Submitted 9 July, 2015; originally announced July 2015.