Skip to main content

Showing 1–37 of 37 results for author: Palangi, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.08225  [pdf, other

    cs.LG

    Improving Black-box Robustness with In-Context Rewriting

    Authors: Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen

    Abstract: Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a simple post-hoc technique… ▽ More

    Submitted 15 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  2. arXiv:2402.06120  [pdf, other

    cs.CL

    Exploring Group and Symmetry Principles in Large Language Models

    Authors: Shima Imani, Hamid Palangi

    Abstract: Large Language Models (LLMs) have demonstrated impressive performance across a wide range of applications; however, assessing their reasoning capabilities remains a significant challenge. In this paper, we introduce a framework grounded in group and symmetry principles, which have played a crucial role in fields such as physics and mathematics, and offer another way to evaluate their capabilities.… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  3. arXiv:2312.02073  [pdf, other

    cs.CL

    A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

    Authors: Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kıcıman, Hamid Palangi, Barun Patra, Robert West

    Abstract: Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling. Favoring the contextual information is critical for retrieval-augmente… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Accepted at ACL 2024 (main conference)

  4. arXiv:2311.11045  [pdf, other

    cs.AI

    Orca 2: Teaching Small Language Models How to Reason

    Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

    Abstract: Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Added url to model weights fixed typo in Author name

  5. arXiv:2310.17750  [pdf, other

    cs.CL

    A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

    Authors: Ahmed Magooda, Alec Helyar, Kyle Jackson, David Sullivan, Chad Atalla, Emily Sheng, Dan Vann, Richard Edgar, Hamid Palangi, Roman Lutz, Hongliang Kong, Vincent Yun, Eslam Kamal, Federico Zarfati, Hanna Wallach, Sarah Bird, Mei Chen

    Abstract: We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services. Our framework for automatically measuring harms from LLMs builds on existing technical and sociotechnical expertise and leverages the capabilities of state-of-the-art LLMs, such as GPT-4. We use this framework to run through several case studie… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: This is a living document

  6. arXiv:2310.07088  [pdf, other

    cs.CL cs.AI

    Diversity of Thought Improves Reasoning Abilities of LLMs

    Authors: Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi

    Abstract: Large language models (LLMs) are documented to struggle in settings that require complex reasoning. Nevertheless, instructing the model to break down the problem into smaller reasoning steps, or ensembling various generations through modifying decoding steps boosts performance. However, these methods assume that the input prompt is fixed and expect the decoding strategies to introduce the diversit… ▽ More

    Submitted 23 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  7. arXiv:2310.06827  [pdf, other

    cs.CL cs.LG

    Teaching Language Models to Hallucinate Less with Synthetic Tasks

    Authors: Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar

    Abstract: Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. However, optimizing LLMs to hallucinate less on these tasks is challenging, as hallucination is hard to efficiently evaluate at each optimization step. I… ▽ More

    Submitted 7 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  8. arXiv:2309.15129  [pdf, other

    cs.AI cs.CL cs.LG

    Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

    Authors: Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson

    Abstract: Recently an influx of studies claim emergent cognitive abilities in large language models (LLMs). Yet, most rely on anecdotes, overlook contamination of training sets, or lack systematic Evaluation involving multiple tasks, control conditions, multiple iterations, and statistical robustness tests. Here we make two major contributions. First, we propose CogEval, a cognitive science-inspired protoco… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  9. arXiv:2309.15098  [pdf, other

    cs.CL cs.AI cs.LG

    Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

    Authors: Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

    Abstract: We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as constraint satisfaction problems and use this framework to investigate how the LLM interacts internally with factual constraints. We find a strong positive relationship between the LLM's attention to constraint tokens and the fac… ▽ More

    Submitted 17 April, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Published at ICLR 2024

  10. arXiv:2307.10522  [pdf, other

    cs.CL

    Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models

    Authors: Somayeh Ghanbarzadeh, Yan Huang, Hamid Palangi, Radames Cruz Moreno, Hamed Khanpour

    Abstract: Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning,… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Journal ref: ACL 2023

  11. arXiv:2307.10457  [pdf, other

    cs.CL

    Improving the Reusability of Pre-trained Language Models in Real-world Applications

    Authors: Somayeh Ghanbarzadeh, Hamid Palangi, Yan Huang, Radames Cruz Moreno, Hamed Khanpour

    Abstract: The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but… ▽ More

    Submitted 8 August, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted as a long paper and awarded as the BEST Resaerch Paper in IEEE IRI'23 (IEEE 24th International conference on Information Reuse and Integrationfor Data Science)

  12. arXiv:2306.02707  [pdf, other

    cs.CL cs.LG

    Orca: Progressive Learning from Complex Explanation Traces of GPT-4

    Authors: Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah

    Abstract: Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimat… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  13. arXiv:2304.03916  [pdf, other

    cs.LG cs.AI

    Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

    Authors: Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman

    Abstract: Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments. However, mitigating these correlations during pre-training for large-scale models can be costly and impractical, particularly for those without access to high-performance computing resources. This paper proposes a novel appr… ▽ More

    Submitted 30 May, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

  14. arXiv:2303.12712  [pdf, other

    cs.CL cs.AI

    Sparks of Artificial General Intelligence: Early experiments with GPT-4

    Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

    Abstract: Artificial intelligence (AI) researchers have been develo** and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an earl… ▽ More

    Submitted 13 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  15. arXiv:2301.09211  [pdf, other

    cs.CL cs.AI

    An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

    Authors: Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah

    Abstract: Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an emp… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 17 pages,

    ACM Class: I.2.7

  16. arXiv:2212.10015  [pdf, other

    cs.CV cs.AI cs.CL

    Benchmarking Spatial Relationships in Text-to-Image Generation

    Authors: Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

    Abstract: Spatial understanding is a fundamental aspect of computer vision and integral for human-level reasoning about images, making it an important component for grounded language understanding. While recent text-to-image synthesis (T2I) models have shown unprecedented improvements in photorealism, it is unclear whether they have reliable spatial understanding capabilities. We investigate the ability of… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: preprint; Code and Data at https://github.com/microsoft/VISOR and https://huggingface.co/datasets/tgokhale/sr2d_visor

  17. arXiv:2211.11109  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deep Learning on a Healthy Data Diet: Finding Important Examples for Fairness

    Authors: Abdelrahman Zayed, Prasanna Parthasarathi, Goncalo Mordido, Hamid Palangi, Samira Shabanian, Sarath Chandar

    Abstract: Data-driven predictive solutions predominant in commercial applications tend to suffer from biases and stereotypes, which raises equity concerns. Prediction models may discover, use, or amplify spurious correlations based on gender or other protected personal characteristics, thus discriminating against marginalized groups. Mitigating gender bias has become an important research focus in natural l… ▽ More

    Submitted 24 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: In Proceedings of AAAI 2023

  18. arXiv:2211.11031  [pdf, other

    cs.LG

    Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

    Authors: Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

    Abstract: Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifel… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS 2023

  19. arXiv:2211.04364  [pdf, other

    cs.CL

    NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?

    Authors: Saadia Gabriel, Hamid Palangi, Ye** Choi

    Abstract: While a substantial body of prior work has explored adversarial example generation for natural language understanding tasks, these examples are often unrealistic and diverge from the real-world data distributions. In this work, we introduce a two-stage adversarial example generation framework (NaturalAdversaries), for designing adversaries that are effective at fooling a given classifier and demon… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP 2022

  20. arXiv:2208.06061  [pdf, other

    cs.CL

    Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

    Authors: Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky

    Abstract: Machine translation has seen rapid progress with the advent of Transformer-based models. These models have no explicit linguistic structure built into them, yet they may still implicitly learn structured relationships by attending to relevant tokens. We hypothesize that this structural learning could be made more robust by explicitly endowing Transformers with a structural bias, and we investigate… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: Revised edition to 4th Workshop on Technologies for MT of Low Resource Languages

    Journal ref: Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)

  21. arXiv:2207.02159  [pdf, other

    cs.CV cs.MM

    Robustness Analysis of Video-Language Models Against Visual and Language Perturbations

    Authors: Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet

    Abstract: Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning. However, robustness of these approaches against real-world perturbations has not been studied. In this work, we perform the first extensive robustness study of video-language models against various real-world perturbations. We focus on text-to-vid… ▽ More

    Submitted 18 July, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track. This projects webpage is located at https://bit.ly/3CNOly4

    Journal ref: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (2022)

  22. arXiv:2207.01398  [pdf, other

    cs.CV eess.IV

    Large-scale Robustness Analysis of Video Action Recognition Models

    Authors: Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat

    Abstract: We have seen a great progress in video action recognition in recent years. There are several models based on convolutional neural network (CNN) and some recent transformer based approaches which provide top performance on existing benchmarks. In this work, we perform a large-scale robustness analysis of these existing models for video action recognition. We focus on robustness against real-world d… ▽ More

    Submitted 7 April, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted in 2023 Conference on Computer Vision and Pattern Recognition (CVPR)

  23. arXiv:2203.09509  [pdf, other

    cs.CL

    ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

    Authors: Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar

    Abstract: Toxic language detection systems often falsely flag text that contains minority group mentions as toxic, as those groups are often the targets of online hate. Such over-reliance on spurious correlations also causes systems to struggle with detecting implicitly toxic language. To help mitigate these issues, we create ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign… ▽ More

    Submitted 14 July, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published as a long paper at ACL 2022. Code: https://github.com/microsoft/TOXIGEN

  24. arXiv:2106.01317  [pdf, other

    cs.CL cs.AI cs.LG

    Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization

    Authors: Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao

    Abstract: Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) composing a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. In this paper,… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: NAACL 2021 (14 pages)

  25. arXiv:2105.08961  [pdf, other

    cs.LG cs.AI cs.CL

    Compositional Processing Emerges in Neural Networks Solving Math Problems

    Authors: Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao

    Abstract: A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 7 pages, 2 figures, Accepted to CogSci 2021 for poster presentation

  26. arXiv:2011.09530  [pdf, other

    cs.CV cs.AI eess.IV

    Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

    Authors: Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang

    Abstract: Neuro-symbolic representations have proved effective in learning structure information in vision and language. In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions. We refer to these relations as rel… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  27. arXiv:2006.11524  [pdf, other

    cs.LG cs.AI cs.CV cs.NE cs.SC stat.ML

    Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"

    Authors: Saeed Amizadeh, Hamid Palangi, Oleksandr Polozov, Yichen Huang, Kazuhito Koishida

    Abstract: Visual reasoning tasks such as visual question answering (VQA) require an interplay of visual perception with reasoning about the question semantics grounded in perception. However, recent advances in this area are still primarily driven by perception improvements (e.g. scene graph generation) rather than reasoning. Neuro-symbolic models such as Neural Module Networks bring the benefits of composi… ▽ More

    Submitted 25 August, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: Published in Proceedings of the 37th International Conference on Machine Learning (ICML), Online, PMLR 119, 2020

  28. arXiv:2005.11406  [pdf, other

    cs.CV

    Novel Human-Object Interaction Detection via Adversarial Domain Generalization

    Authors: Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid Palangi, Jianfeng Gao, C. -C. Jay Kuo, Pengchuan Zhang

    Abstract: We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. As a result, most existing HOI methods heavily re… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  29. arXiv:1910.12647  [pdf, other

    cs.CL cs.LG stat.ML

    HUBERT Untangles BERT to Improve Transfer across NLP Tasks

    Authors: Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao

    Abstract: We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP datasets that HUBERT, but not BERT, is able to learn and leverage. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. Our exper… ▽ More

    Submitted 25 April, 2021; v1 submitted 25 October, 2019; originally announced October 2019.

  30. arXiv:1910.02339  [pdf, other

    cs.CL cs.LG

    Map** Natural-language Problems to Formal-language Solutions Using Structured Neural Representations

    Authors: Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao

    Abstract: Generating formal-language programs represented by relational tuples, such as Lisp programs or mathematical operations, to solve problems stated in natural language is a challenging task because it requires explicitly capturing discrete symbolic structural information implicit in the input. However, most general neural sequence models do not explicitly capture such structural information, limiting… ▽ More

    Submitted 1 August, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

  31. arXiv:1909.11059  [pdf, other

    cs.CV

    Unified Vision-Language Pre-Training for Image Captioning and VQA

    Authors: Luowei Zhou, Hamid Palangi, Lei Zhang, Houdong Hu, Jason J. Corso, Jianfeng Gao

    Abstract: This paper presents a unified Vision-Language Pre-training (VLP) model. The model is unified in that (1) it can be fine-tuned for either vision-language generation (e.g., image captioning) or understanding (e.g., visual question answering) tasks, and (2) it uses a shared multi-layer transformer network for both encoding and decoding, which differs from many existing methods where the encoder and d… ▽ More

    Submitted 4 December, 2019; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: AAAI 2020 camera-ready version. The code and the pre-trained models are available at https://github.com/LuoweiZhou/VLP

  32. arXiv:1909.09953  [pdf, other

    cs.CV cs.AI

    Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators

    Authors: Kuang-Huei Lee, Hamid Palangi, Xi Chen, Houdong Hu, Jianfeng Gao

    Abstract: Grounding language to visual relations is critical to various language-and-vision applications. In this work, we tackle two fundamental language-and-vision tasks: image-text matching and image captioning, and demonstrate that neural scene graph generators can learn effective visual relation features to facilitate grounding language to visual relations and subsequently improve the two end applicati… ▽ More

    Submitted 22 September, 2019; originally announced September 2019.

  33. arXiv:1705.08432  [pdf, other

    cs.CL

    Question-Answering with Grammatically-Interpretable Representations

    Authors: Hamid Palangi, Paul Smolensky, Xiaodong He, Li Deng

    Abstract: We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model perfor… ▽ More

    Submitted 25 September, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

  34. Distributed Compressive Sensing: A Deep Learning Approach

    Authors: Hamid Palangi, Rabab Ward, Li Deng

    Abstract: Various studies that address the compressed sensing problem with Multiple Measurement Vectors (MMVs) have been recently carried. These studies assume the vectors of the different channels to be jointly sparse. In this paper, we relax this condition. Instead we assume that these sparse vectors depend on each other but that this dependency is unknown. We capture this dependency by computing the cond… ▽ More

    Submitted 11 May, 2016; v1 submitted 20 August, 2015; originally announced August 2015.

    Comments: To appear in IEEE Transactions on Signal Processing

    Journal ref: IEEE Transactions on Signal Processing, Volume: 64, Issue: 17, pp. 4504-4518, 2016

  35. arXiv:1502.06922  [pdf, other

    cs.CL cs.IR cs.LG cs.NE

    Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval

    Authors: Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward

    Abstract: This paper develops a model that addresses sentence embedding, a hot topic in current natural language processing research, using recurrent neural networks with Long Short-Term Memory (LSTM) cells. Due to its ability to capture long term memory, the LSTM-RNN accumulates increasingly richer information as it goes through the sentence, and when it reaches the last word, the hidden layer of the netwo… ▽ More

    Submitted 16 January, 2016; v1 submitted 24 February, 2015; originally announced February 2015.

    Comments: To appear in IEEE/ACM Transactions on Audio, Speech, and Language Processing

  36. arXiv:1412.6629  [pdf, other

    cs.IR

    Semantic Modelling with Long-Short-Term Memory for Information Retrieval

    Authors: H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, R. Ward

    Abstract: In this paper we address the following problem in web document and information retrieval (IR): How can we use long-term context information to gain better IR performance? Unlike common IR methods that use bag of words representation for queries and documents, we treat them as a sequence of words and use long short term memory (LSTM) to capture contextual dependencies. To the best of our knowledge,… ▽ More

    Submitted 27 February, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

  37. arXiv:1311.2987  [pdf, ps, other

    cs.LG

    Learning Input and Recurrent Weight Matrices in Echo State Networks

    Authors: Hamid Palangi, Li Deng, Rabab K Ward

    Abstract: Echo State Networks (ESNs) are a special type of the temporally deep network model, the Recurrent Neural Network (RNN), where the recurrent matrix is carefully designed and both the recurrent and input matrices are fixed. An ESN uses the linearity of the activation function of the output units to simplify the learning of the output matrix. In this paper, we devise a special technique that take adv… ▽ More

    Submitted 12 November, 2013; originally announced November 2013.

    Comments: Deep Learning Workshop NIPS 2013