Skip to main content

Showing 1–6 of 6 results for author: Smiley, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.18152  [pdf, other

    cs.CL

    Large Language Models as Financial Data Annotators: A Study on Effectiveness and Efficiency

    Authors: Toyin Aguda, Suchetha Siddagangappa, Elena Kochkina, Simerjot Kaur, Dongsheng Wang, Charese Smiley, Sameena Shah

    Abstract: Collecting labeled datasets in finance is challenging due to scarcity of domain experts and higher cost of employing them. While Large Language Models (LLMs) have demonstrated remarkable performance in data annotation tasks on general domain datasets, their effectiveness on domain specific datasets remains underexplored. To address this gap, we investigate the potential of LLMs as efficient data a… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  2. arXiv:2401.00942  [pdf, other

    cs.CE

    The Influence of Biomedical Research on Future Business Funding: Analyzing Scientific Impact and Content in Industrial Investments

    Authors: Reza Khanmohammadi, Simerjot Kaur, Charese H. Smiley, Tuka Alhanai, Ivan Brugere, Armineh Nourbakhsh, Mohammad M. Ghassemi

    Abstract: This paper investigates the relationship between scientific innovation in biomedical sciences and its impact on industrial activities, focusing on how the historical impact and content of scientific papers influenced future funding and innovation grant application content for small businesses. The research incorporates bibliometric analyses along with SBIR (Small Business Innovation Research) data… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  3. REFinD: Relation Extraction Financial Dataset

    Authors: Simerjot Kaur, Charese Smiley, Akshat Gupta, Joy Sain, Dongsheng Wang, Suchetha Siddagangappa, Toyin Aguda, Sameena Shah

    Abstract: A number of datasets for Relation Extraction (RE) have been created to aide downstream tasks such as information retrieval, semantic search, question answering and textual entailment. However, these datasets fail to capture financial-domain specific challenges since most of these datasets are compiled using general knowledge sources such as Wikipedia, web-based text and news articles, hindering re… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  4. arXiv:2211.00083  [pdf, other

    cs.CL cs.AI cs.LG

    WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain

    Authors: Raj Sanjay Shah, Kunal Chawla, Dheeraj Eidnani, Agam Shah, Wendi Du, Sudheer Chava, Natraj Raman, Charese Smiley, Jiaao Chen, Diyi Yang

    Abstract: Pre-trained language models have shown impressive performance on a variety of tasks and domains. Previous research on financial language models usually employs a generic training scheme to train standard model architectures, without completely leveraging the richness of the financial data. We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  5. arXiv:2210.03849  [pdf, other

    cs.CL

    ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

    Authors: Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang

    Abstract: With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching. The community is experiencing the shift of the challenge from how to model language to the imitation of complex reasoning abilities like human beings. In this work, we investigate the application domain of finance that involves rea… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  6. arXiv:2109.00122  [pdf, other

    cs.CL

    FinQA: A Dataset of Numerical Reasoning over Financial Data

    Authors: Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova, Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge, William Yang Wang

    Abstract: The sheer volume of financial statements makes it difficult for humans to access and analyze a business's financials. Robust numerical reasoning likewise faces unique challenges in this domain. In this work, we focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. In contrast to existing tasks on general domain, the finance… ▽ More

    Submitted 7 May, 2022; v1 submitted 31 August, 2021; originally announced September 2021.

    Comments: EMNLP 2021