Skip to main content

Showing 1–3 of 3 results for author: Yousuf, R B

.
  1. arXiv:2406.14541  [pdf, other

    cs.LG

    Are LLMs Naturally Good at Synthetic Tabular Data Generation?

    Authors: Shengzhe Xu, Cho-Ting Lee, Mandar Sharma, Raquib Bin Yousuf, Nikhil Muralidhar, Naren Ramakrishnan

    Abstract: Large language models (LLMs) have demonstrated their prowess in generating synthetic text and images; however, their potential for generating tabular data -- arguably the most common data type in business and scientific applications -- is largely underexplored. This paper demonstrates that LLMs, used as-is, or after traditional fine-tuning, are severely inadequate as synthetic table generators. Du… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.14005  [pdf, other

    cs.CL cs.AI cs.LG

    Information Guided Regularization for Fine-tuning Language Models

    Authors: Mandar Sharma, Nikhil Muralidhar, Shengzhe Xu, Raquib Bin Yousuf, Naren Ramakrishnan

    Abstract: The pretraining-fine-tuning paradigm has been the de facto strategy for transfer learning in modern language modeling. With the understanding that task adaptation in LMs is often a function of parameters shared across tasks, we argue that a more surgical approach to regularization needs to exist for smoother transfer learning. Towards this end, we investigate how the pretraining loss landscape is… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2207.04029  [pdf, other

    cs.IR cs.AI

    Lessons from Deep Learning applied to Scholarly Information Extraction: What Works, What Doesn't, and Future Directions

    Authors: Raquib Bin Yousuf, Subhodip Biswas, Kulendra Kumar Kaushal, James Dunham, Rebecca Gelles, Sathappan Muthiah, Nathan Self, Patrick Butler, Naren Ramakrishnan

    Abstract: Understanding key insights from full-text scholarly articles is essential as it enables us to determine interesting trends, give insight into the research and development, and build knowledge graphs. However, some of the interesting key insights are only available when considering full-text. Although researchers have made significant progress in information extraction from short documents, extract… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: ACM KDD 2022 Workshop on Data-driven Science of Science

    ACM Class: I.2; I.2.7; H.3