Skip to main content

Showing 1–10 of 10 results for author: Lausen, L

.
  1. arXiv:2310.00789  [pdf, ps, other

    cs.CL cs.LG

    Testing the Limits of Unified Sequence to Sequence LLM Pretraining on Diverse Table Data Tasks

    Authors: Soumajyoti Sarkar, Leonard Lausen

    Abstract: Tables stored in databases and tables which are present in web pages and articles account for a large part of semi-structured data that is available on the internet. It then becomes pertinent to develop a modeling approach with large language models (LLMs) that can be used to solve diverse table tasks such as semantic parsing, question answering as well as classification problems. Traditionally, t… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

  2. arXiv:2307.08623  [pdf, other

    cs.LG cs.AI cs.CL

    HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

    Authors: Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, George Karypis

    Abstract: Language models pretrained on large collections of tabular data have demonstrated their effectiveness in several downstream tasks. However, many of these models do not take into account the row/column permutation invariances, hierarchical structure, etc. that exist in tabular data. To alleviate these limitations, we propose HYTREL, a tabular language model, that captures the permutation invariance… ▽ More

    Submitted 26 October, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 (spotlight)

  3. arXiv:2306.03438  [pdf, other

    cs.LG cs.AI cs.CL cs.SE

    Large Language Models of Code Fail at Completing Code with Potential Bugs

    Authors: Tuan Dinh, **man Zhao, Samson Tan, Renato Negrinho, Leonard Lausen, Sheng Zha, George Karypis

    Abstract: Large language models of code (Code-LLMs) have recently brought tremendous advances to code completion, a fundamental feature of programming assistance and code intelligence. However, most existing works ignore the possible presence of bugs in the code context for generation, which are inevitable in software development. Therefore, we introduce and study the buggy-code completion problem, inspired… ▽ More

    Submitted 30 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 27 pages, accepted to NeurIPS 2023

  4. arXiv:2306.00381  [pdf, other

    cs.SE cs.LG

    Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

    Authors: Hengzhi Pei, **man Zhao, Leonard Lausen, Sheng Zha, George Karypis

    Abstract: Pretrained code language models have enabled great progress towards program synthesis. However, common approaches only consider in-file local context and thus miss information and constraints imposed by other parts of the codebase and its external dependencies. Existing code completion benchmarks also lack such context. To resolve these restrictions we curate a new dataset of permissively licensed… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 12 pages. Accepted to AAAI 2023

    ACM Class: I.2.2; I.2.7

  5. arXiv:2211.03966  [pdf, ps, other

    cs.CL cs.LG

    Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic

    Authors: Soumajyoti Sarkar, Kaixiang Lin, Sailik Sengupta, Leonard Lausen, Sheng Zha, Saab Mansour

    Abstract: The use of multilingual language models for tasks in low and high-resource languages has been a success story in deep learning. In recent times, Arabic has been receiving widespread attention on account of its dialectal variance. While prior research studies have tried to adapt these multilingual models for dialectal variants of Arabic, it still remains a challenging problem owing to the lack of s… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  6. arXiv:2204.11117  [pdf, other

    cs.CL cs.LG

    Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

    Authors: Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis

    Abstract: Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks. In contrast, literature on task transferability has established that the choice of intermediate tasks can heavily affect downstream task performance. In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task re… ▽ More

    Submitted 12 July, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 - Camera ready version

  7. arXiv:1907.04433  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

    Authors: Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

    Abstract: We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototy** and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customiza… ▽ More

    Submitted 12 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Journal ref: Journal of Machine Learning Research 21 (2020) 1-7

  8. arXiv:1712.05902  [pdf, other

    cs.LG cs.DC

    NSML: A Machine Learning Platform That Enables You to Focus on Your Models

    Authors: Nako Sung, Minkyu Kim, Hyunwoo Jo, Youngil Yang, **gwoong Kim, Leonard Lausen, Youngkwan Kim, Gayoung Lee, Donghyun Kwak, Jung-Woo Ha, Sunghun Kim

    Abstract: Machine learning libraries such as TensorFlow and PyTorch simplify model implementation. However, researchers are still required to perform a non-trivial amount of manual tasks such as GPU allocation, training status tracking, and comparison of models with different hyperparameter settings. We propose a system to handle these tasks and help researchers focus on models. We present the requirements… ▽ More

    Submitted 15 December, 2017; originally announced December 2017.

    Comments: 8 pages, 4figures

  9. arXiv:1706.03458  [pdf, other

    cs.CV

    Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model

    Authors: Xingjian Shi, Zhihan Gao, Leonard Lausen, Hao Wang, Dit-Yan Yeung, Wai-kin Wong, Wang-chun Woo

    Abstract: With the goal of making high-resolution forecasts of regional rainfall, precipitation nowcasting has become an important and fundamental technology underlying various public services ranging from rainstorm warnings to flight safety. Recently, the Convolutional LSTM (ConvLSTM) model has been shown to outperform traditional optical flow based methods for precipitation nowcasting, suggesting that dee… ▽ More

    Submitted 5 October, 2017; v1 submitted 12 June, 2017; originally announced June 2017.

    Comments: NIPS 2017 Spotlight

  10. arXiv:1609.04695  [pdf

    cond-mat.mes-hall physics.optics quant-ph

    Excitation of surface plasmon polariton modes with multiple nitrogen vacancy centers in single nanodiamonds

    Authors: Shailesh Kumar, Jens L. Lausen, Cesar E. Garcia-Ortiz, Sebastian K. H. Andersen, Alexander S. Roberts, Ilya P. Radko, Cameron L. C. Smith, Anders Kristensen, Sergey I. Bozhevolnyi

    Abstract: Nitrogen-vacancy (NV) centers in diamonds are interesting due to their remarkable characteristics that are well suited to applications in quantum-information processing and magnetic field sensing, as well as representing stable fluorescent sources. Multiple NV centers in nanodiamonds (NDs) are especially useful as biological fluorophores due to their chemical neutrality, brightness and room-temper… ▽ More

    Submitted 15 September, 2016; originally announced September 2016.

    Comments: 22 pages, 13 figures

    Journal ref: J. Opt. 18 (2016) 024002