Skip to main content

Showing 1–14 of 14 results for author: Tam, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, **g Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been develo** over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2402.15810  [pdf, other

    cs.DL cs.CL cs.LG

    OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining

    Authors: Fan** Zhang, Shijie Shi, Yifan Zhu, Bo Chen, Yukuo Cen, Jifan Yu, Yelin Chen, Lulu Wang, Qingfei Zhao, Yuqing Cheng, Tianyi Han, Yuwei An, Dan Zhang, Weng Lam Tam, Kun Cao, Yunhe Pang, Xinyu Guan, Huihui Yuan, Jian Song, Xiaoyan Li, Yuxiao Dong, Jie Tang

    Abstract: With the rapid proliferation of scientific literature, versatile academic knowledge services increasingly rely on comprehensive academic graph mining. Despite the availability of public academic graphs, benchmarks, and datasets, these resources often fall short in multi-aspect and fine-grained annotations, are constrained to specific task types and domains, or lack underlying real academic graphs.… ▽ More

    Submitted 20 June, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: KDD'24, 9 pages, 5 appendix pages

    Journal ref: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '24), August 25--29, 2024, Barcelona, Spain

  3. arXiv:2311.18743  [pdf, other

    cs.CL cs.AI cs.LG

    AlignBench: Benchmarking Chinese Alignment of Large Language Models

    Authors: Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, **g Zhang, Minlie Huang, Yuxiao Dong, Jie Tang

    Abstract: Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, effective evaluation of alignment for emerging Chinese LLMs is still significantly lacking, calling for real-scenario grounded, open-ended, challenging and automatic evaluations tailored for alignment. To fill in this gap, we introduce AlignBench, a comprehensive multi-dim… ▽ More

    Submitted 5 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  4. arXiv:2306.06629  [pdf, other

    cs.CL cs.AI

    GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

    Authors: Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Yang Yang, Hongyin Tang, Keqing He, Jiahao Liu, **gang Wang, Shu Zhao, Peng Zhang, Jie Tang

    Abstract: Currently, the reduction in the parameter scale of large-scale pre-trained language models (PLMs) through knowledge distillation has greatly facilitated their widespread deployment on various devices. However, the deployment of knowledge distillation systems faces great challenges in real-world industrial-strength applications, which require the use of complex distillation methods on even larger-s… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: accepted for ACL 2023 industry track

  5. arXiv:2306.06625  [pdf, other

    cs.CL cs.AI

    Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method

    Authors: Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Shu Zhao, Peng Zhang, Jie Tang

    Abstract: The large scale of pre-trained language models poses a challenge for their deployment on various devices, with a growing emphasis on methods to compress these models, particularly knowledge distillation. However, current knowledge distillation methods rely on the model's intermediate layer features and the golden labels (also called hard labels), which usually require aligned model architecture an… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted to Findings of ACL2023

  6. arXiv:2305.08316  [pdf, other

    q-bio.MN cs.AI cs.CE cs.LG

    SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction

    Authors: Ziyuan Zhao, Peisheng Qian, Xulei Yang, Zeng Zeng, Cuntai Guan, Wai Leong Tam, Xiaoli Li

    Abstract: Protein-protein interactions (PPIs) are crucial in various biological processes and their study has significant implications for drug development and disease diagnosis. Existing deep learning methods suffer from significant performance degradation under complex real-world scenarios due to various factors, e.g., label scarcity and domain shift. In this paper, we propose a self-ensembling multigraph… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

    Comments: Accepted by IJCAI 2023

  7. arXiv:2210.02414  [pdf, other

    cs.CL cs.AI cs.LG

    GLM-130B: An Open Bilingual Pre-trained Model

    Authors: Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, Wenguang Chen, Peng Zhang, Yuxiao Dong, Jie Tang

    Abstract: We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as good as GPT-3 (davinci) and unveil how models of such a scale can be successfully pre-trained. Over the course of this effort, we face numerous unexpected technical and engineering challenges, particularly on loss spikes and… ▽ More

    Submitted 25 October, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted to ICLR 2023

  8. arXiv:2207.07087  [pdf, other

    cs.CL cs.IR cs.LG

    Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

    Authors: Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang

    Abstract: Prompt tuning attempts to update few task-specific parameters in pre-trained models. It has achieved comparable performance to fine-tuning of the full parameter set on both language understanding and generation tasks. In this work, we study the problem of prompt tuning for neural text retrievers. We introduce parameter-efficient prompt tuning for text retrieval across in-domain, cross-domain, and… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  9. arXiv:2110.07602  [pdf, other

    cs.CL

    P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

    Authors: Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

    Abstract: Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training. However, in the context of NLU, prior work reveals that prompt tuning does not perform well for normal-sized pretrained models. We also find that existing methods of prompt tuning cannot handle hard sequence labeling tasks, indicating a lack of unive… ▽ More

    Submitted 20 March, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Proceedings of the 60th Annual Meeting of the Association of Computational Linguistics, 2022

  10. arXiv:2110.03094  [pdf, other

    eess.IV cs.CV

    Improving Pneumonia Localization via Cross-Attention on Medical Images and Reports

    Authors: Riddhish Bhalodia, Ali Hatamizadeh, Leo Tam, Ziyue Xu, Xiaosong Wang, Evrim Turkbey, Daguang Xu

    Abstract: Localization and characterization of diseases like pneumonia are primary steps in a clinical pipeline, facilitating detailed clinical diagnosis and subsequent treatment planning. Additionally, such location annotated datasets can provide a pathway for deep learning models to be used for downstream tasks. However, acquiring quality annotations is expensive on human resources and usually requires do… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: Published at MICCAI 2021

  11. arXiv:2103.16022  [pdf, other

    cs.CV

    Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays

    Authors: Xiaosong Wang, Ziyue Xu, Leo Tam, Dong Yang, Daguang Xu

    Abstract: Pre-trained models, e.g., from ImageNet, have proven to be effective in boosting the performance of many downstream applications. It is too demanding to acquire large-scale annotations to build such models for medical imaging. Meanwhile, there are numerous clinical data (in the form of images and text reports) stored in the hospital information systems. The paired image-text data from the same pat… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  12. arXiv:2012.04682  [pdf, other

    cs.CL cs.IR cs.LG

    Transformer Query-Target Knowledge Discovery (TEND): Drug Discovery from CORD-19

    Authors: Leo K. Tam, Xiaosong Wang, Daguang Xu

    Abstract: Previous work established skip-gram word2vec models could be used to mine knowledge in the materials science literature for the discovery of thermoelectrics. Recent transformer architectures have shown great progress in language modeling and associated fine-tuned tasks, but they have yet to be adapted for drug discovery. We present a RoBERTa transformer-based method that extends the masked languag… ▽ More

    Submitted 10 December, 2020; v1 submitted 27 November, 2020; originally announced December 2020.

  13. arXiv:2009.10325  [pdf, other

    cs.CV cs.AI

    Learning Image Labels On-the-fly for Training Robust Classification Models

    Authors: Xiaosong Wang, Ziyue Xu, Dong Yang, Leo Tam, Holger Roth, Daguang Xu

    Abstract: Current deep learning paradigms largely benefit from the tremendous amount of annotated data. However, the quality of the annotations often varies among labelers. Multi-observer studies have been conducted to study these annotation variances (by labeling the same data for multiple times) and its effects on critical applications like medical image analysis. This process indeed adds an extra burden… ▽ More

    Submitted 2 October, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: v2: Minor Corrections

  14. arXiv:2007.15778  [pdf, other

    cs.CV cs.LG eess.IV

    Weakly supervised one-stage vision and language disease detection using large scale pneumonia and pneumothorax studies

    Authors: Leo K. Tam, Xiaosong Wang, Evrim Turkbey, Kevin Lu, Yuhong Wen, Daguang Xu

    Abstract: Detecting clinically relevant objects in medical images is a challenge despite large datasets due to the lack of detailed labels. To address the label issue, we utilize the scene-level labels with a detection architecture that incorporates natural language information. We present a challenging new set of radiologist paired bounding box and natural language annotations on the publicly available MIM… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted at Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2020