Skip to main content

Showing 1–7 of 7 results for author: Xiang, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2009.03873  [pdf

    cs.LG cs.AI stat.ML

    Machine Intelligence for Outcome Predictions of Trauma Patients During Emergency Department Care

    Authors: Joshua D. Cardosi, Herman Shen, Jonathan I. Groner, Megan Armstrong, Henry Xiang

    Abstract: Trauma mortality results from a multitude of non-linear dependent risk factors including patient demographics, injury characteristics, medical care provided, and characteristics of medical facilities; yet traditional approach attempted to capture these relationships using rigid regression models. We hypothesized that a transfer learning based machine learning algorithm could deeply understand a tr… ▽ More

    Submitted 9 September, 2020; v1 submitted 8 September, 2020; originally announced September 2020.

    Comments: 23 pages, 1 figure, 4 tables

  3. arXiv:2005.13326  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency

    Authors: Keyu An, Hongyu Xiang, Zhijian Ou

    Abstract: In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit). CAT inherits the data-efficiency of the hybrid approach and the simplicity of the E2E approach, providing a full-fledged implementation of CTC-CRFs and complete training and testing scripts for a number of English and Chinese benchmarks. Experiments show CAT obtains state-of-the-art r… ▽ More

    Submitted 4 August, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: Accepted into INTERSPEECH 2020. arXiv admin note: text overlap with arXiv:1911.08747

  4. arXiv:1911.08747  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    CAT: CRF-based ASR Toolkit

    Authors: Keyu An, Hongyu Xiang, Zhijian Ou

    Abstract: In this paper, we present a new open source toolkit for automatic speech recognition (ASR), named CAT (CRF-based ASR Toolkit). A key feature of CAT is discriminative training in the framework of conditional random field (CRF), particularly with connectionist temporal classification (CTC) inspired state topology. CAT contains a full-fledged implementation of CTC-CRF and provides a complete workflow… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: Code released at: https://github.com/thu-spmi/cat

  5. arXiv:1907.07843  [pdf, other

    stat.ML cs.LG

    An Adaptive Approach for Anomaly Detector Selection and Fine-Tuning in Time Series

    Authors: Hui Ye, Xiaopeng Ma, Qingfeng Pan, Huaqiang Fang, Hang Xiang, Tongzhen Shao

    Abstract: The anomaly detection of time series is a hotspot of time series data mining. The own characteristics of different anomaly detectors determine the abnormal data that they are good at. There is no detector can be optimizing in all types of anomalies. Moreover, it still has difficulties in industrial production due to problems such as a single detector can't be optimized at different time windows of… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: 7 pages, 5 figures it has been accepted to DLP-KDD 2019 workshop

  6. arXiv:1907.06582  [pdf, other

    cs.LG stat.ML

    AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

    Authors: Zheng Gao, Lin Guo, Chi Ma, Xiao Ma, Kai Sun, Hang Xiang, Xiaoqiang Zhu, Hongsong Li, Xiaozhong Liu

    Abstract: Anomaly detection is facing with emerging challenges in many important industry domains, such as cyber security and online recommendation and advertising. The recent trend in these areas calls for anomaly detection on time-evolving data with high-dimensional categorical features without labeled samples. Also, there is an increasing demand for identifying and monitoring irregular patterns at multip… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: Accepted by 2019 KDD Workshop on Deep Learning Practice for High-Dimensional Sparse Data

  7. arXiv:1806.08541  [pdf, other

    stat.ML cs.LG

    Visualizing and Understanding Deep Neural Networks in CTR Prediction

    Authors: Lin Guo, Hui Ye, Wenbo Su, Henhuan Liu, Kai Sun, Hang Xiang

    Abstract: Although deep learning techniques have been successfully applied to many tasks, interpreting deep neural network models is still a big challenge to us. Recently, many works have been done on visualizing and analyzing the mechanism of deep neural networks in the areas of image processing and natural language processing. In this paper, we present our approaches to visualize and understand deep neura… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.

    Comments: Accept by 2018 SIGIR Workshop on eCommerce