-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seong** Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
A Machine Learning Approach to Predicting Single Event Upsets
Authors:
Archit Gupta,
Chong Yock Eng,
Deon Lim Meng Wee,
Rashna Analia Ahmed,
See Min Sim
Abstract:
A single event upset (SEU) is a critical soft error that occurs in semiconductor devices on exposure to ionising particles from space environments. SEUs cause bit flips in the memory component of semiconductors. This creates a multitude of safety hazards as stored information becomes less reliable. Currently, SEUs are only detected several hours after their occurrence. CREMER, the model presented…
▽ More
A single event upset (SEU) is a critical soft error that occurs in semiconductor devices on exposure to ionising particles from space environments. SEUs cause bit flips in the memory component of semiconductors. This creates a multitude of safety hazards as stored information becomes less reliable. Currently, SEUs are only detected several hours after their occurrence. CREMER, the model presented in this paper, predicts SEUs in advance using machine learning. CREMER uses only positional data to predict SEU occurrence, making it robust, inexpensive and scalable. Upon implementation, the improved reliability of memory devices will create a digitally safer environment onboard space vehicles.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Interactive Natural Language Processing
Authors:
Zekun Wang,
Ge Zhang,
Kexin Yang,
Ning Shi,
Wangchunshu Zhou,
Shaochun Hao,
Guangzheng Xiong,
Yizhi Li,
Mong Yuan Sim,
Xiuying Chen,
Qingqing Zhu,
Zhenzhu Yang,
Adam Nik,
Qi Liu,
Chenghua Lin,
Shi Wang,
Ruibo Liu,
Wenhu Chen,
Ke Xu,
Dayiheng Liu,
Yike Guo,
Jie Fu
Abstract:
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in th…
▽ More
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
ANNA: Enhanced Language Representation for Question Answering
Authors:
Changwook Jun,
Hansol Jang,
Myoseop Sim,
Hyun Kim,
Jooyoung Choi,
Kyungkoo Min,
Kyunghoon Bae
Abstract:
Pre-trained language models have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-the-art results have shown their approaches in the separate perspectives of data processing, pre-training tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance indiv…
▽ More
Pre-trained language models have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-the-art results have shown their approaches in the separate perspectives of data processing, pre-training tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance individually, and that the language model performs the best results on a specific question answering task when those approaches are jointly considered in pre-training models. In particular, we propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training language modeling. Our best model achieves new state-of-the-art results of 95.7\% F1 and 90.6\% EM on SQuAD 1.1 and also outperforms existing pre-trained language models such as RoBERTa, ALBERT, ELECTRA, and XLNet on the SQuAD 2.0 benchmark.
△ Less
Submitted 3 April, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
-
Korean-Specific Dataset for Table Question Answering
Authors:
Changwook Jun,
Jooyoung Choi,
Myoseop Sim,
Hyun Kim,
Hansol Jang,
Kyungkoo Min
Abstract:
Existing question answering systems mainly focus on dealing with text data. However, much of the data produced daily is stored in the form of tables that can be found in documents and relational databases, or on the web. To solve the task of question answering over tables, there exist many datasets for table question answering written in English, but few Korean datasets. In this paper, we demonstr…
▽ More
Existing question answering systems mainly focus on dealing with text data. However, much of the data produced daily is stored in the form of tables that can be found in documents and relational databases, or on the web. To solve the task of question answering over tables, there exist many datasets for table question answering written in English, but few Korean datasets. In this paper, we demonstrate how we construct Korean-specific datasets for table question answering: Korean tabular dataset is a collection of 1.4M tables with corresponding descriptions for unsupervised pre-training language models. Korean table question answering corpus consists of 70k pairs of questions and answers created by crowd-sourced workers. Subsequently, we then build a pre-trained language model based on Transformer and fine-tune the model for table question answering with these datasets. We then report the evaluation results of our model. We make our datasets publicly available via our GitHub repository and hope that those datasets will help further studies for question answering over tables, and for the transformation of table formats.
△ Less
Submitted 1 May, 2022; v1 submitted 17 January, 2022;
originally announced January 2022.
-
An intelligent financial portfolio trading strategy using deep Q-learning
Authors:
Hyungjun Park,
Min Kyu Sim,
Dong Gu Choi
Abstract:
Portfolio traders strive to identify dynamic portfolio allocation schemes so that their total budgets are efficiently allocated through the investment horizon. This study proposes a novel portfolio trading strategy in which an intelligent agent is trained to identify an optimal trading action by using deep Q-learning. We formulate a Markov decision process model for the portfolio trading process,…
▽ More
Portfolio traders strive to identify dynamic portfolio allocation schemes so that their total budgets are efficiently allocated through the investment horizon. This study proposes a novel portfolio trading strategy in which an intelligent agent is trained to identify an optimal trading action by using deep Q-learning. We formulate a Markov decision process model for the portfolio trading process, and the model adopts a discrete combinatorial action space, determining the trading direction at prespecified trading size for each asset, to ensure practical applicability. Our novel portfolio trading strategy takes advantage of three features to outperform in real-world trading. First, a map** function is devised to handle and transform an initially found but infeasible action into a feasible action closest to the originally proposed ideal action. Second, by overcoming the dimensionality problem, this study establishes models of agent and Q-network for deriving a multi-asset trading strategy in the predefined action space. Last, this study introduces a technique that has the advantage of deriving a well-fitted multi-asset trading strategy by designing an agent to simulate all feasible actions in each state. To validate our approach, we conduct backtests for two representative portfolios and demonstrate superior results over the benchmark strategies.
△ Less
Submitted 28 November, 2019; v1 submitted 8 July, 2019;
originally announced July 2019.
-
A Comparative Study of Analog/Digital Self-Interference Cancellation for Full Duplex Radios
Authors:
Jong Woo Kwak,
Min Soo Sim,
In-Woong Kang,
Jong Sung Park,
Jaedon Park,
Chan-Byoung Chae
Abstract:
Self-interference (SI) is the main obstacle to full-duplex radios. To overcome the SI, researchers have proposed several analog and digital domain self-interference cancellation (SIC) techniques. How well the digital cancellation works depends on the results of analog cancellation. Therefore, to analyze overall SIC performance, one should do so in an integrated manner. In this paper, we build a si…
▽ More
Self-interference (SI) is the main obstacle to full-duplex radios. To overcome the SI, researchers have proposed several analog and digital domain self-interference cancellation (SIC) techniques. How well the digital cancellation works depends on the results of analog cancellation. Therefore, to analyze overall SIC performance, one should do so in an integrated manner. In this paper, we build a simulator that can analyze the performance of analog and digital SIC techniques. Through this simulator, we can analyze the overall SIC performance within various system parameters such as the resolution of an analog-to-digital converter (ADC) and/or nonlinearity of a power amplifier (PA). With our simulator, we expect that configurations and tuning algorithms of an active analog canceller can be optimized before real hardware implementation.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
Map-based Millimeter-Wave Channel Models: An Overview, Hybrid Modeling, Data, and Learning
Authors:
Yeon-Geun Lim,
Yae Jee Cho,
MinSoo Sim,
Younsun Kim,
Chan-Byoung Chae,
Reinaldo A. Valenzuela
Abstract:
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such paramet…
▽ More
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such parameters enable researchers to accurately evaluate novel technologies in the mm-Wave range. Diverse map-based modeling methods result in different modeling objectives, including the characteristics of channel parameters and different complexities of the modeling procedure. This article outlines an overview of map-based mm-Wave channel models and proposes a concept of how they can be utilized to integrate a hardware testbed/sounder with a software testbed/sounder. In addition, we categorize map-based channel parameters and provide guidelines for hybrid modeling. Next, we share the measurement data and the map-based channel parameters with the public. Lastly, we evaluate a machine learning-based beam selection algorithm through the shared database. We expect that the offered guidelines and the shared database will enable researchers to readily design a map-based channel model.
△ Less
Submitted 10 July, 2019; v1 submitted 24 November, 2017;
originally announced November 2017.
-
Compact Full Duplex MIMO Radios in D2D Underlaid Cellular Networks: From System Design to Prototype Results
Authors:
MinKeun Chung,
Min Soo Sim,
Dong Ku Kim,
Chan-Byoung Chae
Abstract:
This paper considers the implementation and application possibilities of a compact full duplex multiple-input multiple-output (MIMO) architecture where direct communication exists between users, e.g., device-to-device (D2D) and cellular link coexisting on the same spectrum. For the architecture of the compact full duplex radio, we combine an analog self-interference canceler based dual-polarizatio…
▽ More
This paper considers the implementation and application possibilities of a compact full duplex multiple-input multiple-output (MIMO) architecture where direct communication exists between users, e.g., device-to-device (D2D) and cellular link coexisting on the same spectrum. For the architecture of the compact full duplex radio, we combine an analog self-interference canceler based dual-polarization with high cross-polarization discrimination (XPD) and Long Term Evolution (LTE)-based per-subcarrier digital self-interference canceler. While we consider the compactness and power efficiency of an analog solution, we focus on the digital canceler design with robustness to a frequency-selective channel and high compatibility with a conventional LTE system. For an over-the-air wireless experiment of full duplex testbed with a two-user-pair, we implement a full duplex MIMO physical layer (PHY), supporting 20 MHz bandwidth, on an FPGA-based software-defined radio platform. Further, we propose a novel timing synchronization method to construct a more viable full duplex MIMO link. By having the full duplex link prototype fully operating in real-time, we present the first characterization of the proposed compact full duplex MIMO performance depending on the transmit power of the full duplex node. We also show the link quality between nodes. One of the crucial insights of this work is that the full duplex operation of a user is capable of acquiring the throughput gain if the user has self-interference capability with guaranteed performance.
△ Less
Submitted 17 March, 2017; v1 submitted 19 December, 2016;
originally announced December 2016.
-
Nonlinear Self-Interference Cancellation for Full-Duplex Radios: From Link- and System-Level Performance Perspectives
Authors:
Min Soo Sim,
MinKeun Chung,
Dongkyu Kim,
Jaehoon Chung,
Dong Ku Kim,
Chan-Byoung Chae
Abstract:
One of the promising technologies for LTE Evolution is full-duplex radio, an innovation is expected to double the spectral efficiency. To realize full-duplex in practice, the main challenge is overcoming self-interference, and to do so, researchers have developed self-interference cancellation techniques. Since most wireless transceivers use power amplifiers, especially in cellular systems, resear…
▽ More
One of the promising technologies for LTE Evolution is full-duplex radio, an innovation is expected to double the spectral efficiency. To realize full-duplex in practice, the main challenge is overcoming self-interference, and to do so, researchers have developed self-interference cancellation techniques. Since most wireless transceivers use power amplifiers, especially in cellular systems, researchers have revealed the importance of nonlinear self-interference cancellation. In this article, we first explore several nonlinear digital self-interference cancellation techniques. We then propose a low complexity pre-calibration-based nonlinear digital self-interference cancellation technique. Next we discuss issues about reference signal allocation and the overhead of each technique. For performance evaluations, we carry out extensive measurements through a real-time prototype and link-/system-level simulations. For link-level analysis, we measure the amount of cancelled self-interference for each technique. We also evaluate system-level performances through 3D ray-tracing-based simulations. Numerical results confirm the significant performance improvement over a half-duplex system even in interference-limited indoor environments.
△ Less
Submitted 14 February, 2017; v1 submitted 7 July, 2016;
originally announced July 2016.
-
Compressed Channel Feedback for Correlated Massive MIMO Systems
Authors:
Min Soo Sim,
Jeonghun Park,
Chan-Byoung Chae,
Robert W. Heath Jr
Abstract:
Massive multiple-input multiple-output (MIMO) is a promising approach for cellular communication due to its energy efficiency and high achievable data rate. These advantages, however, can be realized only when channel state information (CSI) is available at the transmitter. Since there are many antennas, CSI is too large to feed back without compression. To compress CSI, prior work has applied com…
▽ More
Massive multiple-input multiple-output (MIMO) is a promising approach for cellular communication due to its energy efficiency and high achievable data rate. These advantages, however, can be realized only when channel state information (CSI) is available at the transmitter. Since there are many antennas, CSI is too large to feed back without compression. To compress CSI, prior work has applied compressive sensing (CS) techniques and the fact that CSI can be sparsified. The adopted sparsifying bases fail, however, to reflect the spatial correlation and channel conditions or to be feasible in practice. In this paper, we propose a new sparsifying basis that reflects the long-term characteristics of the channel, and needs no change as long as the spatial correlation model does not change. We propose a new reconstruction algorithm for CS, and also suggest dimensionality reduction as a compression method. To feed back compressed CSI in practice, we propose a new codebook for the compressed channel quantization assuming no other-cell interference. Numerical results confirm that the proposed channel feedback mechanisms show better performance in point-to-point (single-user) and point-to-multi-point (multi-user) scenarios.
△ Less
Submitted 31 March, 2015;
originally announced March 2015.
-
Prototy** Real-Time Full Duplex Radios
Authors:
MinKeun Chung,
Min Soo Sim,
Jaeweon Kim,
Dong Ku Kim,
Chan-Byoung Chae
Abstract:
In this article, we present a real-time full duplex radio system for 5G wireless networks. Full duplex radios are capable of opening new possibilities in contexts of high traffic demand where there are limited radio resources. A critical issue, however, to implementing full duplex radios, in real wireless environments, is being able to cancel self-interference. To overcome the self-interference ch…
▽ More
In this article, we present a real-time full duplex radio system for 5G wireless networks. Full duplex radios are capable of opening new possibilities in contexts of high traffic demand where there are limited radio resources. A critical issue, however, to implementing full duplex radios, in real wireless environments, is being able to cancel self-interference. To overcome the self-interference challenge, we prototype our design on a software-defined radio (SDR) platform. This design combines a dual-polarization antenna-based analog part with a digital self-interference canceller that operates in real-time. Prototype test results confirm that the proposed full-duplex system achieves about 1.9 times higher throughput than a half-duplex system. This article concludes with a discussion of implementationchallenges that remain for researchers seeking the most viable solution for full duplex communications.
△ Less
Submitted 18 December, 2016; v1 submitted 10 March, 2015;
originally announced March 2015.