-
Graph Neural Network as Computationally Efficient Emulator of Ice-sheet and Sea-level System Model (ISSM)
Authors:
Younghyun Koo,
Maryam Rahnemoonfar
Abstract:
The Ice-sheet and Sea-level System Model (ISSM) provides solutions for Stokes equations relevant to ice sheet dynamics by employing finite element and fine mesh adaption. However, since its finite element method is compatible only with Central Processing Units (CPU), the ISSM has limits on further economizing computational time. Thus, by taking advantage of Graphics Processing Units (GPUs), we des…
▽ More
The Ice-sheet and Sea-level System Model (ISSM) provides solutions for Stokes equations relevant to ice sheet dynamics by employing finite element and fine mesh adaption. However, since its finite element method is compatible only with Central Processing Units (CPU), the ISSM has limits on further economizing computational time. Thus, by taking advantage of Graphics Processing Units (GPUs), we design a graph convolutional network (GCN) as a fast emulator for ISSM. The GCN is trained and tested using the 20-year transient ISSM simulations in the Pine Island Glacier (PIG). The GCN reproduces ice thickness and velocity with a correlation coefficient greater than 0.998, outperforming the traditional convolutional neural network (CNN). Additionally, GCN shows 34 times faster computational speed than the CPU-based ISSM modeling. The GPU-based GCN emulator allows us to predict how the PIG will change in the future under different melting rate scenarios with high fidelity and much faster computational time.
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
NAIST Simultaneous Speech Translation System for IWSLT 2024
Authors:
Yuka Ko,
Ryo Fukuda,
Yuta Nishikawa,
Yasumasa Kano,
Tomoya Yanagita,
Kosuke Doi,
Mana Makinae,
Haotian Tan,
Makoto Sakai,
Sakriani Sakti,
Katsuhito Sudoh,
Satoshi Nakamura
Abstract:
This paper describes NAIST's submission to the simultaneous track of the IWSLT 2024 Evaluation Campaign: English-to-{German, Japanese, Chinese} speech-to-text translation and English-to-Japanese speech-to-speech translation. We develop a multilingual end-to-end speech-to-text translation model combining two pre-trained language models, HuBERT and mBART. We trained this model with two decoding poli…
▽ More
This paper describes NAIST's submission to the simultaneous track of the IWSLT 2024 Evaluation Campaign: English-to-{German, Japanese, Chinese} speech-to-text translation and English-to-Japanese speech-to-speech translation. We develop a multilingual end-to-end speech-to-text translation model combining two pre-trained language models, HuBERT and mBART. We trained this model with two decoding policies, Local Agreement (LA) and AlignAtt. The submitted models employ the LA policy because it outperformed the AlignAtt policy in previous models. Our speech-to-speech translation method is a cascade of the above speech-to-text model and an incremental text-to-speech (TTS) module that incorporates a phoneme estimation model, a parallel acoustic model, and a parallel WaveGAN vocoder. We improved our incremental TTS by applying the Transformer architecture with the AlignAtt policy for the estimation model. The results show that our upgraded TTS module contributed to improving the system performance.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Graph Neural Networks for Emulation of Finite-Element Ice Dynamics in Greenland and Antarctic Ice Sheets
Authors:
Younghyun Koo,
Maryam Rahnemoonfar
Abstract:
Although numerical models provide accurate solutions for ice sheet dynamics based on physics laws, they accompany intensified computational demands to solve partial differential equations. In recent years, convolutional neural networks (CNNs) have been widely used as statistical emulators for those numerical models. However, since CNNs operate on regular grids, they cannot represent the refined me…
▽ More
Although numerical models provide accurate solutions for ice sheet dynamics based on physics laws, they accompany intensified computational demands to solve partial differential equations. In recent years, convolutional neural networks (CNNs) have been widely used as statistical emulators for those numerical models. However, since CNNs operate on regular grids, they cannot represent the refined meshes and computational efficiency of finite-element numerical models. Therefore, instead of CNNs, this study adopts an equivariant graph convolutional network (EGCN) as an emulator for the ice sheet dynamics modeling. EGCN reproduces ice thickness and velocity changes in the Helheim Glacier, Greenland, and Pine Island Glacier, Antarctica, with 260 times and 44 times faster computation time, respectively. Compared to the traditional CNN and graph convolutional network, EGCN shows outstanding accuracy in thickness prediction near fast ice streams by preserving the equivariance to the translation and rotation of graphs.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation
Authors:
Kosuke Doi,
Yuka Ko,
Mana Makinae,
Katsuhito Sudoh,
Satoshi Nakamura
Abstract:
This paper analyzes the features of monotonic translations, which follow the word order of the source language, in simultaneous interpreting (SI). The word order differences are one of the biggest challenges in SI, especially for language pairs with significant structural differences like English and Japanese. We analyzed the characteristics of monotonic translations using the NAIST English-to-Jap…
▽ More
This paper analyzes the features of monotonic translations, which follow the word order of the source language, in simultaneous interpreting (SI). The word order differences are one of the biggest challenges in SI, especially for language pairs with significant structural differences like English and Japanese. We analyzed the characteristics of monotonic translations using the NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset and found some grammatical structures that make monotonic translation difficult in English-Japanese SI. We further investigated the features of monotonic translations through evaluating the output from the existing speech translation (ST) and simultaneous speech translation (simulST) models on NAIST English-to-Japanese Chunk-wise Monotonic Translation Evaluation Dataset as well as on existing test sets. The results suggest that the existing SI-based test set underestimates the model performance. We also found that the monotonic-translation-based dataset would better evaluate simulST models, while using an offline-based test set for evaluating simulST models underestimates the model performance.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models
Authors:
Dojun Park,
Jiwoo Lee,
Seohyun Park,
Hyeyun Jeong,
Youngeun Koo,
Soonha Hwang,
Seonwoo Park,
Sungeun Lee
Abstract:
As the capabilities of LLMs expand, it becomes increasingly important to evaluate them beyond basic knowledge assessment, focusing on higher-level language understanding. This study introduces MultiPragEval, a robust test suite designed for the multilingual pragmatic evaluation of LLMs across English, German, Korean, and Chinese. Comprising 1200 question units categorized according to Grice's Coop…
▽ More
As the capabilities of LLMs expand, it becomes increasingly important to evaluate them beyond basic knowledge assessment, focusing on higher-level language understanding. This study introduces MultiPragEval, a robust test suite designed for the multilingual pragmatic evaluation of LLMs across English, German, Korean, and Chinese. Comprising 1200 question units categorized according to Grice's Cooperative Principle and its four conversational maxims, MultiPragEval enables an in-depth assessment of LLMs' contextual awareness and their ability to infer implied meanings. Our findings demonstrate that Claude3-Opus significantly outperforms other models in all tested languages, establishing a state-of-the-art in the field. Among open-source models, Solar-10.7B and Qwen1.5-14B emerge as strong competitors. This study not only leads the way in the multilingual evaluation of LLMs in pragmatic inference but also provides valuable insights into the nuanced capabilities necessary for advanced language comprehension in AI systems.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Recover as It is Designed to Be: Recovering from Compatibility Mobile App Crashes by Reusing User Flows
Authors:
Donghwi Kim,
Hyungjun Yoon,
Chang Min Park,
Su** Han,
Young** Kwon,
Steven Y. Ko,
Sung-Ju Lee
Abstract:
Android OS is severely fragmented by API updates and device vendors' OS customization, creating a market condition where vastly different OS versions coexist. This gives rise to compatibility crash problems where Android apps crash on certain Android versions but not on others. Although well-known, this problem is extremely challenging for app developers to overcome due to the sheer number of Andr…
▽ More
Android OS is severely fragmented by API updates and device vendors' OS customization, creating a market condition where vastly different OS versions coexist. This gives rise to compatibility crash problems where Android apps crash on certain Android versions but not on others. Although well-known, this problem is extremely challenging for app developers to overcome due to the sheer number of Android versions in the market that must be tested. We present RecoFlow, a framework for enabling app developers to automatically recover an app from a crash by programming user flows with our API and visual tools. RecoFlow tracks app feature usage with the user flows on user devices and recovers an app from a crash by replaying UI actions of the app feature disrupted by the crash. To prevent recurring compatibility crashes, RecoFlow executes a previously crashed app in compatibility mode that is enabled by our novel Android OS virtualization technique. Our evaluation with professional Android developers shows that our API and tools are easy to use and effective in recovering from compatibility crashes.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Physics-Informed Machine Learning On Polar Ice: A Survey
Authors:
Zesheng Liu,
YoungHyun Koo,
Maryam Rahnemoonfar
Abstract:
The mass loss of the polar ice sheets contributes considerably to ongoing sea-level rise and changing ocean circulation, leading to coastal flooding and risking the homes and livelihoods of tens of millions of people globally. To address the complex problem of ice behavior, physical models and data-driven models have been proposed in the literature. Although traditional physical models can guarant…
▽ More
The mass loss of the polar ice sheets contributes considerably to ongoing sea-level rise and changing ocean circulation, leading to coastal flooding and risking the homes and livelihoods of tens of millions of people globally. To address the complex problem of ice behavior, physical models and data-driven models have been proposed in the literature. Although traditional physical models can guarantee physically meaningful results, they have limitations in producing high-resolution results. On the other hand, data-driven approaches require large amounts of high-quality and labeled data, which is rarely available in the polar regions. Hence, as a promising framework that leverages the advantages of physical models and data-driven methods, physics-informed machine learning (PIML) has been widely studied in recent years. In this paper, we review the existing algorithms of PIML, provide our own taxonomy based on the methods of combining physics and data-driven approaches, and analyze the advantages of PIML in the aspects of accuracy and efficiency. Further, our survey discusses some current challenges and highlights future opportunities, including PIML on sea ice studies, PIML with different combination methods and backbone networks, and neural operator methods.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
A Study of Vulnerability Repair in JavaScript Programs with Large Language Models
Authors:
Tan Khang Le,
Saba Alimadadi,
Steven Y. Ko
Abstract:
In recent years, JavaScript has become the most widely used programming language, especially in web development. However, writing secure JavaScript code is not trivial, and programmers often make mistakes that lead to security vulnerabilities in web applications. Large Language Models (LLMs) have demonstrated substantial advancements across multiple domains, and their evolving capabilities indicat…
▽ More
In recent years, JavaScript has become the most widely used programming language, especially in web development. However, writing secure JavaScript code is not trivial, and programmers often make mistakes that lead to security vulnerabilities in web applications. Large Language Models (LLMs) have demonstrated substantial advancements across multiple domains, and their evolving capabilities indicate their potential for automatic code generation based on a required specification, including automatic bug fixing. In this study, we explore the accuracy of LLMs, namely ChatGPT and Bard, in finding and fixing security vulnerabilities in JavaScript programs. We also investigate the impact of context in a prompt on directing LLMs to produce a correct patch of vulnerable JavaScript code. Our experiments on real-world software vulnerabilities show that while LLMs are promising in automatic program repair of JavaScript code, achieving a correct bug fix often requires an appropriate amount of context in the prompt.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
HearHere: Mitigating Echo Chambers in News Consumption through an AI-based Web System
Authors:
Youngseung Jeon,
Jaehoon Kim,
Sohyun Park,
Yunyong Ko,
Seongeun Ryu,
Sang-Wook Kim,
Kyungsik Han
Abstract:
Considerable efforts are currently underway to mitigate the negative impacts of echo chambers, such as increased susceptibility to fake news and resistance towards accepting scientific evidence. Prior research has presented the development of computer systems that support the consumption of news information from diverse political perspectives to mitigate the echo chamber effect. However, existing…
▽ More
Considerable efforts are currently underway to mitigate the negative impacts of echo chambers, such as increased susceptibility to fake news and resistance towards accepting scientific evidence. Prior research has presented the development of computer systems that support the consumption of news information from diverse political perspectives to mitigate the echo chamber effect. However, existing studies still lack the ability to effectively support the key processes of news information consumption and quantitatively identify a political stance towards the information. In this paper, we present HearHere, an AI-based web system designed to help users accommodate information and opinions from diverse perspectives. HearHere facilitates the key processes of news information consumption through two visualizations. Visualization 1 provides political news with quantitative political stance information, derived from our graph-based political classification model, and users can experience diverse perspectives (Hear). Visualization 2 allows users to express their opinions on specific political issues in a comment form and observe the position of their own opinions relative to pro-liberal and pro-conservative comments presented on a map interface (Here). Through a user study with 94 participants, we demonstrate the feasibility of HearHere in supporting the consumption of information from various perspectives. Our findings highlight the importance of providing political stance information and quantifying users' political status as a means to mitigate political polarization. In addition, we propose design implications for system development, including the consideration of demographics such as political interest and providing users with initiatives.
△ Less
Submitted 29 February, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Graph Neural Networks as Fast and High-fidelity Emulators for Finite-Element Ice Sheet Modeling
Authors:
Maryam Rahnemoonfar,
Younghyun Koo
Abstract:
Although the finite element approach of the Ice-sheet and Sea-level System Model (ISSM) solves ice dynamics problems governed by Stokes equations quickly and accurately, such numerical modeling requires intensive computation on central processing units (CPU). In this study, we develop graph neural networks (GNN) as fast surrogate models to preserve the finite element structure of ISSM. Using the 2…
▽ More
Although the finite element approach of the Ice-sheet and Sea-level System Model (ISSM) solves ice dynamics problems governed by Stokes equations quickly and accurately, such numerical modeling requires intensive computation on central processing units (CPU). In this study, we develop graph neural networks (GNN) as fast surrogate models to preserve the finite element structure of ISSM. Using the 20-year transient simulations in the Pine Island Glacier (PIG), we train and test three GNNs: graph convolutional network (GCN), graph attention network (GAT), and equivariant graph convolutional network (EGCN). These GNNs reproduce ice thickness and velocity with better accuracy than the classic convolutional neural network (CNN) and multi-layer perception (MLP). In particular, GNNs successfully capture the ice mass loss and acceleration induced by higher basal melting rates in the PIG. When our GNN emulators are implemented on graphic processing units (GPUs), they show up to 50 times faster computational time than the CPU-based ISSM simulation.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Flash: A Hybrid Private Inference Protocol for Deep CNNs with High Accuracy and Low Latency on CPU
Authors:
Hyeri Roh,
**su Yeo,
Yeongil Ko,
Gu-Yeon Wei,
David Brooks,
Woo-Seok Choi
Abstract:
This paper presents Flash, an optimized private inference (PI) hybrid protocol utilizing both homomorphic encryption (HE) and secure two-party computation (2PC), which can reduce the end-to-end PI latency for deep CNN models less than 1 minute with CPU. To this end, first, Flash proposes a low-latency convolution algorithm built upon a fast slot rotation operation and a novel data encoding scheme,…
▽ More
This paper presents Flash, an optimized private inference (PI) hybrid protocol utilizing both homomorphic encryption (HE) and secure two-party computation (2PC), which can reduce the end-to-end PI latency for deep CNN models less than 1 minute with CPU. To this end, first, Flash proposes a low-latency convolution algorithm built upon a fast slot rotation operation and a novel data encoding scheme, which results in 4-94x performance gain over the state-of-the-art. Second, to minimize the communication cost introduced by the standard nonlinear activation function ReLU, Flash replaces the entire ReLUs with the polynomial $x^2+x$ and trains deep CNN models with the new activation function. The trained models improve the inference accuracy for CIFAR-10/100 and TinyImageNet by 16% on average (up to 40% for ResNet-32) compared to prior art. Last, Flash proposes an efficient 2PC-based $x^2+x$ evaluation protocol that does not require any offline communication and that reduces the total communication cost to process the activation layer by 84-196x over the state-of-the-art. As a result, the end-to-end PI latency of Flash implemented on CPU is 0.02 minute for CIFAR-100 and 0.57 minute for TinyImageNet classification, while the total data communication is 0.07GB for CIFAR-100 and 0.22GB for TinyImageNet. Flash improves the state-of-the-art PI by 16-45x in latency and 84-196x in communication cost. Moreover, even for ImageNet, Flash can deliver the latency less than 1 minute on CPU with the total communication less than 1GB.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Instruction Fine-Tuning: Does Prompt Loss Matter?
Authors:
Mathew Huerta-Enochian,
Seung Yong Ko
Abstract:
We present a novel study analyzing the effects of various prompt loss token weights (PLW) for supervised instruction fine-tuning (SIFT). While prompt-masking (PLW = 0) is common for SIFT, some fine-tuning APIs support fractional PLWs and suggest that using a small non-zero PLW can help stabilize learning when fine-tuning on short-completion data. However, there has never been a study confirming th…
▽ More
We present a novel study analyzing the effects of various prompt loss token weights (PLW) for supervised instruction fine-tuning (SIFT). While prompt-masking (PLW = 0) is common for SIFT, some fine-tuning APIs support fractional PLWs and suggest that using a small non-zero PLW can help stabilize learning when fine-tuning on short-completion data. However, there has never been a study confirming this claim, and OpenAI, a major cloud-based SIFT provider, recently removed this parameter from their fine-tuning API. We found that performance of models fine-tuned on short-completion data had a statistically-significant negative quadratic relationship with PLW. Using small values (0.01 - 0.5) of PLW produced better results on multiple-choice and short-generation benchmarks (outperforming models fine-tuned on long-completion data) while large values (~ 1.0) of PLW produced better results on long-generation benchmarks. We explained this effect and verified its importance through additional experiments. This research serves as a warning to API providers about the importance of providing a PLW parameter for SIFT.
△ Less
Submitted 18 June, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
BEC: Bit-Level Static Analysis for Reliability against Soft Errors
Authors:
Yousun Ko,
Bernd Burgstaller
Abstract:
Soft errors are a type of transient digital signal corruption that occurs in digital hardware components such as the internal flip-flops of CPU pipelines, the register file, memory cells, and even internal communication buses. Soft errors are caused by environmental radioactivity, magnetic interference, lasers, and temperature fluctuations, either unintentionally, or as part of a deliberate attemp…
▽ More
Soft errors are a type of transient digital signal corruption that occurs in digital hardware components such as the internal flip-flops of CPU pipelines, the register file, memory cells, and even internal communication buses. Soft errors are caused by environmental radioactivity, magnetic interference, lasers, and temperature fluctuations, either unintentionally, or as part of a deliberate attempt to compromise a system and expose confidential data.
We propose a bit-level error coalescing (BEC) static program analysis and its two use cases to understand and improve program reliability against soft errors. The BEC analysis tracks each bit corruption in the register file and classifies the effect of the corruption by its semantics at compile time. The usefulness of the proposed analysis is demonstrated in two scenarios, fault injection campaign pruning, and reliability-aware program transformation. Experimental results show that bit-level analysis pruned up to 30.04 % of exhaustive fault injection campaigns (13.71 % on average), without loss of accuracy. Program vulnerability was reduced by up to 13.11 % (4.94 % on average) through bit-level vulnerability-aware instruction scheduling. The analysis has been implemented within LLVM and evaluated on the RISC-V architecture.
To the best of our knowledge, the proposed BEC analysis is the first bit-level compiler analysis for program reliability against soft errors. The proposed method is generic and not limited to a specific computer architecture.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation
Authors:
Sunjae Lee,
Junyoung Choi,
Jungjae Lee,
Munim Hasan Wasi,
Hojun Choi,
Steven Y. Ko,
Sangeun Oh,
Insik Shin
Abstract:
The advent of large language models (LLMs) has opened up new opportunities in the field of mobile task automation. Their superior language understanding and reasoning capabilities allow users to automate complex and repetitive tasks. However, due to the inherent unreliability and high operational cost of LLMs, their practical applicability is quite limited. To address these issues, this paper intr…
▽ More
The advent of large language models (LLMs) has opened up new opportunities in the field of mobile task automation. Their superior language understanding and reasoning capabilities allow users to automate complex and repetitive tasks. However, due to the inherent unreliability and high operational cost of LLMs, their practical applicability is quite limited. To address these issues, this paper introduces MobileGPT, an innovative LLM-based mobile task automator equipped with a human-like app memory. MobileGPT emulates the cognitive process of humans interacting with a mobile app -- explore, select, derive, and recall. This approach allows for a more precise and efficient learning of a task's procedure by breaking it down into smaller, modular sub-tasks that can be re-used, re-arranged, and adapted for various objectives. We implement MobileGPT using online LLMs services (GPT-3.5 and GPT-4) and evaluate its performance on a dataset of 160 user instructions across 8 widely used mobile apps. The results indicate that MobileGPT can automate and learn new tasks with 82.5% accuracy, and is able to adapt them to different contexts with near perfect (98.75%) accuracy while reducing both latency and cost by 62.5% and 68.8%, respectively, compared to the GPT-4 powered baseline.
△ Less
Submitted 16 March, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
Authors:
Hyogun Lee,
Kyungho Bae,
Seong Jong Ha,
Yumin Ko,
Gyeong-Moon Park,
**woo Choi
Abstract:
In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as…
▽ More
In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.
△ Less
Submitted 22 November, 2023; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Multi-task Deep Convolutional Network to Predict Sea Ice Concentration and Drift in the Arctic Ocean
Authors:
Younghyun Koo,
Maryam Rahnemoonfar
Abstract:
Forecasting sea ice concentration (SIC) and sea ice drift (SID) in the Arctic Ocean is of great significance as the Arctic environment has been changed by the recent warming climate. Given that physical sea ice models require high computational costs with complex parameterization, deep learning techniques can effectively replace the physical model and improve the performance of sea ice prediction.…
▽ More
Forecasting sea ice concentration (SIC) and sea ice drift (SID) in the Arctic Ocean is of great significance as the Arctic environment has been changed by the recent warming climate. Given that physical sea ice models require high computational costs with complex parameterization, deep learning techniques can effectively replace the physical model and improve the performance of sea ice prediction. This study proposes a novel multi-task fully conventional network architecture named hierarchical information-sharing U-net (HIS-Unet) to predict daily SIC and SID. Instead of learning SIC and SID separately at each branch, we allow the SIC and SID layers to share their information and assist each other's prediction through the weighting attention modules (WAMs). Consequently, our HIS-Unet outperforms other statistical approaches, sea ice physical models, and neural networks without such information-sharing units. The improvement of HIS-Unet is obvious both for SIC and SID prediction when and where sea ice conditions change seasonally, which implies that the information sharing through WAMs allows the model to learn the sudden changes of SIC and SID. The weight values of the WAMs imply that SIC information plays a more critical role in SID prediction, compared to that of SID information in SIC prediction, and information sharing is more active in sea ice edges (seasonal sea ice) than in the central Arctic (multi-year sea ice).
△ Less
Submitted 31 October, 2023;
originally announced November 2023.
-
CROWN: A Novel Approach to Comprehending Users' Preferences for Accurate Personalized News Recommendation
Authors:
Yunyong Ko,
Seongeun Ryu,
Sang-Wook Kim
Abstract:
Personalized news recommendation aims to assist users in finding news articles that align with their interests, which plays a pivotal role in mitigating users' information overload problem. Although many recent works have been studied for better personalized news recommendation, the following challenges should be explored more: (C1) Comprehending manifold intents coupled within a news article, (C2…
▽ More
Personalized news recommendation aims to assist users in finding news articles that align with their interests, which plays a pivotal role in mitigating users' information overload problem. Although many recent works have been studied for better personalized news recommendation, the following challenges should be explored more: (C1) Comprehending manifold intents coupled within a news article, (C2) Differentiating varying post-read preferences of news articles, and (C3) Addressing the cold-start user problem. To tackle the aforementioned challenges together, in this paper, we propose a novel personalized news recommendation framework (CROWN) that employs (1) category-guided intent disentanglement for (C1), (2) consistency-based news representation for (C2), and (3) GNN-enhanced hybrid user representation for (C3). Furthermore, we incorporate a category prediction into the training process of CROWN as an auxiliary task, which provides supplementary supervisory signals to enhance intent disentanglement. Extensive experiments on two real-world datasets reveal that (1) CROWN provides consistent performance improvements over ten state-of-the-art news recommendation methods and (2) the proposed strategies significantly improve the accuracy of CROWN.
△ Less
Submitted 13 February, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Enhancing Hyperedge Prediction with Context-Aware Self-Supervised Learning
Authors:
Yunyong Ko,
Hanghang Tong,
Sang-Wook Kim
Abstract:
Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How…
▽ More
Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix Multiplication
Authors:
Myung-Hwan Jang,
Yunyong Ko,
Hyuck-Moo Gwon,
Ikhyeon Jo,
Yongjun Park,
Sang-Wook Kim
Abstract:
Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamental operation for real-world network analysis. With the increasing size of real-world networks, the single-machine-based SpGEMM approach cannot perform SpGEMM on large-scale networks, exceeding the size of main memory (i.e., not scalable). Although the distributed-system-based approach could handle large-scale SpGEMM based on mu…
▽ More
Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamental operation for real-world network analysis. With the increasing size of real-world networks, the single-machine-based SpGEMM approach cannot perform SpGEMM on large-scale networks, exceeding the size of main memory (i.e., not scalable). Although the distributed-system-based approach could handle large-scale SpGEMM based on multiple machines, it suffers from severe inter-machine communication overhead to aggregate results of multiple machines (i.e., not efficient). To address this dilemma, in this paper, we propose a novel storage-based SpGEMM approach (SAGE) that stores given networks in storage (e.g., SSD) and loads only the necessary parts of the networks into main memory when they are required for processing via a 3-layer architecture. Furthermore, we point out three challenges that could degrade the overall performance of SAGE and propose three effective strategies to address them: (1) block-based workload allocation for balancing workloads across threads, (2) in-memory partial aggregation for reducing the amount of unnecessarily generated storage-memory I/Os, and (3) distribution-aware memory allocation for preventing unexpected buffer overflows in main memory. Via extensive evaluation, we verify the superiority of SAGE over existing SpGEMM methods in terms of scalability and efficiency.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Distributionally Robust Stratified Sampling for Stochastic Simulations with Multiple Uncertain Input Models
Authors:
Seung Min Baik,
Eunshin Byon,
Young Myoung Ko
Abstract:
This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider…
▽ More
This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider more general cases in which it is necessary to assess a simulation's response to a variety of input models, such as when evaluating the reliability of wind turbines under nonstationary wind conditions or the operation of a service system when the distribution of customer inter-arrival time is heterogeneous at different times. Moreover, the estimation variance may be considerably impacted by uncertainty in input models. To address such nonstationary and uncertain input models, we offer a distributionally robust (DR) stratified sampling approach with the goal of minimizing the maximum of worst-case estimator variances among plausible but uncertain input models. Specifically, we devise a bi-level optimization framework for formulating DR stochastic problems with different ambiguity set designs, based on the $L_2$-norm, 1-Wasserstein distance, parametric family of distributions, and distribution moments. In order to cope with the non-convexity of objective function, we present a solution approach that uses Bayesian optimization. Numerical experiments and the wind turbine case study demonstrate the robustness of the proposed approach.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Authors:
Yuka Ko,
Ryo Fukuda,
Yuta Nishikawa,
Yasumasa Kano,
Katsuhito Sudoh,
Satoshi Nakamura
Abstract:
Simultaneous speech translation (SimulST) translates partial speech inputs incrementally. Although the monotonic correspondence between input and output is preferable for smaller latency, it is not the case for distant language pairs such as English and Japanese. A prospective approach to this problem is to mimic simultaneous interpretation (SI) using SI data to train a SimulST model. However, the…
▽ More
Simultaneous speech translation (SimulST) translates partial speech inputs incrementally. Although the monotonic correspondence between input and output is preferable for smaller latency, it is not the case for distant language pairs such as English and Japanese. A prospective approach to this problem is to mimic simultaneous interpretation (SI) using SI data to train a SimulST model. However, the size of such SI data is limited, so the SI data should be used together with ordinary bilingual data whose translations are given in offline. In this paper, we propose an effective way to train a SimulST model using mixed data of SI and offline. The proposed method trains a single model using the mixed data with style tags that tell the model to generate SI- or offline-style outputs. Experiment results show improvements of BLEURT in different latency ranges, and our analyses revealed the proposed model generates SI-style outputs more than the baseline.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
NAIST-SIC-Aligned: an Aligned English-Japanese Simultaneous Interpretation Corpus
Authors:
**ming Zhao,
Yuka Ko,
Kosuke Doi,
Ryo Fukuda,
Katsuhito Sudoh,
Satoshi Nakamura
Abstract:
It remains a question that how simultaneous interpretation (SI) data affects simultaneous machine translation (SiMT). Research has been limited due to the lack of a large-scale training corpus. In this work, we aim to fill in the gap by introducing NAIST-SIC-Aligned, which is an automatically-aligned parallel English-Japanese SI dataset. Starting with a non-aligned corpus NAIST-SIC, we propose a t…
▽ More
It remains a question that how simultaneous interpretation (SI) data affects simultaneous machine translation (SiMT). Research has been limited due to the lack of a large-scale training corpus. In this work, we aim to fill in the gap by introducing NAIST-SIC-Aligned, which is an automatically-aligned parallel English-Japanese SI dataset. Starting with a non-aligned corpus NAIST-SIC, we propose a two-stage alignment approach to make the corpus parallel and thus suitable for model training. The first stage is coarse alignment where we perform a many-to-many map** between source and target sentences, and the second stage is fine-grained alignment where we perform intra- and inter-sentence filtering to improve the quality of aligned pairs. To ensure the quality of the corpus, each step has been validated either quantitatively or qualitatively. This is the first open-sourced large-scale parallel SI dataset in the literature. We also manually curated a small test set for evaluation purposes. Our results show that models trained with SI data lead to significant improvement in translation quality and latency over baselines. We hope our work advances research on SI corpora construction and SiMT. Our data can be found at https://github.com/mingzi151/AHC-SI.
△ Less
Submitted 31 March, 2024; v1 submitted 23 April, 2023;
originally announced April 2023.
-
Toward Polar Sea-Ice Classification using Color-based Segmentation and Auto-labeling of Sentinel-2 Imagery to Train an Efficient Deep Learning Model
Authors:
Jurdana Masuma Iqrah,
Younghyun Koo,
Wei Wang,
Hongjie Xie,
Sushil Prasad
Abstract:
Global warming is an urgent issue that is generating catastrophic environmental changes, such as the melting of sea ice and glaciers, particularly in the polar regions. The melting pattern and retreat of polar sea ice cover is an essential indicator of global warming. The Sentinel-2 satellite (S2) captures high-resolution optical imagery over the polar regions. This research aims at develo** a r…
▽ More
Global warming is an urgent issue that is generating catastrophic environmental changes, such as the melting of sea ice and glaciers, particularly in the polar regions. The melting pattern and retreat of polar sea ice cover is an essential indicator of global warming. The Sentinel-2 satellite (S2) captures high-resolution optical imagery over the polar regions. This research aims at develo** a robust and effective system for classifying polar sea ice as thick or snow-covered, young or thin, or open water using S2 images. A key challenge is the lack of labeled S2 training data to serve as the ground truth. We demonstrate a method with high precision to segment and automatically label the S2 images based on suitably determined color thresholds and employ these auto-labeled data to train a U-Net machine model (a fully convolutional neural network), yielding good classification accuracy. Evaluation results over S2 data from the polar summer season in the Ross Sea region of the Antarctic show that the U-Net model trained on auto-labeled data has an accuracy of 90.18% over the original S2 images, whereas the U-Net model trained on manually labeled data has an accuracy of 91.39%. Filtering out the thin clouds and shadows from the S2 images further improves U-Net's accuracy, respectively, to 98.97% for auto-labeled and 98.40% for manually labeled training datasets.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction
Authors:
Yunyong Ko,
Seongeun Ryu,
Soeun Han,
Youngseung Jeon,
Jaehoon Kim,
Sohyun Park,
Kyungsik Han,
Hanghang Tong,
Sang-Wook Kim
Abstract:
The political stance prediction for news articles has been widely studied to mitigate the echo chamber effect -- people fall into their thoughts and reinforce their pre-existing beliefs. The previous works for the political stance problem focus on (1) identifying political factors that could reflect the political stance of a news article and (2) capturing those factors effectively. Despite their e…
▽ More
The political stance prediction for news articles has been widely studied to mitigate the echo chamber effect -- people fall into their thoughts and reinforce their pre-existing beliefs. The previous works for the political stance problem focus on (1) identifying political factors that could reflect the political stance of a news article and (2) capturing those factors effectively. Despite their empirical successes, they are not sufficiently justified in terms of how effective their identified factors are in the political stance prediction. Motivated by this, in this work, we conduct a user study to investigate important factors in political stance prediction, and observe that the context and tone of a news article (implicit) and external knowledge for real-world entities appearing in the article (explicit) are important in determining its political stance. Based on this observation, we propose a novel knowledge-aware approach to political stance prediction (KHAN), employing (1) hierarchical attention networks (HAN) to learn the relationships among words and sentences in three different levels and (2) knowledge encoding (KE) to incorporate external knowledge for real-world entities into the process of political stance prediction. Also, to take into account the subtle and important difference between opposite political stances, we build two independent political knowledge graphs (KG) (i.e., KG-lib and KG-con) by ourselves and learn to fuse the different political knowledge. Through extensive evaluations on three real-world datasets, we demonstrate the superiority of DASH in terms of (1) accuracy, (2) efficiency, and (3) effectiveness.
△ Less
Submitted 4 April, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Minimax AUC Fairness: Efficient Algorithm with Provable Convergence
Authors:
Zhenhuan Yang,
Yan Lok Ko,
Kush R. Varshney,
Yiming Ying
Abstract:
The use of machine learning models in consequential decision making often exacerbates societal inequity, in particular yielding disparate impact on members of marginalized groups defined by race and gender. The area under the ROC curve (AUC) is widely used to evaluate the performance of a scoring function in machine learning, but is studied in algorithmic fairness less than other performance metri…
▽ More
The use of machine learning models in consequential decision making often exacerbates societal inequity, in particular yielding disparate impact on members of marginalized groups defined by race and gender. The area under the ROC curve (AUC) is widely used to evaluate the performance of a scoring function in machine learning, but is studied in algorithmic fairness less than other performance metrics. Due to the pairwise nature of the AUC, defining an AUC-based group fairness metric is pairwise-dependent and may involve both \emph{intra-group} and \emph{inter-group} AUCs. Importantly, considering only one category of AUCs is not sufficient to mitigate unfairness in AUC optimization. In this paper, we propose a minimax learning and bias mitigation framework that incorporates both intra-group and inter-group AUCs while maintaining utility. Based on this Rawlsian framework, we design an efficient stochastic optimization algorithm and prove its convergence to the minimum group-level AUC. We conduct numerical experiments on both synthetic and real-world datasets to validate the effectiveness of the minimax framework and the proposed optimization algorithm.
△ Less
Submitted 28 November, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Multi-modal Semantic SLAM for Complex Dynamic Environments
Authors:
Han Wang,
**g Ying Ko,
Lihua Xie
Abstract:
Simultaneous Localization and Map** (SLAM) is one of the most essential techniques in many real-world robotic applications. The assumption of static environments is common in most SLAM algorithms, which however, is not the case for most applications. Recent work on semantic SLAM aims to understand the objects in an environment and distinguish dynamic information from a scene context by performin…
▽ More
Simultaneous Localization and Map** (SLAM) is one of the most essential techniques in many real-world robotic applications. The assumption of static environments is common in most SLAM algorithms, which however, is not the case for most applications. Recent work on semantic SLAM aims to understand the objects in an environment and distinguish dynamic information from a scene context by performing image-based segmentation. However, the segmentation results are often imperfect or incomplete, which can subsequently reduce the quality of map** and the accuracy of localization. In this paper, we present a robust multi-modal semantic framework to solve the SLAM problem in complex and highly dynamic environments. We propose to learn a more powerful object feature representation and deploy the mechanism of looking and thinking twice to the backbone network, which leads to a better recognition result to our baseline instance segmentation model. Moreover, both geometric-only clustering and visual semantic information are combined to reduce the effect of segmentation error due to small-scale objects, occlusion and motion blur. Thorough experiments have been conducted to evaluate the performance of the proposed method. The results show that our method can precisely identify dynamic objects under recognition imperfection and motion blur. Moreover, the proposed SLAM framework is able to efficiently build a static dense map at a processing rate of more than 10 Hz, which can be implemented in many practical applications. Both training data and the proposed method is open sourced at https://github.com/wh200720041/MMS_SLAM.
△ Less
Submitted 14 May, 2022; v1 submitted 9 May, 2022;
originally announced May 2022.
-
Exploiting Session Information in BERT-based Session-aware Sequential Recommendation
Authors:
**seok Seol,
Youngrok Ko,
Sang-goo Lee
Abstract:
In recommendation systems, utilizing the user interaction history as sequential information has resulted in great performance improvement. However, in many online services, user interactions are commonly grouped by sessions that presumably share preferences, which requires a different approach from ordinary sequence representation techniques. To this end, sequence representation models with a hier…
▽ More
In recommendation systems, utilizing the user interaction history as sequential information has resulted in great performance improvement. However, in many online services, user interactions are commonly grouped by sessions that presumably share preferences, which requires a different approach from ordinary sequence representation techniques. To this end, sequence representation models with a hierarchical structure or various viewpoints have been developed but with a rather complex network structure. In this paper, we propose three methods to improve recommendation performance by exploiting session information while minimizing additional parameters in a BERT-based sequential recommendation model: using session tokens, adding session segment embeddings, and a time-aware self-attention. We demonstrate the feasibility of the proposed methods through experiments on widely used recommendation datasets.
△ Less
Submitted 19 May, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Light Robust Monocular Depth Estimation For Outdoor Environment Via Monochrome And Color Camera Fusion
Authors:
Hyeonsoo Jang,
Yeongmin Ko,
Younkwan Lee,
Moongu Jeon
Abstract:
Depth estimation plays a important role in SLAM, odometry, and autonomous driving. Especially, monocular depth estimation is profitable technology because of its low cost, memory, and computation. However, it is not a sufficiently predicting depth map due to a camera often failing to get a clean image because of light conditions. To solve this problem, various sensor fusion method has been propose…
▽ More
Depth estimation plays a important role in SLAM, odometry, and autonomous driving. Especially, monocular depth estimation is profitable technology because of its low cost, memory, and computation. However, it is not a sufficiently predicting depth map due to a camera often failing to get a clean image because of light conditions. To solve this problem, various sensor fusion method has been proposed. Even though it is a powerful method, sensor fusion requires expensive sensors, additional memory, and high computational performance.
In this paper, we present color image and monochrome image pixel-level fusion and stereo matching with partially enhanced correlation coefficient maximization. Our methods not only outperform the state-of-the-art works across all metrics but also efficient in terms of cost, memory, and computation. We also validate the effectiveness of our design with an ablation study.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs
Authors:
Taebum Kim,
Eunji Jeong,
Geon-Woo Kim,
Yunmo Koo,
Sehoon Kim,
Gyeong-In Yu,
Byung-Gon Chun
Abstract:
Imperative programming allows users to implement their deep neural networks (DNNs) easily and has become an essential part of recent deep learning (DL) frameworks. Recently, several systems have been proposed to combine the usability of imperative programming with the optimized performance of symbolic graph execution. Such systems convert imperative Python DL programs to optimized symbolic graphs…
▽ More
Imperative programming allows users to implement their deep neural networks (DNNs) easily and has become an essential part of recent deep learning (DL) frameworks. Recently, several systems have been proposed to combine the usability of imperative programming with the optimized performance of symbolic graph execution. Such systems convert imperative Python DL programs to optimized symbolic graphs and execute them. However, they cannot fully support the usability of imperative programming. For example, if an imperative DL program contains a Python feature with no corresponding symbolic representation (e.g., third-party library calls or unsupported dynamic control flows) they fail to execute the program. To overcome this limitation, we propose Terra, an imperative-symbolic co-execution system that can handle any imperative DL programs while achieving the optimized performance of symbolic graph execution. To achieve this, Terra builds a symbolic graph by decoupling DL operations from Python features. Then, Terra conducts the imperative execution to support all Python features, while delegating the decoupled operations to the symbolic execution. We evaluated the performance improvement and coverage of Terra with ten imperative DL programs for several DNN architectures. The results show that Terra can speed up the execution of all ten imperative DL programs, whereas AutoGraph, one of the state-of-the-art systems, fails to execute five of them.
△ Less
Submitted 23 January, 2022;
originally announced January 2022.
-
Learning Emergent Random Access Protocol for LEO Satellite Networks
Authors:
Ju-Hyung Lee,
Hyowoon Seo,
Jihong Park,
Mehdi Bennis,
Young-Chai Ko
Abstract:
A mega-constellation of low-altitude earth orbit (LEO) satellites (SATs) are envisaged to provide a global coverage SAT network in beyond fifth-generation (5G) cellular systems. LEO SAT networks exhibit extremely long link distances of many users under time-varying SAT network topology. This makes existing multiple access protocols, such as random access channel (RACH) based cellular protocol desi…
▽ More
A mega-constellation of low-altitude earth orbit (LEO) satellites (SATs) are envisaged to provide a global coverage SAT network in beyond fifth-generation (5G) cellular systems. LEO SAT networks exhibit extremely long link distances of many users under time-varying SAT network topology. This makes existing multiple access protocols, such as random access channel (RACH) based cellular protocol designed for fixed terrestrial network topology, ill-suited. To overcome this issue, in this paper, we propose a novel grant-free random access solution for LEO SAT networks, dubbed emergent random access channel protocol (eRACH). In stark contrast to existing model-based and standardized protocols, eRACH is a model-free approach that emerges through interaction with the non-stationary network environment, using multi-agent deep reinforcement learning (MADRL). Furthermore, by exploiting known SAT orbiting patterns, eRACH does not require central coordination or additional communication across users, while training convergence is stabilized through the regular orbiting patterns. Compared to RACH, we show from various simulations that our proposed eRACH yields 54.6% higher average network throughput with around two times lower average access delay while achieving 0.989 Jain's fairness index.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Task-Driven Deep Image Enhancement Network for Autonomous Driving in Bad Weather
Authors:
Younkwan Lee,
Jihyo Jeon,
Yeongmin Ko,
Byunggwan Jeon,
Moongu Jeon
Abstract:
Visual perception in autonomous driving is a crucial part of a vehicle to navigate safely and sustainably in different traffic conditions. However, in bad weather such as heavy rain and haze, the performance of visual perception is greatly affected by several degrading effects. Recently, deep learning-based perception methods have addressed multiple degrading effects to reflect real-world bad weat…
▽ More
Visual perception in autonomous driving is a crucial part of a vehicle to navigate safely and sustainably in different traffic conditions. However, in bad weather such as heavy rain and haze, the performance of visual perception is greatly affected by several degrading effects. Recently, deep learning-based perception methods have addressed multiple degrading effects to reflect real-world bad weather cases but have shown limited success due to 1) high computational costs for deployment on mobile devices and 2) poor relevance between image enhancement and visual perception in terms of the model ability. To solve these issues, we propose a task-driven image enhancement network connected to the high-level vision task, which takes in an image corrupted by bad weather as input. Specifically, we introduce a novel low memory network to reduce most of the layer connections of dense blocks for less memory and computational cost while maintaining high performance. We also introduce a new task-driven training strategy to robustly guide the high-level task model suitable for both high-quality restoration of images and highly accurate perception. Experiment results demonstrate that the proposed method improves the performance among lane and 2D object detection, and depth estimation largely under adverse weather in terms of both low memory and accuracy.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
A Policy-based Versioning SSD with Intel SGX
Authors:
**woo Ahn,
Seung** Lee,
**hoon Lee,
Youngwoo Ko,
Junghee Lee,
Youngjae Kim
Abstract:
Privileged malware neutralizes software-based versioning systems and destroys data. To counter this threat, a versioning solid-state drive (SSD) that performs versioning inside the SSD has been studied. An SSD is a suitable candidate for data versioning because it can preserve previous versions without additional copying, and provide high security with a very small trusted computing base (TCB). Ho…
▽ More
Privileged malware neutralizes software-based versioning systems and destroys data. To counter this threat, a versioning solid-state drive (SSD) that performs versioning inside the SSD has been studied. An SSD is a suitable candidate for data versioning because it can preserve previous versions without additional copying, and provide high security with a very small trusted computing base (TCB). However, the versioning SSDs studied so far commonly use a full disk versioning method that preserves all file versions in a batch. This paper demonstrates that SSDs, which provide full disk versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with this threat, we propose SGX-SSD, a policy-based per-file versioning SSD to keep a deeper history for only the important files of users. However, since the SSD isn't aware of a file semantic, and the versioning policy information should be securely received from the untrusted host computer, implementing the per-file versioning in SSD is a huge challenge. To solve this problem, SGX-SSD utilizes the Intel SGX and has a secure host interface to securely receive policy information (configuration values) from the user. Also, to solve the file semantic unawareness problem of the SSD, a piggyback module is designed to give a file hint at the host layer, and an algorithm for selective versioning based on the policy is implemented in the SSD. To prove our system, we prototyped SGX-SSD the Jasmine OpenSSD platform in Linux environment. In the experimental evaluation, we proved that SGX-SSD provides strong security with little additional overhead for selective per-file versioning.
△ Less
Submitted 14 August, 2021;
originally announced August 2021.
-
Joint Association and Resource Allocation for Multi-Hop Integrated Access and Backhaul (IAB) Network
Authors:
Byungju Lim,
Ju-Hyung Lee,
Jae-Hong Kwon,
Young-Chai Ko
Abstract:
Integrated access and backhaul (IAB) network is envisioned as a novel network architecture for increasing the network capacity and coverage. To facilitate the IAB network, the appropriate methods of wireless link association and resource management are required. In this paper, we investigate the joint optimization problem of association and resource allocation in terms of subchannel and power for…
▽ More
Integrated access and backhaul (IAB) network is envisioned as a novel network architecture for increasing the network capacity and coverage. To facilitate the IAB network, the appropriate methods of wireless link association and resource management are required. In this paper, we investigate the joint optimization problem of association and resource allocation in terms of subchannel and power for IAB network. In particular, we handle the association and resource allocation problems for wireless backhaul and access links considering multi-hop backhauling. Since the optimization problem for IAB network is formulated as a mixed integer non-linear programming (MINLP), we divide it into three subproblems for association, subchannel allocation, and power allocation, respectively, and these subproblems are solved alternatively to obtain a local optimal solution. For the association problem, we adopt the Lagrangian duality approach to configure the backhaul and access links and successive convex approximation (SCA) approach is used to solve the subchannel and power allocation problems efficiently. Simulation results demonstrate that the proposed algorithm achieves better performance than single-hop backhauling based network and enhances the capacity and coverage by configuring the multi-hop backhauling.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
Attention-based Reinforcement Learning for Real-Time UAV Semantic Communication
Authors:
Won Joon Yun,
Byungju Lim,
Soyi Jung,
Young-Chai Ko,
Jihong Park,
Joongheon Kim,
Mehdi Bennis
Abstract:
In this article, we study the problem of air-to-ground ultra-reliable and low-latency communication (URLLC) for a moving ground user. This is done by controlling multiple unmanned aerial vehicles (UAVs) in real time while avoiding inter-UAV collisions. To this end, we propose a novel multi-agent deep reinforcement learning (MADRL) framework, coined a graph attention exchange network (GAXNet). In G…
▽ More
In this article, we study the problem of air-to-ground ultra-reliable and low-latency communication (URLLC) for a moving ground user. This is done by controlling multiple unmanned aerial vehicles (UAVs) in real time while avoiding inter-UAV collisions. To this end, we propose a novel multi-agent deep reinforcement learning (MADRL) framework, coined a graph attention exchange network (GAXNet). In GAXNet, each UAV constructs an attention graph locally measuring the level of attention to its neighboring UAVs, while exchanging the attention weights with other UAVs so as to reduce the attention mismatch between them. Simulation results corroborates that GAXNet achieves up to 4.5x higher rewards during training. At execution, without incurring inter-UAV collisions, GAXNet achieves 6.5x lower latency with the target 0.0000001 error rate, compared to a state-of-the-art baseline framework.
△ Less
Submitted 22 May, 2021;
originally announced May 2021.
-
Misspelling Correction with Pre-trained Contextual Language Model
Authors:
Yifei Hu,
Xiaonan **g,
Youlim Ko,
Julia Taylor Rayz
Abstract:
Spelling irregularities, known now as spelling mistakes, have been found for several centuries. As humans, we are able to understand most of the misspelled words based on their location in the sentence, perceived pronunciation, and context. Unlike humans, computer systems do not possess the convenient auto complete functionality of which human brains are capable. While many programs provide spelli…
▽ More
Spelling irregularities, known now as spelling mistakes, have been found for several centuries. As humans, we are able to understand most of the misspelled words based on their location in the sentence, perceived pronunciation, and context. Unlike humans, computer systems do not possess the convenient auto complete functionality of which human brains are capable. While many programs provide spelling correction functionality, many systems do not take context into account. Moreover, Artificial Intelligence systems function in the way they are trained on. With many current Natural Language Processing (NLP) systems trained on grammatically correct text data, many are vulnerable against adversarial examples, yet correctly spelled text processing is crucial for learning. In this paper, we investigate how spelling errors can be corrected in context, with a pre-trained language model BERT. We present two experiments, based on BERT and the edit distance algorithm, for ranking and selecting candidate corrections. The results of our experiments demonstrated that when combined properly, contextual word embeddings of BERT and edit distance are capable of effectively correcting spelling errors.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Hardness of Approximate Nearest Neighbor Search under L-infinity
Authors:
Young Kun Ko,
Min Jae Song
Abstract:
We show conditional hardness of Approximate Nearest Neighbor Search (ANN) under the $\ell_\infty$ norm with two simple reductions. Our first reduction shows that hardness of a special case of the Shortest Vector Problem (SVP), which captures many provably hard instances of SVP, implies a lower bound for ANN with polynomial preprocessing time under the same norm. Combined with a recent quantitative…
▽ More
We show conditional hardness of Approximate Nearest Neighbor Search (ANN) under the $\ell_\infty$ norm with two simple reductions. Our first reduction shows that hardness of a special case of the Shortest Vector Problem (SVP), which captures many provably hard instances of SVP, implies a lower bound for ANN with polynomial preprocessing time under the same norm. Combined with a recent quantitative hardness result on SVP under $\ell_\infty$ (Bennett et al., FOCS 2017), our reduction implies that finding a $(1+\varepsilon)$-approximate nearest neighbor under $\ell_\infty$ with polynomial preprocessing requires near-linear query time, unless the Strong Exponential Time Hypothesis (SETH) is false. This complements the results of Rubinstein (STOC 2018), who showed hardness of ANN under $\ell_1$, $\ell_2$, and edit distance.
Further improving the approximation factor for hardness, we show that, assuming SETH, near-linear query time is required for any approximation factor less than $3$ under $\ell_\infty$. This shows a conditional separation between ANN under the $\ell_1/ \ell_2$ norm and the $\ell_\infty$ norm since there are sublinear time algorithms achieving better than $3$-approximation for the $\ell_1$ and $\ell_2$ norm. Lastly, we show that the approximation factor of $3$ is a barrier for any naive gadget reduction from the Orthogonal Vectors problem.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Integrating LEO Satellites and Multi-UAV Reinforcement Learning for Hybrid FSO/RF Non-Terrestrial Networks
Authors:
Ju-Hyung Lee,
Jihong Park,
Mehdi Bennis,
Young-Chai Ko
Abstract:
A mega-constellation of low-altitude earth orbit (LEO) satellites (SATs) and burgeoning unmanned aerial vehicles (UAVs) are promising enablers for high-speed and long-distance communications in beyond fifth-generation (5G) systems. Integrating SATs and UAVs within a non-terrestrial network (NTN), in this article we investigate the problem of forwarding packets between two faraway ground terminals…
▽ More
A mega-constellation of low-altitude earth orbit (LEO) satellites (SATs) and burgeoning unmanned aerial vehicles (UAVs) are promising enablers for high-speed and long-distance communications in beyond fifth-generation (5G) systems. Integrating SATs and UAVs within a non-terrestrial network (NTN), in this article we investigate the problem of forwarding packets between two faraway ground terminals through SAT and UAV relays using either millimeter-wave (mmWave) radio-frequency (RF) or free-space optical (FSO) link. Towards maximizing the communication efficiency, the real-time associations with orbiting SATs and the moving trajectories of UAVs should be optimized with suitable FSO/RF links, which is challenging due to the time-varying network topology and a huge number of possible control actions. To overcome the difficulty, we lift this problem to multi-agent deep reinforcement learning (MARL) with a novel action dimensionality reduction technique. Simulation results corroborate that our proposed SAT-UAV integrated scheme achieves 1.99x higher end-to-end sum throughput compared to a benchmark scheme with fixed ground relays. While improving the throughput, our proposed scheme also aims to reduce the UAV control energy, yielding 2.25x higher energy efficiency than a baseline method only maximizing the throughput. Lastly, thanks to utilizing hybrid FSO/RF links, the proposed scheme achieves up to 62.56x higher peak throughput and 21.09x higher worst-case throughput than the cases utilizing either RF or FSO links, highlighting the importance of co-designing SAT-UAV associations, UAV trajectories, and hybrid FSO/RF links in beyond-5G NTNs.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference
Authors:
Brandon Reagen,
Wooseok Choi,
Yeongil Ko,
Vincent Lee,
Gu-Yeon Wei,
Hsien-Hsin S. Lee,
David Brooks
Abstract:
As the application of deep learning continues to grow, so does the amount of data used to make predictions. While traditionally, big-data deep learning was constrained by computing performance and off-chip memory bandwidth, a new constraint has emerged: privacy. One solution is homomorphic encryption (HE). Applying HE to the client-cloud model allows cloud services to perform inference directly on…
▽ More
As the application of deep learning continues to grow, so does the amount of data used to make predictions. While traditionally, big-data deep learning was constrained by computing performance and off-chip memory bandwidth, a new constraint has emerged: privacy. One solution is homomorphic encryption (HE). Applying HE to the client-cloud model allows cloud services to perform inference directly on the client's encrypted data. While HE can meet privacy constraints, it introduces enormous computational challenges and remains impractically slow in current systems.
This paper introduces Cheetah, a set of algorithmic and hardware optimizations for HE DNN inference to achieve plaintext DNN inference speeds. Cheetah proposes HE-parameter tuning optimization and operator scheduling optimizations, which together deliver 79x speedup over the state-of-the-art. However, this still falls short of plaintext inference speeds by almost four orders of magnitude. To bridge the remaining performance gap, Cheetah further proposes an accelerator architecture that, when combined with the algorithmic optimizations, approaches plaintext DNN inference speeds. We evaluate several common neural network models (e.g., ResNet50, VGG16, and AlexNet) and show that plaintext-level HE inference for each is feasible with a custom accelerator consuming 30W and 545mm^2.
△ Less
Submitted 8 October, 2020; v1 submitted 31 May, 2020;
originally announced June 2020.
-
Integrating LEO Satellite and UAV Relaying via Reinforcement Learning for Non-Terrestrial Networks
Authors:
Ju-Hyung Lee,
Jihong Park,
Mehdi Bennis,
Young-Chai Ko
Abstract:
A mega-constellation of low-earth orbit (LEO) satellites has the potential to enable long-range communication with low latency. Integrating this with burgeoning unmanned aerial vehicle (UAV) assisted non-terrestrial networks will be a disruptive solution for beyond 5G systems provisioning large scale three-dimensional connectivity. In this article, we study the problem of forwarding packets betwee…
▽ More
A mega-constellation of low-earth orbit (LEO) satellites has the potential to enable long-range communication with low latency. Integrating this with burgeoning unmanned aerial vehicle (UAV) assisted non-terrestrial networks will be a disruptive solution for beyond 5G systems provisioning large scale three-dimensional connectivity. In this article, we study the problem of forwarding packets between two faraway ground terminals, through an LEO satellite selected from an orbiting constellation and a mobile high-altitude platform (HAP) such as a fixed-wing UAV. To maximize the end-to-end data rate, the satellite association and HAP location should be optimized, which is challenging due to a huge number of orbiting satellites and the resulting time-varying network topology. We tackle this problem using deep reinforcement learning (DRL) with a novel action dimension reduction technique. Simulation results corroborate that our proposed method achieves up to 5.74x higher average data rate compared to a direct communication baseline without SAT and HAP.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
SGX-SSD: A Policy-based Versioning SSD with Intel SGX
Authors:
**woo Ahn,
Seung** Lee,
**hoon Lee,
Yungwoo Ko,
Donghyun Min,
Junghee Lee,
Youngjae Kim
Abstract:
This paper demonstrates that SSDs, which perform device-level versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with that threat, we propose SGX-SSD, a SGX-based versioning SSD which selectively preserves file history based on the given policy. The proposed system adopts Intel SGX to implement the version policy mana…
▽ More
This paper demonstrates that SSDs, which perform device-level versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with that threat, we propose SGX-SSD, a SGX-based versioning SSD which selectively preserves file history based on the given policy. The proposed system adopts Intel SGX to implement the version policy management system that is safe from high-privileged malware. Based on the policy, only the necessary data is selectively preserved in SSD that prevents files with less priority from wasting space and also ensures the integrity of important files.
△ Less
Submitted 28 April, 2020; v1 submitted 28 April, 2020;
originally announced April 2020.
-
Deep Energy Autoencoder for Noncoherent Multicarrier MU-SIMO Systems
Authors:
Thien Van Luong,
Youngwook Ko,
Ngo Anh Vien,
Michail Matthaiou,
Hien Quoc Ngo
Abstract:
We propose a novel deep energy autoencoder (EA) for noncoherent multicarrier multiuser single-input multipleoutput (MU-SIMO) systems under fading channels. In particular, a single-user noncoherent EA-based (NC-EA) system, based on the multicarrier SIMO framework, is first proposed, where both the transmitter and receiver are represented by deep neural networks (DNNs), known as the encoder and deco…
▽ More
We propose a novel deep energy autoencoder (EA) for noncoherent multicarrier multiuser single-input multipleoutput (MU-SIMO) systems under fading channels. In particular, a single-user noncoherent EA-based (NC-EA) system, based on the multicarrier SIMO framework, is first proposed, where both the transmitter and receiver are represented by deep neural networks (DNNs), known as the encoder and decoder of an EA. Unlike existing systems, the decoder of the NC-EA is fed only with the energy combined from all receive antennas, while its encoder outputs a real-valued vector whose elements stand for the subcarrier power levels. Using the NC-EA, we then develop two novel DNN structures for both uplink and downlink NC-EA multiple access (NC-EAMA) schemes, based on the multicarrier MUSIMO framework. Note that NC-EAMA allows multiple users to share the same sub-carriers, thus enables to achieve higher performance gains than noncoherent orthogonal counterparts. By properly training, the proposed NC-EA and NC-EAMA can efficiently recover the transmitted data without any channel state information estimation. Simulation results clearly show the superiority of our schemes in terms of reliability, flexibility and complexity over baseline schemes.
△ Less
Submitted 20 February, 2020;
originally announced February 2020.
-
Key Points Estimation and Point Instance Segmentation Approach for Lane Detection
Authors:
Yeongmin Ko,
Younkwan Lee,
Shoaib Azam,
Farzeen Munir,
Moongu Jeon,
Witold Pedrycz
Abstract:
Perception techniques for autonomous driving should be adaptive to various environments. In the case of traffic line detection, an essential perception module, many condition should be considered, such as number of traffic lines and computing power of the target system. To address these problems, in this paper, we propose a traffic line detection method called Point Instance Network (PINet); the m…
▽ More
Perception techniques for autonomous driving should be adaptive to various environments. In the case of traffic line detection, an essential perception module, many condition should be considered, such as number of traffic lines and computing power of the target system. To address these problems, in this paper, we propose a traffic line detection method called Point Instance Network (PINet); the method is based on the key points estimation and instance segmentation approach. The PINet includes several stacked hourglass networks that are trained simultaneously. Therefore the size of the trained models can be chosen according to the computing power of the target environment. We cast a clustering problem of the predicted key points as an instance segmentation problem; the PINet can be trained regardless of the number of the traffic lines. The PINet achieves competitive accuracy and false positive on the TuSimple and Culane datasets, popular public datasets for lane detection. Our code is available at https://github.com/koyeongmin/PINet_new
△ Less
Submitted 13 September, 2020; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Throughput Maximization of Mixed FSO/RF UAV-aided Mobile Relaying with a Buffer
Authors:
Ju-Hyung Lee,
Ki-Hong Park,
Young-Chai Ko,
Mohamed-Slim Alouini
Abstract:
In this paper, we consider an unmanned aerial vehicle (UAV) aided mobile relaying system under a buffer constraint. We propose a new relaying protocol employing mixed free-space optical/radio frequency (FSO/RF) communication, i.e., the source-relay and relay-destination links utilize FSO and RF links, respectively, under the buffer constraint at the UAV relay node. Taking the conditions of an imba…
▽ More
In this paper, we consider an unmanned aerial vehicle (UAV) aided mobile relaying system under a buffer constraint. We propose a new relaying protocol employing mixed free-space optical/radio frequency (FSO/RF) communication, i.e., the source-relay and relay-destination links utilize FSO and RF links, respectively, under the buffer constraint at the UAV relay node. Taking the conditions of an imbalance in transmission rate between RF and FSO links into consideration, we study the trajectory optimization problem of buffer-constrained UAV relay node in order to maximize the end-to-end data throughput. Especially, we classify two relaying transmission schemes according to the delay requirements, i.e., i) delay-limited transmission and ii) delay-tolerant transmission. We solve the locally optimal trajectory problem of the UAV to maximize the throughput of ground user terminal. As a result, we propose an iterative algorithm that efficiently finds a local optimum solution for the throughput maximization problems. Through this algorithm, we present the resulting trajectories over the the atmospheric condition, the buffer size, and the delay requirement. Also, we show the optimum buffer size and the throughput-delay tradeoff for a given system. Our numerical results validate that the proposed buffer-aided mobile relaying scheme achieves 65.55% throughput gains compared to conventional static relaying scheme.
△ Less
Submitted 6 July, 2020; v1 submitted 5 January, 2020;
originally announced January 2020.
-
QoS-aware energy-efficient workload routing and server speed control policy in data centers: a robust queueing theoretic approach
Authors:
Seung Min Baik,
Young Myoung Ko
Abstract:
Operating cloud service infrastructures requires high energy efficiency while ensuring a satisfactory service level. Motivated by data centers, we consider a workload routing and server speed control policy applicable to the system operating under fluctuating demands. Dynamic control algorithms are generally more energy-efficient than static ones. However, they often require frequent information e…
▽ More
Operating cloud service infrastructures requires high energy efficiency while ensuring a satisfactory service level. Motivated by data centers, we consider a workload routing and server speed control policy applicable to the system operating under fluctuating demands. Dynamic control algorithms are generally more energy-efficient than static ones. However, they often require frequent information exchanges between routers and servers, making the data centers' management hesitate to deploy these algorithms. This study presents a static routing and server speed control policy that could achieve energy efficiency similar to a dynamic algorithm and eliminate the necessity of frequent communication among resources. We take a robust queueing theoretic approach to response time constraints for the quality of service (QoS) conditions. Each server is modeled as a G/G/1 processor sharing queue, and the concept of uncertainty sets defines the domain of stochastic primitives. We derive an approximative upper bound of sojourn times from uncertainty sets and develop an approximative sojourn time quantile estimation method for QoS. Numerical experiments confirm the proposed static policy offers competitive solutions compared with the dynamic algorithm.
△ Less
Submitted 3 March, 2023; v1 submitted 20 December, 2019;
originally announced December 2019.
-
A UAV-Mounted Free Space Optical Communication: Trajectory Optimization for Flight Time
Authors:
Ju-Hyung Lee,
Ki-Hong Park,
Young-Chai Ko,
Mohamed-Slim Alouini
Abstract:
In this work, we address the trajectory optimization of a fixed-wing unmanned aerial vehicle (UAV) using free space optical communication (FSOC). Here, we focus on maximizing the flight time of the UAV by considering practical constraints for wireless UAV communication, including limited propulsion energy and required data rates. We find optimized trajectories in various atmospheric environments (…
▽ More
In this work, we address the trajectory optimization of a fixed-wing unmanned aerial vehicle (UAV) using free space optical communication (FSOC). Here, we focus on maximizing the flight time of the UAV by considering practical constraints for wireless UAV communication, including limited propulsion energy and required data rates. We find optimized trajectories in various atmospheric environments (e.g., moderate-fog and heavy-fog conditions), while also considering the channel characteristics of FSOC. In addition to maximizing the flight time, we consider the energy efficiency maximization and operation-time minimization problem to find the suboptimal solutions required to meet those constraints. Furthermore, we introduce a low-complexity approach to the proposed framework. In order to address the optimization problem, we conduct a bisection method and sequential programming and introduce a new feasibility check algorithm. Although our design considers suboptimal solutions owing to the nonconvexity of the problems, our simulations indicate that the proposed scheme exhibits a gain of approximately 44.12\% in terms of service time when compared to the conventional scheme.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
Ethanos: Lightweight Bootstrap** for Ethereum
Authors:
Jae-Yun Kim,
Jun-Mo Lee,
Yeon-Jae Koo,
Sang-Hyeon Park,
Soo-Mook Moon
Abstract:
As ethereum blockchain has become popular, the number of users and transactions has skyrocketed, causing an explosive increase of its data size. As a result, ordinary clients using PCs or smartphones cannot easily bootstrap as a full node, but rely on other full nodes such as the miners to run or verify transactions. This may affect the security of ethereum, so light bootstrap** techniques such…
▽ More
As ethereum blockchain has become popular, the number of users and transactions has skyrocketed, causing an explosive increase of its data size. As a result, ordinary clients using PCs or smartphones cannot easily bootstrap as a full node, but rely on other full nodes such as the miners to run or verify transactions. This may affect the security of ethereum, so light bootstrap** techniques such as fast sync has been proposed to download only parts of full data, yet the space overhead is still too high. One of the biggest space overhead that cannot easily be reduced is caused by saving the state of all accounts in the block's state trie. Fortunately, we found that more than 90% of accounts are inactive and old transactions are hard to be manipulated. Based on these observations, this paper propose a novel optimization technique called ethanos that can reduce bootstrap** cost by swee** inactive accounts periodically and by not downloading old transactions. If an inactive account becomes active, ethanos restore its state by running a restoration transaction. Also, ethanos gives incentives for archive nodes to maintain the old transactions for possible re-verification. We implemented ethanos by instrumenting the go-ethereum (geth) client and evaluated with the real 113 million transactions from 14 million accounts between 7M-th and 8M-th blocks in ethereum. Our experimental result shows that ethanos can reduce the size of the account state by half, which, if combined with removing old transactions, may reduce the storage size for bootstrap** to around 1GB. This would be reasonable enough for ordinary clients to bootstrap on their personal devices.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Word Sense Disambiguation using Knowledge-based Word Similarity
Authors:
Sunjae Kwon,
Dongsuk Oh,
Youngjoong Ko
Abstract:
In natural language processing, word-sense disambiguation (WSD) is an open problem concerned with identifying the correct sense of words in a particular context. To address this problem, we introduce a novel knowledge-based WSD system. We suggest the adoption of two methods in our system. First, we suggest a novel method to encode the word vector representation by considering the graphical semanti…
▽ More
In natural language processing, word-sense disambiguation (WSD) is an open problem concerned with identifying the correct sense of words in a particular context. To address this problem, we introduce a novel knowledge-based WSD system. We suggest the adoption of two methods in our system. First, we suggest a novel method to encode the word vector representation by considering the graphical semantic relationships from the lexical knowledge-base. Second, we propose a method for extracting the contextual words from the text for analyzing an ambiguous word based on the similarity of word vector representations. To validate the effectiveness of our WSD system, we conducted experiments on the five benchmark English WSD corpora (Senseval-02, Senseval-03, SemEval-07, SemEval-13, and SemEval-15). The obtained results demonstrated that the suggested methods significantly enhanced the WSD performance. Furthermore, our system outperformed the existing knowledge-based WSD systems and showed a performance comparable to that of the state-of-the-art supervised WSD systems.
△ Less
Submitted 21 June, 2020; v1 submitted 10 November, 2019;
originally announced November 2019.
-
An Adaptive Step Toward the Multiphase Conjecture
Authors:
Young Kun Ko,
Omri Weinstein
Abstract:
In 2010, Pǎtraşcu proposed the following three-phase dynamic problem, as a candidate for proving polynomial lower bounds on the operational time of dynamic data structures:
I: Preprocess a collection of sets $\vec{S} = S_1, \ldots , S_k \subseteq [n]$, where $k=\operatorname{poly}(n)$.
II: A set $T\subseteq [n]$ is revealed, and the data structure updates its memory.
III: An index…
▽ More
In 2010, Pǎtraşcu proposed the following three-phase dynamic problem, as a candidate for proving polynomial lower bounds on the operational time of dynamic data structures:
I: Preprocess a collection of sets $\vec{S} = S_1, \ldots , S_k \subseteq [n]$, where $k=\operatorname{poly}(n)$.
II: A set $T\subseteq [n]$ is revealed, and the data structure updates its memory.
III: An index $i \in [k]$ is revealed, and the data structure must determine if $S_i\cap T=^? \emptyset$.
Pǎtraşcu conjectured that any data structure for the Multiphase problem must make $n^ε$ cell-probes in either Phase II or III, and showed that this would imply similar unconditional lower bounds on many important dynamic data structure problems. Alas, there has been almost no progress on this conjecture in the past decade since its introduction. We show an $\tildeΩ(\sqrt{n})$ cell-probe lower bound on the Multiphase problem for data structures with general (adaptive) updates, and queries with unbounded but "layered" adaptivity. This result captures all known set-intersection data structures and significantly strengthens previous Multiphase lower bounds, which only captured non-adaptive data structures.
Our main technical result is a communication lower bound on a 4-party variant of Pǎtraşcu's Number-On-Forehead Multiphase game, using information complexity techniques. We also show that a lower bound on Pǎtraşcu's original NOF game would imply a polynomial ($n^{1+ε}$) lower bound on the number of wires of any constant-depth circuit with arbitrary gates computing a random $\tilde{O}(n)\times n$ linear operator $x \mapsto Ax$, a long-standing open problem in circuit complexity. This suggests that the NOF conjecture is much stronger than its data structure counterpart.
△ Less
Submitted 29 October, 2019;
originally announced October 2019.
-
Unconstrained Road Marking Recognition with Generative Adversarial Networks
Authors:
Younkwan Lee,
Juhyun Lee,
Yoo** Hong,
YeongMin Ko,
Moongu Jeon
Abstract:
Recent road marking recognition has achieved great success in the past few years along with the rapid development of deep learning. Although considerable advances have been made, they are often over-dependent on unrepresentative datasets and constrained conditions. In this paper, to overcome these drawbacks, we propose an alternative method that achieves higher accuracy and generates high-quality…
▽ More
Recent road marking recognition has achieved great success in the past few years along with the rapid development of deep learning. Although considerable advances have been made, they are often over-dependent on unrepresentative datasets and constrained conditions. In this paper, to overcome these drawbacks, we propose an alternative method that achieves higher accuracy and generates high-quality samples as data augmentation. With the following two major contributions: 1) The proposed deblurring network can successfully recover a clean road marking from a blurred one by adopting generative adversarial networks (GAN). 2) The proposed data augmentation method, based on mutual information, can preserve and learn semantic context from the given dataset. We construct and train a class-conditional GAN to increase the size of training set, which makes it suitable to recognize target. The experimental results have shown that our proposed framework generates deblurred clean samples from blurry ones, and outperforms other methods even with unconstrained road marking datasets.
△ Less
Submitted 9 October, 2019;
originally announced October 2019.