Search | arXiv e-print repository

Classification of Goods Using Text Descriptions With Sentences Retrieval

Authors: Eunji Lee, Sundong Kim, Sihyun Kim, Sungwon Park, Meeyoung Cha, Soyeon Jung, Suyoung Yang, Yeonsoo Choi, Sungdae Ji, Minsoo Song, Heeja Kim

Abstract: The task of assigning and validating internationally accepted commodity code (HS code) to traded goods is one of the critical functions at the customs office. This decision is crucial to importers and exporters, as it determines the tariff rate. However, similar to court decisions made by judges, the task can be non-trivial even for experienced customs officers. The current paper proposes a deep l… ▽ More The task of assigning and validating internationally accepted commodity code (HS code) to traded goods is one of the critical functions at the customs office. This decision is crucial to importers and exporters, as it determines the tariff rate. However, similar to court decisions made by judges, the task can be non-trivial even for experienced customs officers. The current paper proposes a deep learning model to assist this seemingly challenging HS code classification. Together with Korea Customs Service, we built a decision model based on KoELECTRA that suggests the most likely heading and subheadings (i.e., the first four and six digits) of the HS code. Evaluation on 129,084 past cases shows that the top-3 suggestions made by our model have an accuracy of 95.5% in classifying 265 subheadings. This promising result implies algorithms may reduce the time and effort taken by customs officers substantially by assisting the HS code classification task. △ Less

Submitted 2 November, 2021; originally announced November 2021.

arXiv:2108.04553 [pdf, other]

doi 10.1103/PhysRevLett.127.172701

Advancement of Photospheric Radius Expansion and Clocked Type-I X-Ray Burst Models with the New $^{22}$Mg$(α,p)^{25}$Al Reaction Rate Determined at Gamow Energy

Authors: J. Hu, H. Yamaguchi, Y. H. Lam, A. Heger, D. Kahl, A. M. Jacobs, Z. Johnston, S. W. Xu, N. T. Zhang, S. B. Ma, L. H. Ru, E. Q. Liu, T. Liu, S. Hayakawa, L. Yang, H. Shimizu, C. B. Hamill, A. St J. Murphy, J. Su, X. Fang, K. Y. Chae, M. S. Kwag, S. M. Cha, N. N. Duy, N. K. Uyen , et al. (12 additional authors not shown)

Abstract: We report the first (in)elastic scattering measurement of $^{25}\mathrm{Al}+p$ with the capability to select and measure in a broad energy range the proton resonances in $^{26}$Si contributing to the $^{22}$Mg$(α,p)$ reaction at type I x-ray burst energies. We measured spin-parities of four resonances above the $α$ threshold of $^{26}$Si that are found to strongly impact the $^{22}$Mg$(α,p)$ rate.… ▽ More We report the first (in)elastic scattering measurement of $^{25}\mathrm{Al}+p$ with the capability to select and measure in a broad energy range the proton resonances in $^{26}$Si contributing to the $^{22}$Mg$(α,p)$ reaction at type I x-ray burst energies. We measured spin-parities of four resonances above the $α$ threshold of $^{26}$Si that are found to strongly impact the $^{22}$Mg$(α,p)$ rate. The new rate advances a state-of-the-art model to remarkably reproduce light curves of the GS 1826$-$24 clocked burster with mean deviation $<9$ % and permits us to discover a strong correlation between the He abundance in the accreting envelope of photospheric radius expansion burster and the dominance of $^{22}$Mg$(α,p)$ branch. △ Less

Submitted 20 October, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

Comments: accepted by Physical Review Letters on 5 August 2021, published 19 October 2021

Journal ref: Phys. Rev. Lett. 127 (2021) 172701

arXiv:2106.10147 [pdf, other]

Evaluating the Robustness of Trigger Set-Based Watermarks Embedded in Deep Neural Networks

Authors: Suyoung Lee, Wonho Song, Suman Jana, Meeyoung Cha, Sooel Son

Abstract: Trigger set-based watermarking schemes have gained emerging attention as they provide a means to prove ownership for deep neural network model owners. In this paper, we argue that state-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership. We posit that this impaired capability stems from two common experimental flaws that the existing resear… ▽ More Trigger set-based watermarking schemes have gained emerging attention as they provide a means to prove ownership for deep neural network model owners. In this paper, we argue that state-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership. We posit that this impaired capability stems from two common experimental flaws that the existing research practice has committed when evaluating the robustness of watermarking algorithms: (1) incomplete adversarial evaluation and (2) overlooked adaptive attacks. We conduct a comprehensive adversarial evaluation of 11 representative watermarking schemes against six of the existing attacks and demonstrate that each of these watermarking schemes lacks robustness against at least two non-adaptive attacks. We also propose novel adaptive attacks that harness the adversary's knowledge of the underlying watermarking algorithm of a target model. We demonstrate that the proposed attacks effectively break all of the 11 watermarking schemes, consequently allowing adversaries to obscure the ownership of any watermarked model. We encourage follow-up studies to consider our guidelines when evaluating the robustness of their watermarking schemes via conducting comprehensive adversarial evaluation that includes our adaptive attacks to demonstrate a meaningful upper bound of watermark robustness. △ Less

Submitted 19 January, 2023; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: 15 pages, accepted at IEEE TDSC

arXiv:2105.04046 [pdf, other]

A likelihood approach to nonparametric estimation of a singular distribution using deep generative models

Authors: Minwoo Chae, Dongha Kim, Yongdai Kim, Lizhen Lin

Abstract: We investigate statistical properties of a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. More specifically, a deep generative model is used to model high-dimensional data that are assumed to concentrate around some low-dimensional structure. Estimating the distribution supported on this low-dimensional structure, such as a low-dimensional… ▽ More We investigate statistical properties of a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. More specifically, a deep generative model is used to model high-dimensional data that are assumed to concentrate around some low-dimensional structure. Estimating the distribution supported on this low-dimensional structure, such as a low-dimensional manifold, is challenging due to its singularity with respect to the Lebesgue measure in the ambient space. In the considered model, a usual likelihood approach can fail to estimate the target distribution consistently due to the singularity. We prove that a novel and effective solution exists by perturbing the data with an instance noise, which leads to consistent estimation of the underlying distribution with desirable convergence rates. We also characterize the class of distributions that can be efficiently estimated via deep generative models. This class is sufficiently general to contain various structured distributions such as product distributions, classically smooth distributions and distributions supported on a low-dimensional manifold. Our analysis provides some insights on how deep generative models can avoid the curse of dimensionality for nonparametric distribution estimation. We conduct a thorough simulation study and real data analysis to empirically demonstrate that the proposed data perturbation technique improves the estimation performance significantly. △ Less

Submitted 28 March, 2023; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: 42 pages, 13 figures, 1 table

MSC Class: 62G05 (Primary); 62G20 (Secondary)

arXiv:2104.10864 [pdf, other]

doi 10.1371/journal.pone.0263381

Misinformation, Believability, and Vaccine Acceptance Over 40 Countries: Takeaways From the Initial Phase of The COVID-19 Infodemic

Authors: Karandeep Singh, Gabriel Lima, Meeyoung Cha, Chiyoung Cha, Juhi Kulshrestha, Yong-Yeol Ahn, Onur Varol

Abstract: The COVID-19 pandemic has been damaging to the lives of people all around the world. Accompanied by the pandemic is an infodemic, an abundant and uncontrolled spreading of potentially harmful misinformation. The infodemic may severely change the pandemic's course by interfering with public health interventions such as wearing masks, social distancing, and vaccination. In particular, the impact of… ▽ More The COVID-19 pandemic has been damaging to the lives of people all around the world. Accompanied by the pandemic is an infodemic, an abundant and uncontrolled spreading of potentially harmful misinformation. The infodemic may severely change the pandemic's course by interfering with public health interventions such as wearing masks, social distancing, and vaccination. In particular, the impact of the infodemic on vaccination is critical because it holds the key to reverting to pre-pandemic normalcy. This paper presents findings from a global survey on the extent of worldwide exposure to the COVID-19 infodemic, assesses different populations' susceptibility to false claims, and analyzes its association with vaccine acceptance. Based on responses gathered from over 18,400 individuals from 40 countries, we find a strong association between perceived believability of misinformation and vaccination hesitancy. Additionally, our study shows that only half of the online users exposed to rumors might have seen the fact-checked information. Moreover, depending on the country, between 6% and 37% of individuals considered these rumors believable. Our survey also shows that poorer regions are more susceptible to encountering and believing COVID-19 misinformation. We discuss implications of our findings on public campaigns that proactively spread accurate information to countries that are more susceptible to the infodemic. We also highlight fact-checking platforms' role in better identifying and prioritizing claims that are perceived to be believable and have wide exposure. Our findings give insights into better handling of risk communication during the initial phase of a future pandemic. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:2104.03613 [pdf, other]

Uncertainty-aware Remaining Useful Life predictor

Authors: Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, Olga Fink

Abstract: Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications. Deploying successful RUL prediction methods in real-life applications is a prerequisite for the design of intelligent maintenance strategies with the potential of drastically reducing maintenance costs and machine downtimes. In light o… ▽ More Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications. Deploying successful RUL prediction methods in real-life applications is a prerequisite for the design of intelligent maintenance strategies with the potential of drastically reducing maintenance costs and machine downtimes. In light of their superior performance in a wide range of engineering fields, Machine Learning (ML) algorithms are natural candidates to tackle the challenges involved in the design of intelligent maintenance systems. In particular, given the potentially catastrophic consequences or substantial costs associated with maintenance decisions that are either too late or too early, it is desirable that ML algorithms provide uncertainty estimates alongside their predictions. However, standard data-driven methods used for uncertainty estimation in RUL problems do not scale well to large datasets or are not sufficiently expressive to model the high-dimensional map** from raw sensor data to RUL estimates. In this work, we consider Deep Gaussian Processes (DGPs) as possible solutions to the aforementioned limitations. We perform a thorough evaluation and comparison of several variants of DGPs applied to RUL predictions. The performance of the algorithms is evaluated on the N-CMAPSS (New Commercial Modular Aero-Propulsion System Simulation) dataset from NASA for aircraft engines. The results show that the proposed methods are able to provide very accurate RUL predictions along with sensible uncertainty estimates, providing more reliable solutions for (safety-critical) real-life industrial applications. △ Less

Submitted 8 April, 2021; originally announced April 2021.

arXiv:2103.15296 [pdf, other]

Elsa: Energy-based learning for semi-supervised anomaly detection

Authors: Sungwon Han, Hyeonho Song, Seungeon Lee, Sungwon Park, Meeyoung Cha

Abstract: Anomaly detection aims at identifying deviant instances from the normal data distribution. Many advances have been made in the field, including the innovative use of unsupervised contrastive learning. However, existing methods generally assume clean training data and are limited when the data contain unknown anomalies. This paper presents Elsa, a novel semi-supervised anomaly detection approach th… ▽ More Anomaly detection aims at identifying deviant instances from the normal data distribution. Many advances have been made in the field, including the innovative use of unsupervised contrastive learning. However, existing methods generally assume clean training data and are limited when the data contain unknown anomalies. This paper presents Elsa, a novel semi-supervised anomaly detection approach that unifies the concept of energy-based models with unsupervised contrastive learning. Elsa instills robustness against any data contamination by a carefully designed fine-tuning step based on the new energy function that forces the normal data to be divided into classes of prototypes. Experiments on multiple contamination scenarios show the proposed model achieves SOTA performance. Extensive analyses also verify the contribution of each component in the proposed model. Beyond the experiments, we also offer a theoretical interpretation of why contrastive learning alone cannot detect anomalies under data contamination. △ Less

Submitted 3 January, 2022; v1 submitted 28 March, 2021; originally announced March 2021.

Comments: Accepted and published at BMVC2021

arXiv:2103.04537 [pdf, other]

doi 10.1007/978-3-030-87196-3_26

Multimodal Representation Learning via Maximization of Local Mutual Information

Authors: Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth Berkowitz, Steven Horng, Polina Golland, William M. Wells

Abstract: We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting represe… ▽ More We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual information estimation with neural network discriminators. We argue that the sum of local mutual information is typically a lower bound on the global mutual information. Our experimental results in the downstream image classification tasks demonstrate the advantages of using local features for image-text representation learning. △ Less

Submitted 14 December, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

Comments: In Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021

Journal ref: In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 273-283. Springer, Cham, 2021

arXiv:2102.07650 [pdf, other]

Learning Student-Friendly Teacher Networks for Knowledge Distillation

Authors: Dae Young Park, Moon-Hyun Cha, Changwook Jeong, Dae Sin Kim, Bohyung Han

Abstract: We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student. Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students and, consequently, more appropriate for knowledge transfer. In other words, at the time of o… ▽ More We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student. Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students and, consequently, more appropriate for knowledge transfer. In other words, at the time of optimizing a teacher model, the proposed algorithm learns the student branches jointly to obtain student-friendly representations. Since the main goal of our approach lies in training teacher models and the subsequent knowledge distillation procedure is straightforward, most of the existing knowledge distillation methods can adopt this technique to improve the performance of diverse student models in terms of accuracy and convergence speed. The proposed algorithm demonstrates outstanding accuracy in several well-known knowledge distillation techniques with various combinations of teacher and student models even in the case that their architectures are heterogeneous and there is no prior knowledge about student models at the time of training teacher networks. △ Less

Submitted 23 January, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

Comments: Accepted by NeurIPS 2021

arXiv:2102.00625 [pdf, other]

doi 10.1145/3411764.3445260

Human Perceptions on Moral Responsibility of AI: A Case Study in AI-Assisted Bail Decision-Making

Authors: Gabriel Lima, Nina Grgić-Hlača, Meeyoung Cha

Abstract: How to attribute responsibility for autonomous artificial intelligence (AI) systems' actions has been widely debated across the humanities and social science disciplines. This work presents two experiments ($N$=200 each) that measure people's perceptions of eight different notions of moral responsibility concerning AI and human agents in the context of bail decision-making. Using real-life adapted… ▽ More How to attribute responsibility for autonomous artificial intelligence (AI) systems' actions has been widely debated across the humanities and social science disciplines. This work presents two experiments ($N$=200 each) that measure people's perceptions of eight different notions of moral responsibility concerning AI and human agents in the context of bail decision-making. Using real-life adapted vignettes, our experiments show that AI agents are held causally responsible and blamed similarly to human agents for an identical task. However, there was a meaningful difference in how people perceived these agents' moral responsibility; human agents were ascribed to a higher degree of present-looking and forward-looking notions of responsibility than AI agents. We also found that people expect both AI and human decision-makers and advisors to justify their decisions regardless of their nature. We discuss policy and HCI implications of these findings, such as the need for explainable AI in high-stakes scenarios. △ Less

Submitted 31 January, 2021; originally announced February 2021.

Comments: 17 Pages, 5 Figures, ACM CHI 2021

arXiv:2101.05957 [pdf, other]

Descriptive AI Ethics: Collecting and Understanding the Public Opinion

Authors: Gabriel Lima, Meeyoung Cha

Abstract: There is a growing need for data-driven research efforts on how the public perceives the ethical, moral, and legal issues of autonomous AI systems. The current debate on the responsibility gap posed by these systems is one such example. This work proposes a mixed AI ethics model that allows normative and descriptive research to complement each other, by aiding scholarly discussion with data gather… ▽ More There is a growing need for data-driven research efforts on how the public perceives the ethical, moral, and legal issues of autonomous AI systems. The current debate on the responsibility gap posed by these systems is one such example. This work proposes a mixed AI ethics model that allows normative and descriptive research to complement each other, by aiding scholarly discussion with data gathered from the public. We discuss its implications on bridging the gap between optimistic and pessimistic views towards AI systems' deployment. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: Accepted to the Ethics in Design Workshop at ACM CSCW 2020 (https://ethicsindesignworkshop.wordpress.com/). 5 pages

arXiv:2101.00807 [pdf, other]

doi 10.1140/epjds/s13688-021-00278-7

Urban green space and happiness in developed countries

Authors: Oh-Hyun Kwon, Inho Hong, Jeasurk Yang, Donghee Yvette Wohn, Woo-Sung Jung, Meeyoung Cha

Abstract: Urban green space has been regarded as contributing to citizen happiness by promoting physical and mental health. However, how urban green space and happiness are related across many countries of different socioeconomic conditions has not been explained well. By measuring urban green space score (UGS) from high-resolution Sentinel-2 satellite imagery of 90 global cities that in total cover 179,168… ▽ More Urban green space has been regarded as contributing to citizen happiness by promoting physical and mental health. However, how urban green space and happiness are related across many countries of different socioeconomic conditions has not been explained well. By measuring urban green space score (UGS) from high-resolution Sentinel-2 satellite imagery of 90 global cities that in total cover 179,168 km$^2$ and include 230 million people in 60 developed countries, we reveal that the amount of urban green space and the GDP can explain the happiness level of the country. More precisely, urban green space and GDP are each individually associated with happiness; happiness in the 30 wealthiest countries is explained only by urban green space, whereas GDP alone explains happiness in the 30 other countries in this study. Lastly, we further show that the relationship between urban green space and happiness is mediated by social support and that GDP moderates the relationship between social support and happiness, which underlines the importance of maintaining urban green space as a place for social cohesion in promoting people's happiness. △ Less

Submitted 4 January, 2021; originally announced January 2021.

Comments: 9 pages, 4 figures, supplementary information with 8 figures

Journal ref: EPJ Data Science 10, 28 (2021)

arXiv:2012.13593 [pdf]

An insight into acupoints and meridians in human body based on interstitial fluid circulation

Authors: Li Hongyi, Yin Yajun, Hu Jun, Li Hua, Wang Fang, Ji Fusui, Ma Chao

Abstract: The atlas of human acupoints and meridians has been utilized in clinical practice for almost a millennium although the anatomical structures and functions remain to be clarified. It has recently been reported that a long-distance interstitial fluid (ISF) circulatory pathway may originate from the acupoints in the extremities. As observed in living human subjects, cadavers and animals using magneti… ▽ More The atlas of human acupoints and meridians has been utilized in clinical practice for almost a millennium although the anatomical structures and functions remain to be clarified. It has recently been reported that a long-distance interstitial fluid (ISF) circulatory pathway may originate from the acupoints in the extremities. As observed in living human subjects, cadavers and animals using magnetic resonance imaging and fluorescent tracers, the ISF flow pathways include at least 4 types of anatomical structures: the cutaneous-, perivenous-, periarterial-, and neural-pathways. Unlike the blood or lymphatic vessels, these ISF flow pathways are composed of highly ordered and topologically connected interstitial fibrous connective tissues that may work as guiderails for the ISF to flow actively over long distance under certain driving forces. Our experimental results demonstrated that most acupoints in the extremity endings connect with one or more ISF flow pathways and comprise a complex network of acupoint-ISF-pathways. We also found that this acupoint-ISF-pathway network can connect to visceral organs or tissues such as the pericardium and epicardium, even though the topographical geometry in human extremities does not totally match the meridian lines on the atlas that is currently used in traditional Chinese medicine. Based on our experimental data, the following working hypotheses are proposed. A comprehensive atlas will be constructed to systemically reveal the detailed anatomical structures of the acupoints-originated ISF circulation. Such an atlas may shed light on the mysteries shrouding the visceral correlations of acupoints and meridians, and inaugurate a new frontier for innovative medical applications. △ Less

Submitted 25 December, 2020; originally announced December 2020.

arXiv:2012.11150 [pdf, other]

Improving Unsupervised Image Clustering With Robust Learning

Authors: Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon Hong, Meeyoung Cha

Abstract: Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. To overcome these challenges, the current research proposes an innovative model RUC that is inspired by robust learning. RUC's novelty is at utilizing pseudo-labels of existing image clustering models as a noisy dataset that may… ▽ More Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. To overcome these challenges, the current research proposes an innovative model RUC that is inspired by robust learning. RUC's novelty is at utilizing pseudo-labels of existing image clustering models as a noisy dataset that may include misclassified samples. Its retraining process can revise misaligned knowledge and alleviate the overconfidence problem in predictions. The model's flexible structure makes it possible to be used as an add-on module to other clustering methods and helps them achieve better performance on multiple datasets. Extensive experiments show that the proposed model can adjust the model confidence with better calibration and gain additional robustness against adversarial noise. △ Less

Submitted 29 March, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: Accepted at CVPR2021

arXiv:2010.14282 [pdf, other]

Active Learning for Human-in-the-Loop Customs Inspection

Authors: Sundong Kim, Tung-Duong Mai, Sungwon Han, Sungwon Park, Thi Nguyen Duc Khanh, Jaechan So, Karandeep Singh, Meeyoung Cha

Abstract: We study the human-in-the-loop customs inspection scenario, where an AI-assisted algorithm supports customs officers by recommending a set of imported goods to be inspected. If the inspected items are fraudulent, the officers can levy extra duties. Th formed logs are then used as additional training data for successive iterations. Choosing to inspect suspicious items first leads to an immediate ga… ▽ More We study the human-in-the-loop customs inspection scenario, where an AI-assisted algorithm supports customs officers by recommending a set of imported goods to be inspected. If the inspected items are fraudulent, the officers can levy extra duties. Th formed logs are then used as additional training data for successive iterations. Choosing to inspect suspicious items first leads to an immediate gain in customs revenue, yet such inspections may not bring new insights for learning dynamic traffic patterns. On the other hand, inspecting uncertain items can help acquire new knowledge, which will be used as a supplementary training resource to update the selection systems. Based on multiyear customs datasets obtained from three countries, we demonstrate that some degree of exploration is necessary to cope with domain shifts in trade data. The results show that a hybrid strategy of selecting likely fraudulent and uncertain items will eventually outperform the exploitation-only strategy. △ Less

Submitted 23 February, 2022; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: To Appear at IEEE TKDE

ACM Class: H.4.0

arXiv:2009.14605 [pdf, other]

Disruption in the Chinese E-Commerce During COVID-19

Authors: Yuan Yuan, Muzhi Guan, Zhilun Zhou, Sundong Kim, Meeyoung Cha, Depeng **, Yong Li

Abstract: The recent outbreak of the novel coronavirus (COVID-19) has infected millions of citizens worldwide and claimed many lives. This paper examines its impact on the Chinese e-commerce market by analyzing behavioral changes seen from a large online shop** platform. We first conduct a time series analysis to identify product categories that faced the most extensive disruptions. The time-lagged analys… ▽ More The recent outbreak of the novel coronavirus (COVID-19) has infected millions of citizens worldwide and claimed many lives. This paper examines its impact on the Chinese e-commerce market by analyzing behavioral changes seen from a large online shop** platform. We first conduct a time series analysis to identify product categories that faced the most extensive disruptions. The time-lagged analysis shows that behavioral patterns seen in shop** actions are highly responsive to epidemic development. Based on these findings, we present a consumer demand prediction method by encompassing the epidemic statistics and behavioral features for COVID-19 related products. Experiment results demonstrate that our predictions outperform existing baselines and further extend to the long-term and province-level forecasts. We discuss how our market analysis and prediction can help better prepare for future pandemics by gaining an extra time to launch preventive steps. △ Less

Submitted 27 October, 2020; v1 submitted 22 July, 2020; originally announced September 2020.

Comments: 10 pages, 7 figures, 6 tables

MSC Class: 68T07 ACM Class: J.4

arXiv:2008.13174 [pdf, ps, other]

Bayesian High-dimensional Semi-parametric Inference beyond sub-Gaussian Errors

Authors: Kyoungjae Lee, Minwoo Chae, Lizhen Lin

Abstract: We consider a sparse linear regression model with unknown symmetric error under the high-dimensional setting. The true error distribution is assumed to belong to the locally $β$-Hölder class with an exponentially decreasing tail, which does not need to be sub-Gaussian. We obtain posterior convergence rates of the regression coefficient and the error density, which are nearly optimal and adaptive t… ▽ More We consider a sparse linear regression model with unknown symmetric error under the high-dimensional setting. The true error distribution is assumed to belong to the locally $β$-Hölder class with an exponentially decreasing tail, which does not need to be sub-Gaussian. We obtain posterior convergence rates of the regression coefficient and the error density, which are nearly optimal and adaptive to the unknown sparsity level. Furthermore, we derive the semi-parametric Bernstein-von Mises (BvM) theorem to characterize asymptotic shape of the marginal posterior for regression coefficients. Under the sub-Gaussianity assumption on the true score function, strong model selection consistency for regression coefficients are also obtained, which eventually asserts the frequentist's validity of credible sets. △ Less

Submitted 30 August, 2020; originally announced August 2020.

arXiv:2008.12085 [pdf, ps, other]

doi 10.1007/978-3-030-66823-5_23

DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis

Authors: Juan Diego Ortega, Neslihan Kose, Paola Cañas, Min-An Chao, Alexander Unnervik, Marcos Nieto, Oihana Otaegui, Luis Salgado

Abstract: Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS), especially after the recent success of Deep Learning (DL) methods. The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development, crucial for the transition of automated driving from SAE Level-2 to SAE Level-3. In this paper, we introduce the Drive… ▽ More Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS), especially after the recent success of Deep Learning (DL) methods. The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development, crucial for the transition of automated driving from SAE Level-2 to SAE Level-3. In this paper, we introduce the Driver Monitoring Dataset (DMD), an extensive dataset which includes real and simulated driving scenarios: distraction, gaze allocation, drowsiness, hands-wheel interaction and context data, in 41 hours of RGB, depth and IR videos from 3 cameras capturing face, body and hands of 37 drivers. A comparison with existing similar datasets is included, which shows the DMD is more extensive, diverse, and multi-purpose. The usage of the DMD is illustrated by extracting a subset of it, the dBehaviourMD dataset, containing 13 distraction activities, prepared to be used in DL training processes. Furthermore, we propose a robust and real-time driver behaviour recognition system targeting a real-world application that can run on cost-efficient CPU-only platforms, based on the dBehaviourMD. Its performance is evaluated with different types of fusion strategies, which all reach enhanced accuracy still providing real-time response. △ Less

Submitted 27 August, 2020; originally announced August 2020.

Comments: Accepted to ECCV 2020 workshop - Assistive Computer Vision and Robotics

arXiv:2008.01339 [pdf, other]

Collecting the Public Perception of AI and Robot Rights

Authors: Gabriel Lima, Changyeon Kim, Seungho Ryu, Chihyung Jeon, Meeyoung Cha

Abstract: Whether to give rights to artificial intelligence (AI) and robots has been a sensitive topic since the European Parliament proposed advanced robots could be granted "electronic personalities." Numerous scholars who favor or disfavor its feasibility have participated in the debate. This paper presents an experiment (N=1270) that 1) collects online users' first impressions of 11 possible rights that… ▽ More Whether to give rights to artificial intelligence (AI) and robots has been a sensitive topic since the European Parliament proposed advanced robots could be granted "electronic personalities." Numerous scholars who favor or disfavor its feasibility have participated in the debate. This paper presents an experiment (N=1270) that 1) collects online users' first impressions of 11 possible rights that could be granted to autonomous electronic agents of the future and 2) examines whether debunking common misconceptions on the proposal modifies one's stance toward the issue. The results indicate that even though online users mainly disfavor AI and robot rights, they are supportive of protecting electronic agents from cruelty (i.e., favor the right against cruel treatment). Furthermore, people's perceptions became more positive when given information about rights-bearing non-human entities or myth-refuting statements. The style used to introduce AI and robot rights significantly affected how the participants perceived the proposal, similar to the way metaphors function in creating laws. For robustness, we repeated the experiment over a more representative sample of U.S. residents (N=164) and found that perceptions gathered from online users and those by the general population are similar. △ Less

Submitted 4 August, 2020; originally announced August 2020.

Comments: Conditionally Accepted to ACM CSCW 2020

arXiv:2006.16500 [pdf]

Method for the generation of depth images for view-based shape retrieval of 3D CAD model from partial point cloud

Authors: Hyungki Kim, Moohyun Cha, Duhwan Mun

Abstract: A laser scanner can easily acquire the geometric data of physical environments in the form of a point cloud. Recognizing objects from a point cloud is often required for industrial 3D reconstruction, which should include not only geometry information but also semantic information. However, recognition process is often a bottleneck in 3D reconstruction because it requires expertise on domain knowle… ▽ More A laser scanner can easily acquire the geometric data of physical environments in the form of a point cloud. Recognizing objects from a point cloud is often required for industrial 3D reconstruction, which should include not only geometry information but also semantic information. However, recognition process is often a bottleneck in 3D reconstruction because it requires expertise on domain knowledge and intensive labor. To address this problem, various methods have been developed to recognize objects by retrieving the corresponding model in the database from an input geometry query. In recent years, the technique of converting geometric data into an image and applying view-based 3D shape retrieval has demonstrated high accuracy. Depth image which encodes depth value as intensity of pixel is frequently used for view-based 3D shape retrieval. However, geometric data collected from objects is often incomplete due to the occlusions and the limit of line of sight. Image generated by occluded point clouds lowers the performance of view-based 3D object retrieval due to loss of information. In this paper, we propose a method of viewpoint and image resolution estimation method for view-based 3D shape retrieval from point cloud query. Automatic selection of viewpoint and image resolution by calculating the data acquisition rate and density from the sampled viewpoints and image resolutions are proposed. The retrieval performance from the images generated by the proposed method is experimented and compared for various dataset. Additionally, view-based 3D shape retrieval performance with deep convolutional neural network has been experimented with the proposed method. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: 39 pages

arXiv:2006.12218 [pdf, other]

Risk Communication in Asian Countries: COVID-19 Discourse on Twitter

Authors: Sungkyu Park, Sungwon Han, Jeongwook Kim, Mir Majid Molaie, Hoang Dieu Vu, Karandeep Singh, Jiyoung Han, Wonjae Lee, Meeyoung Cha

Abstract: COVID-19 has become one of the most widely talked about topics on social media. This research characterizes risk communication patterns by analyzing the public discourse on the novel coronavirus from four Asian countries: South Korea, Iran, Vietnam, and India, which suffered the outbreak to different degrees. The temporal analysis shows that the official epidemic phases issued by governments do no… ▽ More COVID-19 has become one of the most widely talked about topics on social media. This research characterizes risk communication patterns by analyzing the public discourse on the novel coronavirus from four Asian countries: South Korea, Iran, Vietnam, and India, which suffered the outbreak to different degrees. The temporal analysis shows that the official epidemic phases issued by governments do not match well with the online attention on COVID-19. This finding calls for a need to analyze the public discourse by new measures, such as topical dynamics. Here, we propose an automatic method to detect topical phase transitions and compare similarities in major topics across these countries over time. We examine the time lag difference between social media attention and confirmed patient counts. For dynamics, we find an inverse relationship between the tweet count and topical diversity. △ Less

Submitted 14 August, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: 10 pages (+ 12 pages of Appendices), 18 figures, 2 tables

arXiv:2006.08164 [pdf]

doi 10.37727/jkdas.2021.23.2.1045

COVID-19 Vaccine Acceptance in the US and UK in the Early Phase of the Pandemic: AI-Generated Vaccines Hesitancy for Minors, and the Role of Governments

Authors: Gabriel Lima, Meeyoung Cha, Chiyoung Cha, Hyeyoung Hwang

Abstract: This study presents survey results of the public's willingness to get vaccinated against COVID-19 during an early phase of the pandemic and examines factors that could influence vaccine acceptance based on a between-subjects design. A representative quota sample of 572 adults in the US and UK participated in an online survey. First, the participants' medical use tendencies and initial vaccine acce… ▽ More This study presents survey results of the public's willingness to get vaccinated against COVID-19 during an early phase of the pandemic and examines factors that could influence vaccine acceptance based on a between-subjects design. A representative quota sample of 572 adults in the US and UK participated in an online survey. First, the participants' medical use tendencies and initial vaccine acceptance were assessed; then, short vignettes were provided to evaluate their changes in attitude towards COVID-19 vaccines. For data analysis, ANOVA and post hoc pairwise comparisons were used. The participants were more reluctant to vaccinate their children than themselves and the elderly. The use of artificial intelligence (AI) in vaccine development did not influence vaccine acceptance. Vignettes that explicitly stated the high effectiveness of vaccines led to an increase in vaccine acceptance. Our study suggests public policies emphasizing the vaccine effectiveness against the virus could lead to higher vaccination rates. We also discuss the public's expectations of governments concerning vaccine safety and present a series of implications based on our findings. △ Less

Submitted 23 June, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

Comments: Published to the Journal of the Korean Data Analysis Society vol. 23 no. 3

arXiv:2006.04001 [pdf, other]

Real-Time Model Calibration with Deep Reinforcement Learning

Authors: Yuan Tian, Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, Olga Fink

Abstract: The dynamic, real-time, and accurate inference of model parameters from empirical data is of great importance in many scientific and engineering disciplines that use computational models (such as a digital twin) for the analysis and prediction of complex physical processes. However, fast and accurate inference for processes with large and high dimensional datasets cannot easily be achieved with st… ▽ More The dynamic, real-time, and accurate inference of model parameters from empirical data is of great importance in many scientific and engineering disciplines that use computational models (such as a digital twin) for the analysis and prediction of complex physical processes. However, fast and accurate inference for processes with large and high dimensional datasets cannot easily be achieved with state-of-the-art methods under noisy real-world conditions. The primary reason is that the inference of model parameters with traditional techniques based on optimisation or sampling often suffers from computational and statistical challenges, resulting in a trade-off between accuracy and deployment time. In this paper, we propose a novel framework for inference of model parameters based on reinforcement learning. The contribution of the paper is twofold: 1) We reformulate the inference problem as a tracking problem with the objective of learning a policy that forces the response of the physics-based model to follow the observations; 2) We propose the constrained Lyapunov-based actor-critic (CLAC) algorithm to enable the robust and accurate inference of physics-based model parameters in real time under noisy real-world conditions. The proposed methodology is demonstrated and evaluated on two model-based diagnostics test cases utilizing two different physics-based models of turbofan engines. The performance of the methodology is compared to that of two alternative approaches: a state update method (unscented Kalman filter) and a supervised end-to-end map** with deep neural networks. The experimental results demonstrate that the proposed methodology outperforms all other tested methods in terms of speed and robustness, with high inference accuracy. △ Less

Submitted 9 June, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

Comments: 18 pages, 10 figures

arXiv:2004.14125 [pdf, ps, other]

doi 10.1103/PhysRevResearch.2.033109

Topologically ordered zigzag nanoribbon: $e/2$ fractional edge charge, spin-charge separation, and ground state degeneracy

Authors: S. -R. Eric Yang, Min-Chul Cha, Hye Jeong Lee, Young Heon Kim

Abstract: We numerically compute the density of states (DOS) of interacting disordered zigzag graphene nanoribbon (ZGNR) having midgap states showing $e/2$ fractional edge charges. The computed Hartree-Fock DOS is linear at the critical disorder strength where the gap vanishes. This implies an $I\mbox{-}V$ curve of $I\propto V^2$. Thus, $I\mbox{-}V$ curve measurement may yield evidence of fractional charges… ▽ More We numerically compute the density of states (DOS) of interacting disordered zigzag graphene nanoribbon (ZGNR) having midgap states showing $e/2$ fractional edge charges. The computed Hartree-Fock DOS is linear at the critical disorder strength where the gap vanishes. This implies an $I\mbox{-}V$ curve of $I\propto V^2$. Thus, $I\mbox{-}V$ curve measurement may yield evidence of fractional charges in interacting disordered ZGNR. We show that even a weak disorder potential acts as a singular perturbation on zigzag edge electronic states, producing drastic changes in the energy spectrum. Spin-charge separation and fractional charges play a key role in the reconstruction of edge antiferromagnetism. Our results show that an interacting disordered ZGNR is a topologically ordered Mott-Anderson insulator. △ Less

Submitted 22 July, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

Comments: 10 pages, 18 figures, Published version, Phys. Rev. Research 2, 033109 (2020)

arXiv:2004.11434 [pdf, ps, other]

Responsible AI and Its Stakeholders

Authors: Gabriel Lima, Meeyoung Cha

Abstract: Responsible Artificial Intelligence (AI) proposes a framework that holds all stakeholders involved in the development of AI to be responsible for their systems. It, however, fails to accommodate the possibility of holding AI responsible per se, which could close some legal and moral gaps concerning the deployment of autonomous and self-learning systems. We discuss three notions of responsibility (… ▽ More Responsible Artificial Intelligence (AI) proposes a framework that holds all stakeholders involved in the development of AI to be responsible for their systems. It, however, fails to accommodate the possibility of holding AI responsible per se, which could close some legal and moral gaps concerning the deployment of autonomous and self-learning systems. We discuss three notions of responsibility (i.e., blameworthiness, accountability, and liability) for all stakeholders, including AI, and suggest the roles of jurisdiction and the general public in this matter. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Comments: 4 pages, accepted to the Fair & Responsible AI Workshop at ACM CHI 2020

arXiv:2004.07500 [pdf, ps, other]

doi 10.1007/s00033-021-01485-y

Nonlocal adhesion models for two cancer cell phenotypes in a multidimensional bounded domain

Authors: Jaewook Ahn, Myeongju Chae, Jihoon Lee

Abstract: Cell-cell adhesion is an inherently nonlocal phenomenon. Numerous partial differential equation models with nonlocal term have been recently presented to describe this phenomenon, yet the mathematical properties of nonlocal adhesion model are not well understood. Here we consider a model with two kinds of nonlocal cell-cell adhesion, satisfying no-flux conditions in a multidimensional bounded doma… ▽ More Cell-cell adhesion is an inherently nonlocal phenomenon. Numerous partial differential equation models with nonlocal term have been recently presented to describe this phenomenon, yet the mathematical properties of nonlocal adhesion model are not well understood. Here we consider a model with two kinds of nonlocal cell-cell adhesion, satisfying no-flux conditions in a multidimensional bounded domain. We show global-in-time well-posedness of the solution to this model and obtain the uniform boundedness of solution. △ Less

Submitted 28 January, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

MSC Class: 92C17; 35Q92; 35K51

arXiv:2004.00105 [pdf, other]

"Trust me, I have a Ph.D.": A Propensity Score Analysis on the Halo Effect of Disclosing One's Offline Social Status in Online Communities

Authors: Kunwoo Park, Haewoon Kwak, Hyunho Song, Meeyoung Cha

Abstract: Online communities adopt various reputation schemes to measure content quality. This study analyzes the effect of a new reputation scheme that exposes one's offline social status, such as an education degree, within an online community. We study two Reddit communities that adopted this scheme, whereby posts include tags identifying education status referred to as flairs, and we examine how the "tr… ▽ More Online communities adopt various reputation schemes to measure content quality. This study analyzes the effect of a new reputation scheme that exposes one's offline social status, such as an education degree, within an online community. We study two Reddit communities that adopted this scheme, whereby posts include tags identifying education status referred to as flairs, and we examine how the "transferred" social status affects the interactions among the users. We computed propensity scores to test whether flairs give ad-hoc authority to the adopters while minimizing the effects of confounding variables such as topics of content. The results show that exposing academic degrees is likely to lead to higher audience votes as well as larger discussion size, compared to the users without the disclosed identities, in a community that covers peer-reviewed scientific articles. In another community with a focus on casual science topics, exposing mere academic degrees did not obtain such benefits. Still, the users with the highest degree (e.g., Ph.D. or M.D.) were likely to receive more feedback from the audience. These findings suggest that reputation schemes that link the offline and online worlds could induce halo effects on feedback behaviors differently depending upon the community culture. We discuss the implications of this research for the design of future reputation mechanisms. △ Less

Submitted 31 March, 2020; originally announced April 2020.

Comments: 12 pages, 3 figures, accepted at ICWSM-20 as a full paper

arXiv:2003.11459 [pdf, other]

doi 10.1007/978-3-030-42699-6

BaitWatcher: A lightweight web interface for the detection of incongruent news headlines

Authors: Kunwoo Park, Taegyun Kim, Seunghyun Yoon, Meeyoung Cha, Kyomin Jung

Abstract: In digital environments where substantial amounts of information are shared online, news headlines play essential roles in the selection and diffusion of news articles. Some news articles attract audience attention by showing exaggerated or misleading headlines. This study addresses the \textit{headline incongruity} problem, in which a news headline makes claims that are either unrelated or opposi… ▽ More In digital environments where substantial amounts of information are shared online, news headlines play essential roles in the selection and diffusion of news articles. Some news articles attract audience attention by showing exaggerated or misleading headlines. This study addresses the \textit{headline incongruity} problem, in which a news headline makes claims that are either unrelated or opposite to the contents of the corresponding article. We present \textit{BaitWatcher}, which is a lightweight web interface that guides readers in estimating the likelihood of incongruence in news articles before clicking on the headlines. BaitWatcher utilizes a hierarchical recurrent encoder that efficiently learns complex textual representations of a news headline and its associated body text. For training the model, we construct a million scale dataset of news articles, which we also release for broader research use. Based on the results of a focus group interview, we discuss the importance of develo** an interpretable AI agent for the design of a better interface for mitigating the effects of online misinformation. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: 24 pages (single column), 7 figures. This research article is published as a book chapter of \textit{Fake News, Disinformation, and Misinformation in Social Media-Emerging Research Challenges and Opportunities}. Springer, 2020. arXiv admin note: text overlap with arXiv:1811.07066

MSC Class: 68U15

arXiv:2003.06507 [pdf, other]

doi 10.3389/frobt.2021.756242

The Conflict Between People's Urge to Punish AI and Legal Systems

Authors: Gabriel Lima, Meeyoung Cha, Chihyung Jeon, Kyungsin Park

Abstract: Regulating artificial intelligence (AI) has become necessary in light of its deployment in high-risk scenarios. This paper explores the proposal to extend legal personhood to AI and robots, which had not yet been examined through the lens of the general public. We present two studies (N = 3,559) to obtain people's views of electronic legal personhood vis-à-vis existing liability models. Our study… ▽ More Regulating artificial intelligence (AI) has become necessary in light of its deployment in high-risk scenarios. This paper explores the proposal to extend legal personhood to AI and robots, which had not yet been examined through the lens of the general public. We present two studies (N = 3,559) to obtain people's views of electronic legal personhood vis-à-vis existing liability models. Our study reveals people's desire to punish automated agents even though these entities are not recognized any mental state. Furthermore, people did not believe automated agents' punishment would fulfill deterrence nor retribution and were unwilling to grant them legal punishment preconditions, namely physical independence and assets. Collectively, these findings suggest a conflict between the desire to punish automated agents and its perceived impracticability. We conclude by discussing how future design and legal decisions may influence how the public reacts to automated agents' wrongdoings. △ Less

Submitted 10 November, 2021; v1 submitted 13 March, 2020; originally announced March 2020.

Comments: Published at Frontiers in AI and Robots - Ethics in Robotics and Artificial Intelligence Section

arXiv:2003.05599 [pdf, other]

Posterior asymptotics in Wasserstein metrics on the real line

Authors: Minwoo Chae, Pierpaolo De Blasi, Stephen G. Walker

Abstract: In this paper, we use the class of Wasserstein metrics to study asymptotic properties of posterior distributions. Our first goal is to provide sufficient conditions for posterior consistency. In addition to the well-known Schwartz's Kullback--Leibler condition on the prior, the true distribution and most probability measures in the support of the prior are required to possess moments up to an orde… ▽ More In this paper, we use the class of Wasserstein metrics to study asymptotic properties of posterior distributions. Our first goal is to provide sufficient conditions for posterior consistency. In addition to the well-known Schwartz's Kullback--Leibler condition on the prior, the true distribution and most probability measures in the support of the prior are required to possess moments up to an order which is determined by the order of the Wasserstein metric. We further investigate convergence rates of the posterior distributions for which we need stronger moment conditions. The required tail conditions are sharp in the sense that the posterior distribution may be inconsistent or contract slowly to the true distribution without these conditions. Our study involves techniques that build on recent advances on Wasserstein convergence of empirical measures. We apply the results to density estimation with a Dirichlet process mixture prior and conduct a simulation study for further illustration. △ Less

Submitted 30 June, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

Comments: 43pages, 4 figures

MSC Class: 62F15; 62G20; 62G07

arXiv:2003.00732 [pdf, other]

Fusing Physics-based and Deep Learning Models for Prognostics

Authors: Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, Olga Fink

Abstract: Physics-based and data-driven models for remaining useful lifetime (RUL) prediction typically suffer from two major challenges that limit their applicability to complex real-world domains: (1) incompleteness of physics-based models and (2) limited representativeness of the training dataset for data-driven models. Combining the advantages of these two directions while overcoming some of their limit… ▽ More Physics-based and data-driven models for remaining useful lifetime (RUL) prediction typically suffer from two major challenges that limit their applicability to complex real-world domains: (1) incompleteness of physics-based models and (2) limited representativeness of the training dataset for data-driven models. Combining the advantages of these two directions while overcoming some of their limitations, we propose a novel hybrid framework for fusing the information from physics-based performance models with deep learning algorithms for prognostics of complex safety-critical systems under real-world scenarios. In the proposed framework, we use physics-based performance models to infer unobservable model parameters related to a system's components health solving a calibration problem. These parameters are subsequently combined with sensor readings and used as input to a deep neural network to generate a data-driven prognostics model with physics-augmented features. The performance of the hybrid framework is evaluated on an extensive case study comprising run-to-failure degradation trajectories from a fleet of nine turbofan engines under real flight conditions. The experimental results show that the hybrid framework outperforms purely data-driven approaches by extending the prediction horizon by nearly 127\%. Furthermore, it requires less training data and is less sensitive to the limited representativeness of the dataset compared to purely data-driven approaches. △ Less

Submitted 27 October, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: 18 pages, 8 figures, Under Review

arXiv:2002.12158 [pdf, other]

A Comprehensive Approach to Unsupervised Embedding Learning based on AND Algorithm

Authors: Sungwon Han, Yizhan Xu, Sungwon Park, Meeyoung Cha, Cheng-Te Li

Abstract: Unsupervised embedding learning aims to extract good representation from data without the need for any manual labels, which has been a critical challenge in many supervised learning tasks. This paper proposes a new unsupervised embedding approach, called Super-AND, which extends the current state-of-the-art model. Super-AND has its unique set of losses that can gather similar samples nearby within… ▽ More Unsupervised embedding learning aims to extract good representation from data without the need for any manual labels, which has been a critical challenge in many supervised learning tasks. This paper proposes a new unsupervised embedding approach, called Super-AND, which extends the current state-of-the-art model. Super-AND has its unique set of losses that can gather similar samples nearby within a low-density space while kee** invariant features intact against data augmentation. Super-AND outperforms all existing approaches and achieves an accuracy of 89.2% on the image classification task for CIFAR-10. We discuss the practical implications of this method in assisting semi-supervised tasks. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:1912.12502 [pdf, other]

Implicit supervision for fault detection and segmentation of emerging fault types with Deep Variational Autoencoders

Authors: Manuel Arias Chao, Bryan T. Adey, Olga Fink

Abstract: Data-driven fault diagnostics of safety-critical systems often faces the challenge of a complete lack of labeled data associated with faulty system conditions (i.e., fault types) at training time. Since an unknown number and nature of fault types can arise during deployment, data-driven fault diagnostics in this scenario is an open-set learning problem. Most of the algorithms for open-set diagnost… ▽ More Data-driven fault diagnostics of safety-critical systems often faces the challenge of a complete lack of labeled data associated with faulty system conditions (i.e., fault types) at training time. Since an unknown number and nature of fault types can arise during deployment, data-driven fault diagnostics in this scenario is an open-set learning problem. Most of the algorithms for open-set diagnostics are one-class classification and unsupervised algorithms that do not leverage all the available labeled and unlabeled data in the learning algorithm. As a result, their fault detection and segmentation performance (i.e., identifying and separating faults of different types) are sub-optimal. With this work, we propose training a variational autoencoder (VAE) with labeled and unlabeled samples while inducing implicit supervision on the latent representation of the healthy conditions. This, together with a modified sampling process of VAE, creates a compact and informative latent representation that allows good detection and segmentation of unseen fault types using existing one-class and clustering algorithms. We refer to the proposed methodology as "knowledge induced variational autoencoder with adaptive sampling" (KIL-AdaVAE). The fault detection and segmentation capabilities of the proposed methodology are demonstrated in a new simulated case study using the Advanced Geared Turbofan 30000 (AGTF30) dynamical model under real flight conditions. In an extensive comparison, we demonstrate that the proposed method outperforms other learning strategies (supervised learning, supervised learning with embedding and semi-supervised learning) and deep learning algorithms, yielding significant performance improvements on fault detection and fault segmentation. △ Less

Submitted 29 September, 2020; v1 submitted 28 December, 2019; originally announced December 2019.

Comments: 22 pages, 9 Figures

arXiv:1912.08197 [pdf, other]

Lightweight and Robust Representation of Economic Scales from Satellite Imagery

Authors: Sungwon Han, Donghyun Ahn, Hyunji Cha, Jeasurk Yang, Sungwon Park, Meeyoung Cha

Abstract: Satellite imagery has long been an attractive data source that provides a wealth of information on human-inhabited areas. While super resolution satellite images are rapidly becoming available, little study has focused on how to extract meaningful information about human habitation patterns and economic scales from such data. We present READ, a new approach for obtaining essential spatial represen… ▽ More Satellite imagery has long been an attractive data source that provides a wealth of information on human-inhabited areas. While super resolution satellite images are rapidly becoming available, little study has focused on how to extract meaningful information about human habitation patterns and economic scales from such data. We present READ, a new approach for obtaining essential spatial representation for any given district from high-resolution satellite imagery based on deep neural networks. Our method combines transfer learning and embedded statistics to efficiently learn critical spatial characteristics of arbitrary size areas and represent them into a fixed-length vector with minimal information loss. Even with a small set of labels, READ can distinguish subtle differences between rural and urban areas and infer the degree of urbanization. An extensive evaluation demonstrates the model outperforms the state-of-the-art in predicting economic scales, such as population density for South Korea (R^2=0.9617), and shows a high potential use for develo** countries where district-level economic scales are not known. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Accepted for oral presentation at AAAI 2020

arXiv:1911.04835 [pdf]

doi 10.1111/cpr.12760

Active interfacial dynamic transport of fluid in fibrous connective tissues and a hypothesis of interstitial fluid circulatory system

Authors: Li Hongyi, Yin Yajun, Yang Chongqing, Chen Min, Wang Fang, Ma Chao, Li Hua, Kong Yiya, Ji Fusui, Hu Jun

Abstract: Fluid in interstitial spaces accounts for ~20% of an adult body weight. Does it circulate around the body like vascular circulations besides a diffusive and short-ranged transport? This bold conjecture has been debated for decades. As a conventional physiological concept, interstitial space was the space between cells and a micron-sized space. Fluid in interstitial spaces is thought to be entrappe… ▽ More Fluid in interstitial spaces accounts for ~20% of an adult body weight. Does it circulate around the body like vascular circulations besides a diffusive and short-ranged transport? This bold conjecture has been debated for decades. As a conventional physiological concept, interstitial space was the space between cells and a micron-sized space. Fluid in interstitial spaces is thought to be entrapped within interstitial matrix. However, our serial data have further defined an interfacial transport zone on a solid fiber of interstitial matrix. Within this fine space that is probably nanosized, fluid can transport along a fiber under a driving power. Since 2006, our imaging data from volunteers and cadavers have revealed a long-distance extravascular pathway for interstitial fluid flow, comprising four types of anatomic distributions at least. The framework of each extravascular pathway contains the longitudinally assembled and oriented fibers, working as a fibrous guiderail for fluid flow. Interestingly, our data showed that the movement of fluid in a fibrous pathway is in response to a dynamic driving source and named as dynamotaxis. By analysis of some representative studies and our experimental results, a hypothesis of interstitial fluid circulatory system is proposed. △ Less

Submitted 25 November, 2019; v1 submitted 12 November, 2019; originally announced November 2019.

Comments: 15 pages, 2 figures, 18 conferences

Journal ref: Cell Proliferation. 2020;00:e12760

arXiv:1909.05417 [pdf, other]

Deep User Identification Model with Multiple Biometrics

Authors: Hyoung-Kyu Song, Ebrahim AlAlkeem, Jaewoong Yun, Tae-Ho Kim, Tae-Ho Kim, Hyerin Yoo, Dasom Heo, Chan Yeob Yeun, Myungsu Chae

Abstract: Identification using biometrics is an important yet challenging task. Abundant research has been conducted on identifying personal identity or gender using given signals. Various types of biometrics such as electrocardiogram (ECG), electroencephalogram (EEG), face, fingerprint, and voice have been used for these tasks. Most research has only focused on single modality or a single task, while the c… ▽ More Identification using biometrics is an important yet challenging task. Abundant research has been conducted on identifying personal identity or gender using given signals. Various types of biometrics such as electrocardiogram (ECG), electroencephalogram (EEG), face, fingerprint, and voice have been used for these tasks. Most research has only focused on single modality or a single task, while the combination of input modality or tasks is yet to be investigated. In this paper, we propose deep identification and gender classification using multimodal biometrics. Our model uses ECG, fingerprint, and facial data. It then performs two tasks: gender identification and classification. By engaging multi-modality, a single model can handle various input domains without training each modality independently, and the correlation between domains can increase its generalization performance on the tasks. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Comments: Accepted, CIKM 2019 Workshop on DTMBio

arXiv:1908.01529 [pdf, other]

Hybrid deep fault detection and isolation: Combining deep neural networks and system performance models

Authors: Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, Olga Fink

Abstract: With the increased availability of condition monitoring data and the increased complexity of explicit system physics-based models, the application of data-driven approaches for fault detection and isolation has recently grown. While detection accuracy of such approaches is generally good, their performance on fault isolation often suffers from the fact that fault conditions affect a large portion… ▽ More With the increased availability of condition monitoring data and the increased complexity of explicit system physics-based models, the application of data-driven approaches for fault detection and isolation has recently grown. While detection accuracy of such approaches is generally good, their performance on fault isolation often suffers from the fact that fault conditions affect a large portion of the measured signals thereby masking the fault source. To overcome this limitation and enable a more accurate fault detection, we propose a hybrid approach combining physical performance models with deep learning algorithms. Unobserved process variables are inferred with a physics-based performance model to enhance the input space of a data-driven diagnostics model. To validate the effectiveness of the proposed method, we generate a condition monitoring dataset of an advanced gas turbine during flight conditions under healthy and four faulty operative conditions based on the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) dynamical model. We evaluate the performance of the proposed method in combination with two different deep learning algorithms: feed forward neural networks and Variational Autoencoders, both of which demonstrate a significant improvement when applied within the hybrid fault detection and diagnostics framework. The proposed method is able to outperform pure data-driven solutions, particularly for systems with a high variability of operating conditions. It provides superior results both for fault detection as well as for fault isolation. For fault isolation, it overcomes the smearing effect that is observed in pure data-driven approaches and enables a precise isolation of the affected signal. We also demonstrate that deep learning algorithms provide a better performance on fault detection compared to the traditional machine learning algorithms. △ Less

Submitted 28 December, 2019; v1 submitted 5 August, 2019; originally announced August 2019.

Comments: 25 pages, 19 figures, submitted to the International Journal of Prognostics and Health Management (IJPHM), July 2019

arXiv:1907.12176 [pdf, other]

End-to-End Learning Deep CRF models for Multi-Object Tracking

Authors: Jun Xiang, Ma Chao, Guohan Xu, Jianhua Hou

Abstract: Existing deep multi-object tracking (MOT) approaches first learn a deep representation to describe target objects and then associate detection results by optimizing a linear assignment problem. Despite demonstrated successes, it is challenging to discriminate target objects under mutual occlusion or to reduce identity switches in crowded scenes. In this paper, we propose learning deep conditional… ▽ More Existing deep multi-object tracking (MOT) approaches first learn a deep representation to describe target objects and then associate detection results by optimizing a linear assignment problem. Despite demonstrated successes, it is challenging to discriminate target objects under mutual occlusion or to reduce identity switches in crowded scenes. In this paper, we propose learning deep conditional random field (CRF) networks, aiming to model the assignment costs as unary potentials and the long-term dependencies among detection results as pairwise potentials. Specifically, we use a bidirectional long short-term memory (LSTM) network to encode the long-term dependencies. We pose the CRF inference as a recurrent neural network learning process using the standard gradient descent algorithm, where unary and pairwise potentials are jointly optimized in an end-to-end manner. Extensive experimental results on the challenging MOT datasets including MOT-2015 and MOT-2016, demonstrate that our approach achieves the state of the art performances in comparison with published works on both benchmarks. △ Less

Submitted 28 July, 2019; originally announced July 2019.

arXiv:1904.06940 [pdf, ps, other]

Global Well-posedness and Long Time Behaviors of Chemotaxis-Fluid System Modeling Coral Fertilization

Authors: Myeongju Chae, Kyungkeun Kang, Jihoon Lee

Abstract: We consider generalized models on coral broadcast spawning phenomena involving diffusion, advection, chemotaxis, and reactions when egg and sperm densities are different. We prove the global-in-time existence of the regular solutions of the models as well as their temporal decays in two and three dimensions. We also show that the total masses of egg and sperm density have positive lower bounds as… ▽ More We consider generalized models on coral broadcast spawning phenomena involving diffusion, advection, chemotaxis, and reactions when egg and sperm densities are different. We prove the global-in-time existence of the regular solutions of the models as well as their temporal decays in two and three dimensions. We also show that the total masses of egg and sperm density have positive lower bounds as time tends to infinity in three dimensions. △ Less

Submitted 15 April, 2019; originally announced April 2019.

Comments: 27 pages

MSC Class: 35Q30; 35K57; 76Dxx; 76Bxx

arXiv:1904.00603 [pdf, other]

doi 10.1103/PhysRevC.99.035805

New $γ$-ray Transitions Observed in $^{19}$Ne with Implications for the $^{15}$O($α$,$γ$)$^{19}$Ne Reaction Rate

Authors: M. R. Hall, D. W. Bardayan, T. Baugher, A. Lepailleur, S. D. Pain, A. Ratkiewicz, S. Ahn, J. M. Allen, J. T. Anderson, A. D. Ayangeakaa, J. C. Blackmon, S. Burcher, M. P. Carpenter, S. M. Cha, K. Y. Chae, K. A. Chipps, J. A. Cizewski, M. Febbraro, O. Hall, J. Hu, C. L. Jiang, K. L. Jones, E. J. Lee, P. D. O'Malley, S. Ota , et al. (12 additional authors not shown)

Abstract: The $^{15}$O($α$,$γ$)$^{19}$Ne reaction is responsible for breakout from the hot CNO cycle in Type I x-ray bursts. Understanding the properties of resonances between $E_x = 4$ and 5 MeV in $^{19}$Ne is crucial in the calculation of this reaction rate. The spins and parities of these states are well known, with the exception of the 4.14- and 4.20-MeV states, which have adopted spin-parities of 9/2… ▽ More The $^{15}$O($α$,$γ$)$^{19}$Ne reaction is responsible for breakout from the hot CNO cycle in Type I x-ray bursts. Understanding the properties of resonances between $E_x = 4$ and 5 MeV in $^{19}$Ne is crucial in the calculation of this reaction rate. The spins and parities of these states are well known, with the exception of the 4.14- and 4.20-MeV states, which have adopted spin-parities of 9/2$^-$ and 7/2$^-$, respectively. Gamma-ray transitions from these states were studied using triton-$γ$-$γ$ coincidences from the $^{19}$F($^{3}$He,$tγ$)$^{19}$Ne reaction measured with GODDESS (Gammasphere ORRUBA Dual Detectors for Experimental Structure Studies) at Argonne National Laboratory. The observed transitions from the 4.14- and 4.20-MeV states provide strong evidence that the $J^π$ values are actually 7/2$^-$ and 9/2$^-$, respectively. These assignments are consistent with the values in the $^{19}$F mirror nucleus and in contrast to previously accepted assignments. △ Less

Submitted 1 April, 2019; originally announced April 2019.

Journal ref: Phys. Rev. C 99, 035805, 2019

arXiv:1903.04372 [pdf, other]

Nonlinear stability of planar traveling waves in a chemotaxis model of tumor angiogenesis with chemical diffusion

Authors: Myeongju Chae, Kyudong Choi

Abstract: We consider a simplified chemotaxis model of tumor angiogenesis, described by a Keller-Segel system on the two dimensional infinite cylindrical domain $(x, y) \in \mathbb{R} \times {\mathbf S^λ}$, where $ \mathbf S^λ$ is the circle of perimeter $λ>0$. The domain models a virtual channel where newly generated blood vessels toward the vascular endothelial growth factor will be located. The system is… ▽ More We consider a simplified chemotaxis model of tumor angiogenesis, described by a Keller-Segel system on the two dimensional infinite cylindrical domain $(x, y) \in \mathbb{R} \times {\mathbf S^λ}$, where $ \mathbf S^λ$ is the circle of perimeter $λ>0$. The domain models a virtual channel where newly generated blood vessels toward the vascular endothelial growth factor will be located. The system is known to allow planar traveling wave solutions of an invading type. In this paper, we establish the nonlinear stability of these traveling invading waves when chemical diffusion is present if $λ$ is sufficiently small. The same result for the corresponding system in one-dimension was obtained by Li-Li-Wang (2014) [16]. Our result solves the problem remained open in [3] at which only linear stability of the waves was obtained under certain artificial assumption. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: 38 pages, 1 figure

MSC Class: 92B05; 35K45

arXiv:1902.04224 [pdf, other]

Effective Network Compression Using Simulation-Guided Iterative Pruning

Authors: Dae-Woong Jeong, Jaehun Kim, Youngseok Kim, Tae-Ho Kim, Myungsu Chae

Abstract: Existing high-performance deep learning models require very intensive computing. For this reason, it is difficult to embed a deep learning model into a system with limited resources. In this paper, we propose the novel idea of the network compression as a method to solve this limitation. The principle of this idea is to make iterative pruning more effective and sophisticated by simulating the redu… ▽ More Existing high-performance deep learning models require very intensive computing. For this reason, it is difficult to embed a deep learning model into a system with limited resources. In this paper, we propose the novel idea of the network compression as a method to solve this limitation. The principle of this idea is to make iterative pruning more effective and sophisticated by simulating the reduced network. A simple experiment was conducted to evaluate the method; the results showed that the proposed method achieved higher performance than existing methods at the same pruning level. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Comments: Submitted to NIPS 2018 MLPCD2

MSC Class: 68T05

arXiv:1902.00106 [pdf, other]

doi 10.1103/PhysRevLett.122.052701

Key $^{19}$Ne states identified affecting $γ$-ray emission from $^{18}$F in novae

Authors: M. R. Hall, D. W. Barbadian, T. Baugher, A. Lepailleur, S. D. Pain, A. Ratkiewicz, S. Ahn, J. M. Allen, J. T. Anderson, A. D. Ayangeakaa, J. C. Blackmon, S. Burcher, M. P. Carpenter, S. M. Cha, K. Y. Chae, K. A. Chipps, J. A. Cizewski, M. Febbraro, O. Hall, J. Hu, C. L. Jiang, K. L. Jones, E. J. Lee, P. D. O'Malley, S. Ota , et al. (12 additional authors not shown)

Abstract: Detection of nuclear-decay $γ$ rays provides a sensitive thermometer of nova nucleosynthesis. The most intense $γ$-ray flux is thought to be annihilation radiation from the $β^+$ decay of $^{18}$F, which is destroyed prior to decay by the $^{18}$F($p$,$α$)$^{15}$O reaction. Estimates of $^{18}$F production had been uncertain, however, because key near-threshold levels in the compound nucleus,… ▽ More Detection of nuclear-decay $γ$ rays provides a sensitive thermometer of nova nucleosynthesis. The most intense $γ$-ray flux is thought to be annihilation radiation from the $β^+$ decay of $^{18}$F, which is destroyed prior to decay by the $^{18}$F($p$,$α$)$^{15}$O reaction. Estimates of $^{18}$F production had been uncertain, however, because key near-threshold levels in the compound nucleus, $^{19}$Ne, had yet to be identified. This Letter reports the first measurement of the $^{19}$F($^{3}$He,$tγ$)$^{19}$Ne reaction, in which the placement of two long-sought 3/2$^+$ levels is suggested via triton-$γ$-$γ$ coincidences. The precise determination of their resonance energies reduces the upper limit of the rate by a factor of $1.5-17$ at nova temperatures and reduces the average uncertainty on the nova detection probability by a factor of 2.1. △ Less

Submitted 31 January, 2019; originally announced February 2019.

Comments: 6 pages, 4 figures

Journal ref: Phys. Rev. Lett. 122, 052701, 2019

arXiv:1812.08997 [pdf, other]

Stochastic Doubly Robust Gradient

Authors: Kanghoon Lee, Jihye Choi, Moonsu Cha, Jung-Kwon Lee, Taeyoon Kim

Abstract: When training a machine learning model with observational data, it is often encountered that some values are systemically missing. Learning from the incomplete data in which the missingness depends on some covariates may lead to biased estimation of parameters and even harm the fairness of decision outcome. This paper proposes how to adjust the causal effect of covariates on the missingness when t… ▽ More When training a machine learning model with observational data, it is often encountered that some values are systemically missing. Learning from the incomplete data in which the missingness depends on some covariates may lead to biased estimation of parameters and even harm the fairness of decision outcome. This paper proposes how to adjust the causal effect of covariates on the missingness when training models using stochastic gradient descent (SGD). Inspired by the design of doubly robust estimator and its theoretical property of double robustness, we introduce stochastic doubly robust gradient (SDRG) consisting of two models: weight-corrected gradients for inverse propensity score weighting and per-covariate control variates for regression adjustment. Also, we identify the connection between double robustness and variance reduction in SGD by demonstrating the SDRG algorithm with a unifying framework for variance reduced SGD. The performance of our approach is empirically tested by showing the convergence in training image classifiers with several examples of missing data. △ Less

Submitted 21 December, 2018; originally announced December 2018.

Comments: 9 pages, 2 figures

arXiv:1812.05083 [pdf, other]

Adversarial Learning of Semantic Relevance in Text to Image Synthesis

Authors: Miriam Cha, Youngjune L. Gwon, H. T. Kung

Abstract: We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging an auxiliary task in the discriminator. Our generated images are not limited to certain classes and do not suffer from mode collapse while semantically matching… ▽ More We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging an auxiliary task in the discriminator. Our generated images are not limited to certain classes and do not suffer from mode collapse while semantically matching the text input. A key to our training methods is how to form positive and negative training examples with respect to the class label of a given image. Instead of selecting random training examples, we perform negative sampling based on the semantic distance from a positive example in the class. We evaluate our approach using the Oxford-102 flower dataset, adopting the inception score and multi-scale structural similarity index (MS-SSIM) metrics to assess discriminability and diversity of the generated images. The empirical results indicate greater diversity in the generated images, especially when we gradually select more negative training examples closer to a positive example in the semantic space. △ Less

Submitted 5 February, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

arXiv:1812.02853 [pdf, other]

doi 10.1088/1361-648X/ab146b

Soliton fractional charge of disordered graphene nanoribbon

Authors: Y. H. Jeong, S. -R. Eric Yang, M. -C. Cha

Abstract: We investigate the properties of the gap-edge states of half-filled interacting disordered zigzag graphene nanoribbons. We find that the midgap states can display the quantized fractional charge of 1/2. These gap-edge states can be represented by topological kinks with their site probability distribution divided between the opposite zigzag edges with different chiralities. In addition, there are n… ▽ More We investigate the properties of the gap-edge states of half-filled interacting disordered zigzag graphene nanoribbons. We find that the midgap states can display the quantized fractional charge of 1/2. These gap-edge states can be represented by topological kinks with their site probability distribution divided between the opposite zigzag edges with different chiralities. In addition, there are numerous spin-split gap-edge states, similar to those in a Mott-Anderson insulator. △ Less

Submitted 24 April, 2019; v1 submitted 6 December, 2018; originally announced December 2018.

Comments: 11 pages, 19 figures; J. Phys.: Condens. Matter 31 (2019) 265601; estimate of quantum charge fluctuations is included; more references are added

Journal ref: J. Phys.: Condens. Matter 31 265601 (2019)

arXiv:1811.07066 [pdf, other]

Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder

Authors: Seunghyun Yoon, Kunwoo Park, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha, Kyomin Jung

Abstract: Some news headlines mislead readers with overrated or false information, and identifying them in advance will better assist readers in choosing proper news stories to consume. This research introduces million-scale pairs of news headline and body text dataset with incongruity label, which can uniquely be utilized for detecting news stories with misleading headlines. On this dataset, we develop two… ▽ More Some news headlines mislead readers with overrated or false information, and identifying them in advance will better assist readers in choosing proper news stories to consume. This research introduces million-scale pairs of news headline and body text dataset with incongruity label, which can uniquely be utilized for detecting news stories with misleading headlines. On this dataset, we develop two neural networks with hierarchical architectures that model a complex textual representation of news articles and measure the incongruity between the headline and the body text. We also present a data augmentation method that dramatically reduces the text input size a model handles by independently investigating each paragraph of news stories, which further boosts the performance. Our experiments and qualitative evaluations demonstrate that the proposed methods outperform existing approaches and efficiently detect news stories with misleading headlines in the real world. △ Less

Submitted 7 February, 2019; v1 submitted 16 November, 2018; originally announced November 2018.

Comments: 10 pages, Accepted as a conference paper at AAAI 2019

arXiv:1811.03344 [pdf, other]

doi 10.1103/PhysRevB.98.235161

Finite entanglement properties in the matrix product states of the one-dimensional Hubbard model

Authors: Min-Chul Cha

Abstract: We study the effects due to limited entanglement in the one-dimensional Hubbard model by representing the ground states in the form of the matrix product states. Finite-entanglement scaling behavior over a wide range is observed at half-filling. The critical exponents characterizing the length scale in terms of the size of matrices used are obtained, confirming the theoretical prediction that the… ▽ More We study the effects due to limited entanglement in the one-dimensional Hubbard model by representing the ground states in the form of the matrix product states. Finite-entanglement scaling behavior over a wide range is observed at half-filling. The critical exponents characterizing the length scale in terms of the size of matrices used are obtained, confirming the theoretical prediction that the values of the exponents are solely determined by the central charge. The entanglement spectrum shows that a global double degeneracy occurs in the ground states with a charge gap. We also find that the Mott transition, tuned by changing the chemical potential, always occurs through a first-order transition and the metallic phase has a few conducting states, including the states with the mean-field nature close to the critical point, as expected in variational matrix product states with a finite amount of entanglement. △ Less

Submitted 8 November, 2018; originally announced November 2018.

Comments: 8 pages, 10 figures

Journal ref: Phys. Rev. B 98, 235161 (2018)

arXiv:1809.00758 [pdf]

End-to-end Multimodal Emotion and Gender Recognition with Dynamic Joint Loss Weights

Authors: Myungsu Chae, Tae-Ho Kim, Young Hoon Shin, June-Woo Kim, Soo-Young Lee

Abstract: Multi-task learning is a method for improving the generalizability of multiple tasks. In order to perform multiple classification tasks with one neural network model, the losses of each task should be combined. Previous studies have mostly focused on multiple prediction tasks using joint loss with static weights for training models, choosing the weights between tasks without making sufficient cons… ▽ More Multi-task learning is a method for improving the generalizability of multiple tasks. In order to perform multiple classification tasks with one neural network model, the losses of each task should be combined. Previous studies have mostly focused on multiple prediction tasks using joint loss with static weights for training models, choosing the weights between tasks without making sufficient considerations by setting them uniformly or empirically. In this study, we propose a method to calculate joint loss using dynamic weights to improve the total performance, instead of the individual performance, of tasks. We apply this method to design an end-to-end multimodal emotion and gender recognition model using audio and video data. This approach provides proper weights for the loss of each task when the training process ends. In our experiments, emotion and gender recognition with the proposed method yielded a lower joint loss, which is computed as the negative log-likelihood, than using static weights for joint loss. Moreover, our proposed model has better generalizability than other models. To the best of our knowledge, this research is the first to demonstrate the strength of using dynamic weights for joint loss for maximizing overall performance in emotion and gender recognition tasks. △ Less

Submitted 2 October, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

Comments: IROS 2018 Workshop on Crossmodal Learning for Intelligent Robotics

MSC Class: 68T05

arXiv:1806.06927 [pdf, other]

Auto-Meta: Automated Gradient Based Meta Learner Search

Authors: Jaehong Kim, Sangyeul Lee, Sungwan Kim, Moonsu Cha, Jung Kwon Lee, Youngduck Choi, Yongseok Choi, Dong-Yeon Cho, Jiwon Kim

Abstract: Fully automating machine learning pipelines is one of the key challenges of current artificial intelligence research, since practical machine learning often requires costly and time-consuming human-powered processes such as model design, algorithm development, and hyperparameter tuning. In this paper, we verify that automated architecture search synergizes with the effect of gradient-based meta le… ▽ More Fully automating machine learning pipelines is one of the key challenges of current artificial intelligence research, since practical machine learning often requires costly and time-consuming human-powered processes such as model design, algorithm development, and hyperparameter tuning. In this paper, we verify that automated architecture search synergizes with the effect of gradient-based meta learning. We adopt the progressive neural architecture search \cite{liu:pnas_google:DBLP:journals/corr/abs-1712-00559} to find optimal architectures for meta-learners. The gradient based meta-learner whose architecture was automatically found achieved state-of-the-art results on the 5-shot 5-way Mini-ImageNet classification problem with $74.65\%$ accuracy, which is $11.54\%$ improvement over the result obtained by the first gradient-based meta-learner called MAML \cite{finn:maml:DBLP:conf/icml/FinnAL17}. To our best knowledge, this work is the first successful neural architecture search implementation in the context of meta learning. △ Less

Submitted 10 December, 2018; v1 submitted 11 June, 2018; originally announced June 2018.

Comments: Presented at NIPS 2018 Workshop on Meta-Learning (MetaLearn 2018)

Showing 51–100 of 180 results for author: Chae, M