Search | arXiv e-print repository

arXiv:2407.00087 [pdf, other]

ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback

Authors: Ju-Seung Byun, Jiyun Chun, Jihyung Kil, Andrew Perrault

Abstract: Large Multimodal Models (LMMs) excel at comprehending human instructions and demonstrate remarkable results across a broad spectrum of tasks. Reinforcement Learning from Human Feedback (RLHF) and AI Feedback (RLAIF) further refine LLMs by aligning them with specific preferences. These methods primarily use ranking-based feedback for entire generations. With advanced AI models (Teacher), such as GP… ▽ More Large Multimodal Models (LMMs) excel at comprehending human instructions and demonstrate remarkable results across a broad spectrum of tasks. Reinforcement Learning from Human Feedback (RLHF) and AI Feedback (RLAIF) further refine LLMs by aligning them with specific preferences. These methods primarily use ranking-based feedback for entire generations. With advanced AI models (Teacher), such as GPT-4 and Claude 3 Opus, we can request various types of detailed feedback that are expensive for humans to provide. We propose a two-stage algorithm ARES that Alternates REinforcement Learning (RL) and Supervised Fine-Tuning (SFT). First, we request the Teacher to score how much each sentence contributes to solving the problem in a Chain-of-Thought (CoT). This sentence-level feedback allows us to consider individual valuable segments, providing more granular rewards for the RL procedure. Second, we ask the Teacher to correct the wrong reasoning after the RL stage. The RL procedure requires massive efforts for hyperparameter tuning and often generates errors like repetitive words and incomplete sentences. With the correction feedback, we stabilize the RL fine-tuned model through SFT. We conduct experiments on multi-model dataset ScienceQA and A-OKVQA to demonstrate the effectiveness of our proposal. ARES rationale reasoning achieves around 70% win rate against baseline models judged by GPT-4o. Additionally, we observe that the improved rationale reasoning leads to a 2.5% increase in inference answer accuracy on average for the multi-modal datasets. △ Less

Submitted 25 June, 2024; originally announced July 2024.

arXiv:2406.09188 [pdf, ps, other]

Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval

Authors: Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim, Sanghyuk Chun, Taesup Moon

Abstract: Composed Image Retrieval (CIR) aims to retrieve a target image based on a reference image and conditioning text, enabling controllable searches. Due to the expensive dataset construction cost for CIR triplets, a zero-shot (ZS) CIR setting has been actively studied to eliminate the need for human-collected triplet datasets. The mainstream of ZS-CIR employs an efficient projection module that projec… ▽ More Composed Image Retrieval (CIR) aims to retrieve a target image based on a reference image and conditioning text, enabling controllable searches. Due to the expensive dataset construction cost for CIR triplets, a zero-shot (ZS) CIR setting has been actively studied to eliminate the need for human-collected triplet datasets. The mainstream of ZS-CIR employs an efficient projection module that projects a CLIP image embedding to the CLIP text token embedding space, while fixing the CLIP encoders. Using the projected image embedding, these methods generate image-text composed features by using the pre-trained text encoder. However, their CLIP image and text encoders suffer from the task discrepancy between the pre-training task (text $\leftrightarrow$ image) and the target CIR task (image + text $\leftrightarrow$ image). Conceptually, we need expensive triplet samples to reduce the discrepancy, but we use cheap text triplets instead and update the text encoder. To that end, we introduce the Reducing Task Discrepancy of text encoders for Composed Image Retrieval (RTD), a plug-and-play training scheme for the text encoder that enhances its capability using a novel target-anchored text contrastive learning. We also propose two additional techniques to improve the proposed learning scheme: a hard negatives-based refined batch sampling strategy and a sophisticated concatenation scheme. Integrating RTD into the state-of-the-art projection-based ZS-CIR methods significantly improves performance across various datasets and backbones, demonstrating its efficiency and generalizability. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 17 pages

arXiv:2405.20042 [pdf, other]

CycleFormer : TSP Solver Based on Language Modeling

Authors: Jieun Yook, Junpyo Seo, Joon Huh, Han Joon Byun, Byung-ro Mooon

Abstract: We propose a new transformer model for the Traveling Salesman Problem (TSP) called CycleFormer. We identified distinctive characteristics that need to be considered when applying a conventional transformer model to TSP and aimed to fully incorporate these elements into the TSP-specific transformer. Unlike the token sets in typical language models, which are limited and static, the token (node) set… ▽ More We propose a new transformer model for the Traveling Salesman Problem (TSP) called CycleFormer. We identified distinctive characteristics that need to be considered when applying a conventional transformer model to TSP and aimed to fully incorporate these elements into the TSP-specific transformer. Unlike the token sets in typical language models, which are limited and static, the token (node) set in TSP is unlimited and dynamic. To exploit this fact to the fullest, we equated the encoder output with the decoder linear layer and directly connected the context vector of the encoder to the decoder encoding. Additionally, we added a positional encoding to the encoder tokens that reflects the two-dimensional nature of TSP, and devised a circular positional encoding for the decoder tokens that considers the cyclic properties of a tour. By incorporating these ideas, CycleFormer outperforms state-of-the-art (SOTA) transformer models for TSP from TSP-50 to TSP-500. Notably, on TSP-500, the optimality gap was reduced by approximately 2.8 times, from 3.09% to 1.10%, compared to the existing SOTA. The code will be made available at https://github.com/Giventicket/CycleFormer. △ Less

Submitted 31 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.17618 [pdf, other]

Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales

Authors: Ju-Seung Byun, Andrew Perrault

Abstract: Reinforcement learning (RL) training is inherently unstable due to factors such as moving targets and high gradient variance. Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF) can introduce additional difficulty. Differing preferences can complicate the alignment process, and prediction errors in a trained reward model can become more severe as t… ▽ More Reinforcement learning (RL) training is inherently unstable due to factors such as moving targets and high gradient variance. Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF) can introduce additional difficulty. Differing preferences can complicate the alignment process, and prediction errors in a trained reward model can become more severe as the LLM generates unseen outputs. To enhance training robustness, RL has adopted techniques from supervised learning, such as ensembles and layer normalization. In this work, we improve the stability of RL training by adapting the reverse cross entropy (RCE) from supervised learning for noisy data to define a symmetric RL loss. We demonstrate performance improvements across various tasks and scales. We conduct experiments in discrete action tasks (Atari games) and continuous action space tasks (MuJoCo benchmark and Box2D) using Symmetric A2C (SA2C) and Symmetric PPO (SPPO), with and without added noise with especially notable performance in SPPO across different hyperparameters. Furthermore, we validate the benefits of the symmetric RL loss when using SPPO for large language models through improved performance in RLHF tasks, such as IMDB positive sentiment sentiment and TL;DR summarization tasks. △ Less

Submitted 29 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.14632 [pdf, other]

Reinforcement Learning for Fine-tuning Text-to-speech Diffusion Models

Authors: **gyi Chen, Ju-Seung Byun, Micha Elsner, Andrew Perrault

Abstract: Recent advancements in generative models have sparked significant interest within the machine learning community. Particularly, diffusion models have demonstrated remarkable capabilities in synthesizing images and speech. Studies such as those by Lee et al. [19], Black et al. [4], Wang et al. [36], and Fan et al. [8] illustrate that Reinforcement Learning with Human Feedback (RLHF) can enhance dif… ▽ More Recent advancements in generative models have sparked significant interest within the machine learning community. Particularly, diffusion models have demonstrated remarkable capabilities in synthesizing images and speech. Studies such as those by Lee et al. [19], Black et al. [4], Wang et al. [36], and Fan et al. [8] illustrate that Reinforcement Learning with Human Feedback (RLHF) can enhance diffusion models for image synthesis. However, due to architectural differences between these models and those employed in speech synthesis, it remains uncertain whether RLHF could similarly benefit speech synthesis models. In this paper, we explore the practical application of RLHF to diffusion-based text-to-speech synthesis, leveraging the mean opinion score (MOS) as predicted by UTokyo-SaruLab MOS prediction system [29] as a proxy loss. We introduce diffusion model loss-guided RL policy optimization (DLPO) and compare it against other RLHF approaches, employing the NISQA speech quality and naturalness assessment model [21] and human preference experiments for further evaluation. Our results show that RLHF can enhance diffusion-based text-to-speech synthesis models, and, moreover, DLPO can better improve diffusion models in generating natural and high quality speech audios. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.01361 [pdf, other]

Haptic-Based Bilateral Teleoperation of Aerial Manipulator for Extracting Wedged Object with Compensation of Human Reaction Time

Authors: Jeonghyun Byun, Dohyun Eom, H. ** Kim

Abstract: Bilateral teleoperation of an aerial manipulator facilitates the execution of industrial missions thanks to the combination of the aerial platform's maneuverability and the ability to conduct complex tasks with human supervision. Heretofore, research on such operations has focused on flying without any physical interaction or exerting a pushing force on a contact surface that does not involve abru… ▽ More Bilateral teleoperation of an aerial manipulator facilitates the execution of industrial missions thanks to the combination of the aerial platform's maneuverability and the ability to conduct complex tasks with human supervision. Heretofore, research on such operations has focused on flying without any physical interaction or exerting a pushing force on a contact surface that does not involve abrupt changes in the interaction force. In this paper, we propose a human reaction time compensating haptic-based bilateral teleoperation strategy for an aerial manipulator extracting a wedged object from a static structure (i.e., plug-pulling), which incurs an abrupt decrease in the interaction force and causes additional difficulty for an aerial platform. A haptic device composed of a 4-degree-of-freedom robotic arm and a gripper is made for the teleoperation of aerial wedged object-extracting tasks, and a haptic-based teleoperation method to execute the aerial manipulator by the haptic device is introduced. We detect the extraction of the object by the estimation of the external force exerted on the aerial manipulator and generate reference trajectories for both the aerial manipulator and the haptic device after the extraction. As an example of the extraction of a wedged object, we conduct comparative plug-pulling experiments with a quadrotor-based aerial manipulator. The results validate that the proposed bilateral teleoperation method reduces the overshoot in the aerial manipulator's position and ensures fast recovery to its initial position after extracting the wedged object. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: to be presented in 2024 IEEE International Conference on Unmanned Aircraft Systems (ICUAS), Chania, Crete, Greece, 2024

arXiv:2404.11310 [pdf, other]

Autonomous aerial perching and unperching using omnidirectional tiltrotor and switching controller

Authors: Dongjae Lee, Sunwoo Hwang, Jeonghyun Byun, Seung Jae Lee, H. ** Kim

Abstract: Aerial unperching of multirotors has received little attention as opposed to perching that has been investigated to elongate operation time. This study presents a new aerial robot capable of both perching and unperching autonomously on/from a ferromagnetic surface during flight, and a switching controller to avoid rotor saturation and mitigate overshoot during transition between free-flight and pe… ▽ More Aerial unperching of multirotors has received little attention as opposed to perching that has been investigated to elongate operation time. This study presents a new aerial robot capable of both perching and unperching autonomously on/from a ferromagnetic surface during flight, and a switching controller to avoid rotor saturation and mitigate overshoot during transition between free-flight and perching. To enable stable perching and unperching maneuvers on/from a vertical surface, a lightweight ($\approx$ $1$ \si{kg}), fully actuated tiltrotor that can hover at $90^\circ$ pitch angle is first developed. We design a perching/unperching module composed of a single servomotor and a magnet, which is then mounted on the tiltrotor. A switching controller including exclusive control modes for transitions between free-flight and perching is proposed. Lastly, we propose a simple yet effective strategy to ensure robust perching in the presence of measurement and control errors and avoid collisions with the perching site immediately after unperching. We validate the proposed framework in experiments where the tiltrotor successfully performs perching and unperching on/from a vertical surface during flight. We further show effectiveness of the proposed transition mode in the switching controller by ablation studies where large overshoot and even collision with a perching site occur. To the best of the authors' knowledge, this work presents the first autonomous aerial unperching framework using a fully actuated tiltrotor. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 7 pages, 10 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA) accepted

arXiv:2401.16396 [pdf, other]

Ovarian Cancer Diagnostics using Wavelet Packet Scaling Descriptors

Authors: Raymond J. Hinton Jr., Jihyun Byun, Dixon Vimalajeewa, Brani Vidakovic

Abstract: Detecting early-stage ovarian cancer accurately and efficiently is crucial for timely treatment. Various methods for early diagnosis have been explored, including a focus on features derived from protein mass spectra, but these tend to overlook the complex interplay across protein expression levels. We propose an innovative method to automate the search for diagnostic features in these spectra by… ▽ More Detecting early-stage ovarian cancer accurately and efficiently is crucial for timely treatment. Various methods for early diagnosis have been explored, including a focus on features derived from protein mass spectra, but these tend to overlook the complex interplay across protein expression levels. We propose an innovative method to automate the search for diagnostic features in these spectra by analyzing their inherent scaling characteristics. We compare two techniques for estimating the self-similarity in a signal using the scaling behavior of its wavelet packet decomposition. The methods are applied to the mass spectra using a rolling window approach, yielding a collection of self-similarity indexes that capture protein interactions, potentially indicative of ovarian cancer. Then, the most discriminatory scaling descriptors from this collection are selected for use in classification algorithms. To assess their effectiveness for early diagnosis of ovarian cancer, the techniques are applied to two datasets from the American National Cancer Institute. Comparative evaluation against an existing wavelet-based method shows that one wavelet packet-based technique led to improved diagnostic performance for one of the analyzed datasets (95.67% vs. 96.78% test accuracy, respectively). This highlights the potential of wavelet packet-based methods to capture novel diagnostic information related to ovarian cancer. This innovative approach offers promise for better early detection and improved patient outcomes in ovarian cancer. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 18 pages, 8 figures

arXiv:2312.06112 [pdf, other]

MAFA: Managing False Negatives for Vision-Language Pre-training

Authors: Jaeseok Byun, Dohoon Kim, Taesup Moon

Abstract: We consider a critical issue of false negatives in Vision-Language Pre-training (VLP), a challenge that arises from the inherent many-to-many correspondence of image-text pairs in large-scale web-crawled datasets. The presence of false negatives can impede achieving optimal performance and even lead to a significant performance drop. To address this challenge, we propose MAFA (MAnaging FAlse negat… ▽ More We consider a critical issue of false negatives in Vision-Language Pre-training (VLP), a challenge that arises from the inherent many-to-many correspondence of image-text pairs in large-scale web-crawled datasets. The presence of false negatives can impede achieving optimal performance and even lead to a significant performance drop. To address this challenge, we propose MAFA (MAnaging FAlse negatives), which consists of two pivotal components building upon the recently developed GRouped mIni-baTch sampling (GRIT) strategy: 1) an efficient connection mining process that identifies and converts false negatives into positives, and 2) label smoothing for the image-text contrastive (ITC) loss. Our comprehensive experiments verify the effectiveness of MAFA across multiple downstream tasks, emphasizing the crucial role of addressing false negatives in VLP, potentially even surpassing the importance of addressing false positives. In addition, the compatibility of MAFA with the recent BLIP-family model is also demonstrated. Code is available at https://github.com/jaeseokbyun/MAFA. △ Less

Submitted 12 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: CVPR 2024 camera ready version

arXiv:2310.01405 [pdf, other]

Representation Engineering: A Top-Down Approach to AI Transparency

Authors: Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

Abstract: In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equip** us with novel methods for monitoring and manipulating high-level cognitive p… ▽ More In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equip** us with novel methods for monitoring and manipulating high-level cognitive phenomena in deep neural networks (DNNs). We provide baselines and an initial analysis of RepE techniques, showing that they offer simple yet effective solutions for improving our understanding and control of large language models. We showcase how these methods can provide traction on a wide range of safety-relevant problems, including honesty, harmlessness, power-seeking, and more, demonstrating the promise of top-down transparency research. We hope that this work catalyzes further exploration of RepE and fosters advancements in the transparency and safety of AI systems. △ Less

Submitted 10 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Code is available at https://github.com/andyzoujm/representation-engineering

arXiv:2309.16007 [pdf, ps, other]

Sums of Powers of Primes in Arithmetic Progression

Authors: Muhammet Boran, John Byun, Zhangze Li, Steven J. Miller, Stephanie Reyes

Abstract: Gerard and Washington proved that, for $k > -1$, the number of primes less than $x^{k+1}$ can be well approximated by summing the $k$-th powers of all primes up to $x$. We extend this result to primes in arithmetic progressions: we prove that the number of primes $p\equiv n \pmod m$ less than $x^{k+1}$ is asymptotic to the sum of $k$-th powers of all primes $p\equiv n \pmod m$ up to $x$. We prove… ▽ More Gerard and Washington proved that, for $k > -1$, the number of primes less than $x^{k+1}$ can be well approximated by summing the $k$-th powers of all primes up to $x$. We extend this result to primes in arithmetic progressions: we prove that the number of primes $p\equiv n \pmod m$ less than $x^{k+1}$ is asymptotic to the sum of $k$-th powers of all primes $p\equiv n \pmod m$ up to $x$. We prove that the prime power sum approximation tends to be an underestimate for positive $k$ and an overestimate for negative $k$, and quantify for different values of $k$ how well the approximation works for $x$ between $10^4$ and $10^8.$ △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: 19 pages, 16 tables

MSC Class: (Primary) 11N13; (Secondary) 11N05

Journal ref: The PUMP Journal of Undergraduate Research (2024), Volume 7, 29-50

arXiv:2305.18739 [pdf, other]

doi 10.1109/ICASSP49357.2023.10095881

An empirical study on speech restoration guided by self supervised speech representation

Authors: Jaeuk Byun, Youna Ji, Soo Whan Chung, Soyeon Choe, Min Seok Choi

Abstract: Enhancing speech quality is an indispensable yet difficult task as it is often complicated by a range of degradation factors. In addition to additive noise, reverberation, clip**, and speech attenuation can all adversely affect speech quality. Speech restoration aims to recover speech components from these distortions. This paper focuses on exploring the impact of self-supervised speech represen… ▽ More Enhancing speech quality is an indispensable yet difficult task as it is often complicated by a range of degradation factors. In addition to additive noise, reverberation, clip**, and speech attenuation can all adversely affect speech quality. Speech restoration aims to recover speech components from these distortions. This paper focuses on exploring the impact of self-supervised speech representation learning on the speech restoration task. Specifically, we employ speech representation in various speech restoration networks and evaluate their performance under complicated distortion scenarios. Our experiments demonstrate that the contextual information provided by the self-supervised speech representation can enhance speech restoration performance in various distortion scenarios, while also increasing robustness against the duration of speech attenuation and mismatched test conditions. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: To be presented at ICASSP 2023

arXiv:2305.14846 [pdf, other]

Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup

Authors: Junyoung Byun, Myung-Joon Kwon, Seungju Cho, Yoonji Kim, Changick Kim

Abstract: Deep neural networks are widely known to be susceptible to adversarial examples, which can cause incorrect predictions through subtle input modifications. These adversarial examples tend to be transferable between models, but targeted attacks still have lower attack success rates due to significant variations in decision boundaries. To enhance the transferability of targeted adversarial examples,… ▽ More Deep neural networks are widely known to be susceptible to adversarial examples, which can cause incorrect predictions through subtle input modifications. These adversarial examples tend to be transferable between models, but targeted attacks still have lower attack success rates due to significant variations in decision boundaries. To enhance the transferability of targeted adversarial examples, we propose introducing competition into the optimization process. Our idea is to craft adversarial perturbations in the presence of two new types of competitor noises: adversarial perturbations towards different target classes and friendly perturbations towards the correct class. With these competitors, even if an adversarial example deceives a network to extract specific features leading to the target class, this disturbance can be suppressed by other competitors. Therefore, within this competition, adversarial examples should take different attack strategies by leveraging more diverse features to overwhelm their interference, leading to improving their transferability to different models. Considering the computational complexity, we efficiently simulate various interference from these two types of competitors in feature space by randomly mixing up stored clean features in the model inference and named this method Clean Feature Mixup (CFM). Our extensive experimental results on the ImageNet-Compatible and CIFAR-10 datasets show that the proposed method outperforms the existing baselines with a clear margin. Our code is available at https://github.com/dreamflake/CFM. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: CVPR 2023 camera-ready

arXiv:2304.13886 [pdf, other]

Improving the Utility of Differentially Private Clustering through Dynamical Processing

Authors: Junyoung Byun, Yu** Choi, Jaewook Lee

Abstract: This study aims to alleviate the trade-off between utility and privacy in the task of differentially private clustering. Existing works focus on simple clustering methods, which show poor clustering performance for non-convex clusters. By utilizing Morse theory, we hierarchically connect the Gaussian sub-clusters to fit complex cluster distributions. Because differentially private sub-clusters are… ▽ More This study aims to alleviate the trade-off between utility and privacy in the task of differentially private clustering. Existing works focus on simple clustering methods, which show poor clustering performance for non-convex clusters. By utilizing Morse theory, we hierarchically connect the Gaussian sub-clusters to fit complex cluster distributions. Because differentially private sub-clusters are obtained through the existing methods, the proposed method causes little or no additional privacy loss. We provide a theoretical background that implies that the proposed method is inductive and can achieve any desired number of clusters. Experiments on various datasets show that our framework achieves better clustering performance at the same privacy level, compared to the existing methods. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2303.08818 [pdf, other]

Boosting Convolutional Neural Networks' Protein Binding Site Prediction Capacity Using SE(3)-invariant transformers, Transfer Learning and Homology-based Augmentation

Authors: Daeseok Lee, Jeunghyun Byun, Bonggun Shin

Abstract: Figuring out small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many virtual and real drug-discovery scenarios. Since it is not always easy to find such binding sites based on domain knowledge or traditional methods, different deep learning methods that predict binding sites out of protein structures have been developed in recent years. H… ▽ More Figuring out small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many virtual and real drug-discovery scenarios. Since it is not always easy to find such binding sites based on domain knowledge or traditional methods, different deep learning methods that predict binding sites out of protein structures have been developed in recent years. Here we present a new such deep learning algorithm, that significantly outperformed all state-of-the-art baselines in terms of the both resolutions$\unicode{x2013}$pocket and residue. This good performance was also demonstrated in a case study involving the protein human serum albumin and its binding sites. Our algorithm included new ideas both in the model architecture and in the training method. For the model architecture, it incorporated SE(3)-invariant geometric self-attention layers that operate on top of residue-level CNN outputs. This residue-level processing of the model allowed a transfer learning between the two resolutions, which turned out to significantly improve the binding pocket prediction. Moreover, we developed novel augmentation method based on protein homology, which prevented our model from over-fitting. Overall, we believe that our contribution to the literature is twofold. First, we provided a new computational method for binding site prediction that is relevant to real-world applications, as shown by the good performance on different benchmarks and case study. Second, the novel ideas in our method$\unicode{x2013}$the model architecture, transfer learning and the homology augmentation$\unicode{x2013}$would serve as useful components in future works. △ Less

Submitted 18 April, 2023; v1 submitted 20 February, 2023; originally announced March 2023.

Comments: Updates in version 2: author order change (making it clear that Bonggun Shin is the corresponding author)

arXiv:2301.08078 [pdf, other]

Stable Contact Guaranteeing Motion/Force Control for an Aerial Manipulator on an Arbitrarily Tilted Surface

Authors: Jeonghyun Byun, Byeongjun Kim, Changhyeon Kim, Donggeon David Oh, H. ** Kim

Abstract: This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the… ▽ More This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the gains of the disturbance-observer (DOB)-based motion/force controller are calculated based on the stability conditions considering both the model uncertainties in the dynamic equation and switching between the free and contact motions. To validate the proposed controller, we conducted the time-varying motion/force tracking experiments with different approach speeds and orientations of the surface. The results show that our controller enables the aerial manipulator to track the time-varying motion/force trajectories. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: to be presented in 2023 IEEE International Conference on Robotics and Automations (ICRA), London, United Kingdom, 2023

arXiv:2301.01751 [pdf, other]

Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes

Authors: Justin Reppert, Ben Rachbach, Charlie George, Luke Stebbing, Jungwon Byun, Maggie Appleton, Andreas Stuhlmüller

Abstract: Language models (LMs) can perform complex reasoning either end-to-end, with hidden latent state, or compositionally, with transparent intermediate state. Composition offers benefits for interpretability and safety, but may need workflow support and infrastructure to remain competitive. We describe iterated decomposition, a human-in-the-loop workflow for develo** and refining compositional LM pro… ▽ More Language models (LMs) can perform complex reasoning either end-to-end, with hidden latent state, or compositionally, with transparent intermediate state. Composition offers benefits for interpretability and safety, but may need workflow support and infrastructure to remain competitive. We describe iterated decomposition, a human-in-the-loop workflow for develo** and refining compositional LM programs. We improve the performance of compositions by zooming in on failing components and refining them through decomposition, additional context, chain of thought, etc. To support this workflow, we develop ICE, an open-source tool for visualizing the execution traces of LM programs. We apply iterated decomposition to three real-world tasks and improve the accuracy of LM programs over less compositional baselines: describing the placebo used in a randomized controlled trial (25% to 65%), evaluating participant adherence to a medical intervention (53% to 70%), and answering NLP questions on the Qasper dataset (38% to 69%). These applications serve as case studies for a workflow that, if automated, could keep ML systems interpretable and safe even as they scale to increasingly complex tasks. △ Less

Submitted 4 January, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

arXiv:2212.07026 [pdf, other]

Improving group robustness under noisy labels using predictive uncertainty

Authors: Dongpin Oh, Dae Lee, Jeunghyun Byun, Bonggun Shin

Abstract: The standard empirical risk minimization (ERM) can underperform on certain minority groups (i.e., waterbirds in lands or landbirds in water) due to the spurious correlation between the input and its label. Several studies have improved the worst-group accuracy by focusing on the high-loss samples. The hypothesis behind this is that such high-loss samples are \textit{spurious-cue-free} (SCF) sample… ▽ More The standard empirical risk minimization (ERM) can underperform on certain minority groups (i.e., waterbirds in lands or landbirds in water) due to the spurious correlation between the input and its label. Several studies have improved the worst-group accuracy by focusing on the high-loss samples. The hypothesis behind this is that such high-loss samples are \textit{spurious-cue-free} (SCF) samples. However, these approaches can be problematic since the high-loss samples may also be samples with noisy labels in the real-world scenarios. To resolve this issue, we utilize the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. To motivate this, we theoretically show that the high-uncertainty samples are the SCF samples in the binary classification problem. This theoretical result implies that the predictive uncertainty is an adequate indicator to identify SCF samples in a noisy label setting. Motivated from this, we propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels. In the END framework, we first train the \textit{identification model} to obtain the SCF samples from a training set using its predictive uncertainty. Then, another model is trained on the dataset augmented with an oversampled SCF set. The experimental results show that our END framework outperforms other strong baselines on several real-world benchmarks that consider both the noisy labels and the spurious-cues. △ Less

Submitted 13 December, 2022; originally announced December 2022.

arXiv:2210.17327 [pdf, other]

Diffusion-based Generative Speech Source Separation

Authors: Robin Scheibler, Youna Ji, Soo-Whan Chung, Jaeuk Byun, Soyeon Choe, Min-Seok Choi

Abstract: We propose DiffSep, a new single channel source separation method based on score-matching of a stochastic differential equation (SDE). We craft a tailored continuous time diffusion-mixing process starting from the separated sources and converging to a Gaussian distribution centered on their mixture. This formulation lets us apply the machinery of score-based generative modelling. First, we train a… ▽ More We propose DiffSep, a new single channel source separation method based on score-matching of a stochastic differential equation (SDE). We craft a tailored continuous time diffusion-mixing process starting from the separated sources and converging to a Gaussian distribution centered on their mixture. This formulation lets us apply the machinery of score-based generative modelling. First, we train a neural network to approximate the score function of the marginal probabilities or the diffusion-mixing process. Then, we use it to solve the reverse time SDE that progressively separates the sources starting from their mixture. We propose a modified training strategy to handle model mismatch and source permutation ambiguity. Experiments on the WSJ0 2mix dataset demonstrate the potential of the method. Furthermore, the method is also suitable for speech enhancement and shows performance competitive with prior work on the VoiceBank-DEMAND dataset. △ Less

Submitted 2 November, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: 5 pages, 3 figures, 2 tables. Submitted to ICASSP 2023

arXiv:2209.02573 [pdf, other]

S-BORM: Reliability-based optimization of general systems using buffered optimization and reliability method

Authors: Ji-Eun Byun, Welington de Oliveira, Johannes O. Royset

Abstract: Reliability-based optimization (RBO) is crucial for identifying optimal risk-informed decisions for designing and operating engineering systems. However, its computation remains challenging as it requires a concurrent task of optimization and reliability analysis. Moreover, computation becomes even more complicated when considering performance of a general system, whose failure event is represente… ▽ More Reliability-based optimization (RBO) is crucial for identifying optimal risk-informed decisions for designing and operating engineering systems. However, its computation remains challenging as it requires a concurrent task of optimization and reliability analysis. Moreover, computation becomes even more complicated when considering performance of a general system, whose failure event is represented as a link-set of cut-sets. This is because even when component events have smooth and convex limit-state functions, the system limit-state function has neither property, except in trivial cases. To address the challenge, this study develops an efficient algorithm to solve RBO problems of general system events. We employ the buffered optimization and reliability method (BORM), which utilizes, instead of the conventional failure probability definition, the buffered failure probability. The proposed algorithm solves a sequence of difference-of-convex RBO models iteratively by employing a proximal bundle method. For demonstration, we design three numerical examples with increasing complexity that includes up to 108 cut-sets, which are solved by the proposed algorithm within a minute with high accuracy. We also demonstrate its robustness by performing extensive parametric studies. △ Less

Submitted 7 October, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: Codes and data are available at https://github.com/jieunbyun/sborm

arXiv:2208.13125 [pdf, other]

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Authors: Ju-Seung Byun, Andrew Perrault

Abstract: Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) has been shown to improve performance by modeling the value distribution, not just the mean. We study the value distribution in several continuous control tasks and find that the learned value distribution is empirical quite… ▽ More Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) has been shown to improve performance by modeling the value distribution, not just the mean. We study the value distribution in several continuous control tasks and find that the learned value distribution is empirical quite close to normal. We design a method that exploits this property, employ variances predicted from a variance network, along with returns, to analytically compute target quantile bars representing a normal for our distributional value function. In addition, we propose a policy update strategy based on the correctness as measured by structural characteristics of the value distribution not present in the standard value function. The approach we outline is compatible with many DRL structures. We use two representative on-policy algorithms, PPO and TRPO, as testbeds. Our method yields statistically significant improvements in 10 out of 16 continuous task settings, while utilizing a reduced number of weights and achieving faster training time compared to an ensemble-based method for quantifying value distribution uncertainty. △ Less

Submitted 17 January, 2024; v1 submitted 27 August, 2022; originally announced August 2022.

arXiv:2208.04060 [pdf, other]

GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training

Authors: Jaeseok Byun, Taebaek Hwang, Jianlong Fu, Taesup Moon

Abstract: Most of the currently existing vision and language pre-training (VLP) methods have mainly focused on how to extract and align vision and text features. In contrast to the mainstream VLP methods, we highlight that two routinely applied steps during pre-training have crucial impact on the performance of the pre-trained model: in-batch hard negative sampling for image-text matching (ITM) and assignin… ▽ More Most of the currently existing vision and language pre-training (VLP) methods have mainly focused on how to extract and align vision and text features. In contrast to the mainstream VLP methods, we highlight that two routinely applied steps during pre-training have crucial impact on the performance of the pre-trained model: in-batch hard negative sampling for image-text matching (ITM) and assigning the large masking probability for the masked language modeling (MLM). After empirically showing the unexpected effectiveness of above two steps, we systematically devise our GRIT-VLP, which adaptively samples mini-batches for more effective mining of hard negative samples for ITM while maintaining the computational cost for pre-training. Our method consists of three components: 1) GRouped mIni-baTch sampling (GRIT) strategy that collects similar examples in a mini-batch, 2) ITC consistency loss for improving the mining ability, and 3) enlarged masking probability for MLM. Consequently, we show our GRIT-VLP achieves a new state-of-the-art performance on various downstream tasks with much less computational cost. Furthermore, we demonstrate that our model is essentially in par with ALBEF, the previous state-of-the-art, only with one-third of training epochs on the same training data. Code is available at https://github.com/jaeseokbyun/GRIT-VLP. △ Less

Submitted 8 August, 2022; originally announced August 2022.

arXiv:2205.04579 [pdf, other]

doi 10.1093/mnras/stac2313

Modal compression of the redshift-space galaxy bispectrum

Authors: Joyce Byun, Elisabeth Krause

Abstract: We extend the modal decomposition method, previously applied to compress the information in the real-space bispectrum, to the anisotropic redshift-space galaxy bispectrum. In the modal method approach, the bispectrum is expanded on a basis of smooth functions of triangles and their orientations, such that a set of modal expansion coefficients can capture the information in the bispectrum. We assum… ▽ More We extend the modal decomposition method, previously applied to compress the information in the real-space bispectrum, to the anisotropic redshift-space galaxy bispectrum. In the modal method approach, the bispectrum is expanded on a basis of smooth functions of triangles and their orientations, such that a set of modal expansion coefficients can capture the information in the bispectrum. We assume a reference survey and compute Fisher forecasts for the compressed modal bispectrum and two other basis decompositions of the redshift-space bispectrum in the literature, one based on (single) spherical harmonics and another based on tripolar spherical harmonics. In each case, we compare the forecasted constraints from the compressed statistic with forecasted constraints from the full, uncompressed bispectrum which includes all triangles and orientations. Our main result is that all three compression methods achieve good recovery of the full information content of the bispectrum, but the modal decomposition approach achieves this the most efficiently: only 14 (42) modal expansion coefficients are necessary to obtain constraints that are within 10% (2%) of the full bispectrum result. The next most efficient decomposition is the one based on tripolar spherical harmonics, while the spherical harmonic multipoles are the least efficient. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 17 pages, 5 figures, 2 tables. To be submitted to MNRAS. Comments welcome

arXiv:2204.14001 [pdf, other]

doi 10.23919/ASCC56756.2022.9828175

Machine Learning-Based GPS Multipath Detection Method Using Dual Antennas

Authors: Sanghyun Kim, Jungyun Byun, Kwansik Park

Abstract: In urban areas, global navigation satellite system (GNSS) signals are often reflected or blocked by buildings, thus resulting in large positioning errors. In this study, we proposed a machine learning approach for global positioning system (GPS) multipath detection that uses dual antennas. A machine learning model that could classify GPS signal reception conditions was trained with several GPS mea… ▽ More In urban areas, global navigation satellite system (GNSS) signals are often reflected or blocked by buildings, thus resulting in large positioning errors. In this study, we proposed a machine learning approach for global positioning system (GPS) multipath detection that uses dual antennas. A machine learning model that could classify GPS signal reception conditions was trained with several GPS measurements selected as suggested features. We applied five features for machine learning, including a feature obtained from the dual antennas, and evaluated the classification performance of the model, after applying four machine learning algorithms: gradient boosting decision tree (GBDT), random forest, decision tree, and K-nearest neighbor (KNN). It was found that a classification accuracy of 82%-96% was achieved when the test data set was collected at the same locations as those of the training data set. However, when the test data set was collected at locations different from those of the training data, a classification accuracy of 44%-77% was obtained. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: Submitted to ASCC 2022

arXiv:2203.09123 [pdf, other]

Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input

Authors: Junyoung Byun, Seungju Cho, Myung-Joon Kwon, Hee-Seon Kim, Changick Kim

Abstract: The transferability of adversarial examples allows the deception on black-box models, and transfer-based targeted attacks have attracted a lot of interest due to their practical applicability. To maximize the transfer success rate, adversarial examples should avoid overfitting to the source model, and image augmentation is one of the primary approaches for this. However, prior works utilize simple… ▽ More The transferability of adversarial examples allows the deception on black-box models, and transfer-based targeted attacks have attracted a lot of interest due to their practical applicability. To maximize the transfer success rate, adversarial examples should avoid overfitting to the source model, and image augmentation is one of the primary approaches for this. However, prior works utilize simple image transformations such as resizing, which limits input diversity. To tackle this limitation, we propose the object-based diverse input (ODI) method that draws an adversarial image on a 3D object and induces the rendered image to be classified as the target class. Our motivation comes from the humans' superior perception of an image printed on a 3D object. If the image is clear enough, humans can recognize the image content in a variety of viewing conditions. Likewise, if an adversarial example looks like the target class to the model, the model should also classify the rendered image of the 3D object as the target class. The ODI method effectively diversifies the input by leveraging an ensemble of multiple source objects and randomizing viewing conditions. In our experimental results on the ImageNet-Compatible dataset, this method boosts the average targeted attack success rate from 28.3% to 47.0% compared to the state-of-the-art methods. We also demonstrate the applicability of the ODI method to adversarial examples on the face verification task and its superior performance improvement. Our code is available at https://github.com/dreamflake/ODI. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: Accepted at CVPR 2022

arXiv:2111.15164 [pdf, other]

WALK-VIO: Walking-motion-Adaptive Leg Kinematic Constraint Visual-Inertial Odometry for Quadruped Robots

Authors: Hyunjun Lim, Byeongho Yu, Yeeun Kim, Joowoong Byun, Soonpyo Kwon, Haewon Park, Hyun Myung

Abstract: In this paper, WALK-VIO, a novel visual-inertial odometry (VIO) with walking-motion-adaptive leg kinematic constraints that change with body motion for localization of quadruped robots, is proposed. Quadruped robots primarily use VIO because they require fast localization for control and path planning. However, since quadruped robots are mainly used outdoors, extraneous features extracted from the… ▽ More In this paper, WALK-VIO, a novel visual-inertial odometry (VIO) with walking-motion-adaptive leg kinematic constraints that change with body motion for localization of quadruped robots, is proposed. Quadruped robots primarily use VIO because they require fast localization for control and path planning. However, since quadruped robots are mainly used outdoors, extraneous features extracted from the sky or ground cause tracking failures. In addition, the quadruped robots' walking motion cause wobbling, which lowers the localization accuracy due to the camera and inertial measurement unit (IMU). To overcome these limitations, many researchers use VIO with leg kinematic constraints. However, since the quadruped robot's walking motion varies according to the controller, gait, quadruped robots' velocity, and so on, these factors should be considered in the process of adding leg kinematic constraints. We propose VIO that can be used regardless of walking motion by adjusting the leg kinematic constraint factor. In order to evaluate WALK-VIO, we create and publish datasets of quadruped robots that move with various types of walking motion in a simulation environment. In addition, we verified the validity of WALK-VIO through comparison with current state-of-the-art algorithms. △ Less

Submitted 30 November, 2021; originally announced November 2021.

arXiv:2111.04371 [pdf, other]

Geometrically Adaptive Dictionary Attack on Face Recognition

Authors: Junyoung Byun, Hyojun Go, Changick Kim

Abstract: CNN-based face recognition models have brought remarkable performance improvement, but they are vulnerable to adversarial perturbations. Recent studies have shown that adversaries can fool the models even if they can only access the models' hard-label output. However, since many queries are needed to find imperceptible adversarial noise, reducing the number of queries is crucial for these attacks.… ▽ More CNN-based face recognition models have brought remarkable performance improvement, but they are vulnerable to adversarial perturbations. Recent studies have shown that adversaries can fool the models even if they can only access the models' hard-label output. However, since many queries are needed to find imperceptible adversarial noise, reducing the number of queries is crucial for these attacks. In this paper, we point out two limitations of existing decision-based black-box attacks. We observe that they waste queries for background noise optimization, and they do not take advantage of adversarial perturbations generated for other images. We exploit 3D face alignment to overcome these limitations and propose a general strategy for query-efficient black-box attacks on face recognition named Geometrically Adaptive Dictionary Attack (GADA). Our core idea is to create an adversarial perturbation in the UV texture map and project it onto the face in the image. It greatly improves query efficiency by limiting the perturbation search space to the facial area and effectively recycling previous perturbations. We apply the GADA strategy to two existing attack methods and show overwhelming performance improvement in the experiments on the LFW and CPLFW datasets. Furthermore, we also present a novel attack strategy that can circumvent query similarity-based stateful detection that identifies the process of query-based black-box attacks. △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: Accepted at WACV 2022

arXiv:2110.04357 [pdf, other]

Training Transition Policies via Distribution Matching for Complex Tasks

Authors: Ju-Seung Byun, Andrew Perrault

Abstract: Humans decompose novel complex tasks into simpler ones to exploit previously learned skills. Analogously, hierarchical reinforcement learning seeks to leverage lower-level policies for simple tasks to solve complex ones. However, because each lower-level policy induces a different distribution of states, transitioning from one lower-level policy to another may fail due to an unexpected starting st… ▽ More Humans decompose novel complex tasks into simpler ones to exploit previously learned skills. Analogously, hierarchical reinforcement learning seeks to leverage lower-level policies for simple tasks to solve complex ones. However, because each lower-level policy induces a different distribution of states, transitioning from one lower-level policy to another may fail due to an unexpected starting state. We introduce transition policies that smoothly connect lower-level policies by producing a distribution of states and actions that matches what is expected by the next policy. Training transition policies is challenging because the natural reward signal -- whether the next policy can execute its subtask successfully -- is sparse. By training transition policies via adversarial inverse reinforcement learning to match the distribution of expected states and actions, we avoid relying on task-based reward. To further improve performance, we use deep Q-learning with a binary action space to determine when to switch from a transition policy to the next pre-trained policy, using the success or failure of the next subtask as the reward. Although the reward is still sparse, the problem is less severe due to the simple binary action space. We demonstrate our method on continuous bipedal locomotion and arm manipulation tasks that require diverse skills. We show that it smoothly connects the lower-level policies, achieving higher success rates than previous methods that search for successful trajectories based on a reward function, but do not match the state distribution. △ Less

Submitted 11 March, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

arXiv:2109.05391 [pdf, ps, other]

Gradients and Subgradients of Buffered Failure Probability

Authors: Johannes O. Royset, Ji-Eun Byun

Abstract: Gradients and subgradients are central to optimization and sensitivity analysis of buffered failure probabilities. We furnish a characterization of subgradients based on subdifferential calculus in the case of finite probability distributions and, under additional assumptions, also a gradient expression for general distributions. Several examples illustrate the application of the results, especial… ▽ More Gradients and subgradients are central to optimization and sensitivity analysis of buffered failure probabilities. We furnish a characterization of subgradients based on subdifferential calculus in the case of finite probability distributions and, under additional assumptions, also a gradient expression for general distributions. Several examples illustrate the application of the results, especially in the context of optimality conditions. △ Less

Submitted 22 October, 2021; v1 submitted 11 September, 2021; originally announced September 2021.

arXiv:2107.11176 [pdf]

Data-driven optimization of reliability using buffered failure probability

Authors: Ji-Eun Byun, Johannes O. Royset

Abstract: Design and operation of complex engineering systems rely on reliability optimization. Such optimization requires us to account for uncertainties expressed in terms of compli-cated, high-dimensional probability distributions, for which only samples or data might be available. However, using data or samples often degrades the computational efficiency, particularly as the conventional failure probabi… ▽ More Design and operation of complex engineering systems rely on reliability optimization. Such optimization requires us to account for uncertainties expressed in terms of compli-cated, high-dimensional probability distributions, for which only samples or data might be available. However, using data or samples often degrades the computational efficiency, particularly as the conventional failure probability is estimated using the indicator function whose gradient is not defined at zero. To address this issue, by leveraging the buffered failure probability, the paper develops the buffered optimization and reliability method (BORM) for efficient, data-driven optimization of reliability. The proposed formulations, algo-rithms, and strategies greatly improve the computational efficiency of the optimization and thereby address the needs of high-dimensional and nonlinear problems. In addition, an analytical formula is developed to estimate the reliability sensitivity, a subject fraught with difficulty when using the conventional failure probability. The buffered failure probability is thoroughly investigated in the context of many different distributions, leading to a novel measure of tail-heaviness called the buffered tail index. The efficiency and accuracy of the proposed optimization methodology are demonstrated by three numerical examples, which underline the unique advantages of the buffered failure probability for data-driven reliability analysis. △ Less

Submitted 21 September, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

Comments: 32 pages

MSC Class: 90C15; 93E20 ACM Class: G.3; I.2.8; J.6

arXiv:2107.02366 [pdf, other]

Real-Time Motion Planning of a Hydraulic Excavator using Trajectory Optimization and Model Predictive Control

Authors: Dongjae Lee, Inkyu Jang, Jeonghyun Byun, Hoseong Seo, H. ** Kim

Abstract: Automation of excavation tasks requires real-time trajectory planning satisfying various constraints. To guarantee both constraint feasibility and real-time trajectory re-plannability, we present an integrated framework for real-time optimization-based trajectory planning of a hydraulic excavator. The proposed framework is composed of two main modules: a global planner and a real-time local planne… ▽ More Automation of excavation tasks requires real-time trajectory planning satisfying various constraints. To guarantee both constraint feasibility and real-time trajectory re-plannability, we present an integrated framework for real-time optimization-based trajectory planning of a hydraulic excavator. The proposed framework is composed of two main modules: a global planner and a real-time local planner. The global planner computes the entire global trajectory considering excavation volume and energy minimization while the local counterpart tracks the global trajectory in a receding horizon manner, satisfying dynamic feasibility, physical constraints, and disturbance-awareness. We validate the proposed planning algorithm in a simulation environment where two types of operations are conducted in the presence of emulated disturbance from hydraulic friction and soil-bucket interaction: shallow and deep excavation. The optimized global trajectories are obtained in an order of a second, which is tracked by the local planner at faster than 30 Hz. To the best of our knowledge, this work presents the first real-time motion planning framework that satisfies constraints of a hydraulic excavator, such as force/torque, power, cylinder displacement, and flow rate limits. △ Less

Submitted 7 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: 8 pages, 8 figures, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) accepted

arXiv:2107.00353 [pdf, other]

Stability and Robustness Analysis of Plug-Pulling using an Aerial Manipulator

Authors: Jeonghyun Byun, Dongjae Lee, Hoseong Seo, Inkyu Jang, Jeongjun Choi, H. ** Kim

Abstract: In this paper, an autonomous aerial manipulation task of pulling a plug out of an electric socket is conducted, where maintaining the stability and robustness is challenging due to sudden disappearance of a large interaction force. The abrupt change in the dynamical model before and after the separation of the plug can cause destabilization or mission failure. To accomplish aerial plug-pulling, we… ▽ More In this paper, an autonomous aerial manipulation task of pulling a plug out of an electric socket is conducted, where maintaining the stability and robustness is challenging due to sudden disappearance of a large interaction force. The abrupt change in the dynamical model before and after the separation of the plug can cause destabilization or mission failure. To accomplish aerial plug-pulling, we employ the concept of hybrid automata to divide the task into three operative modes, i.e, wire-pulling, stabilizing, and free-flight. Also, a strategy for trajectory generation and a design of disturbance-observer-based controllers for each operative mode are presented. Furthermore, the theory of hybrid automata is used to prove the stability and robustness during the mode transition. We validate the proposed trajectory generation and control method by an actual wire-pulling experiment with a multirotor-based aerial manipulator. △ Less

Submitted 5 July, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: to be presented in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021

arXiv:2105.10967 [pdf, other]

FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise

Authors: Jaeseok Byun, Sungmin Cha, Taesup Moon

Abstract: We consider the challenging blind denoising problem for Poisson-Gaussian noise, in which no additional information about clean images or noise level parameters is available. Particularly, when only "single" noisy images are available for training a denoiser, the denoising performance of existing methods was not satisfactory. Recently, the blind pixelwise affine image denoiser (BP-AIDE) was propose… ▽ More We consider the challenging blind denoising problem for Poisson-Gaussian noise, in which no additional information about clean images or noise level parameters is available. Particularly, when only "single" noisy images are available for training a denoiser, the denoising performance of existing methods was not satisfactory. Recently, the blind pixelwise affine image denoiser (BP-AIDE) was proposed and significantly improved the performance in the above setting, to the extent that it is competitive with denoisers which utilized additional information. However, BP-AIDE seriously suffered from slow inference time due to the inefficiency of noise level estimation procedure and that of the blind-spot network (BSN) architecture it used. To that end, we propose Fast Blind Image Denoiser (FBI-Denoiser) for Poisson-Gaussian noise, which consists of two neural network models; 1) PGE-Net that estimates Poisson-Gaussian noise parameters 2000 times faster than the conventional methods and 2) FBI-Net that realizes a much more efficient BSN for pixelwise affine denoiser in terms of the number of parameters and inference speed. Consequently, we show that our FBI-Denoiser blindly trained solely based on single noisy images can achieve the state-of-the-art performance on several real-world noisy image benchmark datasets with much faster inference time (x 10), compared to BP-AIDE. The official code of our method is available at https://github.com/csm9493/FBI-Denoiser. △ Less

Submitted 23 May, 2021; originally announced May 2021.

Comments: CVPR 2021 camera ready version

arXiv:2101.04829 [pdf, other]

On the Effectiveness of Small Input Noise for Defending Against Query-based Black-Box Attacks

Authors: Junyoung Byun, Hyojun Go, Changick Kim

Abstract: While deep neural networks show unprecedented performance in various tasks, the vulnerability to adversarial examples hinders their deployment in safety-critical systems. Many studies have shown that attacks are also possible even in a black-box setting where an adversary cannot access the target model's internal information. Most black-box attacks are based on queries, each of which obtains the t… ▽ More While deep neural networks show unprecedented performance in various tasks, the vulnerability to adversarial examples hinders their deployment in safety-critical systems. Many studies have shown that attacks are also possible even in a black-box setting where an adversary cannot access the target model's internal information. Most black-box attacks are based on queries, each of which obtains the target model's output for an input, and many recent studies focus on reducing the number of required queries. In this paper, we pay attention to an implicit assumption of query-based black-box adversarial attacks that the target model's output exactly corresponds to the query input. If some randomness is introduced into the model, it can break the assumption, and thus, query-based attacks may have tremendous difficulty in both gradient estimation and local search, which are the core of their attack process. From this motivation, we observe even a small additive input noise can neutralize most query-based attacks and name this simple yet effective approach Small Noise Defense (SND). We analyze how SND can defend against query-based black-box attacks and demonstrate its effectiveness against eight state-of-the-art attacks with CIFAR-10 and ImageNet datasets. Even with strong defense ability, SND almost maintains the original classification accuracy and computational speed. SND is readily applicable to pre-trained models by adding only one line of code at the inference. △ Less

Submitted 8 November, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

Comments: Accepted at WACV 2022

arXiv:2012.01700 [pdf, other]

doi 10.1109/MIS.2022.3151466

Robust Federated Learning with Noisy Labels

Authors: Seunghan Yang, Hyoungseob Park, Junyoung Byun, Changick Kim

Abstract: Federated learning is a paradigm that enables local devices to jointly train a server model while kee** the data decentralized and private. In federated learning, since local data are collected by clients, it is hardly guaranteed that the data are correctly annotated. Although a lot of studies have been conducted to train the networks robust to these noisy data in a centralized setting, these al… ▽ More Federated learning is a paradigm that enables local devices to jointly train a server model while kee** the data decentralized and private. In federated learning, since local data are collected by clients, it is hardly guaranteed that the data are correctly annotated. Although a lot of studies have been conducted to train the networks robust to these noisy data in a centralized setting, these algorithms still suffer from noisy labels in federated learning. Compared to the centralized setting, clients' data can have different noise distributions due to variations in their labeling systems or background knowledge of users. As a result, local models form inconsistent decision boundaries and their weights severely diverge from each other, which are serious problems in federated learning. To solve these problems, we introduce a novel federated learning scheme that the server cooperates with local models to maintain consistent decision boundaries by interchanging class-wise centroids. These centroids are central features of local data on each device, which are aligned by the server every communication round. Updating local models with the aligned centroids helps to form consistent decision boundaries among local models, although the noise distributions in clients' data are different from each other. To improve local model performance, we introduce a novel approach to select confident samples that are used for updating the model with given labels. Furthermore, we propose a global-guided pseudo-labeling method to update labels of unconfident samples by exploiting the global model. Our experimental results on the noisy CIFAR-10 dataset and the Clothing1M dataset show that our approach is noticeably effective in federated learning with noisy labels. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Journal ref: IEEE Intelligent Systems, 2022

arXiv:2010.09933 [pdf, other]

Proximal Policy Gradient: PPO with Policy Gradient

Authors: Ju-Seung Byun, Byungmoon Kim, Huamin Wang

Abstract: In this paper, we propose a new algorithm PPG (Proximal Policy Gradient), which is close to both VPG (vanilla policy gradient) and PPO (proximal policy optimization). The PPG objective is a partial variation of the VPG objective and the gradient of the PPG objective is exactly same as the gradient of the VPG objective. To increase the number of policy update iterations, we introduce the advantage-… ▽ More In this paper, we propose a new algorithm PPG (Proximal Policy Gradient), which is close to both VPG (vanilla policy gradient) and PPO (proximal policy optimization). The PPG objective is a partial variation of the VPG objective and the gradient of the PPG objective is exactly same as the gradient of the VPG objective. To increase the number of policy update iterations, we introduce the advantage-policy plane and design a new clip** strategy. We perform experiments in OpenAI Gym and Bullet robotics environments for ten random seeds. The performance of PPG is comparable to PPO, and the entropy decays slower than PPG. Thus we show that performance similar to PPO can be obtained by using the gradient formula from the original policy gradient theorem. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 7 pages

arXiv:2010.09579 [pdf, other]

doi 10.1088/1475-7516/2021/03/105

Towards cosmological constraints from the compressed modal bispectrum: a robust comparison of real-space bispectrum estimators

Authors: Joyce Byun, Andrea Oddo, Cristiano Porciani, Emiliano Sefusatti

Abstract: Higher-order clustering statistics, like the galaxy bispectrum, can add complementary cosmological information to what is accessible with two-point statistics, like the power spectrum. While the standard way of measuring the bispectrum involves estimating a bispectrum value in a large number of Fourier triangle bins, the compressed modal bispectrum approximates the bispectrum as a linear combinati… ▽ More Higher-order clustering statistics, like the galaxy bispectrum, can add complementary cosmological information to what is accessible with two-point statistics, like the power spectrum. While the standard way of measuring the bispectrum involves estimating a bispectrum value in a large number of Fourier triangle bins, the compressed modal bispectrum approximates the bispectrum as a linear combination of basis functions and estimates the expansion coefficients on the chosen basis. In this work, we compare the two estimators by using parallel pipelines to analyze the real-space halo bispectrum measured in a suite of $N$-body simulations corresponding to a total volume of $\sim 1{,}000 \,h^{-3}\,{\rm Gpc}^3$, with covariance matrices estimated from 10,000 mock halo catalogs. We find that the modal bispectrum yields constraints that are consistent and competitive with the standard bispectrum analysis: for the halo bias and shot noise parameters within the tree-level halo bispectrum model up to $k_{\rm max} \approx 0.06 \, (0.10) \,h\,{\rm Mpc}^{-1}$, only 6 (10) modal expansion coefficients are necessary to obtain constraints equivalent to the standard bispectrum estimator using $\sim$ 20 to 1,600 triangle bins, depending on the bin width. For this work, we have implemented a modal estimator pipeline using Markov Chain Monte Carlo simulations for the first time, and we discuss in detail how the parameter posteriors and modal expansion are robust to, or sensitive to, several user settings within the modal bispectrum pipeline. The combination of the highly efficient compression that is achieved and the large number of mock catalogs available allows us to quantify how our modal bispectrum constraints depend on the number of mocks that are used to estimate covariance matrices and the functional form of the likelihood. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 51 pages, 20 figures, 2 tables. To be submitted to JCAP. Comments are welcome

arXiv:2005.06325 [pdf, other]

doi 10.1093/mnras/staa2020

Constraining the growth rate of structure with phase correlations

Authors: Joyce Byun, Felipe Oliveira Franco, Cullan Howlett, Camille Bonvin, Danail Obreschkow

Abstract: We show that correlations between the phases of the galaxy density field in redshift space provide additional information about the growth rate of large-scale structure that is complementary to the power spectrum multipoles. In particular, we consider the multipoles of the line correlation function (LCF), which correlates phases between three collinear points, and use the Fisher forecasting method… ▽ More We show that correlations between the phases of the galaxy density field in redshift space provide additional information about the growth rate of large-scale structure that is complementary to the power spectrum multipoles. In particular, we consider the multipoles of the line correlation function (LCF), which correlates phases between three collinear points, and use the Fisher forecasting method to show that the LCF multipoles can break the degeneracy between the measurement of the growth rate of structure $f$ and the amplitude of perturbations $σ_8$ that is present in the power spectrum multipoles at large scales. This leads to an improvement in the measurement of $f$ and $σ_8$ by up to 220 per cent for $k_{\rm max} = 0.15 \, h\mathrm{Mpc}^{-1}$ and up to 50 per cent for $k_{\rm max} = 0.30 \, h\mathrm{Mpc}^{-1}$ at redshift $z=0.25$, with respect to power spectrum measurements alone for the upcoming generation of galaxy surveys like DESI and Euclid. The average improvements in the constraints on $f$ and $σ_8$ for $k_{\rm max} = 0.15 \, h\mathrm{Mpc}^{-1}$ are $\sim 90$ per cent for the DESI BGS sample with mean redshift $\overline{z}=0.25$, $\sim 40$ per cent for the DESI ELG sample with $\overline{z}=1.25$, and $\sim 40$ per cent for the Euclid H$α$ galaxies with $\overline{z}=1.3$. For $k_{\rm max} = 0.30 \, h\mathrm{Mpc}^{-1}$, the average improvements are $\sim 40$ per cent for the DESI BGS sample and $\sim 20$ per cent for both the DESI ELG and Euclid H$α$ galaxies. △ Less

Submitted 7 July, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: 28 pages, 13 figures, 2 tables. v2 has additional discussion on model-independence of the forecasts. v3 matches the MNRAS accepted version

Journal ref: MNRAS 497, 1765 (2020)

arXiv:2003.02974 [pdf, other]

A flow disturbance estimation and rejection strategy for multirotors with round-trip trajectories

Authors: Jaeseung Byun, Simo A. Mäkiharju, Mark W. Mueller

Abstract: This paper presents a round-trip strategy of multirotors subject to unknown flow disturbances. During the outbound flight, the vehicle immediately utilizes the wind disturbance estimations in feedback control, as an attempt to reduce the tracking error. During this phase, the disturbance estimations with respect to the position are also recorded for future use. For the return flight, the disturban… ▽ More This paper presents a round-trip strategy of multirotors subject to unknown flow disturbances. During the outbound flight, the vehicle immediately utilizes the wind disturbance estimations in feedback control, as an attempt to reduce the tracking error. During this phase, the disturbance estimations with respect to the position are also recorded for future use. For the return flight, the disturbances previously collected are then routed through a feedforward controller. The major assumption here is that the disturbances may vary over space, but not over time during the same mission. We demonstrate the effectiveness of this feedforward strategy via experiments with two different types of wind flows; a simple jet flow and a more complex flow. To use as a baseline case, a cascaded PD controller with an additional feedback loop for disturbance estimation was employed for outbound flights. To display our contributions regarding the additional feedforward approach, an additional feedforward correction term obtained via prerecorded data was integrated for the return flight. Compared to the baseline controller, the feedforward controller was observed to produce 43% less RMSE position error at a vehicle ground velocity of 1 m/s with 6 m/s of environmental wind velocity. This feedforward approach also produced 14% less RMSE position error for the complex flows as well. △ Less

Submitted 9 June, 2021; v1 submitted 5 March, 2020; originally announced March 2020.

Comments: Experimental validation video can be found here: https://youtu.be/lHJLIt3Ul5U

arXiv:1910.04397 [pdf, other]

BitNet: Learning-Based Bit-Depth Expansion

Authors: Junyoung Byun, Kyu** Shim, Changick Kim

Abstract: Bit-depth is the number of bits for each color channel of a pixel in an image. Although many modern displays support unprecedented higher bit-depth to show more realistic and natural colors with a high dynamic range, most media sources are still in bit-depth of 8 or lower. Since insufficient bit-depth may generate annoying false contours or lose detailed visual appearance, bit-depth expansion (BDE… ▽ More Bit-depth is the number of bits for each color channel of a pixel in an image. Although many modern displays support unprecedented higher bit-depth to show more realistic and natural colors with a high dynamic range, most media sources are still in bit-depth of 8 or lower. Since insufficient bit-depth may generate annoying false contours or lose detailed visual appearance, bit-depth expansion (BDE) from low bit-depth (LBD) images to high bit-depth (HBD) images becomes more and more important. In this paper, we adopt a learning-based approach for BDE and propose a novel CNN-based bit-depth expansion network (BitNet) that can effectively remove false contours and restore visual details at the same time. We have carefully designed our BitNet based on an encoder-decoder architecture with dilated convolutions and a novel multi-scale feature integration. We have performed various experiments with four different datasets including MIT-Adobe FiveK, Kodak, ESPL v2, and TESTIMAGES, and our proposed BitNet has achieved state-of-the-art performance in terms of PSNR and SSIM among other existing BDE methods and famous CNN-based image processing networks. Unlike previous methods that separately process each color channel, we treat all RGB channels at once and have greatly improved color restoration. In addition, our network has shown the fastest computational speed in near real-time. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: Accepted by ACCV 2018, Authors Byun and Shim contributed equally

arXiv:1909.02504 [pdf, other]

doi 10.1088/1475-7516/2020/02/025

Modelling the matter bispectrum at small scales in modified gravity

Authors: Benjamin Bose, Joyce Byun, Fabien Lacasa, Azadeh Moradinezhad Dizgah, Lucas Lombriser

Abstract: Future large-scale structure surveys will measure three-point statistics with high statistical significance. This will offer significant improvements on our understanding of gravity, provided we can model these statistics accurately. We assess the performance of several schemes for theoretical modelling of the matter bispectrum, including halo-model based approaches and fitting formulae. We compar… ▽ More Future large-scale structure surveys will measure three-point statistics with high statistical significance. This will offer significant improvements on our understanding of gravity, provided we can model these statistics accurately. We assess the performance of several schemes for theoretical modelling of the matter bispectrum, including halo-model based approaches and fitting formulae. We compare the model predictions against N-body simulations, considering scales up to $k_{\rm max} = 4 h/{\rm Mpc}$, well into non-linear regime of structure formation. Focusing on the equilateral configuration, we conduct this analysis for three theories of gravity: general relativity, $f(R)$ gravity, and the DGP braneworld model. Additionally, we compute the lensing convergence bispectrum for these models. We find that all current modelling prescriptions in modified gravity, in particular for theories with scale-dependent linear growth, fail to attain the accuracy required by the precision of the Stage IV surveys such as \emph{Euclid}. Among these models, we find that a halo-model corrected fitting formula achieves the best overall performance. △ Less

Submitted 17 February, 2020; v1 submitted 5 September, 2019; originally announced September 2019.

Comments: 44 pages, 16 figures, 4 tables. JCAP accepted version

arXiv:1906.05556 [pdf, ps, other]

doi 10.1186/s13662-019-2405-9

Global Stability of an SEIR Epidemic Model where Empirical Distribution of Incubation Period has Approximated by Coxian Distribution

Authors: Sungchan Kim, Jong Hyuk Byun, Il Hyo Jung

Abstract: In this work, we have developed a Coxian distributed SEIR model in incorporating an empirical incubation period. We show that the global dynamics are completely determined by a basic reproduction number. An application of the Coxian distributed SEIR model using data of an empirical incubation period is explored. The model may be useful for resolving causing the realistic intrinsic parts in classic… ▽ More In this work, we have developed a Coxian distributed SEIR model in incorporating an empirical incubation period. We show that the global dynamics are completely determined by a basic reproduction number. An application of the Coxian distributed SEIR model using data of an empirical incubation period is explored. The model may be useful for resolving causing the realistic intrinsic parts in classical epidemic models since Coxian distribution approximately converges to any distribution. △ Less

Submitted 19 August, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

Journal ref: Adv Differ Equ (2019) 2019: 469

arXiv:1905.09396 [pdf, other]

Predictive Control for Chasing a Ground Vehicle using a UAV

Authors: Jaeseung Byun, Karan P. Jain, Siddharth H. Nair, Haoyun Xu, Jiaming Zha

Abstract: We propose a high-level planner for a multirotor to chase a ground vehicle, while simultaneously respecting various state and input constraints. Assuming a minimal kinematic model for the ground vehicle, we use data collected online to generate predictions for our planner within a model predictive control framework. Our solution is demonstrated, both via simulations and experiments on a stable qua… ▽ More We propose a high-level planner for a multirotor to chase a ground vehicle, while simultaneously respecting various state and input constraints. Assuming a minimal kinematic model for the ground vehicle, we use data collected online to generate predictions for our planner within a model predictive control framework. Our solution is demonstrated, both via simulations and experiments on a stable quadcopter platform. △ Less

Submitted 22 May, 2019; originally announced May 2019.

arXiv:1903.08518 [pdf, ps, other]

doi 10.1093/mnras/staa734

Suppressing cosmic variance with paired-and-fixed cosmological simulations: average properties and covariances of dark matter clustering statistics

Authors: Anatoly Klypin, Francisco Prada, Joyce Byun

Abstract: Making cosmological inferences from the observed galaxy clustering requires accurate predictions for the mean clustering statistics and their covariances. Those are affected by cosmic variance -- the statistical noise due to the finite number of harmonics. The cosmic variance can be suppressed by fixing the amplitudes of the harmonics instead of drawing them from a Gaussian distribution predicted… ▽ More Making cosmological inferences from the observed galaxy clustering requires accurate predictions for the mean clustering statistics and their covariances. Those are affected by cosmic variance -- the statistical noise due to the finite number of harmonics. The cosmic variance can be suppressed by fixing the amplitudes of the harmonics instead of drawing them from a Gaussian distribution predicted by the inflation models. Initial realizations also can be generated in pairs with 180 degrees flipped phases to further reduce the variance. Here, we compare the consequences of using paired-and-fixed vs Gaussian initial conditions on the average dark matter clustering and covariance matrices predicted from N-body simulations. As in previous studies, we find no measurable differences between paired-and-fixed and Gaussian simulations for the average density distribution function, power spectrum and bispectrum. Yet, the covariances from paired-and-fixed simulations are suppressed in a complicated scale- and redshift-dependent way. The situation is particularly problematic on the scales of Baryon Acoustic Oscillations where the covariance matrix of the power spectrum is lower by only 20% compared to the Gaussian realizations, implying that there is not much of a reduction of the cosmic variance. The non-trivial suppression, combined with the fact that paired-and-fixed covariances are noisier than from Gaussian simulations, suggests that there is no path towards obtaining accurate covariance matrices from paired-and-fixed simulations. Because the covariances are crucial for the observational estimates of galaxy clustering statistics and cosmological parameters, paired-and-fixed simulations, though useful for some applications, cannot be used for the production of mock galaxy catalogs. △ Less

Submitted 20 March, 2019; originally announced March 2019.

Comments: Submitted to MNRAS

arXiv:1812.00240 [pdf, other]

Fast and Accurate Reconstruction of Pan-Tilt RGB-D Scans via Axis Bound Registration

Authors: Jung-Hyun Byun, Tack-Don Han

Abstract: A fast and accurate algorithm is presented for registering scans from an RGB-D camera on a pan-tilt platform. The pan-tilt RGB-D camera rotates and scans the entire scene in an automated fashion. The proposed algorithm exploits the movement of the camera that is bound by the two rotation axes of the servo motors so as to realize fast and accurate registration of acquired point clouds. The rotation… ▽ More A fast and accurate algorithm is presented for registering scans from an RGB-D camera on a pan-tilt platform. The pan-tilt RGB-D camera rotates and scans the entire scene in an automated fashion. The proposed algorithm exploits the movement of the camera that is bound by the two rotation axes of the servo motors so as to realize fast and accurate registration of acquired point clouds. The rotation parameters, including the rotation axes, pan-tilt transformations and the servo control mechanism, are calibrated beforehand. Subsequently, fast global registration can be performed during online operation with transformation matrices formed by the calibrated rotation axes and angles. In local registration, features are extracted and matched between two scenes. False-positive correspondences, whose distances to the rotation trajectories exceed a threshold, are rejected. Then, a more accurate registration can be achieved by minimizing the residual distances between corresponding points, while transformations are bound to the rotation axes. Finally, the preliminary alignment result is input to the iterative closed point algorithm to compute the final transformation. Results of comparative experiments validate that the proposed method outperforms state-of-the-art algorithms of various approaches based on camera calibration, global registration, and simultaneous-localization-and-map** in terms of root-mean-square error and computation time. △ Less

Submitted 10 June, 2019; v1 submitted 1 December, 2018; originally announced December 2018.

Comments: in submission

arXiv:1812.00233 [pdf, other]

doi 10.2312/egsh.20171001

AIR: Anywhere Immersive Reality with User-Perspective Projection

Authors: JungHyun Byun, SeungHo Chae, YoonSik Yang, TackDon Han

Abstract: Projection-based augmented reality (AR) has much potential, but is limited in that it requires burdensome installations and prone to geometric distortions on display surface. To overcome these limitations, we propose AIR. It can be carried and placed anywhere to project AR using pan/tilting motors, while providing the user with distortion-free projection of a correct 3D view. Projection-based augmented reality (AR) has much potential, but is limited in that it requires burdensome installations and prone to geometric distortions on display surface. To overcome these limitations, we propose AIR. It can be carried and placed anywhere to project AR using pan/tilting motors, while providing the user with distortion-free projection of a correct 3D view. △ Less

Submitted 1 December, 2018; originally announced December 2018.

Comments: Presented at EUROGRAPHICS 2017 as Short Paper

arXiv:1812.00232 [pdf, other]

doi 10.2312/egs.20181044

Accurate control of a pan-tilt system based on parameterization of rotational motion

Authors: JungHyun Byun, SeungHo Chae, TackDon Han

Abstract: A pan-tilt camera system has been adopted by a variety of fields since it can cover a wide range of region compared to a single fixated camera setup. Yet many studies rely on factory-assembled and calibrated platforms and assume an ideal rotation where rotation axes are perfectly aligned with the optical axis of the local camera. However, in a user-created setup where a pan-tilting mechanism is ar… ▽ More A pan-tilt camera system has been adopted by a variety of fields since it can cover a wide range of region compared to a single fixated camera setup. Yet many studies rely on factory-assembled and calibrated platforms and assume an ideal rotation where rotation axes are perfectly aligned with the optical axis of the local camera. However, in a user-created setup where a pan-tilting mechanism is arbitrarily assembled, the kinematic configurations may be inaccurate or unknown, violating ideal rotation. These discrepancies in the model with the real physics result in erroneous servo manipulation of the pan-tilting system. In this paper, we propose an accurate control mechanism for arbitrarily-assembled pan-tilt camera systems. The proposed method formulates pan-tilt rotations as motion along great circle trajectories and calibrates its model parameters, such as positions and vectors of rotation axes, in 3D space. Then, one can accurately servo pan-tilt rotations with pose estimation from inverse kinematics of their transformation. The comparative experiment demonstrates out-performance of the proposed method, in terms of accurately localizing target points in world coordinates, after being rotated from their captured camera frames. △ Less

Submitted 1 December, 2018; originally announced December 2018.

Comments: Presented at EUROGRAPHICS 2018 as Short Paper

arXiv:1805.10178 [pdf, other]

doi 10.1103/PhysRevD.99.103530

Probing redshift-space distortions with phase correlations

Authors: Felipe O. Franco, Camille Bonvin, Danail Obreschkow, Kamran Ali, Joyce Byun

Abstract: Redshift-space distortions are a sensitive probe of the growth of large-scale structure. In the linear regime, redshift-space distortions are fully described by the multipoles of the two-point correlation function. In the nonlinear regime, however, higher-order statistics are needed to capture the full information of the galaxy density field. In this paper, we show that the redshift-space line cor… ▽ More Redshift-space distortions are a sensitive probe of the growth of large-scale structure. In the linear regime, redshift-space distortions are fully described by the multipoles of the two-point correlation function. In the nonlinear regime, however, higher-order statistics are needed to capture the full information of the galaxy density field. In this paper, we show that the redshift-space line correlation function--which is a measure of Fourier phase correlations--is sensitive to the nonlinear growth of the density and velocity fields and to the nonlinear map** between real and redshift space. We expand the line correlation function in multipoles, and we show that almost all of the information is encoded in the monopole, quadrupole, and hexadecapole. We argue that these multipoles are highly complementary to the multipoles of the two-point correlation function: first, because they are directly sensitive to the difference between the density and the velocity coupling kernels, which is a purely nonlinear quantity; and second, because the multipoles are proportional to different combinations of $f$ and $σ_8$. Measured in conjunction with the two-point correlation function and the bispectrum, the multipoles of the line correlation function could therefore allow us to disentangle efficiently these two quantities and to test modified theories of gravity. △ Less

Submitted 10 June, 2019; v1 submitted 25 May, 2018; originally announced May 2018.

Comments: 17 pages, 8 figures

Journal ref: Phys. Rev. D 99, 103530 (2019)

arXiv:1705.04392 [pdf, other]

doi 10.1093/mnras/stx1681

Towards optimal cosmological parameter recovery from compressed bispectrum statistics

Authors: Joyce Byun, Alexander Eggemeier, Donough Regan, David Seery, Robert E. Smith

Abstract: Over the next decade, improvements in cosmological parameter constraints will be driven by surveys of large-scale structure. Its inherent non-linearity suggests that significant information will be embedded in higher correlations beyond the two-point function. Extracting this information is extremely challenging: it requires accurate theoretical modelling and significant computational resources to… ▽ More Over the next decade, improvements in cosmological parameter constraints will be driven by surveys of large-scale structure. Its inherent non-linearity suggests that significant information will be embedded in higher correlations beyond the two-point function. Extracting this information is extremely challenging: it requires accurate theoretical modelling and significant computational resources to estimate the covariance matrix describing correlations between different Fourier configurations. We investigate whether it is possible to reduce the covariance matrix without significant loss of information by using a proxy that aggregates the bispectrum over a subset of Fourier configurations. Specifically, we study the constraints on $Λ$CDM parameters from combining the power spectrum with (a) the modal bispectrum decomposition, (b) the line correlation function and (c) the integrated bispectrum. We forecast the error bars achievable on $Λ$CDM parameters using these proxies in a future galaxy survey and compare them to those obtained from measurements of the Fourier bispectrum, including simple estimates of their degradation in the presence of shot noise. Our results demonstrate that the modal bispectrum performs as well as the Fourier bispectrum, even with considerably fewer modes than Fourier configurations. The line correlation function has good performance but does not match the modal bispectrum. The integrated bispectrum is comparatively insensitive to changes in the background cosmology. We find that adding bispectrum data can improve constraints on bias parameters and the normalization $σ_8$ by up to 5 compared to power spectrum measurements alone. For other parameters, improvements of up to $\sim$ 20% are possible. Finally, we use a range of theoretical models to explore how the sophistication required for realistic predictions varies with each proxy. (abridged) △ Less

Submitted 11 May, 2017; originally announced May 2017.

Comments: 38 pages, 16 figures. Supporting dataset made available at https://doi.org/10.5281/ZENODO.438187

arXiv:1604.07912 [pdf]

doi 10.1038/ncomms13422

Trapped charge driven degradation of perovskite solar cells

Authors: Namyoung Ahn, Kwisung Kwak, Min Seok Jang, Heetae Yoon, Byung Yang Lee, Jong-Kwon Lee, Peter V. Pikhitsa, Junseop Byun, Mansoo Choi

Abstract: Perovskite solar cells have shown fast deterioration during actual operation even with encapsulation, but its mechanism has been elusive. We found the fundamental mechanism for irreversible degradation of perovskite materials in which trapped charges regardless of the polarity play a decisive role. A novel experimental setup utilizing different polarity ions revealed that the moisture induced irre… ▽ More Perovskite solar cells have shown fast deterioration during actual operation even with encapsulation, but its mechanism has been elusive. We found the fundamental mechanism for irreversible degradation of perovskite materials in which trapped charges regardless of the polarity play a decisive role. A novel experimental setup utilizing different polarity ions revealed that the moisture induced irreversible dissociation of perovskite materials is triggered by charges trapped along grain boundaries. Our finding clearly explained the intriguing observations why light soaking induces irreversible degradation while in the dark, moisture only causes reversible hydration, and why degradation begins from different side of interface for different charge extraction layers. The deprotonation of organic cations by trapped charge induced local electric field is attributed to the initiation of irreversible decomposition. △ Less

Submitted 26 April, 2016; originally announced April 2016.

Showing 1–50 of 57 results for author: Byun, J