-
Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation
Authors:
Yeonguk Yu,
Sungho Shin,
Seunghyeok Back,
Minhwan Ko,
Sangjun Noh,
Kyoobin Lee
Abstract:
Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this wo…
▽ More
Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this work, we propose DPLOT, a simple yet effective TTA framework that consists of two components: (1) domain-specific block selection and (2) pseudo-label generation using paired-view images. Specifically, we select blocks that involve domain-specific feature extraction and train these blocks by entropy minimization. After blocks are adjusted for current test domain, we generate pseudo-labels by averaging given test images and corresponding flipped counterparts. By simply using flip augmentation, we prevent a decrease in the quality of the pseudo-labels, which can be caused by the domain gap resulting from strong augmentation. Our experimental results demonstrate that DPLOT outperforms previous TTA methods in CIFAR10-C, CIFAR100-C, and ImageNet-C benchmarks, reducing error by up to 5.4%, 9.1%, and 2.9%, respectively. Also, we provide an extensive analysis to demonstrate effectiveness of our framework. Code is available at https://github.com/gist-ailab/domain-specific-block-selection-and-paired-view-pseudo-labeling-for-online-TTA.
△ Less
Submitted 7 May, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Fast and Accurate Unknown Object Instance Segmentation through Error-Informed Refinement
Authors:
Seunghyeok Back,
Sangbeom Lee,
Kangmin Kim,
Joosoon Lee,
Sungho Shin,
Jemo Maeng,
Kyoobin Lee
Abstract:
Accurate perception of unknown objects is essential for autonomous robots, particularly when manipulating novel items in unstructured environments. However, existing unknown object instance segmentation (UOIS) methods often have over-segmentation and under-segmentation problems, resulting in inaccurate instance boundaries and failures in subsequent robotic tasks such as gras** and placement. To…
▽ More
Accurate perception of unknown objects is essential for autonomous robots, particularly when manipulating novel items in unstructured environments. However, existing unknown object instance segmentation (UOIS) methods often have over-segmentation and under-segmentation problems, resulting in inaccurate instance boundaries and failures in subsequent robotic tasks such as gras** and placement. To address this challenge, this article introduces INSTA-BEER, a fast and accurate model-agnostic refinement method that enhances the UOIS performance. The model adopts an error-informed refinement approach, which first predicts pixel-wise errors in the initial segmentation and then refines the segmentation guided by these error estimates. We introduce the quad-metric boundary error, which quantifies pixel-wise true positives, true negatives, false positives, and false negatives at the boundaries of object instances, effectively capturing both fine-grained and instance-level segmentation errors. Additionally, the Error Guidance Fusion (EGF) module explicitly integrates error information into the refinement process, further improving segmentation quality. In comprehensive evaluations conducted on three widely used benchmark datasets, INSTA-BEER outperformed state-of-the-art models in both accuracy and inference time. Moreover, a real-world robotic experiment demonstrated the practical applicability of our method in improving the performance of target object gras** tasks in cluttered environments.
△ Less
Submitted 30 April, 2024; v1 submitted 28 June, 2023;
originally announced June 2023.
-
Learning to Place Unseen Objects Stably using a Large-scale Simulation
Authors:
Sangjun Noh,
Raeyoung Kang,
Taewon Kim,
Seunghyeok Back,
Seongho Bak,
Kyoobin Lee
Abstract:
Object placement is a fundamental task for robots, yet it remains challenging for partially observed objects. Existing methods for object placement have limitations, such as the requirement for a complete 3D model of the object or the inability to handle complex shapes and novel objects that restrict the applicability of robots in the real world. Herein, we focus on addressing the Unseen Object Pl…
▽ More
Object placement is a fundamental task for robots, yet it remains challenging for partially observed objects. Existing methods for object placement have limitations, such as the requirement for a complete 3D model of the object or the inability to handle complex shapes and novel objects that restrict the applicability of robots in the real world. Herein, we focus on addressing the Unseen Object Placement (UOP}=) problem. We tackled the UOP problem using two methods: (1) UOP-Sim, a large-scale dataset to accommodate various shapes and novel objects, and (2) UOP-Net, a point cloud segmentation-based approach that directly detects the most stable plane from partial point clouds. Our UOP approach enables robots to place objects stably, even when the object's shape and properties are not fully known, thus providing a promising solution for object placement in various environments. We verify our approach through simulation and real-world robot experiments, demonstrating state-of-the-art performance for placing single-view and partial objects. Robot demos, codes, and dataset are available at https://gistailab.github.io/uop/
△ Less
Submitted 11 September, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
SleePyCo: Automatic Sleep Scoring with Feature Pyramid and Contrastive Learning
Authors:
Seongju Lee,
Yeonguk Yu,
Seunghyeok Back,
Hogeon Seo,
Kyoobin Lee
Abstract:
Automatic sleep scoring is essential for the diagnosis and treatment of sleep disorders and enables longitudinal sleep tracking in home environments. Conventionally, learning-based automatic sleep scoring on single-channel electroencephalogram (EEG) is actively studied because obtaining multi-channel signals during sleep is difficult. However, learning representation from raw EEG signals is challe…
▽ More
Automatic sleep scoring is essential for the diagnosis and treatment of sleep disorders and enables longitudinal sleep tracking in home environments. Conventionally, learning-based automatic sleep scoring on single-channel electroencephalogram (EEG) is actively studied because obtaining multi-channel signals during sleep is difficult. However, learning representation from raw EEG signals is challenging owing to the following issues: 1) sleep-related EEG patterns occur on different temporal and frequency scales and 2) sleep stages share similar EEG patterns. To address these issues, we propose a deep learning framework named SleePyCo that incorporates 1) a feature pyramid and 2) supervised contrastive learning for automatic sleep scoring. For the feature pyramid, we propose a backbone network named SleePyCo-backbone to consider multiple feature sequences on different temporal and frequency scales. Supervised contrastive learning allows the network to extract class discriminative features by minimizing the distance between intra-class features and simultaneously maximizing that between inter-class features. Comparative analyses on four public datasets demonstrate that SleePyCo consistently outperforms existing frameworks based on single-channel EEG. Extensive ablation experiments show that SleePyCo exhibits enhanced overall performance, with significant improvements in discrimination between the N1 and rapid eye movement (REM) stages.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
Automatic Detection of Injection and Press Mold Parts on 2D Drawing Using Deep Neural Network
Authors:
Junseok Lee,
Jongwon Kim,
Jumi Park,
Seunghyeok Back,
Seongho Bak,
Kyoobin Lee
Abstract:
This paper proposes a method to automatically detect the key feature parts in a CAD of commercial TV and monitor using a deep neural network. We developed a deep learning pipeline that can detect the injection parts such as hook, boss, undercut and press parts such as DPS, Embo-Screwless, Embo-Burring, and EMBO in the 2D CAD drawing images. We first cropped the drawing to a specific size for the t…
▽ More
This paper proposes a method to automatically detect the key feature parts in a CAD of commercial TV and monitor using a deep neural network. We developed a deep learning pipeline that can detect the injection parts such as hook, boss, undercut and press parts such as DPS, Embo-Screwless, Embo-Burring, and EMBO in the 2D CAD drawing images. We first cropped the drawing to a specific size for the training efficiency of a deep neural network. Then, we use Cascade R-CNN to find the position of injection and press parts and use Resnet-50 to predict the orientation of the parts. Finally, we convert the position of the parts found through the cropped image to the position of the original image. As a result, we obtained detection accuracy of injection and press parts with 84.1% in AP (Average Precision), 91.2% in AR(Average Recall), 72.0% in AP, 87.0% in AR, and orientation accuracy of injection and press parts with 94.4% and 92.0%, which can facilitate the faster design in industrial product design.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling
Authors:
Seunghyeok Back,
Joosoon Lee,
Taewon Kim,
Sangjun Noh,
Raeyoung Kang,
Seongho Bak,
Kyoobin Lee
Abstract:
Instance-aware segmentation of unseen objects is essential for a robotic system in an unstructured environment. Although previous works achieved encouraging results, they were limited to segmenting the only visible regions of unseen objects. For robotic manipulation in a cluttered scene, amodal perception is required to handle the occluded objects behind others. This paper addresses Unseen Object…
▽ More
Instance-aware segmentation of unseen objects is essential for a robotic system in an unstructured environment. Although previous works achieved encouraging results, they were limited to segmenting the only visible regions of unseen objects. For robotic manipulation in a cluttered scene, amodal perception is required to handle the occluded objects behind others. This paper addresses Unseen Object Amodal Instance Segmentation (UOAIS) to detect 1) visible masks, 2) amodal masks, and 3) occlusions on unseen object instances. For this, we propose a Hierarchical Occlusion Modeling (HOM) scheme designed to reason about the occlusion by assigning a hierarchy to a feature fusion and prediction order. We evaluated our method on three benchmarks (tabletop, indoors, and bin environments) and achieved state-of-the-art (SOTA) performance. Robot demos for picking up occluded objects, codes, and datasets are available at https://sites.google.com/view/uoais
△ Less
Submitted 28 February, 2022; v1 submitted 22 September, 2021;
originally announced September 2021.
-
The SNO+ Experiment
Authors:
SNO+ Collaboration,
:,
V. Albanese,
R. Alves,
M. R. Anderson,
S. Andringa,
L. Anselmo,
E. Arushanova,
S. Asahi,
M. Askins,
D. J. Auty,
A. R. Back,
S. Back,
F. Barão,
Z. Barnard,
A. Barr,
N. Barros,
D. Bartlett,
R. Bayes,
C. Beaudoin,
E. W. Beier,
G. Berardi,
A. Bialek,
S. D. Biller,
E. Blucher
, et al. (229 additional authors not shown)
Abstract:
The SNO+ experiment is located 2 km underground at SNOLAB in Sudbury, Canada. A low background search for neutrinoless double beta ($0νββ$) decay will be conducted using 780 tonnes of liquid scintillator loaded with 3.9 tonnes of natural tellurium, corresponding to 1.3 tonnes of $^{130}$Te. This paper provides a general overview of the SNO+ experiment, including detector design, construction of pr…
▽ More
The SNO+ experiment is located 2 km underground at SNOLAB in Sudbury, Canada. A low background search for neutrinoless double beta ($0νββ$) decay will be conducted using 780 tonnes of liquid scintillator loaded with 3.9 tonnes of natural tellurium, corresponding to 1.3 tonnes of $^{130}$Te. This paper provides a general overview of the SNO+ experiment, including detector design, construction of process plants, commissioning efforts, electronics upgrades, data acquisition systems, and calibration techniques. The SNO+ collaboration is reusing the acrylic vessel, PMT array, and electronics of the SNO detector, having made a number of experimental upgrades and essential adaptations for use with the liquid scintillator. With low backgrounds and a low energy threshold, the SNO+ collaboration will also pursue a rich physics program beyond the search for $0νββ$ decay, including studies of geo- and reactor antineutrinos, supernova and solar neutrinos, and exotic physics such as the search for invisible nucleon decay. The SNO+ approach to the search for $0νββ$ decay is scalable: a future phase with high $^{130}$Te-loading is envisioned to probe an effective Majorana mass in the inverted mass ordering region.
△ Less
Submitted 25 August, 2021; v1 submitted 23 April, 2021;
originally announced April 2021.
-
Object Detection for Understanding Assembly Instruction Using Context-aware Data Augmentation and Cascade Mask R-CNN
Authors:
Joosoon Lee,
Seongju Lee,
Seunghyeok Back,
Sungho Shin,
Kyoobin Lee
Abstract:
Understanding assembly instruction has the potential to enhance the robot s task planning ability and enables advanced robotic applications. To recognize the key components from the 2D assembly instruction image, We mainly focus on segmenting the speech bubble area, which contains lots of information about instructions. For this, We applied Cascade Mask R-CNN and developed a context-aware data aug…
▽ More
Understanding assembly instruction has the potential to enhance the robot s task planning ability and enables advanced robotic applications. To recognize the key components from the 2D assembly instruction image, We mainly focus on segmenting the speech bubble area, which contains lots of information about instructions. For this, We applied Cascade Mask R-CNN and developed a context-aware data augmentation scheme for speech bubble segmentation, which randomly combines images cuts by considering the context of assembly instructions. We showed that the proposed augmentation scheme achieves a better segmentation performance compared to the existing augmentation algorithm by increasing the diversity of trainable data while considering the distribution of components locations. Also, we showed that deep learning can be useful to understand assembly instruction by detecting the essential objects in the assembly instruction, such as tools and parts.
△ Less
Submitted 7 January, 2021; v1 submitted 7 January, 2021;
originally announced January 2021.
-
Segmenting Unseen Industrial Components in a Heavy Clutter Using RGB-D Fusion and Synthetic Data
Authors:
Seunghyeok Back,
Jongwon Kim,
Raeyoung Kang,
Seungjun Choi,
Kyoobin Lee
Abstract:
Segmentation of unseen industrial parts is essential for autonomous industrial systems. However, industrial components are texture-less, reflective, and often found in cluttered and unstructured environments with heavy occlusion, which makes it more challenging to deal with unseen objects. To tackle this problem, we present a synthetic data generation pipeline that randomizes textures via domain r…
▽ More
Segmentation of unseen industrial parts is essential for autonomous industrial systems. However, industrial components are texture-less, reflective, and often found in cluttered and unstructured environments with heavy occlusion, which makes it more challenging to deal with unseen objects. To tackle this problem, we present a synthetic data generation pipeline that randomizes textures via domain randomization to focus on the shape information. In addition, we propose an RGB-D Fusion Mask R-CNN with a confidence map estimator, which exploits reliable depth information in multiple feature levels. We transferred the trained model to real-world scenarios and evaluated its performance by making comparisons with baselines and ablation studies. We demonstrate that our methods, which use only synthetic data, could be effective solutions for unseen industrial components segmentation.
△ Less
Submitted 1 June, 2020; v1 submitted 9 February, 2020;
originally announced February 2020.
-
Intra- and Inter-epoch Temporal Context Network (IITNet) Using Sub-epoch Features for Automatic Sleep Scoring on Raw Single-channel EEG
Authors:
Hogeon Seo,
Seunghyeok Back,
Seongju Lee,
Deokhwan Park,
Tae Kim,
Kyoobin Lee
Abstract:
A deep learning model, named IITNet, is proposed to learn intra- and inter-epoch temporal contexts from raw single-channel EEG for automatic sleep scoring. To classify the sleep stage from half-minute EEG, called an epoch, sleep experts investigate sleep-related events and consider the transition rules between the found events. Similarly, IITNet extracts representative features at a sub-epoch leve…
▽ More
A deep learning model, named IITNet, is proposed to learn intra- and inter-epoch temporal contexts from raw single-channel EEG for automatic sleep scoring. To classify the sleep stage from half-minute EEG, called an epoch, sleep experts investigate sleep-related events and consider the transition rules between the found events. Similarly, IITNet extracts representative features at a sub-epoch level by a residual neural network and captures intra- and inter-epoch temporal contexts from the sequence of the features via bidirectional LSTM. The performance was investigated for three datasets as the sequence length (L) increased from one to ten. IITNet achieved the comparable performance with other state-of-the-art results. The best accuracy, MF1, and Cohen's kappa ($κ$) were 83.9%, 77.6%, 0.78 for SleepEDF (L=10), 86.5%, 80.7%, 0.80 for MASS (L=9), and 86.7%, 79.8%, 0.81 for SHHS (L=10), respectively. Even though using four epochs, the performance was still comparable. Compared to using a single epoch, on average, accuracy and MF1 increased by 2.48%p and 4.90%p and F1 of N1, N2, and REM increased by 16.1%p, 1.50%p, and 6.42%p, respectively. Above four epochs, the performance improvement was not significant. The results support that considering the latest two-minute raw single-channel EEG can be a reasonable choice for sleep scoring via deep neural networks with efficiency and reliability. Furthermore, the experiments with the baselines showed that introducing intra-epoch temporal context learning with a deep residual network contributes to the improvement in the overall performance and has the positive synergy effect with the inter-epoch temporal context learning.
△ Less
Submitted 10 June, 2020; v1 submitted 18 February, 2019;
originally announced February 2019.
-
Catalyst design using actively learned machine with non-ab initio input features towards CO2 reduction reactions
Authors:
Juhwan Noh,
Jaehoon Kim,
Seoin Back,
Yousung Jung
Abstract:
In conventional chemisorption model, the d-band center theory (augmented sometimes with the upper edge of d-band for imporved accuarcy) plays a central role in predicting adsorption energies and catalytic activity as a function of d-band center of the solid surfaces, but it requires density functional calculations that can be quite costly for large scale screening purposes of materials. In this wo…
▽ More
In conventional chemisorption model, the d-band center theory (augmented sometimes with the upper edge of d-band for imporved accuarcy) plays a central role in predicting adsorption energies and catalytic activity as a function of d-band center of the solid surfaces, but it requires density functional calculations that can be quite costly for large scale screening purposes of materials. In this work, we propose to use the d-band width of the muffin-tin orbital theory (to account for local coordination environment) plus electronegativity (to account for adsorbate renormalization) as a simple set of alternative descriptors for chemisorption, which do not demand the ab initio calculations. This pair of descriptors are then combined with machine learning methods, namely, artificial neural network (ANN) and kernel ridge regression (KRR), to allow large scale materials screenings. We show, for a toy set of 263 alloy systems, that the CO adsorption energy can be predicted with a remarkably small mean absolute deviation error of 0.05 eV, a significantly improved result as compared to 0.13 eV obtained with descriptors including costly d-band center calculations in literature. We achieved this high accuracy by utilizing an active learning algorithm, without which the accuracy was 0.18 eV otherwise. As a practical application of this machine, we identified Cu3Y@Cu as a highly active and cost-effective electrochemical CO2 reduction catalyst to produce CO with the overpotential 0.37 V lower than Au catalyst.
△ Less
Submitted 13 September, 2017;
originally announced September 2017.
-
Tuning Optical Conductivity of Large-Scale CVD Graphene by Strain Engineering
Authors:
Guang-Xin Ni,
Hong-Zhi Yang,
Wei Ji,
Seung-Jae Baeck,
Chee-Tat Toh,
Jong-Hyun Ahn,
Vitor M. Pereira,
Barbaros Özyilmaz
Abstract:
Strain engineering has been recently recognized as an effective way to tailor the electrical properties of graphene. In the optical domain, effects such as strain-induced anisotropic absorption add an appealing functionality to graphene, opening the prospect for atomically thin optical elements. Indeed, graphene is currently one of the notable players in the intense drive towards bendable, thin, a…
▽ More
Strain engineering has been recently recognized as an effective way to tailor the electrical properties of graphene. In the optical domain, effects such as strain-induced anisotropic absorption add an appealing functionality to graphene, opening the prospect for atomically thin optical elements. Indeed, graphene is currently one of the notable players in the intense drive towards bendable, thin, and portable electronic displays, where its intrinsically metallic, optically transparent, and mechanically robust nature are major advantages. Given that the intrinsic transparency of a graphene monolayer is 97.7 %, any small, reproducible, controllable, and potentially reversible modulation of transparency can have a significant impact for graphene as a viable transparent conducting electrode. Even more so, if the degree of modulation is polarization dependent. Here we show that the transparency in the visible range of graphene pre-strained on a Polyethylene terephthalate (PET) substrate exhibits a periodic modulation (0.1 %) as a function of polarization direction, which we interpret as strain-induced optical anisotropy. The degree of anisotropy is varied by reversible external manipulation of the level of pre-strain. The magnitude of strain is monitored independently by optical absorption and Raman spectroscopy, and the experimental observations are consistent with the theoretically expected modification of the optical conductivity of graphene arising from the strain-induced changes in the electronic dispersion of graphene. The strain sensitivity of the optical response of graphene demonstrated in this study can be potentially utilized towards novel ultra-thin optical devices and strain sensing applications.
△ Less
Submitted 29 December, 2013;
originally announced December 2013.