-
Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance
Authors:
Ryan Donghan Kwon,
Gangjoo Robin Nam,
Jisoo Tak,
Junseob Shin,
Hyerin Cha,
Yeom Hyeok,
Seung Won Lee
Abstract:
This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focu…
▽ More
This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focus towards more informative features, achieving an accuracy of 87% compared to the baseline ResNet50's 85%. Further augmentations involved multi-scale feature integration, depthwise separable convolutions, and dilated convolutions, culminating in the Enhanced ResNet Model with Improved CBAM. This model demonstrated a remarkable accuracy of 95%, with precision, recall, and f1-scores all witnessing substantial improvements across various ship classes. The bulk carrier and oil tanker classes, in particular, showcased nearly perfect precision and recall rates, underscoring the model's enhanced capability in accurately identifying and classifying ships. Attention heatmap analyses further validated the improved model's efficacy, revealing a more focused attention on relevant ship features, regardless of background complexities. These findings underscore the potential of integrating attention mechanisms and architectural innovations in CNNs for high-resolution satellite imagery classification. The study navigates through the challenges of class imbalance and computational costs, proposing future directions towards scalability and adaptability in new or rare ship type recognition. This research lays a groundwork for the application of advanced deep learning techniques in the domain of remote sensing, offering insights into scalable and efficient satellite image classification.
△ Less
Submitted 8 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text Guidance
Authors:
Giung Nam,
Byeongho Heo,
Juho Lee
Abstract:
Large-scale contrastive vision-language pre-trained models provide the zero-shot model achieving competitive performance across a range of image classification tasks without requiring training on downstream data. Recent works have confirmed that while additional fine-tuning of the zero-shot model on the reference data results in enhanced downstream performance, it compromises the model's robustnes…
▽ More
Large-scale contrastive vision-language pre-trained models provide the zero-shot model achieving competitive performance across a range of image classification tasks without requiring training on downstream data. Recent works have confirmed that while additional fine-tuning of the zero-shot model on the reference data results in enhanced downstream performance, it compromises the model's robustness against distribution shifts. Our investigation begins by examining the conditions required to achieve the goals of robust fine-tuning, employing descriptions based on feature distortion theory and joint energy-based models. Subsequently, we propose a novel robust fine-tuning algorithm, Lipsum-FT, that effectively utilizes the language modeling aspect of the vision-language pre-trained models. Extensive experiments conducted on distribution shift scenarios in DomainNet and ImageNet confirm the superiority of our proposed Lipsum-FT approach over existing robust fine-tuning methods.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion
Authors:
Jihyun Lee,
Shunsuke Saito,
Giljoo Nam,
Minhyuk Sung,
Tae-Kyun Kim
Abstract:
We present InterHandGen, a novel framework that learns the generative prior of two-hand interaction. Sampling from our model yields plausible and diverse two-hand shapes in close interaction with or without an object. Our prior can be incorporated into any optimization or learning methods to reduce ambiguity in an ill-posed setup. Our key observation is that directly modeling the joint distributio…
▽ More
We present InterHandGen, a novel framework that learns the generative prior of two-hand interaction. Sampling from our model yields plausible and diverse two-hand shapes in close interaction with or without an object. Our prior can be incorporated into any optimization or learning methods to reduce ambiguity in an ill-posed setup. Our key observation is that directly modeling the joint distribution of multiple instances imposes high learning complexity due to its combinatorial nature. Thus, we propose to decompose the modeling of joint distribution into the modeling of factored unconditional and conditional single instance distribution. In particular, we introduce a diffusion model that learns the single-hand distribution unconditional and conditional to another hand via conditioning dropout. For sampling, we combine anti-penetration and classifier-free guidance to enable plausible generation. Furthermore, we establish the rigorous evaluation protocol of two-hand synthesis, where our method significantly outperforms baseline generative models in terms of plausibility and diversity. We also demonstrate that our diffusion prior can boost the performance of two-hand reconstruction from monocular in-the-wild images, achieving new state-of-the-art accuracy.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
VIGFace: Virtual Identity Generation Model for Face Image Synthesis
Authors:
Minsoo Kim,
Min-Cheol Sagong,
Gi Pyo Nam,
Junghyun Cho,
Ig-Jae Kim
Abstract:
Deep learning-based face recognition continues to face challenges due to its reliance on huge datasets obtained from web crawling, which can be costly to gather and raise significant real-world privacy concerns. To address this issue, we propose VIGFace, a novel framework capable of generating synthetic facial images. Initially, we train the face recognition model using a real face dataset and cre…
▽ More
Deep learning-based face recognition continues to face challenges due to its reliance on huge datasets obtained from web crawling, which can be costly to gather and raise significant real-world privacy concerns. To address this issue, we propose VIGFace, a novel framework capable of generating synthetic facial images. Initially, we train the face recognition model using a real face dataset and create a feature space for both real and virtual IDs where virtual prototypes are orthogonal to other prototypes. Subsequently, we generate synthetic images by using the diffusion model based on the feature space. Our proposed framework provides two significant benefits. Firstly, it allows for creating virtual facial images without concerns about portrait rights, guaranteeing that the generated virtual face images are clearly differentiated from existing individuals. Secondly, it serves as an effective augmentation method by incorporating real existing images. Further experiments demonstrate the efficacy of our framework, achieving state-of-the-art results from both perspectives without any external data.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
IG-FIQA: Improving Face Image Quality Assessment through Intra-class Variance Guidance robust to Inaccurate Pseudo-Labels
Authors:
Minsoo Kim,
Gi Pyo Nam,
Haksub Kim,
Haesol Park,
Ig-Jae Kim
Abstract:
In the realm of face image quality assesment (FIQA), method based on sample relative classification have shown impressive performance. However, the quality scores used as pseudo-labels assigned from images of classes with low intra-class variance could be unrelated to the actual quality in this method. To address this issue, we present IG-FIQA, a novel approach to guide FIQA training, introducing…
▽ More
In the realm of face image quality assesment (FIQA), method based on sample relative classification have shown impressive performance. However, the quality scores used as pseudo-labels assigned from images of classes with low intra-class variance could be unrelated to the actual quality in this method. To address this issue, we present IG-FIQA, a novel approach to guide FIQA training, introducing a weight parameter to alleviate the adverse impact of these classes. This method involves estimating sample intra-class variance at each iteration during training, ensuring minimal computational overhead and straightforward implementation. Furthermore, this paper proposes an on-the-fly data augmentation methodology for improved generalization performance in FIQA. On various benchmark datasets, our proposed method, IG-FIQA, achieved novel state-of-the-art (SOTA) performance.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Enhancing Transfer Learning with Flexible Nonparametric Posterior Sampling
Authors:
Hyungi Lee,
Giung Nam,
Edwin Fong,
Juho Lee
Abstract:
Transfer learning has recently shown significant performance across various tasks involving deep neural networks. In these transfer learning scenarios, the prior distribution for downstream data becomes crucial in Bayesian model averaging (BMA). While previous works proposed the prior over the neural network parameters centered around the pre-trained solution, such strategies have limitations when…
▽ More
Transfer learning has recently shown significant performance across various tasks involving deep neural networks. In these transfer learning scenarios, the prior distribution for downstream data becomes crucial in Bayesian model averaging (BMA). While previous works proposed the prior over the neural network parameters centered around the pre-trained solution, such strategies have limitations when dealing with distribution shifts between upstream and downstream data. This paper introduces nonparametric transfer learning (NPTL), a flexible posterior sampling method to address the distribution shift issue within the context of nonparametric learning. The nonparametric learning (NPL) method is a recent approach that employs a nonparametric prior for posterior sampling, efficiently accounting for model misspecification scenarios, which is suitable for transfer learning scenarios that may involve the distribution shift between upstream and downstream tasks. Through extensive empirical validations, we demonstrate that our approach surpasses other baselines in BMA performance.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
A converse of dynamical Mordell--Lang conjecture in positive characteristic
Authors:
Jungin Lee,
Gyeonghyeon Nam
Abstract:
In this paper, we prove the converse of the dynamical Mordell--Lang conjecture in positive characteristic: For every subset $S \subseteq \mathbb{N}_0$ which is a union of finitely many arithmetic progressions along with finitely many $p$-sets of the form $\left \{ \sum_{j=1}^{m} c_j p^{k_jn_j} : n_j \in \mathbb{N}_0 \right \}$ ($c_j \in \mathbb{Q}$, $k_j \in \mathbb{N}_0$), there exist a split tor…
▽ More
In this paper, we prove the converse of the dynamical Mordell--Lang conjecture in positive characteristic: For every subset $S \subseteq \mathbb{N}_0$ which is a union of finitely many arithmetic progressions along with finitely many $p$-sets of the form $\left \{ \sum_{j=1}^{m} c_j p^{k_jn_j} : n_j \in \mathbb{N}_0 \right \}$ ($c_j \in \mathbb{Q}$, $k_j \in \mathbb{N}_0$), there exist a split torus $X = \mathbb{G}_m^k$ defined over $K=\overline{\mathbb{F}_p}(t)$, an endomorphism $Φ$ of $X$, $α\in X(K)$ and a closed subvariety $V \subseteq X$ such that $\left \{ n \in \mathbb{N}_0 : Φ^n(α) \in V(K) \right \} = S$.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
AIRS-assisted Vehicular Networks with Rate-Splitting SWIPT Receivers: Joint Trajectory and Communication Design
Authors:
Gyoungyoon Nam,
Seokhyun Lee,
Seongah Jeong
Abstract:
In this correspondence, we propose to use an intelligent reflective surface (IRS) installed on unmanned aerial vehicle (UAV), referred to as aerial IRS (AIRS), for vehicular networks, where simultaneous wireless information and power transfer (SWIPT) receivers to concurrently allow information decoding (ID) and energy harvesting (EH) are equipped at the battery-limited vehicles. For efficiently su…
▽ More
In this correspondence, we propose to use an intelligent reflective surface (IRS) installed on unmanned aerial vehicle (UAV), referred to as aerial IRS (AIRS), for vehicular networks, where simultaneous wireless information and power transfer (SWIPT) receivers to concurrently allow information decoding (ID) and energy harvesting (EH) are equipped at the battery-limited vehicles. For efficiently supporting the multiple moving vehicles, we adopt rate-splitting multiple access (RSMA) technique. With the aim of maximizing the sum rate of vehicles, we jointly optimize trajectory and phase shift design of AIRS, transmit power and rate allocation for RSMA along with power splitting ratio for SWIPT implementation. Via simulations, the superior performances of the proposed algorithm are validated compared to the conventional partial optimizations.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Unital algebras being Morita equivalent to weighted Leavitt path algebras
Authors:
Roozbeh Hazrat,
Tran Giang Nam
Abstract:
In this article, we describe the endomorphism ring of a finitely generated progenerator module of a weighted Leavitt path algebra $L_{K}(E, w)$ of a finite vertex weighted graph $(E, w)$. Contrary to the case of Leavitt path algebras, we show that a (full) corner of a weighted Leavitt path algebra is, in general, not isomorphic to a weighted Leavitt path algebra. However, using the above result, w…
▽ More
In this article, we describe the endomorphism ring of a finitely generated progenerator module of a weighted Leavitt path algebra $L_{K}(E, w)$ of a finite vertex weighted graph $(E, w)$. Contrary to the case of Leavitt path algebras, we show that a (full) corner of a weighted Leavitt path algebra is, in general, not isomorphic to a weighted Leavitt path algebra. However, using the above result, we show that for every full idempotent $ε$ in $L_{K}(E, w)$, there exists a positive integer $n$ such that $M_n(εL_{K}(E, w) ε)$ is isomorphic to the weighted Leavitt path algebra of a weighted graph explicitly constructed from $(E, w)$. We then completely describe unital algebras being Morita equivalent to weighted Leavitt path algebras of vertex weighted graphs. In particular, we characterize unital algebras being Morita equivalent to sandpile algebras.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Saxl conjecture and the tensor square of unipotent characters of GL(n,q)
Authors:
Emmanuel Letellier,
GyeongHyeon Nam
Abstract:
We know from Letellier that if for some triple of partitions the corresponding Kronecker coefficient is non-zero then the corresponding multiplicities for unipotent characters of GL(n,q) is also non-zero. A conjecture of Saxl says that the tensor square of an irreducible character of the symmetric group corresponding to a staircase partition contains all the irreducible characters. Therefore Saxl…
▽ More
We know from Letellier that if for some triple of partitions the corresponding Kronecker coefficient is non-zero then the corresponding multiplicities for unipotent characters of GL(n,q) is also non-zero. A conjecture of Saxl says that the tensor square of an irreducible character of the symmetric group corresponding to a staircase partition contains all the irreducible characters. Therefore Saxl conjecture implies its analogue for unipotent characters. In this paper we prove the analogue of Saxl conjecture for unipotent characters and we describe conjecturally the set of all partitions for which the tensor square of the associated unipotent character contains all the unipotent characters.
△ Less
Submitted 1 February, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
A Local Appearance Model for Volumetric Capture of Diverse Hairstyle
Authors:
Ziyan Wang,
Giljoo Nam,
Aljaz Bozic,
Chen Cao,
Jason Saragih,
Michael Zollhoefer,
Jessica Hodgins
Abstract:
Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our metho…
▽ More
Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our method leverages the local similarity across different hairstyles and learns a universal hair appearance prior from multi-view captures of hundreds of people. This prior model takes 3D-aligned features as input and generates dense radiance fields conditioned on a sparse point cloud with color. As our model splits different hairstyles into local primitives and builds prior at that level, it is capable of handling various hair topologies. Through experiments, we demonstrate that our model captures a diverse range of hairstyles and generalizes well to challenging new hairstyles. Empirical results show that our method improves the state-of-the-art approaches in capturing and generating photorealistic, personalized avatars with complete hair.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Relightable Gaussian Codec Avatars
Authors:
Shunsuke Saito,
Gabriel Schwartz,
Tomas Simon,
Junxuan Li,
Giljoo Nam
Abstract:
The fidelity of relighting is bounded by both geometry and appearance representations. For geometry, both mesh and volumetric approaches have difficulty modeling intricate structures like 3D hair geometry. For appearance, existing relighting models are limited in fidelity and often too slow to render in real-time with high-resolution continuous environments. In this work, we present Relightable Ga…
▽ More
The fidelity of relighting is bounded by both geometry and appearance representations. For geometry, both mesh and volumetric approaches have difficulty modeling intricate structures like 3D hair geometry. For appearance, existing relighting models are limited in fidelity and often too slow to render in real-time with high-resolution continuous environments. In this work, we present Relightable Gaussian Codec Avatars, a method to build high-fidelity relightable head avatars that can be animated to generate novel expressions. Our geometry model based on 3D Gaussians can capture 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences. To support diverse materials of human heads such as the eyes, skin, and hair in a unified manner, we present a novel relightable appearance model based on learnable radiance transfer. Together with global illumination-aware spherical harmonics for the diffuse components, we achieve real-time relighting with all-frequency reflections using spherical Gaussians. This appearance model can be efficiently relit under both point light and continuous illumination. We further improve the fidelity of eye reflections and enable explicit gaze control by introducing relightable explicit eye models. Our method outperforms existing approaches without compromising real-time performance. We also demonstrate real-time relighting of avatars on a tethered consumer VR headset, showcasing the efficiency and fidelity of our avatars.
△ Less
Submitted 27 May, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Automorphisms of Leavitt path algebras: Zhang twist and irreducible representations
Authors:
Tran Giang Nam,
Ashish K. Srivastava,
Nguyen Thi Vien
Abstract:
In this article, we construct (graded) automorphisms fixing all vertices of Leavitt path algebras of arbitrary graphs in terms of general linear groups over corners of these algebras. As an application, we study Zhang twist of Leavitt path algebras and describe new classes of irreducible representations of Leavitt path algebras of the rose graphs $R_n$ with $n$ petals.
In this article, we construct (graded) automorphisms fixing all vertices of Leavitt path algebras of arbitrary graphs in terms of general linear groups over corners of these algebras. As an application, we study Zhang twist of Leavitt path algebras and describe new classes of irreducible representations of Leavitt path algebras of the rose graphs $R_n$ with $n$ petals.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization
Authors:
Junhyeong Cho,
Gilhyun Nam,
Sungyeon Kim,
Hunmin Yang,
Suha Kwak
Abstract:
In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse…
▽ More
In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. The proposed method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning style word vectors, we train a linear classifier using synthesized style-content features. PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, even though it does not require any images for training.
△ Less
Submitted 15 August, 2023; v1 submitted 27 July, 2023;
originally announced July 2023.
-
Differentiable Display Photometric Stereo
Authors:
Seokjun Choi,
Seungwoo Yoon,
Giljoo Nam,
Seungyong Lee,
Seung-Hwan Baek
Abstract:
Photometric stereo leverages variations in illumination conditions to reconstruct surface normals. Display photometric stereo, which employs a conventional monitor as an illumination source, has the potential to overcome limitations often encountered in bulky and difficult-to-use conventional setups. In this paper, we present differentiable display photometric stereo (DDPS), addressing an often ov…
▽ More
Photometric stereo leverages variations in illumination conditions to reconstruct surface normals. Display photometric stereo, which employs a conventional monitor as an illumination source, has the potential to overcome limitations often encountered in bulky and difficult-to-use conventional setups. In this paper, we present differentiable display photometric stereo (DDPS), addressing an often overlooked challenge in display photometric stereo: the design of display patterns. Departing from using heuristic display patterns, DDPS learns the display patterns that yield accurate normal reconstruction for a target system in an end-to-end manner. To this end, we propose a differentiable framework that couples basis-illumination image formation with analytic photometric-stereo reconstruction. The differentiable framework facilitates the effective learning of display patterns via auto-differentiation. Also, for training supervision, we propose to use 3D printing for creating a real-world training dataset, enabling accurate reconstruction on the target real-world setup. Finally, we exploit that conventional LCD monitors emit polarized light, which allows for the optical separation of diffuse and specular reflections when combined with a polarization camera, leading to accurate normal reconstruction. Extensive evaluation of DDPS shows improved normal-reconstruction accuracy compared to heuristic patterns and demonstrates compelling properties such as robustness to pattern initialization, calibration errors, and simplifications in image formation and reconstruction.
△ Less
Submitted 12 March, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Traversing Between Modes in Function Space for Fast Ensembling
Authors:
EungGu Yun,
Hyungi Lee,
Giung Nam,
Juho Lee
Abstract:
Deep ensemble is a simple yet powerful way to improve the performance of deep neural networks. Under this motivation, recent works on mode connectivity have shown that parameters of ensembles are connected by low-loss subspaces, and one can efficiently collect ensemble parameters in those subspaces. While this provides a way to efficiently train ensembles, for inference, multiple forward passes sh…
▽ More
Deep ensemble is a simple yet powerful way to improve the performance of deep neural networks. Under this motivation, recent works on mode connectivity have shown that parameters of ensembles are connected by low-loss subspaces, and one can efficiently collect ensemble parameters in those subspaces. While this provides a way to efficiently train ensembles, for inference, multiple forward passes should still be executed using all the ensemble parameters, which often becomes a serious bottleneck for real-world deployment. In this work, we propose a novel framework to reduce such costs. Given a low-loss subspace connecting two modes of a neural network, we build an additional neural network that predicts the output of the original neural network evaluated at a certain point in the low-loss subspace. The additional neural network, which we call a "bridge", is a lightweight network that takes minimal features from the original network and predicts outputs for the low-loss subspace without forward passes through the original network. We empirically demonstrate that we can indeed train such bridge networks and significantly reduce inference costs with the help of bridge networks.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning
Authors:
Moonseok Choi,
Hyungi Lee,
Giung Nam,
Juho Lee
Abstract:
Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands. When it comes to global pruning techniques, Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes. In light of the recent findi…
▽ More
Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands. When it comes to global pruning techniques, Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes. In light of the recent finding that the two successive matching IMP solutions are linearly connected without a loss barrier, we propose Sparse Weight Averaging with Multiple Particles (SWAMP), a straightforward modification of IMP that achieves performance comparable to an ensemble of two IMP solutions. For every iteration, we concurrently train multiple sparse models, referred to as particles, using different batch orders yet the same matching ticket, and then weight average such models to produce a single mask. We demonstrate that our method consistently outperforms existing baselines across different sparsities through extensive experiments on various data and neural network structures.
△ Less
Submitted 26 April, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Martingale Posterior Neural Processes
Authors:
Hyungi Lee,
Eunggu Yun,
Giung Nam,
Edwin Fong,
Juho Lee
Abstract:
A Neural Process (NP) estimates a stochastic process implicitly defined with neural networks given a stream of data, rather than pre-specifying priors already known, such as Gaussian processes. An ideal NP would learn everything from data without any inductive biases, but in practice, we often restrict the class of stochastic processes for the ease of estimation. One such restriction is the use of…
▽ More
A Neural Process (NP) estimates a stochastic process implicitly defined with neural networks given a stream of data, rather than pre-specifying priors already known, such as Gaussian processes. An ideal NP would learn everything from data without any inductive biases, but in practice, we often restrict the class of stochastic processes for the ease of estimation. One such restriction is the use of a finite-dimensional latent variable accounting for the uncertainty in the functions drawn from NPs. Some recent works show that this can be improved with more "data-driven" source of uncertainty such as bootstrap**. In this work, we take a different approach based on the martingale posterior, a recently developed alternative to Bayesian inference. For the martingale posterior, instead of specifying prior-likelihood pairs, a predictive distribution for future data is specified. Under specific conditions on the predictive distribution, it can be shown that the uncertainty in the generated future data actually corresponds to the uncertainty of the implicitly defined Bayesian posteriors. Based on this result, instead of assuming any form of the latent variables, we equip a NP with a predictive distribution implicitly defined with neural networks and use the corresponding martingale posteriors as the source of uncertainty. The resulting model, which we name as Martingale Posterior Neural Process (MPNP), is demonstrated to outperform baselines on various tasks.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Decoupled Training for Long-Tailed Classification With Stochastic Representations
Authors:
Giung Nam,
Sunguk Jang,
Juho Lee
Abstract:
Decoupling representation learning and classifier learning has been shown to be effective in classification with long-tailed data. There are two main ingredients in constructing a decoupled learning scheme; 1) how to train the feature extractor for representation learning so that it provides generalizable representations and 2) how to re-train the classifier that constructs proper decision boundar…
▽ More
Decoupling representation learning and classifier learning has been shown to be effective in classification with long-tailed data. There are two main ingredients in constructing a decoupled learning scheme; 1) how to train the feature extractor for representation learning so that it provides generalizable representations and 2) how to re-train the classifier that constructs proper decision boundaries by handling class imbalances in long-tailed data. In this work, we first apply Stochastic Weight Averaging (SWA), an optimization technique for improving the generalization of deep neural networks, to obtain better generalizing feature extractors for long-tailed classification. We then propose a novel classifier re-training algorithm based on stochastic representation obtained from the SWA-Gaussian, a Gaussian perturbed SWA, and a self-distillation strategy that can harness the diverse stochastic representations based on uncertainty estimates to build more robust classifiers. Extensive experiments on CIFAR10/100-LT, ImageNet-LT, and iNaturalist-2018 benchmarks show that our proposed method improves upon previous methods both in terms of prediction accuracy and uncertainty estimation.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Good Neighbors Are All You Need for Chinese Grapheme-to-Phoneme Conversion
Authors:
Jungjun Kim,
Chang** Han,
Gyuhyeon Nam,
Gyeongsu Chae
Abstract:
Most Chinese Grapheme-to-Phoneme (G2P) systems employ a three-stage framework that first transforms input sequences into character embeddings, obtains linguistic information using language models, and then predicts the phonemes based on global context about the entire input sequence. However, linguistic knowledge alone is often inadequate. Language models frequently encode overly general structure…
▽ More
Most Chinese Grapheme-to-Phoneme (G2P) systems employ a three-stage framework that first transforms input sequences into character embeddings, obtains linguistic information using language models, and then predicts the phonemes based on global context about the entire input sequence. However, linguistic knowledge alone is often inadequate. Language models frequently encode overly general structures of a sentence and fail to cover specific cases needed to use phonetic knowledge. Also, a handcrafted post-processing system is needed to address the problems relevant to the tone of the characters. However, the system exhibits inconsistency in the segmentation of word boundaries which consequently degrades the performance of the G2P system. To address these issues, we propose the Reinforcer that provides strong inductive bias for language models by emphasizing the phonological information between neighboring characters to help disambiguate pronunciations. Experimental results show that the Reinforcer boosts the cutting-edge architectures by a large margin. We also combine the Reinforcer with a large-scale pre-trained model and demonstrate the validity of using neighboring context in knowledge transfer scenarios.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Event Fusion Photometric Stereo Network
Authors:
Wonjeong Ryoo,
Giljoo Nam,
Jae-Sang Hyun,
Sangpil Kim
Abstract:
We present a novel method to estimate the surface normal of an object in an ambient light environment using RGB and event cameras. Modern photometric stereo methods rely on an RGB camera, mainly in a dark room, to avoid ambient illumination. To alleviate the limitations of the darkroom environment and to use essential light information, we employ an event camera with a high dynamic range and low l…
▽ More
We present a novel method to estimate the surface normal of an object in an ambient light environment using RGB and event cameras. Modern photometric stereo methods rely on an RGB camera, mainly in a dark room, to avoid ambient illumination. To alleviate the limitations of the darkroom environment and to use essential light information, we employ an event camera with a high dynamic range and low latency. This is the first study that uses an event camera for the photometric stereo task, which works on continuous light sources and ambient light environment. In this work, we also curate a novel photometric stereo dataset that is constructed by capturing objects with event and RGB cameras under numerous ambient lights environment. Additionally, we propose a novel framework named Event Fusion Photometric Stereo Network~(EFPS-Net), which estimates the surface normals of an object using both RGB frames and event signals. Our proposed method interpolates event observation maps that generate light information with sparse event signals to acquire fluent light information. Subsequently, the event-interpolated observation maps are fused with the RGB observation maps. Our numerous experiments showed that EFPS-Net outperforms state-of-the-art methods on a dataset captured in the real world where ambient lights exist. Consequently, we demonstrate that incorporating additional modalities with EFPS-Net alleviates the limitations that occurred from ambient illumination.
△ Less
Submitted 11 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images
Authors:
Mohammad Mahdi Behzadi,
Mohammad Madani,
Hanzhang Wang,
Jun Bai,
Ankit Bhardwaj,
Anna Tarakanova,
Harold Yamase,
Ga Hie Nam,
Sheida Nabavi
Abstract:
Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming a…
▽ More
Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming and has known interobserver variability. In the past few years, deep learning algorithms have been used to analyze histopathology images, delivering promising results for grading prostate cancer. However, most of the algorithms rely on the fully annotated datasets which are expensive to generate. In this work, we proposed a novel weakly-supervised algorithm to classify prostate cancer grades. The proposed algorithm consists of three steps: (1) extracting discriminative areas in a histopathology image by employing the Multiple Instance Learning (MIL) algorithm based on Transformers, (2) representing the image by constructing a graph using the discriminative patches, and (3) classifying the image into its Gleason grades by develo** a Graph Convolutional Neural Network (GCN) based on the gated attention mechanism. We evaluated our algorithm using publicly available datasets, including TCGAPRAD, PANDA, and Gleason 2019 challenge datasets. We also cross validated the algorithm on an independent dataset. Results show that the proposed model achieved state-of-the-art performance in the Gleason grading task in terms of accuracy, F1 score, and cohen-kappa. The code is available at https://github.com/NabaviLab/Prostate-Cancer.
△ Less
Submitted 24 December, 2022;
originally announced December 2022.
-
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
Authors:
Gimin Nam,
Mariem Khlifi,
Andrew Rodriguez,
Alberto Tono,
Linqi Zhou,
Paul Guerrero
Abstract:
Diffusion models have shown great promise for image generation, beating GANs in terms of generation diversity, with comparable image quality. However, their application to 3D shapes has been limited to point or voxel representations that can in practice not accurately represent a 3D surface. We propose a diffusion model for neural implicit representations of 3D shapes that operates in the latent s…
▽ More
Diffusion models have shown great promise for image generation, beating GANs in terms of generation diversity, with comparable image quality. However, their application to 3D shapes has been limited to point or voxel representations that can in practice not accurately represent a 3D surface. We propose a diffusion model for neural implicit representations of 3D shapes that operates in the latent space of an auto-decoder. This allows us to generate diverse and high quality 3D surfaces. We additionally show that we can condition our model on images or text to enable image-to-3D generation and text-to-3D generation using CLIP embeddings. Furthermore, adding noise to the latent codes of existing shapes allows us to explore shape variations.
△ Less
Submitted 15 December, 2022; v1 submitted 1 December, 2022;
originally announced December 2022.
-
NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation
Authors:
Ziyan Wang,
Giljoo Nam,
Tuur Stuyck,
Stephen Lombardi,
Chen Cao,
Jason Saragih,
Michael Zollhoefer,
Jessica Hodgins,
Christoph Lassner
Abstract:
The capture and animation of human hair are two of the major challenges in the creation of realistic avatars for the virtual reality. Both problems are highly challenging, because hair has complex geometry and appearance, as well as exhibits challenging motion. In this paper, we present a two-stage approach that models hair independently from the head to address these challenges in a data-driven m…
▽ More
The capture and animation of human hair are two of the major challenges in the creation of realistic avatars for the virtual reality. Both problems are highly challenging, because hair has complex geometry and appearance, as well as exhibits challenging motion. In this paper, we present a two-stage approach that models hair independently from the head to address these challenges in a data-driven manner. The first stage, state compression, learns a low-dimensional latent space of 3D hair states containing motion and appearance, via a novel autoencoder-as-a-tracker strategy. To better disentangle the hair and head in appearance learning, we employ multi-view hair segmentation masks in combination with a differentiable volumetric renderer. The second stage learns a novel hair dynamics model that performs temporal hair transfer based on the discovered latent codes. To enforce higher stability while driving our dynamics model, we employ the 3D point-cloud autoencoder from the compression stage for de-noising of the hair state. Our model outperforms the state of the art in novel view synthesis and is capable of creating novel hair animations without having to rely on hair observations as a driving signal. Project page is here https://ziyanw1.github.io/neuwigs/.
△ Less
Submitted 11 October, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
On Leavitt path algebras of Hopf graphs
Authors:
T. G. Nam,
N. T. Phuc
Abstract:
In this paper, we provide the structure of Hopf graphs associated to pairs $(G, \mathfrak{r})$ consisting of groups $G$ together with ramification datas $\mathfrak{r}$ and their Leavitt path algebras. Consequently, we characterize the Gelfand-Kirillov dimension, the stable rank, the purely infinite simplicity and the existence of a nonzero finite dimensional representation of the Leavitt path alge…
▽ More
In this paper, we provide the structure of Hopf graphs associated to pairs $(G, \mathfrak{r})$ consisting of groups $G$ together with ramification datas $\mathfrak{r}$ and their Leavitt path algebras. Consequently, we characterize the Gelfand-Kirillov dimension, the stable rank, the purely infinite simplicity and the existence of a nonzero finite dimensional representation of the Leavitt path algebra of a Hopf graph via properties of ramification data $\mathfrak{r}$ and $G$.
△ Less
Submitted 1 June, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
OOOE: Only-One-Object-Exists Assumption to Find Very Small Objects in Chest Radiographs
Authors:
Gunhee Nam,
Taesoo Kim,
Sanghyup Lee,
Thijs Kooi
Abstract:
The accurate localization of inserted medical tubes and parts of human anatomy is a common problem when analyzing chest radiographs and something deep neural networks could potentially automate. However, many foreign objects like tubes and various anatomical structures are small in comparison to the entire chest X-ray, which leads to severely unbalanced data and makes training deep neural networks…
▽ More
The accurate localization of inserted medical tubes and parts of human anatomy is a common problem when analyzing chest radiographs and something deep neural networks could potentially automate. However, many foreign objects like tubes and various anatomical structures are small in comparison to the entire chest X-ray, which leads to severely unbalanced data and makes training deep neural networks difficult. In this paper, we present a simple yet effective `Only-One-Object-Exists' (OOOE) assumption to improve the deep network's ability to localize small landmarks in chest radiographs. The OOOE enables us to recast the localization problem as a classification problem and we can replace commonly used continuous regression techniques with a multi-class discrete objective. We validate our approach using a large scale proprietary dataset of over 100K radiographs as well as publicly available RANZCR-CLiP Kaggle Challenge dataset and show that our method consistently outperforms commonly used regression-based detection models as well as commonly used pixel-wise classification methods. Additionally, we find that the method using the OOOE assumption generalizes to multiple detection problems in chest X-rays and the resulting model shows state-of-the-art performance on detecting various tube tips inserted to the patient as well as patient anatomy.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Arithmetic geometry of character varieties with regular monodromy, I
Authors:
Masoud Kamgarpour,
GyeongHyeon Nam,
Anna Puskás
Abstract:
We study character varieties arising as moduli of representations of an orientable surface group into a reductive group $G$. We first show that if $G/Z$ acts freely on the representation variety, then both the representation variety and the character variety are smooth and equidimensional. Next, we count points on a family of smooth character varieties; namely, those involving both regular semisim…
▽ More
We study character varieties arising as moduli of representations of an orientable surface group into a reductive group $G$. We first show that if $G/Z$ acts freely on the representation variety, then both the representation variety and the character variety are smooth and equidimensional. Next, we count points on a family of smooth character varieties; namely, those involving both regular semisimple and regular unipotent monodromy. In particular, we show that these varieties are polynomial count and obtain an explicit expression for their $E$-polynomials. Finally, by analysing the $E$-polynomial, we determine certain topological invariants of these varieties such as the Euler characteristic and the number of connected components. As an application, we give an example of a cohomologically rigid representation which is not physically rigid.
△ Less
Submitted 14 September, 2023; v1 submitted 5 September, 2022;
originally announced September 2022.
-
GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization
Authors:
Gilhyun Nam,
Gyeongjae Choi,
Kyungmin Lee
Abstract:
Training a deep learning model with artificially generated data can be an alternative when training data are scarce, yet it suffers from poor generalization performance due to a large domain gap. In this paper, we characterize the domain gap by using a causal framework for data generation. We assume that the real and synthetic data have common content variables but different style variables. Thus,…
▽ More
Training a deep learning model with artificially generated data can be an alternative when training data are scarce, yet it suffers from poor generalization performance due to a large domain gap. In this paper, we characterize the domain gap by using a causal framework for data generation. We assume that the real and synthetic data have common content variables but different style variables. Thus, a model trained on synthetic dataset might have poor generalization as the model learns the nuisance style variables. To that end, we propose causal invariance learning which encourages the model to learn a style-invariant representation that enhances the syn-to-real generalization. Furthermore, we propose a simple yet effective feature distillation method that prevents catastrophic forgetting of semantic knowledge of the real domain. In sum, we refer to our method as Guided Causal Invariant Syn-to-real Generalization that effectively improves the performance of syn-to-real generalization. We empirically verify the validity of proposed methods, and especially, our method achieves state-of-the-art on visual syn-to-real domain generalization tasks such as image classification and semantic segmentation.
△ Less
Submitted 18 February, 2023; v1 submitted 21 August, 2022;
originally announced August 2022.
-
Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images
Authors:
Radu Alexandru Rosu,
Shunsuke Saito,
Ziyan Wang,
Chenglei Wu,
Sven Behnke,
Giljoo Nam
Abstract:
We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation…
▽ More
We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation based on a neural scalp texture that encodes the geometry and appearance of individual strands at each texel location. Furthermore, we introduce a novel neural rendering framework based on rasterization of the learned hair strands. Our neural rendering is strand-accurate and anti-aliased, making the rendering view-consistent and photorealistic. Combining appearance with a multi-view geometric prior, we enable, for the first time, the joint learning of appearance and explicit hair geometry from a multi-view setup. We demonstrate the efficacy of our approach in terms of fidelity and efficiency for various hairstyles.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
ORA3D: Overlap Region Aware Multi-view 3D Object Detection
Authors:
Wonseok Roh,
Gyusam Chang,
Seokha Moon,
Giljoo Nam,
Chanyoung Kim,
Younghyun Kim,
**kyu Kim,
Sangpil Kim
Abstract:
Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network. Moreover, objects in the overlap region are often largely occluded or suffer from deformation due to camera distortion, causing a domain shift. To mitigate this issue, we propose using the…
▽ More
Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network. Moreover, objects in the overlap region are often largely occluded or suffer from deformation due to camera distortion, causing a domain shift. To mitigate this issue, we propose using the following two main modules: (1) Stereo Disparity Estimation for Weak Depth Supervision and (2) Adversarial Overlap Region Discriminator. The former utilizes the traditional stereo disparity estimation method to obtain reliable disparity information from the overlap region. Given the disparity estimates as supervision, we propose regularizing the network to fully utilize the geometric potential of binocular images and improve the overall detection accuracy accordingly. Further, the latter module minimizes the representational gap between non-overlap and overlap** regions. We demonstrate the effectiveness of the proposed method with the nuScenes large-scale multi-view 3D object detection data. Our experiments show that our proposed method outperforms current state-of-the-art models, i.e., DETR3D and BEVDet.
△ Less
Submitted 29 June, 2023; v1 submitted 2 July, 2022;
originally announced July 2022.
-
Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation
Authors:
Giung Nam,
Hyungi Lee,
Byeongho Heo,
Juho Lee
Abstract:
Ensembles of deep neural networks have demonstrated superior performance, but their heavy computational cost hinders applying them for resource-limited environments. It motivates distilling knowledge from the ensemble teacher into a smaller student network, and there are two important design choices for this ensemble distillation: 1) how to construct the student network, and 2) what data should be…
▽ More
Ensembles of deep neural networks have demonstrated superior performance, but their heavy computational cost hinders applying them for resource-limited environments. It motivates distilling knowledge from the ensemble teacher into a smaller student network, and there are two important design choices for this ensemble distillation: 1) how to construct the student network, and 2) what data should be shown during training. In this paper, we propose a weight averaging technique where a student with multiple subnetworks is trained to absorb the functional diversity of ensemble teachers, but then those subnetworks are properly averaged for inference, giving a single student network with no additional inference cost. We also propose a perturbation strategy that seeks inputs from which the diversities of teachers can be better transferred to the student. Combining these two, our method significantly improves upon previous methods on various image classification tasks.
△ Less
Submitted 30 June, 2022;
originally announced June 2022.
-
Long-term memory and synapse-like dynamics in two-dimensional nanofluidic channels
Authors:
P. Robin,
T. Emmerich,
A. Ismail,
A. Niguès,
Y. You,
G. -H. Nam,
A. Keerthi,
A. Siria,
A. K. Geim,
B. Radha,
L. Bocquet
Abstract:
Fine-tuned ion transport across nanoscale pores is key to many biological processes such as neurotransmission. Recent advances have enabled the confinement of water and ions to two dimensions, unveiling transport properties unreachable at larger scales and triggering hopes to reproduce the ionic machinery of biological systems. Here we report experiments demonstrating the emergence of memory in th…
▽ More
Fine-tuned ion transport across nanoscale pores is key to many biological processes such as neurotransmission. Recent advances have enabled the confinement of water and ions to two dimensions, unveiling transport properties unreachable at larger scales and triggering hopes to reproduce the ionic machinery of biological systems. Here we report experiments demonstrating the emergence of memory in the transport of aqueous electrolytes across (sub)nanoscale channels. We unveiled two types of nanofluidic memristors, depending on channel material and confinement, with memory from minutes to hours. We explained how large timescales could emerge from interfacial processes like ionic self-assembly or surface adsorption. Such behavior allowed us to implement Hebbian learning with nanofluidic systems. This result lays the ground for biomimetic computations on aqueous electrolytic chips.
△ Less
Submitted 17 January, 2023; v1 submitted 16 May, 2022;
originally announced May 2022.
-
Liquid-activated quantum emission from pristine hexagonal boron nitride for nanofluidic sensing
Authors:
Nathan Ronceray,
Yi You,
Evgenii Glushkov,
Martina Lihter,
Benjamin Rehl,
Tzu-Heng Chen,
Gwang-Hyeon Nam,
Fanny Borza,
Kenji Watanabe,
Takashi Taniguchi,
Sylvie Roke,
Ashok Keerthi,
Jean Comtet,
Boya Radha,
Aleksandra Radenovic
Abstract:
Liquids confined down to the atomic scale can show radically new properties. However, only indirect and ensemble measurements operate in such extreme confinement, calling for novel optical approaches enabling direct imaging at the molecular level. Here, we harness fluorescence originating from single-photon emitters at the surface of hexagonal boron nitride (hBN) for molecular imaging and sensing…
▽ More
Liquids confined down to the atomic scale can show radically new properties. However, only indirect and ensemble measurements operate in such extreme confinement, calling for novel optical approaches enabling direct imaging at the molecular level. Here, we harness fluorescence originating from single-photon emitters at the surface of hexagonal boron nitride (hBN) for molecular imaging and sensing in nanometrically confined liquids. The emission originates from the chemisorption of organic solvent molecules onto native surface defects, revealing single-molecule dynamics at the interface through spatially correlated activation of neighboring defects. Emitter spectra further offer a direct readout of local dielectric properties, unveiling increasing dielectric order under nanometer-scale confinement. Liquid-activated native hBN defects bridge the gap between solid-state nanophotonics and nanofluidics, opening new avenues for nanoscale sensing and optofluidics.
△ Less
Submitted 22 August, 2023; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Products of commutator ideals of some Lie-admissible algebras
Authors:
Ivan Kaygorodov,
Farukh Mashurov,
Tran Giang Nam,
Zerui Zhang
Abstract:
In this article, we mainly study the products of commutator ideals of Lie-admissible algebras such as Novikov algebras, bicommutative algebras, and assosymmetric algebras. More precisely, we first study the properties of the lower central chains for Novikov algebras and bicommutative algebras. Then we show that for every Lie nilpotent Novikov algebra or Lie nilpotent bicommutative algebra…
▽ More
In this article, we mainly study the products of commutator ideals of Lie-admissible algebras such as Novikov algebras, bicommutative algebras, and assosymmetric algebras. More precisely, we first study the properties of the lower central chains for Novikov algebras and bicommutative algebras. Then we show that for every Lie nilpotent Novikov algebra or Lie nilpotent bicommutative algebra $\mathcal{A}$, the ideal of $\mathcal{A}$ generated by the set $\{ab - ba\mid a, b\in \mathcal{A}\}$ is nilpotent. Finally, we study properties of the lower central chains for assosymmetric algebras, study the products of commutator ideals of assosymmetric algebras and show that the products of commutator ideals have a similar property as that for associative algebras.
△ Less
Submitted 1 October, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.
-
Angstrofluidics: walking to the limit
Authors:
Yi You,
Abdulghani Ismail,
Gwang-Hyeon Nam,
Solleti Goutham,
Ashok Keerthi,
Boya Radha
Abstract:
Angstrom-scale fluidic channels are ubiquitous in nature, and play an important role in regulating cellular traffic, signaling, and responding to stimuli. Synthetic channels are now a reality with the emergence of several cutting-edge bottom-up and top-down fabrication methods. In particular, the use of atomically thin two dimensional (2D) materials and nanotubes as components to build fluidic con…
▽ More
Angstrom-scale fluidic channels are ubiquitous in nature, and play an important role in regulating cellular traffic, signaling, and responding to stimuli. Synthetic channels are now a reality with the emergence of several cutting-edge bottom-up and top-down fabrication methods. In particular, the use of atomically thin two dimensional (2D) materials and nanotubes as components to build fluidic conduits has pushed the limits of fabrication to the Angstrom-scale. Here, we provide an overview of the recent developments in the fabrication methods for nano- and angstrofluidic channels while categorizing them on the basis of dimensionality (0D pores, 1D tubes, 2D slits), along with the latest advances in measurement techniques. We discuss the ionic transport governed by various stimuli in these channels and draw comparison of ionic mobility, streaming and osmotic power, with varying pore sizes across all the dimensionalities. Towards the end of the review, we highlight the unique future opportunities in the development of smart ionic devices.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
HVH: Learning a Hybrid Neural Volumetric Representation for Dynamic Hair Performance Capture
Authors:
Ziyan Wang,
Giljoo Nam,
Tuur Stuyck,
Stephen Lombardi,
Michael Zollhoefer,
Jessica Hodgins,
Christoph Lassner
Abstract:
Capturing and rendering life-like hair is particularly challenging due to its fine geometric structure, the complex physical interaction and its non-trivial visual appearance.Yet, hair is a critical component for believable avatars. In this paper, we address the aforementioned problems: 1) we use a novel, volumetric hair representation that is com-posed of thousands of primitives. Each primitive c…
▽ More
Capturing and rendering life-like hair is particularly challenging due to its fine geometric structure, the complex physical interaction and its non-trivial visual appearance.Yet, hair is a critical component for believable avatars. In this paper, we address the aforementioned problems: 1) we use a novel, volumetric hair representation that is com-posed of thousands of primitives. Each primitive can be rendered efficiently, yet realistically, by building on the latest advances in neural rendering. 2) To have a reliable control signal, we present a novel way of tracking hair on the strand level. To keep the computational effort manageable, we use guide hairs and classic techniques to expand those into a dense hood of hair. 3) To better enforce temporal consistency and generalization ability of our model, we further optimize the 3D scene flow of our representation with multi-view optical flow, using volumetric ray marching. Our method can not only create realistic renders of recorded multi-view sequences, but also create renderings for new hair configurations by providing new control signals. We compare our method with existing work on viewpoint synthesis and drivable animation and achieve state-of-the-art results. Please check out our project website at https://ziyanw1.github.io/hvh/.
△ Less
Submitted 19 December, 2021; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Diversity Matters When Learning From Ensembles
Authors:
Giung Nam,
Jongmin Yoon,
Yoonho Lee,
Juho Lee
Abstract:
Deep ensembles excel in large-scale image classification tasks both in terms of prediction accuracy and calibration. Despite being simple to train, the computation and memory cost of deep ensembles limits their practicability. While some recent works propose to distill an ensemble model into a single model to reduce such costs, there is still a performance gap between the ensemble and distilled mo…
▽ More
Deep ensembles excel in large-scale image classification tasks both in terms of prediction accuracy and calibration. Despite being simple to train, the computation and memory cost of deep ensembles limits their practicability. While some recent works propose to distill an ensemble model into a single model to reduce such costs, there is still a performance gap between the ensemble and distilled models. We propose a simple approach for reducing this gap, i.e., making the distilled performance close to the full ensemble. Our key assumption is that a distilled model should absorb as much function diversity inside the ensemble as possible. We first empirically show that the typical distillation procedure does not effectively transfer such diversity, especially for complex models that achieve near-zero training error. To fix this, we propose a perturbation strategy for distillation that reveals diversity by seeking inputs for which ensemble member outputs disagree. We empirically show that a model distilled with such perturbed samples indeed exhibits enhanced diversity, leading to improved performance.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
On the ideals of ultragraph Leavitt path algebras
Authors:
T. T. H. Duyen,
D. ~Gonçalves,
T. G. Nam
Abstract:
In this article, we provide an explicit description of a set of generators for any ideal of an ultragraph Leavitt path algebra. We provide several additional consequences of this description, including information about generating sets for graded ideals, the graded uniqueness and Cuntz-Krieger theorems, the semiprimeness, and the semiprimitivity of ultragraph Leavitt path algebras, a complete char…
▽ More
In this article, we provide an explicit description of a set of generators for any ideal of an ultragraph Leavitt path algebra. We provide several additional consequences of this description, including information about generating sets for graded ideals, the graded uniqueness and Cuntz-Krieger theorems, the semiprimeness, and the semiprimitivity of ultragraph Leavitt path algebras, a complete characterization of the prime and primitive ideals of an ultragraph Leavitt path algebra. We also show that every primitive ideal of an ultragraph Leavitt path algebra is exactly the annihilator of a Chen simple module. Consequently, we prove Exel's Effros-Hahn conjecture on primitive ideals in the ultragraph Leavitt path algebra setting (a conclusion that is also new in the context of Leavitt path algebras of graphs).
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Congruence-simplicity of Steinberg algebras of non-Hausdorff ample groupoids over semifields
Authors:
Tran Giang Nam,
Jens Zumbrägel
Abstract:
We investigate the algebra of an ample groupoid, introduced by Steinberg, over a semifield S. In particular, we obtain a complete characterization of congruence-simpleness for Steinberg algebras of second-countable ample groupoids, extending the well-known characterizations when S is a field. We apply our congruence-simplicity results to tight groupoids of inverse semigroup representations associa…
▽ More
We investigate the algebra of an ample groupoid, introduced by Steinberg, over a semifield S. In particular, we obtain a complete characterization of congruence-simpleness for Steinberg algebras of second-countable ample groupoids, extending the well-known characterizations when S is a field. We apply our congruence-simplicity results to tight groupoids of inverse semigroup representations associated to self-similar graphs.
△ Less
Submitted 9 September, 2021; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Polygonal Point Set Tracking
Authors:
Gunhee Nam,
Miran Heo,
Seoung Wug Oh,
Joon-Young Lee,
Seon Joo Kim
Abstract:
In this paper, we propose a novel learning-based polygonal point set tracking method. Compared to existing video object segmentation~(VOS) methods that propagate pixel-wise object mask information, we propagate a polygonal point set over frames.
Specifically, the set is defined as a subset of points in the target contour, and our goal is to track corresponding points on the target contour. Those…
▽ More
In this paper, we propose a novel learning-based polygonal point set tracking method. Compared to existing video object segmentation~(VOS) methods that propagate pixel-wise object mask information, we propagate a polygonal point set over frames.
Specifically, the set is defined as a subset of points in the target contour, and our goal is to track corresponding points on the target contour. Those outputs enable us to apply various visual effects such as motion tracking, part deformation, and texture map**. To this end, we propose a new method to track the corresponding points between frames by the global-local alignment with delicately designed losses and regularization terms. We also introduce a novel learning strategy using synthetic and VOS datasets that makes it possible to tackle the problem without develo** the point correspondence dataset. Since the existing datasets are not suitable to validate our method, we build a new polygonal point set tracking dataset and demonstrate the superior performance of our method over the baselines and existing contour-based VOS methods. In addition, we present visual-effects applications of our method on part distortion and text map**.
△ Less
Submitted 30 May, 2021;
originally announced May 2021.
-
KoDF: A Large-scale Korean DeepFake Detection Dataset
Authors:
Patrick Kwon,
Jaeseong You,
Gyuhyeon Nam,
Sungwoo Park,
Gyeongsu Chae
Abstract:
A variety of effective face-swap and face-reenactment methods have been publicized in recent years, democratizing the face synthesis technology to a great extent. Videos generated as such have come to be called deepfakes with a negative connotation, for various social problems they have caused. Facing the emerging threat of deepfakes, we have built the Korean DeepFake Detection Dataset (KoDF), a l…
▽ More
A variety of effective face-swap and face-reenactment methods have been publicized in recent years, democratizing the face synthesis technology to a great extent. Videos generated as such have come to be called deepfakes with a negative connotation, for various social problems they have caused. Facing the emerging threat of deepfakes, we have built the Korean DeepFake Detection Dataset (KoDF), a large-scale collection of synthesized and real videos focused on Korean subjects. In this paper, we provide a detailed description of methods used to construct the dataset, experimentally show the discrepancy between the distributions of KoDF and existing deepfake detection datasets, and underline the importance of using multiple datasets for real-world generalization. KoDF is publicly available at https://moneybrain-research.github.io/kodf in its entirety (i.e. real clips, synthesized clips, clips with adversarial attack, and metadata).
△ Less
Submitted 23 August, 2021; v1 submitted 18 March, 2021;
originally announced March 2021.
-
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
Authors:
Jaeseong You,
Dalhyun Kim,
Gyuhyeon Nam,
Geumbyeol Hwang,
Gyeongsu Chae
Abstract:
Several of the latest GAN-based vocoders show remarkable achievements, outperforming autoregressive and flow-based competitors in both qualitative and quantitative measures while synthesizing orders of magnitude faster. In this work, we hypothesize that the common factor underlying their success is the multi-resolution discriminating framework, not the minute details in architecture, loss function…
▽ More
Several of the latest GAN-based vocoders show remarkable achievements, outperforming autoregressive and flow-based competitors in both qualitative and quantitative measures while synthesizing orders of magnitude faster. In this work, we hypothesize that the common factor underlying their success is the multi-resolution discriminating framework, not the minute details in architecture, loss function, or training strategy. We experimentally test the hypothesis by evaluating six different generators paired with one shared multi-resolution discriminating framework. For all evaluative measures with respect to text-to-speech syntheses and for all perceptual metrics, their performances are not distinguishable from one another, which supports our hypothesis.
△ Less
Submitted 23 August, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
K-FACE: A Large-Scale KIST Face Database in Consideration with Unconstrained Environments
Authors:
Yeji Choi,
Hyunjung Park,
Gi Pyo Nam,
Haksub Kim,
Heeseung Choi,
Junghyun Cho,
Ig-Jae Kim
Abstract:
In this paper, we introduce a new large-scale face database from KIST, denoted as K-FACE, and describe a novel capturing device specifically designed to obtain the data. The K-FACE database contains more than 1 million high-quality images of 1,000 subjects selected by considering the ratio of gender and age groups. It includes a variety of attributes, including 27 poses, 35 lighting conditions, th…
▽ More
In this paper, we introduce a new large-scale face database from KIST, denoted as K-FACE, and describe a novel capturing device specifically designed to obtain the data. The K-FACE database contains more than 1 million high-quality images of 1,000 subjects selected by considering the ratio of gender and age groups. It includes a variety of attributes, including 27 poses, 35 lighting conditions, three expressions, and occlusions by the combination of five types of accessories. As the K-FACE database is systematically constructed through a hemispherical capturing system with elaborate lighting control and multiple cameras, it is possible to accurately analyze the effects of factors that cause performance degradation, such as poses, lighting changes, and accessories. We consider not only the balance of external environmental factors, such as pose and lighting, but also the balance of personal characteristics such as gender and age group. The gender ratio is the same, while the age groups of subjects are uniformly distributed from the 20s to 50s for both genders. The K-FACE database can be extensively utilized in various vision tasks, such as face recognition, face frontalization, illumination normalization, face age estimation, and three-dimensional face model generation. We expect systematic diversity and uniformity of the K-FACE database to promote these research fields.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
A 3D model-based approach for fitting masks to faces in the wild
Authors:
Je Hyeong Hong,
Hanjo Kim,
Minsoo Kim,
Gi Pyo Nam,
Junghyun Cho,
Hyeong-Seok Ko,
Ig-Jae Kim
Abstract:
Face recognition now requires a large number of labelled masked face images in the era of this unprecedented COVID-19 pandemic. Unfortunately, the rapid spread of the virus has left us little time to prepare for such dataset in the wild. To circumvent this issue, we present a 3D model-based approach called WearMask3D for augmenting face images of various poses to the masked face counterparts. Our…
▽ More
Face recognition now requires a large number of labelled masked face images in the era of this unprecedented COVID-19 pandemic. Unfortunately, the rapid spread of the virus has left us little time to prepare for such dataset in the wild. To circumvent this issue, we present a 3D model-based approach called WearMask3D for augmenting face images of various poses to the masked face counterparts. Our method proceeds by first fitting a 3D morphable model on the input image, second overlaying the mask surface onto the face model and war** the respective mask texture, and last projecting the 3D mask back to 2D. The mask texture is adapted based on the brightness and resolution of the input image. By working in 3D, our method can produce more natural masked faces of diverse poses from a single mask texture. To compare precisely between different augmentation approaches, we have constructed a dataset comprising masked and unmasked faces with labels called MFW-mini. Experimental results demonstrate WearMask3D produces more realistic masked faces, and utilizing these images for training leads to state-of-the-art recognition accuracy for masked faces.
△ Less
Submitted 1 August, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
Anick type automorphisms and new irreducible representations of Leavitt path algebras
Authors:
Shigeru Kuroda,
Tran Giang Nam
Abstract:
In this article, we give a new class of automorphisms of Leavitt path algebras of arbitrary graphs. Consequently, we obtain Anick type automorphisms of these Leavitt path algebras and new irreducible representations of Leavitt algebras of type $(1, n)$.
In this article, we give a new class of automorphisms of Leavitt path algebras of arbitrary graphs. Consequently, we obtain Anick type automorphisms of these Leavitt path algebras and new irreducible representations of Leavitt algebras of type $(1, n)$.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
Testing the Turbulent Origin of the Stellar Initial Mass Function
Authors:
D. G. Nam,
C. Federrath,
M. R. Krumholz
Abstract:
Supersonic turbulence in the interstellar medium (ISM) is closely linked to the formation of stars, and hence many theories connect the stellar initial mass function (IMF) with the turbulent properties of molecular clouds. Here we test three turbulence-based IMF models (by Padoan & Nordlund 2002, Hennebelle & Chabrier 2008, and Hopkins 2012), which predict the relation between the high-mass slope…
▽ More
Supersonic turbulence in the interstellar medium (ISM) is closely linked to the formation of stars, and hence many theories connect the stellar initial mass function (IMF) with the turbulent properties of molecular clouds. Here we test three turbulence-based IMF models (by Padoan & Nordlund 2002, Hennebelle & Chabrier 2008, and Hopkins 2012), which predict the relation between the high-mass slope ($Γ$) of the IMF, $\mathrm{d} N/\mathrm{d} \log M \propto M^Γ$ and the exponent n of the velocity power spectrum of turbulence, $E_v(k)\propto k^{-n} $, where $n\approx 2$ corresponds to typical ISM turbulence. Using hydrodynamic simulations, we drive turbulence with an unusual index of $n\approx 1$, measure $Γ$, and compare the results with $n\approx 2$. We find that reducing $n$ from 2 to 1 primarily changes the high-mass region of the IMF (beyond the median mass), where we measure high-mass slopes within the 95 per cent confidence interval of $-1.5<Γ<-1$ for $n \approx 1$ and $-3.7<Γ<-2.4$ for $n\approx 2$, respectively. Thus, we find that $n=1$ results in a significantly flatter high-mass slope of the IMF, with more massive stars formed than for $n \approx 2$. We compare these simulations with the predictions of the three IMF theories. We find that while the Padoan & Nordlund theory matches our simulations with fair accuracy, the other theories either fail to reproduce the main qualitative outcome of the simulations or require some modifications. We conclude that turbulence plays a key role in sha** the IMF, with a shallower turbulence power spectrum producing a shallower high-mass IMF, and hence more massive stars.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Axial Residual Networks for CycleGAN-based Voice Conversion
Authors:
Jaeseong You,
Gyuhyeon Nam,
Dalhyun Kim,
Gyeongsu Chae
Abstract:
We propose a novel architecture and improved training objectives for non-parallel voice conversion. Our proposed CycleGAN-based model performs a shape-preserving transformation directly on a high frequency-resolution magnitude spectrogram, converting its style (i.e. speaker identity) while preserving the speech content. Throughout the entire conversion process, the model does not resort to compres…
▽ More
We propose a novel architecture and improved training objectives for non-parallel voice conversion. Our proposed CycleGAN-based model performs a shape-preserving transformation directly on a high frequency-resolution magnitude spectrogram, converting its style (i.e. speaker identity) while preserving the speech content. Throughout the entire conversion process, the model does not resort to compressed intermediate representations of any sort (e.g. mel spectrogram, low resolution spectrogram, decomposed network feature). We propose an efficient axial residual block architecture to support this expensive procedure and various modifications to the CycleGAN losses to stabilize the training process. We demonstrate via experiments that our proposed model outperforms Scyclone and shows a comparable or better performance to that of CycleGAN-VC2 even without employing a neural vocoder.
△ Less
Submitted 24 August, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Lie nilpotent Novikov algebras and Lie solvable Leavitt path algebras
Authors:
Zerui Zhang,
Tran Giang Nam
Abstract:
In this paper, we first study properties of the lower central chains for Novikov algebras. Then we show that for every Lie nilpotent Novikov algebra~$\mathcal{N}$, the ideal of~$\mathcal{N}$ generated by the set~$\{ab - ba\mid a, b\in \mathcal{N}\}$ is nilpotent. We secondly provide necessary and sufficient conditions on the graph $E$ and the field $K$ for which the Leavitt path algebra $L_K(E)$ i…
▽ More
In this paper, we first study properties of the lower central chains for Novikov algebras. Then we show that for every Lie nilpotent Novikov algebra~$\mathcal{N}$, the ideal of~$\mathcal{N}$ generated by the set~$\{ab - ba\mid a, b\in \mathcal{N}\}$ is nilpotent. We secondly provide necessary and sufficient conditions on the graph $E$ and the field $K$ for which the Leavitt path algebra $L_K(E)$ is Lie solvable. Consequently, we obtain a complete description of Lie nilpotent Leavitt path algebras, and show that the Lie solvability of~$L_K(E)$ and the Lie nilpotency of $[L_K(E),L_K(E)]$ are the same.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
Realizing ultragraph Leavitt path algebras as Steinberg algebras
Authors:
R. Hazrat,
T. G. Nam
Abstract:
In this article, we realize ultragraph Leavitt path algebras as Steinberg algebras. This realization allows us to use the groupoid approach to obtain structural results about these algebras. Using skew product groupoid, we show that ultragraph Leavitt path algebras are graded von Neumann regular rings. We characterize strongly graded ultragraph Leavitt path algebras and show that every ultragraph…
▽ More
In this article, we realize ultragraph Leavitt path algebras as Steinberg algebras. This realization allows us to use the groupoid approach to obtain structural results about these algebras. Using skew product groupoid, we show that ultragraph Leavitt path algebras are graded von Neumann regular rings. We characterize strongly graded ultragraph Leavitt path algebras and show that every ultragraph Leavitt path algebra is semiprimitive. Moreover, we characterize irreducible representations of ultragraph Leavitt path algebras. We also show that ultragraph Leavitt path algebras can be realized as Cuntz-Pimsner rings.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Purely infinite simple ultragraph Leavitt path algebras
Authors:
Tran Giang Nam,
Nguyen Dinh Nam
Abstract:
In this article, we give necessary and sufficient conditions under which the Leavitt path algebra $L_K(\mathcal{G})$ of an ultragraph $\mathcal{G}$ over a field $K$ is purely infinite simple and that it is von Neumann regular. Consequently, we obtain that every graded simple ultragraph Leavitt path algebra is either a locally matricial algebra, or a full matrix ring over $K[x, x^{-1}]$, or a purel…
▽ More
In this article, we give necessary and sufficient conditions under which the Leavitt path algebra $L_K(\mathcal{G})$ of an ultragraph $\mathcal{G}$ over a field $K$ is purely infinite simple and that it is von Neumann regular. Consequently, we obtain that every graded simple ultragraph Leavitt path algebra is either a locally matricial algebra, or a full matrix ring over $K[x, x^{-1}]$, or a purely infinite simple algebra.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.