-
PosterLlama: Bridging Design Ability of Langauge Model to Contents-Aware Layout Generation
Authors:
Jaejung Seol,
Seojun Kim,
Jaejun Yoo
Abstract:
Visual layout plays a critical role in graphic design fields such as advertising, posters, and web UI design. The recent trend towards content-aware layout generation through generative models has shown promise, yet it often overlooks the semantic intricacies of layout design by treating it as a simple numerical optimization. To bridge this gap, we introduce PosterLlama, a network designed for gen…
▽ More
Visual layout plays a critical role in graphic design fields such as advertising, posters, and web UI design. The recent trend towards content-aware layout generation through generative models has shown promise, yet it often overlooks the semantic intricacies of layout design by treating it as a simple numerical optimization. To bridge this gap, we introduce PosterLlama, a network designed for generating visually and textually coherent layouts by reformatting layout elements into HTML code and leveraging the rich design knowledge embedded within language models. Furthermore, we enhance the robustness of our model with a unique depth-based poster augmentation strategy. This ensures our generated layouts remain semantically rich but also visually appealing, even with limited data. Our extensive evaluations across several benchmarks demonstrate that PosterLlama outperforms existing methods in producing authentic and content-aware layouts. It supports an unparalleled range of conditions, including but not limited to unconditional layout generation, element conditional layout generation, layout completion, among others, serving as a highly versatile user manipulation tool.
△ Less
Submitted 2 April, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Origin of chirality in transition-metal dichalcogenides
Authors:
Kwangrae Kim,
Hyun-Woo J. Kim,
Seunghyeok Ha,
Hoon Kim,
**-Kwang Kim,
Jaehwon Kim,
Hyunsung Kim,
Junyoung Kwon,
Jihoon Seol,
Saegyeol Jung,
Changyoung Kim,
Ahmet Alatas,
Ayman Said,
Michael Merz,
Matthieu Le Tacon,
** Mo Bok,
Ki-Seok Kim,
B. J. Kim
Abstract:
Chirality is a ubiquitous phenomenon in which a symmetry between left- and right-handed objects is broken, examples in nature ranging from subatomic particles and molecules to living organisms. In particle physics, the weak force is responsible for the symmetry breaking and parity violation in beta decay, but in condensed matter systems interactions that lead to chirality remain poorly understood.…
▽ More
Chirality is a ubiquitous phenomenon in which a symmetry between left- and right-handed objects is broken, examples in nature ranging from subatomic particles and molecules to living organisms. In particle physics, the weak force is responsible for the symmetry breaking and parity violation in beta decay, but in condensed matter systems interactions that lead to chirality remain poorly understood. Here, we unravel the mechanism of chiral charge density wave formation in the transition-metal dichalcogenide 1T-TiSe2. Using representation analysis, we show that charge density modulations and ionic displacements, which transform as a continuous scalar field and a vector field on a discrete lattice, respectively, follow different irreducible representations of the space group, despite the fact that they propagate with the same wave-vectors and are strongly coupled to each other. This charge-lattice symmetry frustration is resolved by further breaking of all symmetries not common to both sectors through induced lattice distortions, thus leading to chirality. Our theory is verified using Raman spectroscopy and inelastic x-ray scattering, which reveal that all but translation symmetries are broken at a level not resolved by state-of-the-art diffraction techniques.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Proxy-based Item Representation for Attribute and Context-aware Recommendation
Authors:
**seok Seol,
Minseok Gang,
Sang-goo Lee,
Jaehui Park
Abstract:
Neural network approaches in recommender systems have shown remarkable success by representing a large set of items as a learnable vector embedding table. However, infrequent items may suffer from inadequate training opportunities, making it difficult to learn meaningful representations. We examine that in attribute and context-aware settings, the poorly learned embeddings of infrequent items impa…
▽ More
Neural network approaches in recommender systems have shown remarkable success by representing a large set of items as a learnable vector embedding table. However, infrequent items may suffer from inadequate training opportunities, making it difficult to learn meaningful representations. We examine that in attribute and context-aware settings, the poorly learned embeddings of infrequent items impair the recommendation accuracy. To address such an issue, we propose a proxy-based item representation that allows each item to be expressed as a weighted sum of learnable proxy embeddings. Here, the proxy weight is determined by the attributes and context of each item and may incorporate bias terms in case of frequent items to further reflect collaborative signals. The proxy-based method calculates the item representations compositionally, ensuring each representation resides inside a well-trained simplex and, thus, acquires guaranteed quality. Additionally, that the proxy embeddings are shared across all items allows the infrequent items to borrow training signals of frequent items in a unified model structure and end-to-end manner. Our proposed method is a plug-and-play model that can replace the item encoding layer of any neural network-based recommendation model, while consistently improving the recommendation performance with much smaller parameter usage. Experiments conducted on real-world recommendation benchmark datasets demonstrate that our proposed model outperforms state-of-the-art models in terms of recommendation accuracy by up to 17% while using only 10% of the parameters.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Exploiting Session Information in BERT-based Session-aware Sequential Recommendation
Authors:
**seok Seol,
Youngrok Ko,
Sang-goo Lee
Abstract:
In recommendation systems, utilizing the user interaction history as sequential information has resulted in great performance improvement. However, in many online services, user interactions are commonly grouped by sessions that presumably share preferences, which requires a different approach from ordinary sequence representation techniques. To this end, sequence representation models with a hier…
▽ More
In recommendation systems, utilizing the user interaction history as sequential information has resulted in great performance improvement. However, in many online services, user interactions are commonly grouped by sessions that presumably share preferences, which requires a different approach from ordinary sequence representation techniques. To this end, sequence representation models with a hierarchical structure or various viewpoints have been developed but with a rather complex network structure. In this paper, we propose three methods to improve recommendation performance by exploiting session information while minimizing additional parameters in a BERT-based sequential recommendation model: using session tokens, adding session segment embeddings, and a time-aware self-attention. We demonstrate the feasibility of the proposed methods through experiments on widely used recommendation datasets.
△ Less
Submitted 19 May, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Technologies for AI-Driven Fashion Social Networking Service with E-Commerce
Authors:
**seok Seol,
Seongjae Kim,
Sungchan Park,
Holim Lim,
Hyunsoo Na,
Eunyoung Park,
Dohee Jung,
Soyoung Park,
Kangwoo Lee,
Sang-goo Lee
Abstract:
The rapid growth of the online fashion market brought demands for innovative fashion services and commerce platforms. With the recent success of deep learning, many applications employ AI technologies such as visual search and recommender systems to provide novel and beneficial services. In this paper, we describe applied technologies for AI-driven fashion social networking service that incorporat…
▽ More
The rapid growth of the online fashion market brought demands for innovative fashion services and commerce platforms. With the recent success of deep learning, many applications employ AI technologies such as visual search and recommender systems to provide novel and beneficial services. In this paper, we describe applied technologies for AI-driven fashion social networking service that incorporate fashion e-commerce. In the application, people can share and browse their outfit-of-the-day (OOTD) photos, while AI analyzes them and suggests similar style OOTDs and related products. To this end, we trained deep learning based AI models for fashion and integrated them to build a fashion visual search system and a recommender system for OOTD. With aforementioned technologies, the AI-driven fashion SNS platform, iTOO, has been successfully launched.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
False Negative Distillation and Contrastive Learning for Personalized Outfit Recommendation
Authors:
Seongjae Kim,
**seok Seol,
Holim Lim,
Sang-goo Lee
Abstract:
Personalized outfit recommendation has recently been in the spotlight with the rapid growth of the online fashion industry. However, recommending outfits has two significant challenges that should be addressed. The first challenge is that outfit recommendation often requires a complex and large model that utilizes visual information, incurring huge memory and time costs. One natural way to mitigat…
▽ More
Personalized outfit recommendation has recently been in the spotlight with the rapid growth of the online fashion industry. However, recommending outfits has two significant challenges that should be addressed. The first challenge is that outfit recommendation often requires a complex and large model that utilizes visual information, incurring huge memory and time costs. One natural way to mitigate this problem is to compress such a cumbersome model with knowledge distillation (KD) techniques that leverage knowledge from a pretrained teacher model. However, it is hard to apply existing KD approaches in recommender systems (RS) to the outfit recommendation because they require the ranking of all possible outfits while the number of outfits grows exponentially to the number of consisting clothing items. Therefore, we propose a new KD framework for outfit recommendation, called False Negative Distillation (FND), which exploits false-negative information from the teacher model while not requiring the ranking of all candidates. The second challenge is that the explosive number of outfit candidates amplifying the data sparsity problem, often leading to poor outfit representation. To tackle this issue, inspired by the recent success of contrastive learning (CL), we introduce a CL framework for outfit representation learning with two proposed data augmentation methods. Quantitative and qualitative experiments on outfit recommendation datasets demonstrate the effectiveness and soundness of our proposed methods.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Masked Contrastive Learning for Anomaly Detection
Authors:
Hyunsoo Cho,
**seok Seol,
Sang-goo Lee
Abstract:
Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have demonstrated their efficiencies. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additiona…
▽ More
Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have demonstrated their efficiencies. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework validating their superiority in various fields, including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.
△ Less
Submitted 30 January, 2023; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Contrastive Learning for Unsupervised Image-to-Image Translation
Authors:
Hanbit Lee,
**seok Seol,
Sang-goo Lee
Abstract:
Image-to-image translation aims to learn a map** between different groups of visually distinguishable images. While recent methods have shown impressive ability to change even intricate appearance of images, they still rely on domain labels in training a model to distinguish between distinct visual features. Such dependency on labels often significantly limits the scope of applications since con…
▽ More
Image-to-image translation aims to learn a map** between different groups of visually distinguishable images. While recent methods have shown impressive ability to change even intricate appearance of images, they still rely on domain labels in training a model to distinguish between distinct visual features. Such dependency on labels often significantly limits the scope of applications since consistent and high-quality labels are expensive. Instead, we wish to capture visual features from images themselves and apply them to enable realistic translation without human-generated labels. To this end, we propose an unsupervised image-to-image translation method based on contrastive learning. The key idea is to learn a discriminator that differentiates between distinctive styles and let the discriminator supervise a generator to transfer those styles across images. During training, we randomly sample a pair of images and train the generator to change the appearance of one towards another while kee** the original structure. Experimental results show that our method outperforms the leading unsupervised baselines in terms of visual quality and translation accuracy.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
Characterization of Pd and Pd@Au core-shell nanoparticles using atom probe tomography and field evaporation simulation
Authors:
Se-Ho Kim,
Kyuseon Jang,
Phil Woong Kang,
Jae-Pyoung Ahn,
Jae-Bok Seol,
Chang-Min Kwak,
Constantinos Hatzoglou,
Francois Vurpillot,
Pyuck-Pa Choi
Abstract:
We report on atom probe tomography analyses of Pd and Pd@Au nanoparticles embedded in a Ni matrix and the effects of local evaporation field variations on the atom probe data. In order to assess the integrity of the reconstructed atom maps, we performed numerical simulations of the field evaporation processes and compared the simulated datasets with experimentally acquired data. The distortions se…
▽ More
We report on atom probe tomography analyses of Pd and Pd@Au nanoparticles embedded in a Ni matrix and the effects of local evaporation field variations on the atom probe data. In order to assess the integrity of the reconstructed atom maps, we performed numerical simulations of the field evaporation processes and compared the simulated datasets with experimentally acquired data. The distortions seen in the atom maps for both Pd and Pd@Au nanoparticles could be mostly ascribed to local variations in chemical composition and elemental evaporation fields. The evaporation field values for Pd and Ni, taken from the image hump model and assumed in the simulations, yielded a good agreement between experimental and simulation results. In contrast, the evaporation field for Au, as predicted from the image hump model, appeared to be substantially overestimated and resulted in a large discrepancy between experiments and simulations.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Field Evaluations of A Deep Learning-based Intelligent Spraying Robot with Flow Control for Pear Orchards
Authors:
Jaehwi Seol,
Jeongeun Kim,
Hyoung Il Son
Abstract:
This paper proposes a variable flow control system in real time with deep learning using the segmentation of fruit trees in a pear orchard. The flow rate control in real time, undesired pressure fluctuation and theoretical modeling may differ from those in the real world. Therefore, two types of preliminary experiments were designed to examine the linear relationship of the flow rate modeling. Thr…
▽ More
This paper proposes a variable flow control system in real time with deep learning using the segmentation of fruit trees in a pear orchard. The flow rate control in real time, undesired pressure fluctuation and theoretical modeling may differ from those in the real world. Therefore, two types of preliminary experiments were designed to examine the linear relationship of the flow rate modeling. Through a preliminary experiment, the parameters of the pulse width modulation (PWM) controller were optimized, and an actual field experiment was conducted to confirm the performance of the variable flow rate control system. As a result of the field experiment, the performance of the proposed system was satisfactory, as it showed that it could reduce pesticide use and the risk of pesticide exposure. Especially, since the field experiment was conducted in an unstructured environment, the proposed variable flow control system is expected to be sufficiently applicable to other orchards.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Style2Vec: Representation Learning for Fashion Items from Style Sets
Authors:
Hanbit Lee,
**seok Seol,
Sang-goo Lee
Abstract:
With the rapid growth of online fashion market, demand for effective fashion recommendation systems has never been greater. In fashion recommendation, the ability to find items that goes well with a few other items based on style is more important than picking a single item based on the user's entire purchase history. Since the same user may have purchased dress suits in one month and casual denim…
▽ More
With the rapid growth of online fashion market, demand for effective fashion recommendation systems has never been greater. In fashion recommendation, the ability to find items that goes well with a few other items based on style is more important than picking a single item based on the user's entire purchase history. Since the same user may have purchased dress suits in one month and casual denims in another, it is impossible to learn the latent style features of those items using only the user ratings. If we were able to represent the style features of fashion items in a reasonable way, we will be able to recommend new items that conform to some small subset of pre-purchased items that make up a coherent style set. We propose Style2Vec, a vector representation model for fashion items. Based on the intuition of distributional semantics used in word embeddings, Style2Vec learns the representation of a fashion item using other items in matching outfits as context. Two different convolutional neural networks are trained to maximize the probability of item co-occurrences. For evaluation, a fashion analogy test is conducted to show that the resulting representation connotes diverse fashion related semantics like shapes, colors, patterns and even latent styles. We also perform style classification using Style2Vec features and show that our method outperforms other baselines.
△ Less
Submitted 14 August, 2017;
originally announced August 2017.
-
A Syllable-based Technique for Word Embeddings of Korean Words
Authors:
Sanghyuk Choi,
Taeuk Kim,
**seok Seol,
Sang-goo Lee
Abstract:
Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which…
▽ More
Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which word representation is composed of trained syllable vectors. Our model successfully produces morphologically meaningful representation of Korean words compared to the original Skip-gram embeddings. The results also show that it is quite robust to the Out-of-Vocabulary problem.
△ Less
Submitted 5 August, 2017;
originally announced August 2017.
-
Distributed FD-MIMO: Cellular Evolution for 5G and Beyond
Authors:
Yeqing Hu,
Boon Loong Ng,
Young-Han Nam,
** Yuan,
Gary Xu,
Ji-Yun Seol,
Jianzhong,
Zhang
Abstract:
This paper presents the next evolution of FD-MIMO technology for beyond 5G, where antennas of the FD-MIMO system are placed in a distributed manner throughout the cell in a multi-cell deployment scenario. This system, referred to as Distributed FD-MIMO (D-FD-MIMO) system, is capable of providing higher cell average throughput as well as more uniform user experience compared to the conventional FD-…
▽ More
This paper presents the next evolution of FD-MIMO technology for beyond 5G, where antennas of the FD-MIMO system are placed in a distributed manner throughout the cell in a multi-cell deployment scenario. This system, referred to as Distributed FD-MIMO (D-FD-MIMO) system, is capable of providing higher cell average throughput as well as more uniform user experience compared to the conventional FD-MIMO system. System level simulations are performed to evaluate performance. Our results show that the proposed D-FD-MIMO system achieves 1.4-2 times cell average throughput gain compared to the FD-MIMO system. The insights of performance gain are provided. Hardware implementation challenges and potential standards impact are also presented.
△ Less
Submitted 3 April, 2017;
originally announced April 2017.
-
Role of explosive instabilities in high-$β$ disruptions in tokamaks
Authors:
A. Y. Aydemir,
H. H. Lee,
S. G. Lee,
J. Seol,
B. H. Park,
Y. K. In
Abstract:
Intrinsically explosive growth of a ballooning finger is demonstrated in nonlinear magnetohydrodynamic calculations of high-$β$ disruptions in tokamaks. The explosive finger is formed by an ideally unstable n=1 mode, dominated by an m/n=2/1 component. The quadrupole geometry of the 2/1 perturbed pressure field provides a generic mechanism for the formation of the initial ballooning finger and its…
▽ More
Intrinsically explosive growth of a ballooning finger is demonstrated in nonlinear magnetohydrodynamic calculations of high-$β$ disruptions in tokamaks. The explosive finger is formed by an ideally unstable n=1 mode, dominated by an m/n=2/1 component. The quadrupole geometry of the 2/1 perturbed pressure field provides a generic mechanism for the formation of the initial ballooning finger and its subsequent transition from exponential to explosive growth, without relying on secondary processes. The explosive ejection of the hot plasma from the core and stochastization of the magnetic field occur in Alfvénic time scales, accounting for the extremely fast growth of the precursor oscillations and the rapidity of the thermal quench in some high-$β$ disruptions.
△ Less
Submitted 3 March, 2016;
originally announced March 2016.
-
Kernel convolution model for decoding sounds from time-varying neural responses
Authors:
Ali Faisal,
Anni Nora,
Jaeho Seol,
Hanna Renvall,
Riitta Salmelin
Abstract:
In this study we present a kernel based convolution model to characterize neural responses to natural sounds by decoding their time-varying acoustic features. The model allows to decode natural sounds from high-dimensional neural recordings, such as magnetoencephalography (MEG), that track timing and location of human cortical signalling noninvasively across multiple channels. We used the MEG resp…
▽ More
In this study we present a kernel based convolution model to characterize neural responses to natural sounds by decoding their time-varying acoustic features. The model allows to decode natural sounds from high-dimensional neural recordings, such as magnetoencephalography (MEG), that track timing and location of human cortical signalling noninvasively across multiple channels. We used the MEG responses recorded from subjects listening to acoustically different environmental sounds. By decoding the stimulus frequencies from the responses, our model was able to accurately distinguish between two different sounds that it had never encountered before with 70% accuracy. Convolution models typically decode frequencies that appear at a certain time point in the sound signal by using neural responses from that time point until a certain fixed duration of the response. Using our model, we evaluated several fixed durations (time-lags) of the neural responses and observed auditory MEG responses to be most sensitive to spectral content of the sounds at time-lags of 250 ms to 500 ms. The proposed model should be useful for determining what aspects of natural sounds are represented by high-dimensional neural responses and may reveal novel properties of neural signals.
△ Less
Submitted 21 July, 2015;
originally announced July 2015.
-
Exploiting the Preferred Domain of FDD Massive MIMO Systems with Uniform Planar Arrays
Authors:
Junil Choi,
Taeyoung Kim,
David J. Love,
Ji-yun Seol
Abstract:
Massive multiple-input multiple-output (MIMO) systems hold the potential to be an enabling technology for 5G cellular. Uniform planar array (UPA) antenna structures are a focus of much commercial discussion because of their ability to enable a large number of antennas in a relatively small area. With UPA antenna structures, the base station can control the beam direction in both the horizontal and…
▽ More
Massive multiple-input multiple-output (MIMO) systems hold the potential to be an enabling technology for 5G cellular. Uniform planar array (UPA) antenna structures are a focus of much commercial discussion because of their ability to enable a large number of antennas in a relatively small area. With UPA antenna structures, the base station can control the beam direction in both the horizontal and vertical domains simultaneously. However, channel conditions may dictate that one dimension requires higher channel state information (CSI) accuracy than the other. We propose the use of an additional one bit of feedback information sent from the user to the base station to indicate the preferred domain on top of the feedback overhead of CSI quantization in frequency division duplexing (FDD) massive MIMO systems. Combined with variable-rate CSI quantization schemes, the numerical studies show that the additional one bit of feedback can increase the quality of CSI significantly for UPA antenna structures.
△ Less
Submitted 2 February, 2015;
originally announced February 2015.
-
Antenna Grou** based Feedback Compression for FDD-based Massive MIMO Systems
Authors:
Byungju Lee,
Junil Choi,
Ji-yun Seol,
David J. Love,
Byonghyo Shim
Abstract:
Recent works on massive multiple-input multiple-output (MIMO) have shown that a potential breakthrough in capacity gains can be achieved by deploying a very large number of antennas at the basestation. In order to achieve the performance that massive MIMO systems promise, accurate transmit-side channel state information (CSI) should be available at the basestation. While transmit-side CSI can be o…
▽ More
Recent works on massive multiple-input multiple-output (MIMO) have shown that a potential breakthrough in capacity gains can be achieved by deploying a very large number of antennas at the basestation. In order to achieve the performance that massive MIMO systems promise, accurate transmit-side channel state information (CSI) should be available at the basestation. While transmit-side CSI can be obtained by employing channel reciprocity in time division duplexing (TDD) systems, explicit feedback of CSI from the user terminal to the basestation is needed for frequency division duplexing (FDD) systems. In this paper, we propose an antenna grou** based feedback reduction technique for FDD-based massive MIMO systems. The proposed algorithm, dubbed antenna group beamforming (AGB), maps multiple correlated antenna elements to a single representative value using pre-designed patterns. The proposed method modifies the feedback packet by introducing the concept of a header to select a suitable group pattern and a payload to quantize the reduced dimension channel vector. Simulation results show that the proposed method achieves significant feedback overhead reduction over conventional approach performing the vector quantization of whole channel vector under the same target sum rate requirement.
△ Less
Submitted 21 July, 2015; v1 submitted 26 August, 2014;
originally announced August 2014.