Skip to main content

Showing 1–50 of 80 results for author: Nam, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01034  [pdf, other

    cs.CV cs.GR

    Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert

    Authors: Han EunGi, Oh Hyun-Bin, Kim Sung-Bin, Corentin Nivelet Etcheberry, Suekyeong Nam, Janghoon Joo, Tae-Hyun Oh

    Abstract: Speech-driven 3D facial animation has recently garnered attention due to its cost-effective usability in multimedia production. However, most current advances overlook the intelligibility of lip movements, limiting the realism of facial expressions. In this paper, we introduce a method for speech-driven 3D facial animation to generate accurate lip movements, proposing an audio-visual multimodal pe… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: INTERSPEECH 2024

  2. arXiv:2406.14272  [pdf, other

    cs.CV cs.GR

    MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset

    Authors: Kim Sung-Bin, Lee Chae-Yeon, Gihun Son, Oh Hyun-Bin, Janghoon Ju, Suekyeong Nam, Tae-Hyun Oh

    Abstract: Recent studies in speech-driven 3D talking head generation have achieved convincing results in verbal articulations. However, generating accurate lip-syncs degrades when applied to input speech in other languages, possibly due to the lack of datasets covering a broad spectrum of facial movements across languages. In this work, we introduce a novel task to generate 3D talking heads from speeches of… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  3. arXiv:2406.13251  [pdf, other

    cs.CV cs.GR eess.IV

    Freq-Mip-AA : Frequency Mip Representation for Anti-Aliasing Neural Radiance Fields

    Authors: Youngin Park, Seungtae Nam, Cheul-hee Hahm, Eunbyung Park

    Abstract: Neural Radiance Fields (NeRF) have shown remarkable success in representing 3D scenes and generating novel views. However, they often struggle with aliasing artifacts, especially when rendering images from different camera distances from the training views. To address the issue, Mip-NeRF proposed using volumetric frustums to render a pixel and suggested integrated positional encoding (IPE). While… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICIP 2024, 7 pages, 3 figures

  4. arXiv:2406.12904  [pdf, other

    cs.LG physics.comp-ph physics.optics

    Meent: Differentiable Electromagnetic Simulator for Machine Learning

    Authors: Yongha Kim, Anthony W. Jung, Sanmun Kim, Kevin Octavian, Doyoung Heo, Chae** Park, Jeongmin Shin, Sunghyun Nam, Chanhyung Park, Juho Park, Sangjun Han, **myoung Lee, Seolho Kim, Min Seok Jang, Chan Y. Park

    Abstract: Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: under review

  5. arXiv:2404.11810  [pdf, other

    cs.GR

    Holographic Parallax Improves 3D Perceptual Realism

    Authors: Dongyeon Kim, Seung-Woo Nam, Suyeon Choi, Jong-Mo Seo, Gordon Wetzstein, Yoonchan Jeong

    Abstract: Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 33 pages, 34 figures

  6. arXiv:2404.01878  [pdf, other

    cs.CV cs.AI

    Real, fake and synthetic faces -- does the coin have three sides?

    Authors: Shahzeb Naeem, Ramzi Al-Sharawi, Muhammad Riyyan Khan, Usman Tariq, Abhinav Dhall, Hasan Al-Nashash

    Abstract: With the ever-growing power of generative artificial intelligence, deepfake and artificially generated (synthetic) media have continued to spread online, which creates various ethical and moral concerns regarding their usage. To tackle this, we thus present a novel exploration of the trends and patterns observed in real, deepfake and synthetic facial images. The proposed analysis is done in two pa… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  7. arXiv:2404.01745  [pdf, other

    cs.CV cs.AI

    Unleash the Potential of CLIP for Video Highlight Detection

    Authors: Donghoon Han, Seunghyeon Seo, Eunhwan Park, Seong-Uk Nam, Nojun Kwak

    Abstract: Multimodal and large language models (LLMs) have revolutionized the utilization of open-world knowledge, unlocking novel potentials across various tasks and applications. Among these domains, the video domain has notably benefited from their capabilities. In this paper, we present Highlight-CLIP (HL-CLIP), a method designed to excel in the video highlight detection task by leveraging the pre-train… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  8. arXiv:2404.01438  [pdf

    cs.CV cs.AI

    Generation and Detection of Sign Language Deepfakes -- A Linguistic and Visual Analysis

    Authors: Shahzeb Naeem, Muhammad Riyyan Khan, Usman Tariq, Abhinav Dhall, Carlos Ivan Colon, Hasan Al-Nashash

    Abstract: A question in the realm of deepfakes is slowly emerging pertaining to whether we can go beyond facial deepfakes and whether it would be beneficial to society. Therefore, this research presents a positive application of deepfake technology in upper body generation, while performing sign-language for the Deaf and Hard of Hearing (DHoH) community. The resulting videos are later vetted with a sign lan… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 13 pages, 13 figures, Computer Vision and Image Understanding Journal

  9. arXiv:2403.19254  [pdf, other

    cs.CV

    Imperceptible Protection against Style Imitation from Diffusion Models

    Authors: Namhyuk Ahn, Wonhyuk Ahn, KiYoon Yoo, Daesik Kim, Seung-Hun Nam

    Abstract: Recent progress in diffusion models has profoundly enhanced the fidelity of image generation. However, this has raised concerns about copyright infringements. While prior methods have introduced adversarial perturbations to prevent style imitation, most are accompanied by the degradation of artworks' visual quality. Recognizing the importance of maintaining this, we develop a visually improved pro… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2403.14264  [pdf, other

    cs.CV cs.AI

    A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity Identification

    Authors: Seungkwon Kim, Sangyeon Kim, Seung-Hun Nam

    Abstract: Portrait stylization is a challenging task involving the transformation of an input portrait image into a specific style while preserving its inherent characteristics. The recent introduction of Stable Diffusion (SD) has significantly improved the quality of outcomes in this field. However, a practical stylization framework that can effectively filter harmful input content and preserve the distinc… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP 2024

  11. arXiv:2402.14196  [pdf, other

    cs.CV cs.GR

    Mip-Grid: Anti-aliased Grid Representations for Neural Radiance Fields

    Authors: Seungtae Nam, Daniel Rho, Jong Hwan Ko, Eunbyung Park

    Abstract: Despite the remarkable achievements of neural radiance fields (NeRF) in representing 3D scenes and generating novel view images, the aliasing issue, rendering "jaggies" or "blurry" images at varying camera distances, remains unresolved in most existing approaches. The recently proposed mip-NeRF has addressed this challenge by rendering conical frustums instead of rays. However, it relies on MLP ar… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to NeurIPS 2023

  12. arXiv:2402.11597  [pdf, other

    cs.CL

    Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

    Authors: Gui** Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim

    Abstract: Large language models (LLMs) are typically prompted to follow a single instruction per inference call. In this work, we analyze whether LLMs also hold the capability to handle multiple instructions simultaneously, denoted as Multi-Task Inference. For this purpose, we introduce the MTI Bench(Multi-Task Inference Benchmark), a comprehensive evaluation benchmark encompassing 5,000 instances across 25… ▽ More

    Submitted 6 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: acl 2024 (main)

  13. arXiv:2402.00863  [pdf, other

    cs.CV

    Geometry Transfer for Stylizing Radiance Fields

    Authors: Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos, Sungjoo Yoo, Alexander Sorkine-Hornung, Rakesh Ranjan

    Abstract: Shape and geometric patterns are essential in defining stylistic identity. However, current 3D style transfer methods predominantly focus on transferring colors and textures, often overlooking geometric aspects. In this paper, we introduce Geometry Transfer, a novel method that leverages geometric deformation for 3D style transfer. This technique employs depth maps to extract a style guide, subseq… ▽ More

    Submitted 6 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: CVPR 2024. Project page: https://hyblue.github.io/geo-srf/

  14. arXiv:2401.15313  [pdf, other

    cs.RO cs.CV eess.SY math.OC

    Multi-Robot Relative Pose Estimation in SE(2) with Observability Analysis: A Comparison of Extended Kalman Filtering and Robust Pose Graph Optimization

    Authors: Kihoon Shin, Hyunjae Sim, Seungwon Nam, Yonghee Kim, Jae Hu, Kwang-Ki K. Kim

    Abstract: In this study, we address multi-robot localization issues, with a specific focus on cooperative localization and observability analysis of relative pose estimation. Cooperative localization involves enhancing each robot's information through a communication network and message passing. If odometry data from a target robot can be transmitted to the ego robot, observability of their relative pose es… ▽ More

    Submitted 4 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

    Comments: 20 pages, 21 figures

    MSC Class: 93C85; 93E11; 93E24; 90C26; 93E10; 62M20;

  15. arXiv:2401.03079  [pdf, other

    cs.RO

    Integrating Open-World Shared Control in Immersive Avatars

    Authors: Patrick Naughton, James Seungbum Nam, Andrew Stratton, Kris Hauser

    Abstract: Teleoperated avatar robots allow people to transport their manipulation skills to environments that may be difficult or dangerous to work in. Current systems are able to give operators direct control of many components of the robot to immerse them in the remote environment, but operators still struggle to complete tasks as competently as they could in person. We present a framework for incorporati… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  16. arXiv:2311.14993  [pdf, other

    cs.CV

    Coordinate-Aware Modulation for Neural Fields

    Authors: Joo Chan Lee, Daniel Rho, Seungtae Nam, Jong Hwan Ko, Eunbyung Park

    Abstract: Neural fields, map** low-dimensional input coordinates to corresponding signals, have shown promising results in representing various signals. Numerous methodologies have been proposed, and techniques employing MLPs and grid representations have achieved substantial success. MLPs allow compact and high expressibility, yet often suffer from spectral bias and slow convergence speed. On the other h… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: Project page: http://maincold2.github.io/cam/

  17. arXiv:2311.05881  [pdf, other

    physics.app-ph cs.ET cs.NE

    Programmable Superconducting Optoelectronic Single-Photon Synapses with Integrated Multi-State Memory

    Authors: Bryce A. Primavera, Saeed Khan, Richard P. Mirin, Sae Woo Nam, Jeffrey M. Shainline

    Abstract: The co-location of memory and processing is a core principle of neuromorphic computing. A local memory device for synaptic weight storage has long been recognized as an enabling element for large-scale, high-performance neuromorphic hardware. In this work, we demonstrate programmable superconducting synapses with integrated memories for use in superconducting optoelectronic neural systems. Superco… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 16 pages, 11 figures

  18. arXiv:2311.00994  [pdf, other

    cs.CV cs.GR

    LaughTalk: Expressive 3D Talking Head Generation with Laughter

    Authors: Kim Sung-Bin, Lee Hyun, Da Hye Hong, Suekyeong Nam, Janghoon Ju, Tae-Hyun Oh

    Abstract: Laughter is a unique expression, essential to affirmative social interactions of humans. Although current 3D talking head generation methods produce convincing verbal articulations, they often fail to capture the vitality and subtleties of laughter and smiles despite their importance in social context. In this paper, we introduce a novel task to generate 3D talking heads capable of both articulate… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted to WACV2024

  19. arXiv:2310.11005  [pdf, ps, other

    cs.IT cs.CR

    Optimal Private Discrete Distribution Estimation with One-bit Communication

    Authors: Seung-Hyun Nam, Vincent Y. F. Tan, Si-Hyeon Lee

    Abstract: We consider a private discrete distribution estimation problem with one-bit communication constraint. The privacy constraints are imposed with respect to the local differential privacy and the maximal leakage. The estimation error is quantified by the worst-case mean squared error. We completely characterize the first-order asymptotics of this privacy-utility trade-off under the one-bit communicat… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 13 pages, 5 figures, and 1 page of supplementary material

  20. arXiv:2310.03205  [pdf, other

    cs.CV cs.AI

    A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

    Authors: Kim Youwang, Lee Hyun, Kim Sung-Bin, Suekyeong Nam, Janghoon Ju, Tae-Hyun Oh

    Abstract: We propose NeuFace, a 3D face mesh pseudo annotation method on videos via neural re-parameterized optimization. Despite the huge progress in 3D face reconstruction methods, generating reliable 3D face labels for in-the-wild dynamic videos remains challenging. Using NeuFace optimization, we annotate the per-view/-frame accurate and consistent face meshes on large-scale face videos, called the NeuFa… ▽ More

    Submitted 6 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: 9 pages, 7 figures, and 3 tables for the main paper. 8 pages, 6 figures and 3 tables for the appendix

  21. arXiv:2309.14668  [pdf

    physics.optics cs.GR eess.IV physics.app-ph physics.comp-ph

    Depolarized Holography with Polarization-multiplexing Metasurface

    Authors: Seung-Woo Nam, Young** Kim, Dongyeon Kim, Yoonchan Jeong

    Abstract: The evolution of computer-generated holography (CGH) algorithms has prompted significant improvements in the performances of holographic displays. Nonetheless, they start to encounter a limited degree of freedom in CGH optimization and physical constraints stemming from the coherent nature of holograms. To surpass the physical limitations, we consider polarization as a new degree of freedom by uti… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 15 pages, 13 figures, to be published in SIGGRAPH Asia 2023

  22. arXiv:2309.06933  [pdf, other

    cs.CV

    DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

    Authors: Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong

    Abstract: Recent progresses in large-scale text-to-image models have yielded remarkable accomplishments, finding various applications in art domain. However, expressing unique characteristics of an artwork (e.g. brushwork, colortone, or composition) with text prompts alone may encounter limitations due to the inherent constraints of verbal description. To this end, we introduce DreamStyler, a novel framewor… ▽ More

    Submitted 18 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: AAAI 2024

  23. arXiv:2307.03962  [pdf, ps, other

    cs.IT cs.CR

    Achieving the Exactly Optimal Privacy-Utility Trade-Off with Low Communication Cost via Shared Randomness

    Authors: Seung-Hyun Nam, Hyun-Young Park, Si-Hyeon Lee

    Abstract: We consider a discrete distribution estimation problem under a local differential privacy (LDP) constraint in the presence of shared randomness. By exploiting the shared randomness, we suggest a new method for constructing LDP schemes which achieve the exactly optimal privacy-utility trade-off (PUT) with the communication cost of less than or equal to the input data size for any privacy regime. Th… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: 11 pages and 1 figure. This manuscript was submitted to IEEE Transactions on Information Theory

  24. arXiv:2306.15969  [pdf, other

    cs.LG cs.AI

    Separable Physics-Informed Neural Networks

    Authors: Junwoo Cho, Seungtae Nam, Hyunmo Yang, Seok-Bae Yun, Youngjoon Hong, Eunbyung Park

    Abstract: Physics-informed neural networks (PINNs) have recently emerged as promising data-driven PDE solvers showing encouraging results on various PDEs. However, there is a fundamental limitation of training PINNs to solve multi-dimensional PDEs and approximate highly complex solution functions. The number of training points (collocation points) required on these challenging PDEs grows substantially, but… ▽ More

    Submitted 31 October, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: To appear in NeurIPS 2023 (28 pages, 13 figures). workshop paper: arXiv:2211.08761

  25. arXiv:2305.01261  [pdf, other

    cs.CR

    Exactly Optimal and Communication-Efficient Private Estimation via Block Designs

    Authors: Hyun-Young Park, Seung-Hyun Nam, Si-Hyeon Lee

    Abstract: In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower… ▽ More

    Submitted 17 October, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 26 pages, 1 figure, and 1 table. A short version of this manuscript was presented at 2023 IEEE International Symposium on Information Theory

  26. arXiv:2304.12809  [pdf

    cs.HC cs.AI cs.CL cs.CY cs.SD

    Can Voice Assistants Sound Cute? Towards a Model of Kawaii Vocalics

    Authors: Katie Seaborn, Somang Nam, Julia Keckeis, Tatsuya Itagaki

    Abstract: The Japanese notion of "kawaii" or expressions of cuteness, vulnerability, and/or charm is a global cultural export. Work has explored kawaii-ness as a design feature and factor of user experience in the visual appearance, nonverbal behaviour, and sound of robots and virtual characters. In this initial work, we consider whether voices can be kawaii by exploring the vocal qualities of voice assista… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: 7 pages

    Report number: Article 63

    Journal ref: In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA '23). Association for Computing Machinery, New York, NY, USA, Article 63, 1-7

  27. arXiv:2304.10537  [pdf, other

    cs.CV cs.GR

    Learning Neural Duplex Radiance Fields for Real-Time View Synthesis

    Authors: Ziyu Wan, Christian Richardt, Aljaž Božič, Chao Li, Vijay Rengarajan, Seonghyeon Nam, Xiaoyu Xiang, Tuotuo Li, Bo Zhu, Rakesh Ranjan, **g Liao

    Abstract: Neural radiance fields (NeRFs) enable novel view synthesis with unprecedented visual quality. However, to render photorealistic images, NeRFs require hundreds of deep multilayer perceptron (MLP) evaluations - for each pixel. This is prohibitively expensive and makes real-time rendering infeasible, even on powerful modern GPUs. In this paper, we propose a novel approach to distill and bake NeRFs in… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: CVPR 2023. Project page: http://raywzy.com/NDRF

  28. Multiplexed gradient descent: Fast online training of modern datasets on hardware neural networks without backpropagation

    Authors: Adam N. McCaughan, Bakhrom G. Oripov, Natesh Ganesh, Sae Woo Nam, Andrew Dienstfrey, Sonia M. Buckley

    Abstract: We present multiplexed gradient descent (MGD), a gradient descent framework designed to easily train analog or digital neural networks in hardware. MGD utilizes zero-order optimization techniques for online training of hardware neural networks. We demonstrate its ability to train neural networks on modern machine learning datasets, including CIFAR-10 and Fashion-MNIST, and compare its performance… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

    Journal ref: APL Machine Learning 1, 026118 (2023)

  29. arXiv:2212.09069  [pdf, other

    cs.CV cs.GR

    Masked Wavelet Representation for Compact Neural Radiance Fields

    Authors: Daniel Rho, Byeonghyeon Lee, Seungtae Nam, Joo Chan Lee, Jong Hwan Ko, Eunbyung Park

    Abstract: Neural radiance fields (NeRF) have demonstrated the potential of coordinate-based neural representation (neural fields or implicit neural representation) in neural rendering. However, using a multi-layer perceptron (MLP) to represent a 3D scene or object requires enormous computational resources and time. There have been recent studies on how to reduce these computational inefficiencies by using a… ▽ More

    Submitted 21 March, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: Accepted to CVPR 2023

  30. arXiv:2212.03961  [pdf, other

    cs.CV

    FSID: Fully Synthetic Image Denoising via Procedural Scene Generation

    Authors: Gyeongmin Choe, Beibei Du, Seonghyeon Nam, Xiaoyu Xiang, Bo Zhu, Rakesh Ranjan

    Abstract: For low-level computer vision and image processing ML tasks, training on large datasets is critical for generalization. However, the standard practice of relying on real-world images primarily from the Internet comes with image quality, scalability, and privacy issues, especially in commercial contexts. To address this, we have developed a procedural synthetic data generation pipeline and dataset… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  31. arXiv:2211.08761  [pdf, other

    cs.LG

    Separable PINN: Mitigating the Curse of Dimensionality in Physics-Informed Neural Networks

    Authors: Junwoo Cho, Seungtae Nam, Hyunmo Yang, Seok-Bae Yun, Youngjoon Hong, Eunbyung Park

    Abstract: Physics-informed neural networks (PINNs) have emerged as new data-driven PDE solvers for both forward and inverse problems. While promising, the expensive computational costs to obtain solutions often restrict their broader applicability. We demonstrate that the computations in automatic differentiation (AD) can be significantly reduced by leveraging forward-mode AD when training PINN. However, a… ▽ More

    Submitted 2 November, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: To appear in NeurIPS 2022 Workshop on The Symbiosis of Deep Learning and Differential Equations (DLDE) - II, 12 pages, 5 figures, full paper: arXiv:2306.15969

  32. arXiv:2208.12014  [pdf, ps, other

    cs.IT eess.SP

    Technical Report: Development of an Ultrahigh Bandwidth Software-defined Radio Platform

    Authors: Sung Sik Nam, Changseok Yoon, Ki-Hong Park, Mohamed-Slim Alouini

    Abstract: For the development of new digital signal processing systems and services, the rapid, easy, and convenient prototy** of ideas and the rapid time-to-market of products are becoming important with advances in technology. Conventionally, for the development stage, particularly when confirming the feasibility or performance of a new system or service, an idea is first confirmed through a computerbas… ▽ More

    Submitted 26 August, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

  33. arXiv:2207.09663  [pdf, other

    cs.CV

    Streamable Neural Fields

    Authors: Junwoo Cho, Seungtae Nam, Daniel Rho, Jong Hwan Ko, Eunbyung Park

    Abstract: Neural fields have emerged as a new data representation paradigm and have shown remarkable success in various signal representations. Since they preserve signals in their network parameters, the data transfer by sending and receiving the entire model parameters prevents this emerging technology from being used in many practical scenarios. We propose streamable neural fields, a single model that co… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: To appear in ECCV 2022

  34. arXiv:2207.00588  [pdf, other

    cs.CV

    CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics

    Authors: **woo Hwang, Minsu Kim, Daeun Kim, Seungho Nam, Yoonsung Kim, Dohee Kim, Hardik Sharma, Jongse Park

    Abstract: Modern retrospective analytics systems leverage cascade architecture to mitigate bottleneck for computing deep neural networks (DNNs). However, the existing cascades suffer two limitations: (1) decoding bottleneck is either neglected or circumvented, paying significant compute and storage cost for pre-processing; and (2) the systems are specialized for temporal queries and lack spatial query suppo… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: ATC 2022

  35. arXiv:2206.11860  [pdf

    cs.CL cs.AI cs.LG

    Exploiting Transliterated Words for Finding Similarity in Inter-Language News Articles using Machine Learning

    Authors: Sameea Naeem, Dr. Arif ur Rahman, Syed Mujtaba Haider, Abdul Basit Mughal

    Abstract: Finding similarities between two inter-language news articles is a challenging problem of Natural Language Processing (NLP). It is difficult to find similar news articles in a different language other than the native language of user, there is a need for a Machine Learning based automatic system to find the similarity between two inter-language news articles. In this article, we propose a Machine… ▽ More

    Submitted 29 May, 2022; originally announced June 2022.

    Comments: 12 Pages, 13 Figures

  36. arXiv:2206.10878  [pdf, other

    cs.CV

    Feature Re-calibration based Multiple Instance Learning for Whole Slide Image Classification

    Authors: Philip Chikontwe, Soo Jeong Nam, Heounjeong Go, Meejeong Kim, Hyun Jung Sung, Sang Hyun Park

    Abstract: Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases; but, curation of accurate labels is time-consuming and limits the application of fully-supervised methods. To address this, multiple instance learning (MIL) is a popular method that poses classification as a weakly supervised learning task with slide-level labels only. While current MIL method… ▽ More

    Submitted 21 July, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  37. arXiv:2206.01813  [pdf, other

    cs.CV eess.IV

    Learning sRGB-to-Raw-RGB De-rendering with Content-Aware Metadata

    Authors: Seonghyeon Nam, Abhijith Punnappurath, Marcus A. Brubaker, Michael S. Brown

    Abstract: Most camera images are rendered and saved in the standard RGB (sRGB) format by the camera's hardware. Due to the in-camera photo-finishing routines, nonlinear sRGB images are undesirable for computer vision tasks that assume a direct relationship between pixel values and scene radiance. For such applications, linear raw-RGB sensor images are preferred. Saving images in their raw-RGB format is stil… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: CVPR 2022 (GitHub: https://github.com/SamsungLabs/content-aware-metadata)

  38. arXiv:2204.09885  [pdf, other

    cs.CL

    An Attention-Based Model for Predicting Contextual Informativeness and Curriculum Learning Applications

    Authors: Sung** Nam, David Jurgens, Gwen Frishkoff, Kevyn Collins-Thompson

    Abstract: Both humans and machines learn the meaning of unknown words through contextual information in a sentence, but not all contexts are equally helpful for learning. We introduce an effective method for capturing the level of contextual informativeness with respect to a given target word. Our study makes three main contributions. First, we develop models for estimating contextual informativeness, focus… ▽ More

    Submitted 9 November, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

  39. arXiv:2204.09665  [pdf, other

    physics.app-ph cond-mat.supr-con cs.AI cs.NE physics.ins-det

    Demonstration of Superconducting Optoelectronic Single-Photon Synapses

    Authors: Saeed Khan, Bryce A. Primavera, Jeff Chiles, Adam N. McCaughan, Sonia M. Buckley, Alexander N. Tait, Adriana Lita, John Biesecker, Anna Fox, David Olaya, Richard P. Mirin, Sae Woo Nam, Jeffrey M. Shainline

    Abstract: Superconducting optoelectronic hardware is being explored as a path towards artificial spiking neural networks with unprecedented scales of complexity and computational ability. Such hardware combines integrated-photonic components for few-photon, light-speed communication with superconducting circuits for fast, energy-efficient computation. Monolithic integration of superconducting and photonic d… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 23 pages, 20 figures

  40. arXiv:2204.07848  [pdf, other

    cs.CL cs.SD eess.AS

    STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation

    Authors: Saad Naeem, Omer Beg

    Abstract: Phoneme recognition is a largely unsolved problem in NLP, especially for low-resource languages like Urdu. The systems that try to extract the phonemes from audio speech require hand-labeled phonetic transcriptions. This requires expert linguists to annotate speech data with its relevant phonetic representation which is both an expensive and a tedious task. In this paper, we propose STRATA, a fram… ▽ More

    Submitted 16 April, 2022; originally announced April 2022.

  41. arXiv:2108.12947  [pdf, other

    eess.IV cs.CV cs.LG cs.MM

    Learning JPEG Compression Artifacts for Image Manipulation Detection and Localization

    Authors: Myung-Joon Kwon, Seung-Hun Nam, In-Jae Yu, Heung-Kyu Lee, Changick Kim

    Abstract: Detecting and localizing image manipulation are necessary to counter malicious use of image editing techniques. Accordingly, it is essential to distinguish between authentic and tampered regions by analyzing intrinsic statistics in an image. We focus on JPEG compression artifacts left during image acquisition and editing. We propose a convolutional neural network (CNN) that uses discrete cosine tr… ▽ More

    Submitted 25 May, 2022; v1 submitted 29 August, 2021; originally announced August 2021.

    Comments: The version of record of this article, published in the International Journal of Computer Vision (IJCV), is available online at Publisher's website: https://link.springer.com/article/10.1007/s11263-022-01617-5 ; Code is available at: https://github.com/mjkwon2021/CAT-Net

    Journal ref: International Journal of Computer Vision (IJCV), 2022

  42. arXiv:2108.01199  [pdf, other

    cs.CV

    Neural Image Representations for Multi-Image Fusion and Layer Separation

    Authors: Seonghyeon Nam, Marcus A. Brubaker, Michael S. Brown

    Abstract: We propose a framework for aligning and fusing multiple images into a single view using neural image representations (NIRs), also known as implicit or coordinate-based neural representations. Our framework targets burst images that exhibit camera ego motion and potential changes in the scene. We describe different strategies for alignment depending on the nature of the scene motion -- namely, pers… ▽ More

    Submitted 21 July, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Project page: https://shnnam.github.io/research/nir

  43. arXiv:2107.08939  [pdf, other

    cs.MM cs.AI cs.CV

    DHNet: Double MPEG-4 Compression Detection via Multiple DCT Histograms

    Authors: Seung-Hun Nam, Wonhyuk Ahn, Myung-Joon Kwon, Jihyeon Kang, In-Jae Yu

    Abstract: In this article, we aim to detect the double compression of MPEG-4, a universal video codec that is built into surveillance systems and shooting devices. Double compression is accompanied by various types of video manipulation, and its traces can be exploited to determine whether a video is a forgery. To this end, we present a neural network-based approach with discriminant features for capturing… ▽ More

    Submitted 15 April, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE MultiMedia

  44. arXiv:2105.03386  [pdf, other

    cs.DM cs.CG

    Topology and Routing Problems: The Circular Frame

    Authors: Rak-Kyeong Seong, Chanho Min, Sang-Hoon Han, Jaeho Yang, Seungwoo Nam, Kyusam Oh

    Abstract: In this work, we solve the problem of finding non-intersecting paths between points on a plane with a new approach by borrowing ideas from geometric topology, in particular, from the study of polygonal schema in mathematics. We use a topological transformation on the 2-dimensional planar routing environment that simplifies the routing problem into a problem of connecting points on a circle with st… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: 15 pages, 10 figures

  45. Temporally smooth online action detection using cycle-consistent future anticipation

    Authors: Young Hwi Kim, Seonghyeon Nam, Seon Joo Kim

    Abstract: Many video understanding tasks work in the offline setting by assuming that the input video is given from the start to the end. However, many real-world problems require the online setting, making a decision immediately using only the current and the past frames of videos such as in autonomous driving and surveillance systems. In this paper, we present a novel solution for online action detection… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted by Pattern Recognition

    Journal ref: Pattern Recognition, Volume 116, August 2021, 107954

  46. arXiv:2103.13674  [pdf, other

    cs.MM cs.AI cs.CV

    Frame-rate Up-conversion Detection Based on Convolutional Neural Network for Learning Spatiotemporal Features

    Authors: Minseok Yoon, Seung-Hun Nam, In-Jae Yu, Wonhyuk Ahn, Myung-Joon Kwon, Heung-Kyu Lee

    Abstract: With the advance in user-friendly and powerful video editing tools, anyone can easily manipulate videos without leaving prominent visual traces. Frame-rate up-conversion (FRUC), a representative temporal-domain operation, increases the motion continuity of videos with a lower frame-rate and is used by malicious counterfeiters in video tampering such as generating fake frame-rate video without impr… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: preprint; under review

  47. arXiv:2103.01152  [pdf, other

    cond-mat.supr-con cs.GR

    PHIDL: Python CAD layout and geometry creation for nanolithography

    Authors: A. N. McCaughan, A. M. Tait, S. M. Buckley, D. M. Oh, J. T. Chiles, J. M. Shainline, S. W. Nam

    Abstract: Computer-aided design (CAD) has become a critical element in the creation of nanopatterned structures and devices. In particular, with the increased adoption of easy-to-learn programming languages like Python there has been a significant rise in the amount of lithographic geometries generated through scripting and programming. However, there are currently unaddressed gaps in usability for open-sou… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

    Journal ref: J. Vac. Sci. Technol. B 39, 062601 (2021)

  48. arXiv:2008.06255  [pdf, other

    cs.MM cs.CR cs.CV

    WAN: Watermarking Attack Network

    Authors: Seung-Hun Nam, In-Jae Yu, Seung-Min Mun, Daesik Kim, Wonhyuk Ahn

    Abstract: Multi-bit watermarking (MW) has been developed to improve robustness against signal processing operations and geometric distortions. To this end, benchmark tools that test robustness by applying simulated attacks on watermarked images are available. However, limitations in these general attacks exist since they cannot exploit specific characteristics of the targeted MW. In addition, these attacks… ▽ More

    Submitted 20 October, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: Accepted to BMVC 2021

  49. arXiv:2007.08786  [pdf, other

    cs.CV

    Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling

    Authors: Subin Jeon, Seonghyeon Nam, Seoung Wug Oh, Seon Joo Kim

    Abstract: We propose an attention-based networks for transferring motions between arbitrary objects. Given a source image(s) and a driving video, our networks animate the subject in the source images according to the motion in the driving video. In our attention mechanism, dense similarities between the learned keypoints in the source and the driving images are computed in order to retrieve the appearance i… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  50. Deep Convolutional Neural Network for Identifying Seam-Carving Forgery

    Authors: Seung-Hun Nam, Wonhyuk Ahn, In-Jae Yu, Myung-Joon Kwon, Minseok Son, Heung-Kyu Lee

    Abstract: Seam carving is a representative content-aware image retargeting approach to adjust the size of an image while preserving its visually prominent content. To maintain visually important content, seam-carving algorithms first calculate the connected path of pixels, referred to as the seam, according to a defined cost function and then adjust the size of an image by removing and duplicating repeatedl… ▽ More

    Submitted 7 July, 2020; v1 submitted 5 July, 2020; originally announced July 2020.