Skip to main content

Showing 1–45 of 45 results for author: Kim, J W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02066  [pdf, other

    cs.CV eess.IV

    WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

    Authors: Youngdong Jang, Dong In Lee, MinHyuk Jang, Jong Wook Kim, Feng Yang, Sangpil Kim

    Abstract: The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representat… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  2. arXiv:2403.14111  [pdf, other

    cs.CR cs.LG

    HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

    Authors: Seewoo Lee, Garam Lee, Jung Woo Kim, Junbum Shin, Mun-Kyu Lee

    Abstract: Transfer learning is a de facto standard method for efficiently training machine learning models for data-scarce problems by adding and fine-tuning new classification layers to a model pre-trained on large datasets. Although numerous previous studies proposed to use homomorphic encryption to resolve the data privacy issue in transfer learning in the machine learning as a service setting, most of t… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: ICML 2023, Appendix D includes some updates after official publication

    Journal ref: PMLR 202:19010-19035, 2023

  3. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  4. arXiv:2403.05949  [pdf, other

    cs.CV cs.LG q-bio.TO

    General surgery vision transformer: A video pre-trained foundation model for general surgery

    Authors: Samuel Schmidgall, Ji Woong Kim, Jeffrey Jopling, Axel Krieger

    Abstract: The absence of openly accessible data and specialized foundation models is a major barrier for computational research in surgery. Toward this, (i) we open-source the largest dataset of general surgery videos to-date, consisting of 680 hours of surgical videos, including data from robotic and laparoscopic techniques across 28 procedures; (ii) we propose a technique for video pre-training a general… ▽ More

    Submitted 12 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  5. arXiv:2402.08113  [pdf, other

    cs.CL cs.HC

    Addressing cognitive bias in medical language models

    Authors: Samuel Schmidgall, Carl Harris, Ime Essien, Daniel Olshvang, Tawsifur Rahman, Ji Woong Kim, Ro** Ziaei, Jason Eshraghian, Peter Abadir, Rama Chellappa

    Abstract: There is increasing interest in the application large language models (LLMs) to the medical field, in part because of their impressive performance on medical exam questions. While promising, exam questions do not reflect the complexity of real patient-doctor interactions. In reality, physicians' decisions are shaped by many complex factors, such as patient compliance, personal experience, ethical… ▽ More

    Submitted 20 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  6. arXiv:2401.18006  [pdf, other

    q-bio.QM cs.LG eess.SP

    EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation

    Authors: Jonathan W. Kim, Ahmed Alaa, Danilo Bernardo

    Abstract: In conventional machine learning (ML) approaches applied to electroencephalography (EEG), this is often a limited focus, isolating specific brain activities occurring across disparate temporal scales (from transient spikes in milliseconds to seizures lasting minutes) and spatial scales (from localized high-frequency oscillations to global sleep activity). This siloed approach limits the developmen… ▽ More

    Submitted 3 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  7. Machine learning for industrial sensing and control: A survey and practical perspective

    Authors: Nathan P. Lawrence, Seshu Kumar Damarla, Jong Woo Kim, Aditya Tulsyan, Faraz Amjad, Kai Wang, Benoit Chachuat, Jong Min Lee, Biao Huang, R. Bhushan Gopaluni

    Abstract: With the rise of deep learning, there has been renewed interest within the process industries to utilize data on large-scale nonlinear sensing and control problems. We identify key statistical and machine learning techniques that have seen practical success in the process industries. To do so, we start with hybrid modeling to provide a methodological framework underlying core application areas: so… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 48 pages

    Journal ref: Control Engineering Practice 2024

  8. arXiv:2401.00678  [pdf, other

    cs.RO cs.LG q-bio.TO

    General-purpose foundation models for increased autonomy in robot-assisted surgery

    Authors: Samuel Schmidgall, Ji Woong Kim, Alan Kuntz, Ahmed Ezzat Ghazi, Axel Krieger

    Abstract: The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise toward being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown i… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  9. arXiv:2312.01631  [pdf, other

    cs.RO

    Cooperative vs. Teleoperation Control of the Steady Hand Eye Robot with Adaptive Sclera Force Control: A Comparative Study

    Authors: Mojtaba Esfandiari, Ji Woong Kim, Botao Zhao, Golchehr Amirkhani, Muhammad Hadi, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita

    Abstract: A surgeon's physiological hand tremor can significantly impact the outcome of delicate and precise retinal surgery, such as retinal vein cannulation (RVC) and epiretinal membrane peeling. Robot-assisted eye surgery technology provides ophthalmologists with advanced capabilities such as hand tremor cancellation, hand motion scaling, and safety constraints that enable them to perform these otherwise… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  10. arXiv:2309.02706  [pdf, other

    cs.CL

    HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

    Authors: Gui** Son, Hanwool Lee, Suwan Kim, Huiseo Kim, Jaecheol Lee, Je Won Yeom, Jihyu Jung, Jung Woo Kim, Songseong Kim

    Abstract: Large language models (LLMs) trained on massive corpora demonstrate impressive capabilities in a wide range of tasks. While there are ongoing efforts to adapt these models to languages beyond English, the attention given to their evaluation methodologies remains limited. Current multilingual benchmarks often rely on back translations or re-implementations of English tests, limiting their capacity… ▽ More

    Submitted 20 March, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted at LREC-COLING 2024

  11. arXiv:2306.17421  [pdf, other

    cs.RO

    Micromanipulation in Surgery: Autonomous Needle Insertion Inside the Eye for Targeted Drug Delivery

    Authors: Ji Woong Kim, Peiyao Zhang, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: We consider a micromanipulation problem in eye surgery, specifically retinal vein cannulation (RVC). RVC involves inserting a microneedle into a retinal vein for the purpose of targeted drug delivery. The procedure requires accurately guiding a needle to a target vein and inserting it while avoiding damage to the surrounding tissues. RVC can be considered similar to the reach or push task studied… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Experiment-oriented Locomotion and Manipulation Research, RSS 2023 workshop. arXiv admin note: text overlap with arXiv:2306.10133

  12. arXiv:2306.10133  [pdf, other

    cs.RO

    Deep Learning Guided Autonomous Surgery: Guiding Small Needles into Sub-Millimeter Scale Blood Vessels

    Authors: Ji Woong Kim, Peiyao Zhang, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: We propose a general strategy for autonomous guidance and insertion of a needle into a retinal blood vessel. The main challenges underpinning this task are the accurate placement of the needle-tip on the target vein and a careful needle insertion maneuver to avoid double-puncturing the vein, while dealing with challenging kinematic constraints and depth-estimation uncertainty. Following how surgeo… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  13. arXiv:2306.10127  [pdf, other

    cs.RO

    Towards Deep Learning Guided Autonomous Eye Surgery Using Microscope and iOCT Images

    Authors: Ji Woong Kim, Shuwen Wei, Peiyao Zhang, Peter Gehlbach, ** U. Kang, Iulian Iordachita, Marin Kobilarov

    Abstract: Recent advancements in retinal surgery have paved the way for a modern operating room equipped with a surgical robot, a microscope, and intraoperative optical coherence tomography (iOCT)- a depth sensor widely used in retinal surgery. Integrating these tools raises the fundamental question of how to effectively combine them to enable surgical autonomy. In this work, we tackle this question by deve… ▽ More

    Submitted 27 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: pending submission to a journal

  14. arXiv:2306.06461  [pdf

    eess.AS cs.SD

    Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

    Authors: Ji Won Kim, Sang Won Son, Yoonah Song, Hong Kook Kim, Il Hoon Song, Jeong Eun Lim

    Abstract: This report proposes a frequency dynamic convolution (FDY) with a large kernel attention (LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional encoder representation from audio transformers (BEATs) embedding-based sound event detection (SED) model that employs a mean-teacher and pseudo-label approach to address the challenge of limited labeled data for DCASE 2023 Tas… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: DCASE 2023 Challenge Task 4A, 5 pages

  15. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  16. Autonomous Needle Navigation in Retinal Microsurgery: Evaluation in ex vivo Porcine Eyes

    Authors: Peiyao Zhang, Ji Woong Kim, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: Important challenges in retinal microsurgery include prolonged operating time, inadequate force feedback, and poor depth perception due to a constrained top-down view of the surgery. The introduction of robot-assisted technology could potentially deal with such challenges and improve the surgeon's performance. Motivated by such challenges, this work develops a strategy for autonomous needle naviga… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  17. arXiv:2301.02064  [pdf, other

    cs.CV cs.AI

    Single-round Self-supervised Distributed Learning using Vision Transformer

    Authors: Sangjoon Park, Ik-Jae Lee, Jun Won Kim, Jong Chul Ye

    Abstract: Despite the recent success of deep learning in the field of medicine, the issue of data scarcity is exacerbated by concerns about privacy and data ownership. Distributed learning approaches, including federated learning, have been investigated to address these issues. However, they are hindered by the need for cumbersome communication overheads and weaknesses in privacy protection. To tackle these… ▽ More

    Submitted 15 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

  18. arXiv:2212.04356  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Robust Speech Recognition via Large-Scale Weak Supervision

    Authors: Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever

    Abstract: We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standard benchmarks and are often competitive with prior fully supervised results but in a zero-shot transfer setting without the need for any fine-tuni… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  19. Modern Machine Learning Tools for Monitoring and Control of Industrial Processes: A Survey

    Authors: R. Bhushan Gopaluni, Aditya Tulsyan, Benoit Chachuat, Biao Huang, Jong Min Lee, Faraz Amjad, Seshu Kumar Damarla, Jong Woo Kim, Nathan P. Lawrence

    Abstract: Over the last ten years, we have seen a significant increase in industrial data, tremendous improvement in computational power, and major theoretical advances in machine learning. This opens up an opportunity to use modern machine learning tools on large-scale nonlinear monitoring and control problems. This article provides a survey of recent results with applications in the process industry.

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: IFAC World Congress 2020

  20. arXiv:2209.01083  [pdf, other

    cs.LG

    When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development

    Authors: Nghia Duong-Trung, Stefan Born, Jong Woo Kim, Marie-Therese Schermeyer, Katharina Paulick, Maxim Borisyak, Mariano Nicolas Cruz-Bournazou, Thorben Werner, Randolf Scholz, Lars Schmidt-Thieme, Peter Neubauer, Ernesto Martinez

    Abstract: Machine learning (ML) is becoming increasingly crucial in many fields of engineering but has not yet played out its full potential in bioprocess engineering. While experimentation has been accelerated by increasing levels of lab automation, experimental planning and data modeling are still largerly depend on human intervention. ML can be seen as a set of tools that contribute to the automation of… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 September, 2022; originally announced September 2022.

  21. arXiv:2208.09183  [pdf

    cs.CV cs.AI

    Improved Image Classification with Token Fusion

    Authors: Keong Hun Choi, ** Woo Kim, Yao Wang, Jong Eun Ha

    Abstract: In this paper, we propose a method using the fusion of CNN and transformer structure to improve image classification performance. In the case of CNN, information about a local area on an image can be extracted well, but there is a limit to the extraction of global information. On the other hand, the transformer has an advantage in relatively global extraction, but has a disadvantage in that it req… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  22. arXiv:2206.02222  [pdf, other

    math.OC cs.GT cs.MA eess.SY

    How does a Rational Agent Act in an Epidemic?

    Authors: S. Yagiz Olmez, Shubham Aggarwal, ** Won Kim, Erik Miehling, Tamer BaÅŸar, Matthew West, Prashant G. Mehta

    Abstract: Evolution of disease in a large population is a function of the top-down policy measures from a centralized planner, as well as the self-interested decisions (to be socially active) of individual agents in a large heterogeneous population. This paper is concerned with understanding the latter based on a mean-field type optimal control model. Specifically, the model is used to investigate the role… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.10422

  23. arXiv:2205.06468  [pdf, other

    cs.CV

    Monocular Human Digitization via Implicit Re-projection Networks

    Authors: Min-Gyu Park, Ju-Mi Kang, Je Woo Kim, Ju Hong Yoon

    Abstract: We present an approach to generating 3D human models from images. The key to our framework is that we predict double-sided orthographic depth maps and color images from a single perspective projected image. Our framework consists of three networks. The first network predicts normal maps to recover geometric details such as wrinkles in the clothes and facial regions. The second network predicts sha… ▽ More

    Submitted 15 May, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: Presented at CVRRW (AI for Content Creation workshop) 2022

  24. arXiv:2202.03245  [pdf, other

    cs.CR

    Scalable Multi-Party Privacy-Preserving Gradient Tree Boosting over Vertically Partitioned Dataset with Outsourced Computations

    Authors: Kennedy Edemacu, Beakcheol Jang, Jong Wook Kim

    Abstract: Due to privacy concerns, multi-party gradient tree boosting algorithms have become widely popular amongst machine learning researchers and practitioners. However, limited existing works have focused on vertically partitioned datasets, and the few existing works are either not scalable or tend to leak information. Thus, in this work, we propose SSXGB which is a scalable and secure multi-party gradi… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  25. arXiv:2201.10005  [pdf, other

    cs.CL cs.LG

    Text and Code Embeddings by Contrastive Pre-Training

    Authors: Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Chris Hallacy, Johannes Heidecke, Pranav Shyam, Boris Power, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger, David Schnurr, Felipe Petroski Such, Kenny Hsu, Madeleine Thompson, Tabarak Khan, Toki Sherbakov, Joanne Jang, Peter Welinder, Lilian Weng

    Abstract: Text embeddings are useful features in many applications such as semantic search and computing text similarity. Previous work typically trains models customized for different use cases, varying in dataset choice, training objective and model architecture. In this work, we show that contrastive pre-training on unsupervised data at scale leads to high quality vector representations of text and code.… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  26. arXiv:2111.10422  [pdf, ps, other

    math.OC cs.GT

    Modeling Presymptomatic Spread in Epidemics via Mean-Field Games

    Authors: S. Yagiz Olmez, Shubham Aggarwal, ** Won Kim, Erik Miehling, Tamer BaÅŸar, Matthew West, Prashant G. Mehta

    Abstract: This paper is concerned with develo** mean-field game models for the evolution of epidemics. Specifically, an agent's decision -- to be socially active in the midst of an epidemic -- is modeled as a mean-field game with health-related costs and activity-related rewards. By considering the fully and partially observed versions of this problem, the role of information in guiding an agent's rationa… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

  27. arXiv:2109.01903  [pdf, other

    cs.CV cs.LG

    Robust fine-tuning of zero-shot models

    Authors: Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt

    Abstract: Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve accuracy on a given target distribution, they often reduce robustness to distribution shifts. We address this tension by introducing a simple a… ▽ More

    Submitted 21 June, 2022; v1 submitted 4 September, 2021; originally announced September 2021.

    Comments: CVPR 2022

  28. arXiv:2108.02818  [pdf, other

    cs.CV cs.AI cs.CY

    Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications

    Authors: Sandhini Agarwal, Gretchen Krueger, Jack Clark, Alec Radford, Jong Wook Kim, Miles Brundage

    Abstract: Recently, there have been breakthroughs in computer vision ("CV") models that are more generalizable with the advent of models such as CLIP and ALIGN. In this paper, we analyze CLIP and highlight some of the challenges such models pose. CLIP reduces the need for task specific training data, potentially opening up many niche tasks to automation. CLIP also allows its users to flexibly specify image… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2103.00020

  29. arXiv:2103.00020  [pdf, other

    cs.CV cs.LG

    Learning Transferable Visual Models From Natural Language Supervision

    Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

    Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstr… ▽ More

    Submitted 26 February, 2021; originally announced March 2021.

  30. arXiv:2101.05504  [pdf, other

    cs.LG

    Reliability Check via Weight Similarity in Privacy-Preserving Multi-Party Machine Learning

    Authors: Kennedy Edemacu, Beakcheol Jang, Jong Wook Kim

    Abstract: Multi-party machine learning is a paradigm in which multiple participants collaboratively train a machine learning model to achieve a common learning objective without sharing their privately owned data. The paradigm has recently received a lot of attention from the research community aimed at addressing its associated privacy concerns. In this work, we focus on addressing the concerns of data pri… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

    Journal ref: Information Sciences, Volume 574, October 2021, Pages 51-65

  31. arXiv:2011.07785  [pdf, other

    cs.RO cs.AI

    Autonomously Navigating a Surgical Tool Inside the Eye by Learning from Demonstration

    Authors: Ji Woong Kim, Changyan He, Muller Urias, Peter Gehlbach, Gregory D. Hager, Iulian Iordachita, Marin Kobilarov

    Abstract: A fundamental challenge in retinal surgery is safely navigating a surgical tool to a desired goal position on the retinal surface while avoiding damage to surrounding tissues, a procedure that typically requires tens-of-microns accuracy. In practice, the surgeon relies on depth-estimation skills to localize the tool-tip with respect to the retina in order to perform the tool-navigation task, which… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: Accepted to ICRA 2020

  32. arXiv:2011.07778  [pdf, other

    cs.RO

    Towards Autonomous Eye Surgery by Combining Deep Imitation Learning with Optimal Control

    Authors: Ji Woong Kim, Peiyao Zhang, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: During retinal microsurgery, precise manipulation of the delicate retinal tissue is required for positive surgical outcome. However, accurate manipulation and navigation of surgical tools remain difficult due to a constrained workspace and the top-down view during the surgery, which limits the surgeon's ability to estimate depth. To alleviate such difficulty, we propose to automate the tool-naviga… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: Accepted to Conference on Robot Learning (CoRL) 2020

  33. arXiv:2005.00341  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Jukebox: A Generative Model for Music

    Authors: Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

    Abstract: We introduce Jukebox, a model that generates music with singing in the raw audio domain. We tackle the long context of raw audio using a multi-scale VQ-VAE to compress it to discrete codes, and modeling those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

  34. arXiv:1910.08705  [pdf

    eess.IV cs.CV

    Attention Guided Metal Artifact Correction in MRI using Deep Neural Networks

    Authors: Jee Won Kim, Kinam Kwon, Byungjai Kim, HyunWook Park

    Abstract: An attention guided scheme for metal artifact correction in MRI using deep neural network is proposed in this paper. The inputs of the networks are two distorted images obtained with dual-polarity readout gradients. With MR image generation module and the additional data consistency loss to the previous work [1], the network is trained to estimate the frequency-shift map, off-resonance map, and at… ▽ More

    Submitted 19 October, 2019; originally announced October 2019.

    Comments: 6 pages, 5 figures

    Journal ref: ICCV 2019 Workshop on Interpreting and Explaining Visual Artificial Intelligence Models

  35. Successive Point-of-Interest Recommendation with Local Differential Privacy

    Authors: Jong Seon Kim, Jong Wook Kim, Yon Dohn Chung

    Abstract: A point-of-interest (POI) recommendation system performs an important role in location-based services because it can help people to explore new locations and promote advertisers to launch advertisements at appropriate locations. The existing POI recommendation systems require raw check-in history of users, which might cause location privacy violations. Although there have been several matrix facto… ▽ More

    Submitted 9 May, 2021; v1 submitted 26 August, 2019; originally announced August 2019.

    Comments: This paper has been accepted to IEEE Access

  36. arXiv:1908.09203  [pdf

    cs.CL cs.AI cs.CY

    Release Strategies and the Social Impacts of Language Models

    Authors: Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, Miles McCain, Alex Newhouse, Jason Blazakis, Kris McGuffie, Jasmine Wang

    Abstract: Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more. However, their flexibility and generative capabilities also raise misuse concerns. This report discusses OpenAI's work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and… ▽ More

    Submitted 12 November, 2019; v1 submitted 24 August, 2019; originally announced August 2019.

    Comments: 71 pages, report

    ACM Class: I.2; I.2.7; K.4

  37. arXiv:1906.08512  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Adversarial Learning for Improved Onsets and Frames Music Transcription

    Authors: Jong Wook Kim, Juan Pablo Bello

    Abstract: Automatic music transcription is considered to be one of the hardest problems in music information retrieval, yet recent deep learning approaches have achieved substantial improvements on transcription performance. These approaches commonly employ supervised learning models that predict various time-frequency representations, by minimizing element-wise losses such as the cross entropy function. Ho… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  38. arXiv:1905.11684  [pdf, other

    cs.CL

    On Measuring Gender Bias in Translation of Gender-neutral Pronouns

    Authors: Won Ik Cho, Ji Won Kim, Seok Min Kim, Nam Soo Kim

    Abstract: Ethics regarding social bias has recently thrown striking issues in natural language processing. Especially for gender-related topics, the need for a system that reduces the model bias has grown in areas such as image captioning, content recommendation, and automated employment. However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated,… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: Accepted to 1st ACL Workshop on Gender Bias for Natural Language Processing (GeBNLP 2019)

  39. arXiv:1811.00223  [pdf, other

    cs.SD eess.AS stat.ML

    Neural Music Synthesis for Flexible Timbre Control

    Authors: Jong Wook Kim, Rachel Bittner, Aparna Kumar, Juan Pablo Bello

    Abstract: The recent success of raw audio waveform synthesis models like WaveNet motivates a new approach for music synthesis, in which the entire process --- creating audio samples from a score and instrument information --- is modeled using generative neural networks. This paper describes a neural music synthesis model with flexible timbre controls, which consists of a recurrent neural network conditioned… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

  40. Giving Space to Your Message: Assistive Word Segmentation for the Electronic Ty** of Digital Minorities

    Authors: Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim

    Abstract: For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts. If the language is agglutinative while far from scriptio continua, for instance in the Korean language, the problem becomes more significant. However, some device users these days find it challenging to communicate via key stroking, not… ▽ More

    Submitted 4 May, 2021; v1 submitted 31 October, 2018; originally announced October 2018.

    Comments: DIS 2021 Camera-ready

  41. arXiv:1806.05805  [pdf, other

    cs.LG stat.ML

    Molecular generative model based on conditional variational autoencoder for de novo molecular design

    Authors: Jaechang Lim, Seongok Ryu, ** Woo Kim, Woo Youn Kim

    Abstract: We propose a molecular generative model based on the conditional variational autoencoder for de novo molecular design. It is specialized to control multiple molecular properties simultaneously by imposing them on a latent space. As a proof of concept, we demonstrate that it can be used to generate drug-like molecules with five target properties. We were also able to adjust a single property withou… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

  42. arXiv:1802.06182  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    CREPE: A Convolutional Representation for Pitch Estimation

    Authors: Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello

    Abstract: The task of estimating the fundamental frequency of a monophonic sound recording, also known as pitch tracking, is fundamental to audio processing with multiple applications in speech processing and music information retrieval. To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics. While such techniques perform very well on… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

    Comments: ICASSP 2018

  43. arXiv:1712.00010  [pdf, ps, other

    cs.LG stat.ML

    Highrisk Prediction from Electronic Medical Records via Deep Attention Networks

    Authors: You ** Kim, Yun-Geun Lee, Jeong Whun Kim, ** Joo Park, Borim Ryu, Jung-Woo Ha

    Abstract: Predicting highrisk vascular diseases is a significant issue in the medical domain. Most predicting methods predict the prognosis of patients from pathological and radiological measurements, which are expensive and require much time to be analyzed. Here we propose deep attention models that predict the onset of the high risky vascular disease from symbolic medical histories sequence of hypertensio… ▽ More

    Submitted 30 November, 2017; originally announced December 2017.

    Comments: Accepted poster at NIPS 2017 Workshop on Machine Learning for Health (https://ml4health.github.io/2017/)

  44. arXiv:1709.09625  [pdf, other

    math.OC cs.LG

    How regularization affects the critical points in linear networks

    Authors: Amirhossein Taghvaei, ** W. Kim, Prashant G. Mehta

    Abstract: This paper is concerned with the problem of representing and learning a linear transformation using a linear neural network. In recent years, there has been a growing interest in the study of such networks in part due to the successes of deep learning. The main question of this body of research and also of this paper pertains to the existence and optimality properties of the critical points of the… ▽ More

    Submitted 27 September, 2017; originally announced September 2017.

  45. arXiv:1503.00477  [pdf

    cs.SI cs.CY

    Behavioral Aspects of Social Network Analysis

    Authors: Sung Joo Park, Jong Woo Kim, Hong Joo Lee, Hyun Jung Park, Peter Gloor

    Abstract: Contrary to the structural aspect of conventional social network analysis, a new method in behavioral analysis is proposed. We define behavioral measures including self-loops and multiple links and illustrate the behavioral analysis with the networks of Wikipedia editing. Behavioral social network analysis provides an explanation of human behavior that may be further extended to the explanation of… ▽ More

    Submitted 2 March, 2015; originally announced March 2015.

    Comments: Proceedings of the 5th International Conference on Collaborative Innovation Networks COINs15, Tokyo, Japan March 12-14, 2015 (arXiv:1502.01142)

    Report number: coins15/2015/34