Skip to main content

Showing 1–38 of 38 results for author: Ko, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19502  [pdf, other

    cs.CL cs.AI

    Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning

    Authors: Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo

    Abstract: Despite significant advancements, there is a limited understanding of how large language models (LLMs) utilize knowledge for reasoning. To address this, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with parent nodes of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Work in progress; code is available at https://github.com/kaistAI/knowledge-reasoning

  2. arXiv:2406.05761  [pdf, other

    cs.CL

    The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

    Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Gui** Son, Ye** Cho, Sheikh Shafayat, **heon Baek, Sue Hyun Park, Hyeonbin Hwang, **kyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

    Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  3. arXiv:2405.01974  [pdf, other

    cs.LG cs.AI q-bio.QM

    Multitask Extension of Geometrically Aligned Transfer Encoder

    Authors: Sung Moon Ko, Sumin Lee, Dae-Woong Jeong, Hyunseung Kim, Chanhui Lee, Soorin Yim, Sehui Han

    Abstract: Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transf… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 7 pages, 3 figures, 2 tables

  4. arXiv:2404.13286  [pdf, other

    cs.SD cs.IR eess.AS

    Track Role Prediction of Single-Instrumental Sequences

    Authors: Changheon Han, Suhyun Lee, Minsam Ko

    Abstract: In the composition process, selecting appropriate single-instrumental music sequences and assigning their track-role is an indispensable task. However, manually determining the track-role for a myriad of music samples can be time-consuming and labor-intensive. This study introduces a deep learning model designed to automatically predict the track-role of single-instrumental music sequences. Our ev… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISMIR LBD 2023

  5. arXiv:2404.10966  [pdf, other

    cs.CV

    Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

    Authors: Yeonguk Yu, Sungho Shin, Seunghyeok Back, Minhwan Ko, Sangjun Noh, Kyoobin Lee

    Abstract: Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this wo… ▽ More

    Submitted 7 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024

  6. arXiv:2402.18923  [pdf, other

    cs.CL cs.SD eess.AS

    Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

    Authors: Jeehyun Lee, Yerin Choi, Tae-** Song, Myoung-Wan Koo

    Abstract: Dysarthria, a common issue among stroke patients, severely impacts speech intelligibility. Inappropriate pauses are crucial indicators in severity assessment and speech-language therapy. We propose to extend a large-scale speech recognition model for inappropriate pause detection in dysarthric speech. To this end, we propose task design, labeling strategy, and a speech recognition model with an in… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024

  7. arXiv:2402.08922  [pdf, other

    cs.LG stat.ML

    The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

    Authors: Myeongseob Ko, Feiyang Kang, Weiyan Shi, Ming **, Zhou Yu, Ruoxi Jia

    Abstract: Large-scale black-box models have become ubiquitous across numerous applications. Understanding the influence of individual training data sources on predictions made by these models is crucial for improving their trustworthiness. Current influence estimation techniques involve computing gradients for every training point or repeated training on different subsets. These approaches face obvious comp… ▽ More

    Submitted 19 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

  8. arXiv:2401.14635  [pdf, other

    cs.CR cs.SE

    Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors

    Authors: Taylor R Schorlemmer, Kelechi G Kalu, Luke Chigges, Kyung Myung Ko, Eman Abu Isghair, Saurabh Baghi, Santiago Torres-Arias, James C Davis

    Abstract: Many software applications incorporate open-source third-party packages distributed by public package registries. Guaranteeing authorship along this supply chain is a challenge. Package maintainers can guarantee package authorship through software signing. However, it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data… ▽ More

    Submitted 14 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted at IEEE Security & Privacy 2024 (S&P'24)

  9. arXiv:2312.02531  [pdf, other

    cs.RO cs.AI

    PolyFit: A Peg-in-hole Assembly Framework for Unseen Polygon Shapes via Sim-to-real Adaptation

    Authors: Geonhyup Lee, Joosoon Lee, Sangjun Noh, Minhwan Ko, Kangmin Kim, Kyoobin Lee

    Abstract: The study addresses the foundational and challenging task of peg-in-hole assembly in robotics, where misalignments caused by sensor inaccuracies and mechanical errors often result in insertion failures or jamming. This research introduces PolyFit, representing a paradigm shift by transitioning from a reinforcement learning approach to a supervised learning methodology. PolyFit is a Force/Torque (F… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 8 pages, 8 figures, 3 tables

  10. arXiv:2311.08329  [pdf, other

    cs.CL

    KTRL+F: Knowledge-Augmented In-Document Search

    Authors: Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo

    Abstract: We introduce a new problem KTRL+F, a knowledge-augmented in-document search task that necessitates real-time identification of all semantic targets within a document with the awareness of external sources through a single natural query. KTRL+F addresses following unique challenges for in-document search: 1)utilizing knowledge outside the document for extended use of additional information about ta… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  11. arXiv:2310.06369  [pdf, other

    cs.AI cs.LG

    Geometrically Aligned Transfer Encoder for Inductive Transfer in Regression Tasks

    Authors: Sung Moon Ko, Sumin Lee, Dae-Woong Jeong, Woohyung Lim, Sehui Han

    Abstract: Transfer learning is a crucial technique for handling a small amount of data that is potentially related to other abundant data. However, most of the existing methods are focused on classification tasks using images and language datasets. Therefore, in order to expand the transfer learning scheme to regression tasks, we propose a novel transfer technique based on differential geometry, namely the… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 12+11 pages, 6+1 figures, 0+7 tables

  12. arXiv:2310.00108  [pdf, other

    cs.LG cs.CV

    Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study

    Authors: Myeongseob Ko, Ming **, Chenguang Wang, Ruoxi Jia

    Abstract: Membership inference attacks (MIAs) aim to infer whether a data point has been used to train a machine learning model. These attacks can be employed to identify potential privacy vulnerabilities and detect unauthorized use of personal data. While MIAs have been traditionally studied for simple classification models, recent advancements in multi-modal pre-training, such as CLIP, have demonstrated r… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: International Conference on Computer Vision (ICCV) 2023

  13. arXiv:2309.04062  [pdf, other

    cs.LG cs.AI physics.chem-ph

    3D Denoisers are Good 2D Teachers: Molecular Pretraining via Denoising and Cross-Modal Distillation

    Authors: Sungjun Cho, Dae-Woong Jeong, Sung Moon Ko, **woo Kim, Sehui Han, Seunghoon Hong, Honglak Lee, Moontae Lee

    Abstract: Pretraining molecular representations from large unlabeled data is essential for molecular property prediction due to the high cost of obtaining ground-truth labels. While there exist various 2D graph-based molecular pretraining approaches, these methods struggle to show statistically significant gains in predictive performance. Recent work have thus instead proposed 3D conformer-based pretraining… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 16 pages, 5 figures

  14. arXiv:2308.04709  [pdf, other

    cs.CL

    A Comparative Study of Open-Source Large Language Models, GPT-4 and Claude 2: Multiple-Choice Test Taking in Nephrology

    Authors: Sean Wu, Michael Koo, Lesley Blum, Andy Black, Liyo Kao, Fabien Scalzo, Ira Kurtz

    Abstract: In recent years, there have been significant breakthroughs in the field of natural language processing, particularly with the development of large language models (LLMs). These LLMs have showcased remarkable capabilities on various benchmarks. In the healthcare field, the exact role LLMs and other future AI models will play remains unclear. There is a potential for these models in the future to be… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 7 pages, 3 figures, 1 table

  15. arXiv:2308.01573  [pdf

    cs.SD cs.LG eess.AS

    Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS

    Authors: Myeong** Ko, Yong-Hoon Choi

    Abstract: The diffusion model is capable of generating high-quality data through a probabilistic approach. However, it suffers from the drawback of slow generation speed due to the requirement of a large number of time steps. To address this limitation, recent models such as denoising diffusion implicit models (DDIM) focus on generating samples without directly modeling the probability distribution, while m… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Journal ref: IEEE Open Journal of Signal Processing, vol. 5, pp. 577-587, 2024

  16. arXiv:2306.09020  [pdf, other

    math.OC cs.PF

    Distributionally Robust Stratified Sampling for Stochastic Simulations with Multiple Uncertain Input Models

    Authors: Seung Min Baik, Eunshin Byon, Young Myoung Ko

    Abstract: This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  17. arXiv:2305.19567  [pdf, other

    cs.SD cs.CL eess.AS

    DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer

    Authors: Yerin Choi, Myoung-Wan Koo

    Abstract: Despite the huge successes made in neutral TTS, content-leakage remains a challenge. In this paper, we propose a new input representation and simple architecture to achieve improved prosody modeling. Inspired by the recent success in the use of discrete code in TTS, we introduce discrete code to the input of the reference encoder. Specifically, we leverage the vector quantizer from the audio compr… ▽ More

    Submitted 28 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted in Interspeech 2023

  18. arXiv:2305.02468  [pdf, other

    cs.CL

    Task-Optimized Adapters for an End-to-End Task-Oriented Dialogue System

    Authors: Namo Bang, Jeehyun Lee, Myoung-Wan Koo

    Abstract: Task-Oriented Dialogue (TOD) systems are designed to carry out specific tasks by tracking dialogue states and generating appropriate responses to help users achieve defined goals. Recently, end-to-end dialogue models pre-trained based on large datasets have shown promising performance in the conversational system. However, they share the same parameters to train tasks of the dialogue system (NLU,… ▽ More

    Submitted 31 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL2023

  19. arXiv:2305.00054  [pdf, other

    cs.LG cs.AI stat.ML

    LAVA: Data Valuation without Pre-Specified Learning Algorithms

    Authors: Hoang Anh Just, Feiyang Kang, Jiachen T. Wang, Yi Zeng, Myeongseob Ko, Ming **, Ruoxi Jia

    Abstract: Traditionally, data valuation (DV) is posed as a problem of equitably splitting the validation performance of a learning algorithm among the training data. As a result, the calculated data values depend on many design choices of the underlying learning algorithm. However, this dependence is undesirable for many DV use cases, such as setting priorities over different data sources in a data acquisit… ▽ More

    Submitted 19 December, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

    Comments: ICLR 2023 Spotlight Latest Updated Version: 2023/12/19

  20. arXiv:2301.09789  [pdf, other

    cs.SE

    A Qualitative Study on the Implementation Design Decisions of Developers

    Authors: Jenny T. Liang, Maryam Arab, Minhyuk Ko, Amy J. Ko, Thomas D. LaToza

    Abstract: Decision-making is a key software engineering skill. Developers constantly make choices throughout the software development process, from requirements to implementation. While prior work has studied developer decision-making, the choices made while choosing what solution to write in code remain understudied. In this mixed-methods study, we examine the phenomenon where developers select one specifi… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

  21. Grou**-matrix based Graph Pooling with Adaptive Number of Clusters

    Authors: Sung Moon Ko, Sungjun Cho, Dae-Woong Jeong, Sehui Han, Moontae Lee, Honglak Lee

    Abstract: Graph pooling is a crucial operation for encoding hierarchical structures within graphs. Most existing graph pooling approaches formulate the problem as a node clustering task which effectively captures the graph topology. Conventional methods ask users to specify an appropriate number of clusters as a hyperparameter, then assume that all input graphs share the same number of clusters. In inductiv… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: 10 pages, 3 figures

  22. arXiv:2208.06882  [pdf, other

    cs.CV

    CoShNet: A Hybrid Complex Valued Neural Network using Shearlets

    Authors: Manny Ko, Ujjawal K. Panchal, Héctor Andrade-Loarca, Andres Mendez-Vazquez

    Abstract: In a hybrid neural network, the expensive convolutional layers are replaced by a non-trainable fixed transform with a great reduction in parameters. In previous works, good results were obtained by replacing the convolutions with wavelets. However, wavelet based hybrid network inherited wavelet's lack of vanishing moments along curves and its axis-bias. We propose to use Shearlets with its robust… ▽ More

    Submitted 29 October, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

    Comments: 16 pages, 11 figures

  23. arXiv:2205.12221  [pdf, other

    cs.CL

    ClaimDiff: Comparing and Contrasting Claims on Contentious Issues

    Authors: Miyoung Ko, Ingyu Seong, Hwaran Lee, Joonsuk Park, Minsuk Chang, Minjoon Seo

    Abstract: With the growing importance of detecting misinformation, many studies have focused on verifying factual claims by retrieving evidence. However, canonical fact verification tasks do not apply to catching subtle differences in factually consistent claims, which might still bias the readers, especially on contentious political or economic issues. Our underlying assumption is that among the trusted so… ▽ More

    Submitted 11 June, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: published at Findings of ACL 2023

  24. arXiv:2105.10477  [pdf

    cs.CV eess.IV q-bio.QM

    Towards Realization of Augmented Intelligence in Dermatology: Advances and Future Directions

    Authors: Roxana Daneshjou, Carrie Kovarik, Justin M Ko

    Abstract: Artificial intelligence (AI) algorithms using deep learning have advanced the classification of skin disease images; however these algorithms have been mostly applied "in silico" and not validated clinically. Most dermatology AI algorithms perform binary classification tasks (e.g. malignancy versus benign lesions), but this task is not representative of dermatologists' diagnostic range. The Americ… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: 5 pages, no figures

  25. IMPULSE: A 65nm Digital Compute-in-Memory Macro with Fused Weights and Membrane Potential for Spike-based Sequential Learning Tasks

    Authors: Amogh Agrawal, Mustafa Ali, Minsuk Koo, Nitin Rathi, Akhilesh Jaiswal, Kaushik Roy

    Abstract: The inherent dynamics of the neuron membrane potential in Spiking Neural Networks (SNNs) allows processing of sequential learning tasks, avoiding the complexity of recurrent neural networks. The highly-sparse spike-based computations in such spatio-temporal data can be leveraged for energy-efficiency. However, the membrane potential incurs additional memory access bottlenecks in current SNN hardwa… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

  26. arXiv:2102.01932  [pdf, other

    cs.RO

    Roughly Collected Dataset for Contact Force Sensing Catheter

    Authors: Seunghyuk Cho, Minsoo Koo, Dongwoo Kim, Juyong Lee, Yeonwoo Jung, Kibyung Nam, Changmo Hwang

    Abstract: With rise of interventional cardiology, Catheter Ablation Therapy (CAT) has established itself as a first-line solution to treat cardiac arrhythmia. Although CAT is a promising technique, cardiologist lacks vision inside the body during the procedure, which may cause serious clinical syndromes. To support accurate clinical procedure, Contact Force Sensing (CFS) system is developed to find a positi… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: 7 pages, 6 figures

  27. arXiv:2010.02086  [pdf, other

    cs.CV cs.CY cs.LG eess.SP

    TrueImage: A Machine Learning Algorithm to Improve the Quality of Telehealth Photos

    Authors: Kailas Vodrahalli, Roxana Daneshjou, Roberto A Novoa, Albert Chiou, Justin M Ko, James Zou

    Abstract: Telehealth is an increasingly critical component of the health care ecosystem, especially due to the COVID-19 pandemic. Rapid adoption of telehealth has exposed limitations in the existing infrastructure. In this paper, we study and highlight photo quality as a major challenge in the telehealth workflow. We focus on teledermatology, where photo quality is particularly important; the framework prop… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: 12 pages, 5 figures, Preprint of an article published in Pacific Symposium on Biocomputing \c{opyright} 2020 World Scientific Publishing Co., Singapore, http://psb.stanford.edu/

  28. arXiv:2007.09610  [pdf, other

    eess.IV cs.CV cs.LG

    Self-similarity Student for Partial Label Histopathology Image Segmentation

    Authors: Hsien-Tzu Cheng, Chun-Fu Yeh, Po-Chen Kuo, Andy Wei, Keng-Chi Liu, Mong-Chi Ko, Kuan-Hua Chao, Yu-Ching Peng, Tyng-Luh Liu

    Abstract: Delineation of cancerous regions in gigapixel whole slide images (WSIs) is a crucial diagnostic procedure in digital pathology. This process is time-consuming because of the large search space in the gigapixel WSIs, causing chances of omission and misinterpretation at indistinct tumor lesions. To tackle this, the development of an automated cancerous region segmentation method is imperative. We fr… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  29. arXiv:2006.15830  [pdf, other

    cs.CL cs.AI cs.LG

    Answering Questions on COVID-19 in Real-Time

    Authors: **hyuk Lee, Sean S. Yi, Minbyul Jeong, Mujeen Sung, Won** Yoon, Yonghwa Choi, Miyoung Ko, Jaewoo Kang

    Abstract: The recent outbreak of the novel coronavirus is wreaking havoc on the world and researchers are struggling to effectively combat it. One reason why the fight is difficult is due to the lack of information and knowledge. In this work, we outline our effort to contribute to shrinking this knowledge vacuum by creating covidAsk, a question answering (QA) system that combines biomedical text mining and… ▽ More

    Submitted 9 October, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: 10 pages, EMNLP NLP-COVID Workshop 2020

  30. arXiv:2004.14602  [pdf, other

    cs.CL

    Look at the First Sentence: Position Bias in Question Answering

    Authors: Miyoung Ko, **hyuk Lee, Hyunjae Kim, Gangwoo Kim, Jaewoo Kang

    Abstract: Many extractive question answering models are trained to predict start and end positions of answers. The choice of predicting answers as positions is mainly due to its simplicity and effectiveness. In this study, we hypothesize that when the distribution of the answer positions is highly skewed in the training set (e.g., answers lie only in the k-th sentence of each passage), QA models predicting… ▽ More

    Submitted 8 March, 2021; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: 13 pages, EMNLP 2020

  31. arXiv:2004.12786  [pdf, other

    eess.IV cs.CV cs.LG

    A Cascaded Learning Strategy for Robust COVID-19 Pneumonia Chest X-Ray Screening

    Authors: Chun-Fu Yeh, Hsien-Tzu Cheng, Andy Wei, Hsin-Ming Chen, Po-Chen Kuo, Keng-Chi Liu, Mong-Chi Ko, Ray-Jade Chen, Po-Chang Lee, Jen-Hsiang Chuang, Chi-Mai Chen, Yi-Chang Chen, Wen-Jeng Lee, Ning Chien, Jo-Yu Chen, Yu-Sen Huang, Yu-Chien Chang, Yu-Cheng Huang, Nai-Kuan Chou, Kuan-Hua Chao, Yi-Chin Tu, Yeun-Chung Chang, Tyng-Luh Liu

    Abstract: We introduce a comprehensive screening platform for the COVID-19 (a.k.a., SARS-CoV-2) pneumonia. The proposed AI-based system works on chest x-ray (CXR) images to predict whether a patient is infected with the COVID-19 disease. Although the recent international joint effort on making the availability of all sorts of open data, the public collection of CXR images is still relatively small for relia… ▽ More

    Submitted 30 April, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: 14 pages, 6 figures

  32. arXiv:2002.11163  [pdf, other

    cs.ET cs.AR

    sBSNN: Stochastic-Bits Enabled Binary Spiking Neural Network with On-Chip Learning for Energy Efficient Neuromorphic Computing at the Edge

    Authors: Minsuk Koo, Gopalakrishnan Srinivasan, Yong Shim, Kaushik Roy

    Abstract: In this work, we propose stochastic Binary Spiking Neural Network (sBSNN) composed of stochastic spiking neurons and binary synapses (stochastic only during training) that computes probabilistically with one-bit precision for power-efficient and memory-compressed neuromorphic computing. We present an energy-efficient implementation of the proposed sBSNN using 'stochastic bit' as the core computati… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  33. QoS-aware energy-efficient workload routing and server speed control policy in data centers: a robust queueing theoretic approach

    Authors: Seung Min Baik, Young Myoung Ko

    Abstract: Operating cloud service infrastructures requires high energy efficiency while ensuring a satisfactory service level. Motivated by data centers, we consider a workload routing and server speed control policy applicable to the system operating under fluctuating demands. Dynamic control algorithms are generally more energy-efficient than static ones. However, they often require frequent information e… ▽ More

    Submitted 3 March, 2023; v1 submitted 20 December, 2019; originally announced December 2019.

    Journal ref: IISE Transactions, 2023

  34. arXiv:1905.13130  [pdf, other

    cs.IR cs.LG stat.ML

    SAIN: Self-Attentive Integration Network for Recommendation

    Authors: Seoungjun Yun, Raehyun Kim, Miyoung Ko, Jaewoo Kang

    Abstract: With the growing importance of personalized recommendation, numerous recommendation models have been proposed recently. Among them, Matrix Factorization (MF) based models are the most widely used in the recommendation field due to their high performance. However, MF based models suffer from cold start problems where user-item interactions are sparse. To deal with this problem, content based recomm… ▽ More

    Submitted 6 November, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: SIGIR 2019

  35. arXiv:1901.07031  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

    Authors: Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, Andrew Y. Ng

    Abstract: Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We invest… ▽ More

    Submitted 21 January, 2019; originally announced January 2019.

    Comments: Published in AAAI 2019

  36. Stabilizing the virtual response time in single-server processor sharing queues with slowly time-varying arrival rates

    Authors: Yongkyu Cho, Young Myoung Ko

    Abstract: Motivated by the work of Whitt, who studied stabilization of the mean virtual waiting time (excluding service time) in a $GI_t/GI_t/1/FCFS$ queue, this paper investigates the stabilization of the mean virtual response time in a single-server processor sharing (PS) queueing system with a time-varying arrival rate and a service rate control (a $GI_t/GI_t/1/PS$ queue). We propose and compare a modifi… ▽ More

    Submitted 5 November, 2018; originally announced November 2018.

    Journal ref: Annals of Operations Research, 293 (2020), 27-55

  37. arXiv:1810.00494  [pdf, other

    cs.CL cs.LG

    Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering

    Authors: **hyuk Lee, Seongjun Yun, Hyunjae Kim, Miyoung Ko, Jaewoo Kang

    Abstract: Recently, open-domain question answering (QA) has been combined with machine comprehension models to find answers in a large knowledge source. As open-domain QA requires retrieving relevant documents from text corpora to answer questions, its performance largely depends on the performance of document retrievers. However, since traditional information retrieval systems are not effective in obtainin… ▽ More

    Submitted 30 September, 2018; originally announced October 2018.

    Comments: EMNLP 2018

  38. An Efficient Two-Stage Sparse Representation Method

    Authors: Chengyu Peng, Hong Cheng, Manchor Ko

    Abstract: There are a large number of methods for solving under-determined linear inverse problem. Many of them have very high time complexity for large datasets. We propose a new method called Two-Stage Sparse Representation (TSSR) to tackle this problem. We decompose the representing space of signals into two parts, the measurement dictionary and the sparsifying basis. The dictionary is designed to approx… ▽ More

    Submitted 25 July, 2014; v1 submitted 3 April, 2014; originally announced April 2014.

    Comments: 21 pages, 2 figures, 4 tables

    ACM Class: G.1.6; I.4.10