Skip to main content

Showing 1–27 of 27 results for author: Sim, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19707  [pdf, other

    cs.LG cs.DC

    InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management

    Authors: Wonbeom Lee, Jungi Lee, Junghwan Seo, Jaewoong Sim

    Abstract: Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a challenge due to the enormous memory footprint of the transient state, known as the key-value (KV) cache, which scales with the sequence length and batch size. In this paper, we present InfiniGen, a… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: OSDI 2024

  2. arXiv:2406.12930  [pdf, other

    cs.LG cs.AR

    Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization

    Authors: Jungi Lee, Wonbeom Lee, Jaewoong Sim

    Abstract: Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning and have thus become one of the most important workloads in today's computing landscape. However, deploying LLM inference poses challenges due to the high compute and memory requirements stemming from the enormous model size and the difficulty of running it in the integer pipelines. In this paper,… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: To appear at the 51st International Symposium on Computer Architecture (ISCA 2024)

  3. arXiv:2405.18832  [pdf, other

    cs.LG cs.AI cs.AR

    MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models

    Authors: Taehyun Kim, Kwanseok Choi, Youngmock Cho, Jaehoon Cho, Hyuk-Jae Lee, Jaewoong Sim

    Abstract: Mixture-of-Experts (MoE) large language models (LLM) have memory requirements that often exceed the GPU memory capacity, requiring costly parameter movement from secondary memories to the GPU for expert computation. In this work, we present Mixture of Near-Data Experts (MoNDE), a near-data computing solution that efficiently enables MoE LLM inference. MoNDE reduces the volume of MoE parameter move… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to DAC 2024

  4. arXiv:2404.01752  [pdf, other

    cs.RO cs.AI cs.MA

    Safe Interval RRT* for Scalable Multi-Robot Path Planning in Continuous Space

    Authors: Joonyeol Sim, Joonkyung Kim, Changjoo Nam

    Abstract: In this paper, we consider the problem of Multi-Robot Path Planning (MRPP) in continuous space to find conflict-free paths. The difficulty of the problem arises from two primary factors. First, the involvement of multiple robots leads to combinatorial decision-making, which escalates the search space exponentially. Second, the continuous space presents potentially infinite states and actions. For… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2404.00626  [pdf, other

    cs.CV

    Domain Generalizable Person Search Using Unreal Dataset

    Authors: Minyoung Oh, Duhyun Kim, Jae-Young Sim

    Abstract: Collecting and labeling real datasets to train the person search networks not only requires a lot of time and effort, but also accompanies privacy issues. The weakly-supervised and unsupervised domain adaptation methods have been proposed to alleviate the labeling burden for target datasets, however, their generalization capability is limited. We introduce a novel person search method based on the… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: AAAI2024 accepted

  6. arXiv:2403.10022  [pdf, other

    cs.CV

    Lifelong Person Re-Identification with Backward-Compatibility

    Authors: Minyoung Oh, Jae-Young Sim

    Abstract: Lifelong person re-identification (LReID) assumes a practical scenario where the model is sequentially trained on continuously incoming datasets while alleviating the catastrophic forgetting in the old datasets. However, not only the training datasets but also the gallery images are incrementally accumulated, that requires a huge amount of computational complexity and storage space to extract the… ▽ More

    Submitted 17 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 17 pages, 5 figures, 7 tables

  7. arXiv:2401.11840  [pdf, other

    cs.LG cs.AI

    Learning to Approximate Adaptive Kernel Convolution on Graphs

    Authors: Jaeyoon Sim, Sooyeon Jeon, InJun Choi, Guorong Wu, Won Hwa Kim

    Abstract: Various Graph Neural Networks (GNNs) have been successful in analyzing data in non-Euclidean spaces, however, they have limitations such as oversmoothing, i.e., information becomes excessively averaged as the number of hidden layers increases. The issue stems from the intrinsic formulation of conventional graph convolution where the nodal features are aggregated from a direct neighborhood per laye… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 15 pages, Accepted to AAAI 2024

  8. arXiv:2308.10443  [pdf, other

    cs.AI cs.CL cs.CY

    Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions

    Authors: Wesley Tann, Yuancheng Liu, Jun Heng Sim, Choon Meng Seah, Ee-Chien Chang

    Abstract: The assessment of cybersecurity Capture-The-Flag (CTF) exercises involves participants finding text strings or ``flags'' by exploiting system vulnerabilities. Large Language Models (LLMs) are natural-language models trained on vast amounts of words to understand and generate text; they can perform well on many CTF challenges. Such LLMs are freely available to students. In the context of CTF exerci… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  9. arXiv:2307.11133  [pdf, other

    q-bio.NC cs.AI cs.LG

    Contrastive Graph Pooling for Explainable Classification of Brain Networks

    Authors: Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yi** Ke, Miao Qiao, Wei Zhang, Wei Khang Jeremy Sim, Balázs Gulyás

    Abstract: Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics… ▽ More

    Submitted 12 April, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

  10. arXiv:2305.13108  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test

    Authors: Eungbeom Kim, Yunkee Chae, Jaeheon Sim, Kyogu Lee

    Abstract: Automatic speech recognition systems based on deep learning are mainly trained under empirical risk minimization (ERM). Since ERM utilizes the averaged performance on the data samples regardless of a group such as healthy or dysarthric speakers, ASR systems are unaware of the performance disparities across the groups. This results in biased ASR systems whose performance differences among groups ar… ▽ More

    Submitted 27 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  11. arXiv:2210.17143  [pdf, other

    cs.SD cs.CL eess.AS

    Exploring Train and Test-Time Augmentations for Audio-Language Learning

    Authors: Eungbeom Kim, **hee Kim, Yoori Oh, Kyungsu Kim, Minju Park, Jaeheon Sim, **woo Lee, Kyogu Lee

    Abstract: In this paper, we aim to unveil the impact of data augmentation in audio-language multi-modal learning, which has not been explored despite its importance. We explore various augmentation methods at not only train-time but also test-time and find out that proper data augmentation can lead to substantial improvements. Specifically, applying our proposed audio-language paired augmentation PairMix, w… ▽ More

    Submitted 23 May, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: 5 pages, 4 figures

  12. arXiv:2206.07896  [pdf, other

    cs.DC cs.AR

    CuPBoP: CUDA for Parallelized and Broad-range Processors

    Authors: Ruobing Han, Jun Chen, Bhanu Garg, Jeffrey Young, Jaewoong Sim, Hyesoon Kim

    Abstract: CUDA is one of the most popular choices for GPU programming, but it can only be executed on NVIDIA GPUs. Executing CUDA on non-NVIDIA devices not only benefits the hardware community, but also allows data-parallel computation in heterogeneous systems. To make CUDA programs portable, some researchers have proposed using source-to-source translators to translate CUDA to portable programming language… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  13. Reinforcement Learning for Vision-based Object Manipulation with Non-parametric Policy and Action Primitives

    Authors: Dongwon Son, Myungsin Kim, Jaecheol Sim, Wonsik Shin

    Abstract: The object manipulation is a crucial ability for a service robot, but it is hard to solve with reinforcement learning due to some reasons such as sample efficiency. In this paper, to tackle this object manipulation, we propose a novel framework, AP-NPQL (Non-Parametric Q Learning with Action Primitives), that can efficiently solve the object manipulation with visual input and sparse reward, by uti… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

    Journal ref: IROS 2021

  14. arXiv:2206.01326  [pdf, other

    cs.CV cs.CY cs.LG

    Improving Fairness in Large-Scale Object Recognition by CrowdSourced Demographic Information

    Authors: Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

    Abstract: There has been increasing awareness of ethical issues in machine learning, and fairness has become an important research topic. Most fairness efforts in computer vision have been focused on human sensing applications and preventing discrimination by people's physical attributes such as race, skin color or age by increasing visual representation for particular demographic groups. We argue that ML f… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  15. arXiv:2112.10034  [pdf, other

    cs.DC cs.AR

    COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs

    Authors: Ruobing Han, Jaewon Lee, Jaewoong Sim, Hyesoon Kim

    Abstract: As CUDA programs become the de facto program among data parallel applications such as high-performance computing or machine learning applications, running CUDA on other platforms has been a compelling option. Although several efforts have attempted to support CUDA on other than NVIDIA GPU devices, due to extra steps in the translation, the support is always behind a few years from supporting CUDA'… ▽ More

    Submitted 18 December, 2021; originally announced December 2021.

  16. arXiv:2110.01341  [pdf, other

    cs.CV cs.AI cs.LG

    Context-Aware Unsupervised Clustering for Person Search

    Authors: Byeong-Ju Han, Kuhyeun Ko, Jae-Young Sim

    Abstract: The existing person search methods use the annotated labels of person identities to train deep networks in a supervised manner that requires a huge amount of time and effort for human labeling. In this paper, we first introduce a novel framework of person search that is able to train the network in the absence of the person identity labels, and propose efficient unsupervised clustering methods to… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  17. arXiv:2109.00673  [pdf, other

    cs.PL

    Supporting CUDA for an extended RISC-V GPU architecture

    Authors: Ruobing Han, Blaise Tine, Jaewon Lee, Jaewoong Sim, Hyesoon Kim

    Abstract: With the rapid development of scientific computation, more and more researchers and developers are committed to implementing various workloads/operations on different devices. Among all these devices, NVIDIA GPU is the most popular choice due to its comprehensive documentation and excellent development tools. As a result, there are abundant resources for hand-writing high-performance CUDA codes. H… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

  18. arXiv:2108.08874  [pdf, other

    cs.CV

    Towards A Fairer Landmark Recognition Dataset

    Authors: Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

    Abstract: We introduce a new landmark recognition dataset, which is created with a focus on fair worldwide representation. While previous work proposes to collect as many images as possible from web repositories, we instead argue that such approaches can lead to biased data. To create a more comprehensive and equitable dataset, we start by defining the fair relevance of a landmark to the world population. T… ▽ More

    Submitted 6 June, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: Please cite the full detailed version of the paper instead: Improving Fairness in Large-Scale Object Recognition by CrowdSourced Demographic Information arXiv:2206.01326

  19. arXiv:2103.03375  [pdf, other

    cs.CV cs.LG

    Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

    Authors: Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim

    Abstract: Understanding the nutritional content of food from visual data is a challenging computer vision problem, with the potential to have a positive and widespread impact on public health. Studies in this area are limited to existing datasets in the field that lack sufficient diversity or labels required for training models with nutritional understanding capability. We introduce Nutrition5k, a novel dat… ▽ More

    Submitted 22 June, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: 8 pages, 3 of appendices. CVPR 2021

  20. arXiv:2004.01804  [pdf, other

    cs.CV

    Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

    Authors: Tobias Weyand, Andre Araujo, Bingyi Cao, Jack Sim

    Abstract: While image retrieval and instance recognition techniques are progressing rapidly, there is a need for challenging datasets to accurately measure their performance -- while posing novel challenges that are relevant for practical applications. We introduce the Google Landmarks Dataset v2 (GLDv2), a new benchmark for large-scale, fine-grained instance recognition and image retrieval in the domain of… ▽ More

    Submitted 2 November, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: CVPR20 camera-ready (oral) + appendices

  21. arXiv:2001.05027  [pdf, other

    cs.CV

    Unifying Deep Local and Global Features for Image Search

    Authors: Bingyi Cao, Andre Araujo, Jack Sim

    Abstract: Image retrieval is the problem of searching an image database for items that are similar to a query image. To address this task, two main types of image representations have been studied: global and local image features. In this work, our key contribution is to unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction. We refer to the n… ▽ More

    Submitted 15 September, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: ECCV'20 paper

  22. arXiv:1902.04303  [pdf, other

    stat.AP cs.CR q-bio.GN

    Achieving GWAS with Homomorphic Encryption

    Authors: Jun Jie Sim, Fook Mun Chan, Shibin Chen, Benjamin Hong Meng Tan, Khin Mi Mi Aung

    Abstract: One way of investigating how genes affect human traits would be with a genome-wide association study (GWAS). Genetic markers, known as single-nucleotide polymorphism (SNP), are used in GWAS. This raises privacy and security concerns as these genetic markers can be used to identify individuals uniquely. This problem is further exacerbated by a large number of SNPs needed, which produce reliable res… ▽ More

    Submitted 1 August, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

  23. arXiv:1812.01584  [pdf, other

    cs.CV

    Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

    Authors: Marvin Teichmann, Andre Araujo, Menglong Zhu, Jack Sim

    Abstract: Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uni… ▽ More

    Submitted 13 May, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

    Comments: CVPR 2019. Code and dataset available: https://github.com/tensorflow/models/tree/master/research/delf

  24. arXiv:1811.00778  [pdf, other

    cs.CR cs.LG

    Towards the AlexNet Moment for Homomorphic Encryption: HCNN, theFirst Homomorphic CNN on Encrypted Data with GPUs

    Authors: Ahmad Al Badawi, ** Chao, Jie Lin, Chan Fook Mun, Jun Jie Sim, Benjamin Hong Meng Tan, Xiao Nan, Khin Mi Mi Aung, Vijay Ramaseshan Chandrasekhar

    Abstract: Deep Learning as a Service (DLaaS) stands as a promising solution for cloud-based inference applications. In this setting, the cloud has a pre-learned model whereas the user has samples on which she wants to run the model. The biggest concern with DLaaS is user privacy if the input samples are sensitive data. We provide here an efficient privacy-preserving system by employing high-end technologies… ▽ More

    Submitted 18 August, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

  25. arXiv:1808.02130  [pdf, other

    cs.CV

    CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps

    Authors: Paul Hongsuck Seo, Tobias Weyand, Jack Sim, Bohyung Han

    Abstract: Image geolocalization is the task of identifying the location depicted in a photo based only on its visual information. This task is inherently challenging since many photos have only few, possibly ambiguous cues to their geolocation. Recent work has cast this task as a classification problem by partitioning the earth into a set of discrete cells that correspond to geographic regions. The granular… ▽ More

    Submitted 6 August, 2018; originally announced August 2018.

    Comments: ECCV 2018 accepted paper

  26. arXiv:1612.06321  [pdf, other

    cs.CV

    Large-Scale Image Retrieval with Attentive Deep Local Features

    Authors: Hyeonwoo Noh, Andre Araujo, Jack Sim, Tobias Weyand, Bohyung Han

    Abstract: We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELF (DEep Local Feature). The new feature is based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset. To identify semantically useful local features for image retrieval, we also propose an attention mechanism for keypoint selecti… ▽ More

    Submitted 2 February, 2018; v1 submitted 19 December, 2016; originally announced December 2016.

    Comments: ICCV 2017. Code and dataset available: https://github.com/tensorflow/models/tree/master/research/delf

  27. arXiv:1506.05203  [pdf, other

    cs.DS

    Fast Multiple Order-Preserving Matching Algorithms

    Authors: Myoungji Han, Munseong Kang, Sukhyeun Cho, Geonmo Gu, Jeong Seop Sim, Kunsoo Park

    Abstract: Given a text $T$ and a pattern $P$, the order-preserving matching problem is to find all substrings in $T$ which have the same relative orders as $P$. Order-preserving matching has been an active research area since it was introduced by Kubica et al. \cite{kubica2013linear} and Kim et al. \cite{kim2014order}. In this paper we present two algorithms for the multiple order-preserving matching proble… ▽ More

    Submitted 17 June, 2015; originally announced June 2015.

    Comments: 15 pages, 8 figures, submitted to IWOCA 2015