Skip to main content

Showing 1–50 of 142 results for author: Chae, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14703  [pdf, other

    cs.CL cs.AI

    Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

    Authors: Seungbeen Lee, Seungwon Lim, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, **young Yeo, Youngjae Yu

    Abstract: The idea of personality in descriptive psychology, traditionally defined through observable behavior, has now been extended to Large Language Models (LLMs) to better understand their behavior. This raises a question: do LLMs exhibit distinct and consistent personality traits, similar to humans? Existing self-assessment personality tests, while applicable, lack the necessary validity and reliabilit… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Preprint; Under review

  2. arXiv:2406.05761  [pdf, other

    cs.CL

    The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

    Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Gui** Son, Ye** Cho, Sheikh Shafayat, **heon Baek, Sue Hyun Park, Hyeonbin Hwang, **kyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

    Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  3. arXiv:2404.18428  [pdf, other

    cs.DB

    Geospatial Big Data: Survey and Challenges

    Authors: Jiayang Wu, Wensheng Gan, Han-Chieh Chao, Philip S. Yu

    Abstract: In recent years, geospatial big data (GBD) has obtained attention across various disciplines, categorized into big earth observation data and big human behavior data. Identifying geospatial patterns from GBD has been a vital research focus in the fields of urban management and environmental sustainability. This paper reviews the evolution of GBD mining and its integration with advanced artificial… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: IEEE JSTARS. 14 pages, 5 figures

  4. arXiv:2404.02575  [pdf, other

    cs.CL

    Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

    Authors: Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, Seonghwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu, **young Yeo

    Abstract: Algorithmic reasoning refers to the ability to understand the complex patterns behind the problem and decompose them into a sequence of reasoning steps towards the solution. Such nature of algorithmic reasoning makes it a challenge for large language models (LLMs), even though they have demonstrated promising performance in other reasoning tasks. Within this context, some recent studies use progra… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 38 pages, 4 figures

  5. arXiv:2404.02135  [pdf

    cs.CV eess.IV

    Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance

    Authors: Ryan Donghan Kwon, Gangjoo Robin Nam, Jisoo Tak, Junseob Shin, Hyerin Cha, Yeom Hyeok, Seung Won Lee

    Abstract: This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focu… ▽ More

    Submitted 8 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  6. arXiv:2403.02966  [pdf, other

    cs.CL cs.AI cs.LG

    Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering

    Authors: Sungho Ko, Hyun** Cho, Hyungjoo Chae, **young Yeo, Dongha Lee

    Abstract: Recent studies have investigated utilizing Knowledge Graphs (KGs) to enhance Quesetion Answering (QA) performance of Large Language Models (LLMs), yet structured KG verbalization remains challengin. Existing methods, such as triple-form or free-form textual conversion of triple-form facts, encounter several issues. These include reduced evidence density due to duplicated entities or relationships,… ▽ More

    Submitted 19 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  7. arXiv:2402.19479  [pdf, other

    cs.CV

    Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

    Authors: Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

    Abstract: The quality of the data and annotation upper-bounds the quality of a downstream model. While there exist large text corpora and image-text pairs, high-quality video-text data is much harder to collect. First of all, manual labeling is more time-consuming, as it requires an annotator to watch an entire video. Second, videos have a temporal dimension, consisting of several scenes stacked together, a… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: CVPR 2024. Project Page: https://snap-research.github.io/Panda-70M

  8. arXiv:2402.18374  [pdf, other

    cs.CL

    VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models

    Authors: Seoyeon Kim, Kwangwook Seo, Hyungjoo Chae, **young Yeo, Dongha Lee

    Abstract: Recent approaches in domain-specific named entity recognition (NER), such as biomedical NER, have shown remarkable advances. However, they still lack of faithfulness, producing erroneous predictions. We assume that knowledge of entities can be useful in verifying the correctness of the predictions. Despite the usefulness of knowledge, resolving such errors with knowledge is nontrivial, since the k… ▽ More

    Submitted 8 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024

  9. arXiv:2402.10636  [pdf, other

    cs.CV

    PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

    Authors: Hyunsoo Cha, Byungjun Kim, Hanbyul Joo

    Abstract: We present PEGASUS, a method for constructing a personalized generative 3D face avatar from monocular video sources. Our generative 3D avatar enables disentangled controls to selectively alter the facial attributes (e.g., hair or nose) while preserving the identity. Our approach consists of two stages: synthetic database generation and constructing a personalized generative avatar. We generate a s… ▽ More

    Submitted 2 April, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted at CVPR 2024, Project Page: https://snuvclab.github.io/pegasus/

  10. arXiv:2402.00137  [pdf, other

    cs.LG cs.CV

    Multimodal Neurodegenerative Disease Subty** Explained by ChatGPT

    Authors: Diego Machado Reyes, Hanqing Chao, Juergen Hahn, Li Shen, **kun Yan

    Abstract: Alzheimer's disease (AD) is the most prevalent neurodegenerative disease; yet its currently available treatments are limited to stop** disease progression. Moreover, effectiveness of these treatments is not guaranteed due to the heterogenetiy of the disease. Therefore, it is essential to be able to identify the disease subtypes at a very early stage. Current data driven approaches are able to cl… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  11. arXiv:2401.17594  [pdf, other

    cs.IT

    5G NR Positioning Enhancements in 3GPP Release-18

    Authors: Hyun-Su Cha, Gilsoo Lee, Amitava Ghosh, Matthew Baker, Sean Kelley, Juergen Hofmann

    Abstract: New radio (NR) positioning in the Third Generation Partnership Project (3GPP) Release 18 (Rel-18) enables 5G-advanced networks to achieve ultra-high accuracy positioning without dependence on global navigation satellite systems (GNSS) with key enablers such as the carrier phase positioning technique, standardized for the first time in a cellular communications standard and setting a new baseline f… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  12. arXiv:2311.07215  [pdf, other

    cs.CL cs.SE

    Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback

    Authors: Seungjun Moon, Hyungjoo Chae, Yongho Song, Taeyoon Kwon, Dong** Kang, Kai Tzu-iunn Ong, Seung-won Hwang, **young Yeo

    Abstract: Code editing is an essential step towards reliable program synthesis to automatically correct critical errors generated from code LLMs. Recent studies have demonstrated that closed-source LLMs (i.e., ChatGPT and GPT-4) are capable of generating corrective feedback to edit erroneous inputs. However, it remains challenging for open-source code LLMs to generate feedback for code editing, since these… ▽ More

    Submitted 23 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Work in progress

  13. arXiv:2310.09343  [pdf, other

    cs.CL cs.AI

    Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents

    Authors: Hyungjoo Chae, Yongho Song, Kai Tzu-iunn Ong, Taeyoon Kwon, Min** Kim, Youngjae Yu, Dongha Lee, Dongyeop Kang, **young Yeo

    Abstract: Human-like chatbots necessitate the use of commonsense reasoning in order to effectively comprehend and respond to implicit information present within conversations. Achieving such coherence and informativeness in responses, however, is a non-trivial task. Even for large language models (LLMs), the task of identifying and aggregating key evidence within a single hop presents a substantial challeng… ▽ More

    Submitted 22 October, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: 25 pages, 8 figures, Accepted to EMNLP 2023

  14. arXiv:2309.12314  [pdf, other

    cs.CV

    TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

    Authors: Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi, Chen, Xinggang Wang, Hongyang Chao, Han Hu

    Abstract: In this paper, we propose a novel cross-modal distillation method, called TinyCLIP, for large-scale language-image pre-trained models. The method introduces two core techniques: affinity mimicking and weight inheritance. Affinity mimicking explores the interaction between modalities during distillation, enabling student models to mimic teachers' behavior of learning cross-modal feature alignment i… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted By ICCV 2023

  15. arXiv:2309.01207  [pdf, other

    eess.IV cs.CV cs.LG

    Spectral Adversarial MixUp for Few-Shot Unsupervised Domain Adaptation

    Authors: Jia** Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, **kun Yan

    Abstract: Domain shift is a common problem in clinical applications, where the training images (source domain) and the test images (target domain) are under different distributions. Unsupervised Domain Adaptation (UDA) techniques have been proposed to adapt models trained in the source domain to the target domain. However, those methods require a large number of images from the target domain for model train… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: Accepted by MICCAI 2023

  16. arXiv:2306.12978  [pdf, other

    cs.IT eess.SP

    Rate-Splitting Multiple Access for 6G Networks: Ten Promising Scenarios and Applications

    Authors: Jeonghun Park, Byungju Lee, **seok Choi, Hoon Lee, Namyoon Lee, Seok-Hwan Park, Kyoung-Jae Lee, Junil Choi, Sung Ho Chae, Sang-Woon Jeon, Kyung Sup Kwak, Bruno Clerckx, Wonjae Shin

    Abstract: In the upcoming 6G era, multiple access (MA) will play an essential role in achieving high throughput performances required in a wide range of wireless applications. Since MA and interference management are closely related issues, the conventional MA techniques are limited in that they cannot provide near-optimal performance in universal interference regimes. Recently, rate-splitting multiple acce… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 17 pages, 6 figures, submitted to IEEE Network Magazine

  17. arXiv:2306.11731  [pdf, other

    cs.CV

    Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning

    Authors: Huiguo He, Tianfu Wang, Huan Yang, Jianlong Fu, Nicholas **g Yuan, Jian Yin, Hongyang Chao, Qi Zhang

    Abstract: We study the task of generating profitable Non-Fungible Token (NFT) images from user-input texts. Recent advances in diffusion models have shown great potential for image generation. However, existing works can fall short in generating visually-pleasing and highly-profitable NFT images, mainly due to the lack of 1) plentiful and fine-grained visual attribute prompts for an NFT image, and 2) effect… ▽ More

    Submitted 17 August, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted by ACM-MM 2023

  18. arXiv:2303.11763  [pdf, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface Aided Hybrid Beamforming: Optimal Placement and Beamforming Design

    Authors: Najam Us Saqib, Shumei Hou, Sung Ho Chae, Sang-Woon Jeon

    Abstract: We consider reconfigurable intelligent surface (RIS) aided sixth-generation (6G) terahertz (THz) communications for indoor environment in which a base station (BS) wishes to send independent messages to its serving users with the help of multiple RISs. For indoor environment, various obstacles such as pillars, walls, and other objects can result in no line-of-sight signal path between the BS and a… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: This manuscript contains 18 pages and 9 figures

  19. arXiv:2303.09597  [pdf, other

    cs.RO cs.AI

    Residual Physics Learning and System Identification for Sim-to-real Transfer of Policies on Buoyancy Assisted Legged Robots

    Authors: Nitish Sontakke, Hosik Chae, Sangjoon Lee, Tianle Huang, Dennis W. Hong, Sehoon Ha

    Abstract: The light and soft characteristics of Buoyancy Assisted Lightweight Legged Unit (BALLU) robots have a great potential to provide intrinsically safe interactions in environments involving humans, unlike many heavy and rigid robots. However, their unique and sensitive dynamics impose challenges to obtaining robust control policies in the real world. In this work, we demonstrate robust sim-to-real tr… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  20. arXiv:2303.04508  [pdf, other

    cs.CV

    InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning

    Authors: Seunghwan Lee, Gwanmo Park, Hyewon Son, Jiwon Ryu, Han Joo Chae

    Abstract: We introduce InFusionSurf, a novel approach to enhance the fidelity of neural radiance field (NeRF) frameworks for 3D surface reconstruction using RGB-D video frames. Building upon previous methods that have employed feature encoding to improve optimization speed, we further improve the reconstruction quality with minimal impact on optimization time by refining depth information. Our per-frame int… ▽ More

    Submitted 3 September, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  21. arXiv:2303.03628  [pdf, other

    cs.CL cs.LG

    CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

    Authors: Seungone Kim, Se June Joo, Yul Jang, Hyungjoo Chae, **young Yeo

    Abstract: Chain-of-thought (CoT) prompting enables large language models (LLMs) to solve complex reasoning tasks by generating an explanation before the final prediction. Despite it's promising ability, a critical downside of CoT prompting is that the performance is greatly affected by the factuality of the generated explanation. To improve the correctness of the explanations, fine-tuning language models wi… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: Accepted at EACL 2023 Demo

  22. arXiv:2302.12623  [pdf, other

    cs.AI cs.CL

    TUTORING: Instruction-Grounded Conversational Agent for Language Learners

    Authors: Hyungjoo Chae, Min** Kim, Chaehyeong Kim, Wonseok Jeong, Hyejoong Kim, Junmyung Lee, **young Yeo

    Abstract: In this paper, we propose Tutoring bot, a generative chatbot trained on a large scale of tutor-student conversations for English-language learning. To mimic a human tutor's behavior in language education, the tutor bot leverages diverse educational instructions and grounds to each instruction as additional input context for the tutor response generation. As a single instruction generally involves… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  23. arXiv:2212.07462  [pdf, other

    cs.LG quant-ph

    Harmonic (Quantum) Neural Networks

    Authors: Atiyo Ghosh, Antonio A. Gentile, Mario Dagrada, Chul Lee, Seong-Hyok Kim, Hyukgeun Cha, Yunjun Choi, Brad Kim, Jeong-Il Kye, Vincent E. Elfving

    Abstract: Harmonic functions are abundant in nature, appearing in limiting cases of Maxwell's, Navier-Stokes equations, the heat and the wave equation. Consequently, there are many applications of harmonic functions from industrial process optimisation to robotic path planning and the calculation of first exit times of random walks. Despite their ubiquity and relevance, there have been few attempts to incor… ▽ More

    Submitted 13 August, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: 12 pages (main), 7 pages (supplementary), 7 figures

    Journal ref: PMLR 202:11340-11359, 2023

  24. arXiv:2212.03099  [pdf, other

    cs.CV cs.CL cs.MM

    Semantic-Conditional Diffusion Networks for Image Captioning

    Authors: Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

    Abstract: Recent advances on text-to-image generation have witnessed the rise of diffusion models which act as powerful generative models. Nevertheless, it is not trivial to exploit such latent variable models to capture the dependency among discrete words and meanwhile pursue complex visual-language alignment in image captioning. In this paper, we break the deeply rooted conventions in learning Transformer… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Source code is available at \url{https://github.com/YehLi/xmodaler/tree/master/configs/image_caption/scdnet}

  25. arXiv:2212.00850  [pdf, other

    cs.CV cs.AI

    When Neural Networks Fail to Generalize? A Model Sensitivity Perspective

    Authors: Jia** Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, **kun Yan

    Abstract: Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  26. arXiv:2211.15588  [pdf, other

    cs.DB

    Internet of Behaviors: A Survey

    Authors: Jiayi Sun, Wensheng Gan, Han-Chieh Chao, Philip S. Yu, Wei** Ding

    Abstract: The Internet of Behavior is a research theme that aims to analyze human behavior data on the Internet from the perspective of behavioral psychology, obtain insights about human behavior, and better understand the intention behind the behavior. In this way, the Internet of Behavior can predict human behavioral trends in the future and even change human behavior, which can provide more convenience f… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Preprint. 9 figures, 1 table

  27. arXiv:2211.14951  [pdf, other

    cs.CY cs.DB

    Metaverse in Education: Vision, Opportunities, and Challenges

    Authors: Hong Lin, Shicheng Wan, Wensheng Gan, Jiahui Chen, Han-Chieh Chao

    Abstract: Traditional education has been updated with the development of information technology in human history. Within big data and cyber-physical systems, the Metaverse has generated strong interest in various applications (e.g., entertainment, business, and cultural travel) over the last decade. As a novel social work idea, the Metaverse consists of many kinds of technologies, e.g., big data, interactio… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: IEEE BigData 2022. 10 pages, 5 figures, 3 tables

  28. arXiv:2210.11640  [pdf, other

    cs.SI

    Not All Asians are the Same: A Disaggregated Approach to Identifying Anti-Asian Racism in Social Media

    Authors: Fan Wu, Sanyam Lakhanpal, Qian Li, Kook** Lee, Doowon Kim, Heewon Chae, Hazel K. Kwon

    Abstract: Recent policy initiatives have acknowledged the importance of disaggregating data pertaining to diverse Asian ethnic communities to gain a more comprehensive understanding of their current status and to improve their overall well-being. However, research on anti-Asian racism has thus far fallen short of properly incorporating data disaggregation practices. Our study addresses this gap by collectin… ▽ More

    Submitted 12 February, 2024; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted at theWebConf 2024 (formerly, WWW)

  29. arXiv:2210.07990  [pdf, other

    cs.DB cs.CR

    Metaverse: Survey, Applications, Security, and Opportunities

    Authors: Jiayi Sun, Wensheng Gan, Han-Chieh Chao, Philip S. Yu

    Abstract: As a fusion of various emerging digital technologies, the Metaverse aims to build a virtual shared digital space. It is closely related to extended reality, digital twin, blockchain, and other technologies. Its goal is to build a digital space based on the real world, form a virtual economic system, and expand the space of human activities, which injects new vitality into the social, economic, and… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Preprint. 5 figures, 4 tables

  30. arXiv:2209.12807  [pdf, other

    cs.LG cs.CV

    Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization

    Authors: **gyang Lin, Yu Wang, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

    Abstract: Outlier detection tasks have been playing a critical role in AI safety. There has been a great challenge to deal with this task. Observations show that deep neural network classifiers usually tend to incorrectly classify out-of-distribution (OOD) inputs into in-distribution classes with high confidence. Existing works attempt to solve the problem by explicitly imposing uncertainty on classifiers w… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: Source code is available at \url{https://github.com/jylins/hood}

  31. arXiv:2209.07764  [pdf, other

    cs.RO

    DS-K3DOM: 3-D Dynamic Occupancy Map** with Kernel Inference and Dempster-Shafer Evidential Theory

    Authors: Juyeop Han, Youngjae Min, Hyeok-Joo Chae, Byeong-Min Jeong, Han-Lim Choi

    Abstract: Occupancy map** has been widely utilized to represent the surroundings for autonomous robots to perform tasks such as navigation and manipulation. While occupancy map** in 2-D environments has been well-studied, there have been few approaches suitable for 3-D dynamic occupancy map** which is essential for aerial robots. This paper presents a novel 3-D dynamic occupancy map** algorithm call… ▽ More

    Submitted 24 February, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: 7 pages, 3 figures, Accepted to ICRA 2023

    MSC Class: J.2

  32. arXiv:2209.00945  [pdf, other

    cs.LG

    IMG2IMU: Translating Knowledge from Large-Scale Images to IMU Sensing Applications

    Authors: Hyungjun Yoon, Hyeongheon Cha, Hoang C. Nguyen, Taesik Gong, Sung-Ju Lee

    Abstract: Pre-training representations acquired via self-supervised learning could achieve high accuracy on even tasks with small training data. Unlike in vision and natural language processing domains, pre-training for IMU-based applications is challenging, as there are few public datasets with sufficient size and diversity to learn generalizable representations. To overcome this problem, we propose IMG2IM… ▽ More

    Submitted 29 February, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

    Comments: 12 pages

    MSC Class: 68T20

  33. arXiv:2209.00930  [pdf, other

    cs.CL cs.AI cs.LG

    Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization

    Authors: Seungone Kim, Se June Joo, Hyungjoo Chae, Chaehyeong Kim, Seung-won Hwang, **young Yeo

    Abstract: In this paper, we propose to leverage the unique characteristics of dialogues sharing commonsense knowledge across participants, to resolve the difficulties in summarizing them. We present SICK, a framework that uses commonsense inferences as additional context. Compared to previous work that solely relies on the input dialogue, SICK uses an external knowledge model to generate a rich set of commo… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: Accepted at COLING 2022

  34. arXiv:2207.06633  [pdf, other

    eess.SP cs.NI eess.SY

    Toward cm-Level Accuracy: Carrier Phase Positioning for IIoT in 5G-Advanced NR Networks

    Authors: Abdurrahman Fouda, Ryan Keating, Hyun-Su Cha

    Abstract: High-precision positioning accuracy is among the key features of the future fifth-generation (5G-advanced) cellular networks to enable a wide variety of commercial, critical, and consumer use cases. 5G new radio (NR) systems have relied on (1) cellular temporal/angular-based positioning methods to provide the indoor environments with a moderate positioning accuracy that is well below the positioni… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: in Proc. IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (IEEE PIMRC), Sep. 2022

  35. arXiv:2207.05231  [pdf, other

    eess.IV cs.CV

    Regression Metric Loss: Learning a Semantic Representation Space for Medical Images

    Authors: Hanqing Chao, Jia** Zhang, **kun Yan

    Abstract: Regression plays an essential role in many medical imaging applications for estimating various clinical risk or measurement scores. While training strategies and loss functions have been studied for the deep neural networks in medical image classification tasks, options for regression tasks are very limited. One of the key challenges is that the high-dimensional feature representation learned by e… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted by MICCAI2022

  36. arXiv:2204.11068  [pdf, ps, other

    cs.IT

    Integer Forcing Interference Management for the MIMO Interference Channel

    Authors: Sung Ho Chae, Sang-Woon Jeon

    Abstract: A new interference management scheme based on integer forcing (IF) receivers is studied for the two-user multiple-input and multiple-output (MIMO) interference channel. The proposed scheme employs a message splitting method that divides each data stream into common and private sub-streams, in which the private stream is recovered by the dedicated receiver only while the common stream is required t… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.

    Comments: Submitted to IEEE Trans. Wireless Commun. (in revision), 31 pages, 6 figures

  37. arXiv:2203.03814  [pdf, other

    eess.IV cs.CV cs.HC cs.LG

    Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers

    Authors: Han Joo Chae, Seunghwan Lee, Hyewon Son, Seungyeob Han, Taebin Lim

    Abstract: We introduce AiD Regen, a novel system that generates 3D wound models combining 2D semantic segmentation with 3D reconstruction so that they can be printed via 3D bio-printers during the surgery to treat diabetic foot ulcers (DFUs). AiD Regen seamlessly binds the full pipeline, which includes RGB-D image capturing, semantic segmentation, boundary-guided point-cloud processing, 3D model reconstruct… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  38. arXiv:2112.07515  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising

    Authors: Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

    Abstract: BERT-type structure has led to the revolution of vision-language pre-training and the achievement of state-of-the-art results on numerous vision-language downstream tasks. Existing solutions dominantly capitalize on the multi-modal inputs with mask tokens to trigger mask-based proxy pre-training tasks (e.g., masked language modeling and masked object/frame prediction). In this work, we argue that… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: ACM Multimedia 2021

  39. arXiv:2112.07513  [pdf, other

    cs.CV cs.AI cs.MM

    CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

    Authors: **gyang Lin, Yingwei Pan, Rongfeng Lai, Xuehang Yang, Hongyang Chao, Ting Yao

    Abstract: Localizing text instances in natural scenes is regarded as a fundamental challenge in computer vision. Nevertheless, owing to the extremely varied aspect ratios and scales of text instances in real scenes, most conventional text detectors suffer from the sub-text problem that only localizes the fragments of text instance (i.e., sub-texts). In this work, we quantitatively analyze the sub-text probl… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: ICME 2021 (Oral); Code is publicly available at: https://github.com/jylins/CORE-Text

  40. arXiv:2111.14725  [pdf, other

    cs.CV

    Searching the Search Space of Vision Transformer

    Authors: Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling

    Abstract: Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, and thus been attracting fast-growing efforts on manually designing more effective architectures. In this paper, we propose to use neural architecture search to automate this process, by searching not only the architecture but also the search space. The central idea is to g… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: Accepted to NIPS 2021

  41. arXiv:2111.03481  [pdf, other

    cs.CV

    Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

    Authors: Yanhong Zeng, Huan Yang, Hongyang Chao, Jianbo Wang, Jianlong Fu

    Abstract: We present a new perspective of achieving image synthesis by viewing this task as a visual token generation problem. Different from existing paradigms that directly synthesize a full image from a single input (e.g., a latent code), the new formulation enables a flexible local manipulation for different image regions, which makes it possible to learn content-aware and fine-grained style control for… ▽ More

    Submitted 18 December, 2021; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021

  42. Reference-based Defect Detection Network

    Authors: Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao

    Abstract: The defect detection task can be regarded as a realistic scenario of object detection in the computer vision field and it is widely used in the industrial field. Directly applying vanilla object detector to defect detection task can achieve promising results, while there still exists challenging issues that have not been solved. The first issue is the texture shift which means a trained defect det… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Journal ref: IEEE Transactions on Image Processing, vol. 30, pp. 6637-6647, 2021

  43. arXiv:2108.02696  [pdf, other

    cs.CV cs.AI cs.LG

    A Low Rank Promoting Prior for Unsupervised Contrastive Learning

    Authors: Yu Wang, **gyang Lin, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

    Abstract: Unsupervised learning is just at a tip** point where it could really take off. Among these approaches, contrastive learning has seen tremendous progress and led to state-of-the-art performance. In this paper, we construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning, referred to as LORAC. In contrast t… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  44. arXiv:2107.14222  [pdf, other

    cs.CV

    Rethinking and Improving Relative Position Encoding for Vision Transformer

    Authors: Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, Hongyang Chao

    Abstract: Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, in computer vision, its efficacy is not well studied and even remains controversial, e.g., whether relative position encoding can work equally well as absolute position? In order to clarify this, we first review existi… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

    Comments: Accepted by ICCV 2021

  45. arXiv:2107.04548  [pdf, other

    cs.CV

    Cross-modal Attention for MRI and Ultrasound Volume Registration

    Authors: Xinrui Song, Hengtao Guo, Xuanang Xu, Hanqing Chao, Sheng Xu, Baris Turkbey, Bradford J. Wood, Ge Wang, **kun Yan

    Abstract: Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to unde… ▽ More

    Submitted 11 July, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: This paper has been accepted by MICCAI 2021

  46. arXiv:2106.14413  [pdf, other

    cs.LG cs.CV

    Co$^2$L: Contrastive Continual Learning

    Authors: Hyuntak Cha, Jaeho Lee, **woo Shin

    Abstract: Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks than joint-training methods relying on task-specific supervision. In this paper, we found that the similar holds in the continual learning con-text: contrastively learned representations are more robust against the catastrophic forgetting than joint… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: 14 pages, 5 figures

  47. arXiv:2105.09937  [pdf, other

    cs.CV cs.AI

    AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray

    Authors: Nkechinyere N. Agu, Joy T. Wu, Hanqing Chao, Ismini Lourentzou, Arjun Sharma, Mehdi Moradi, **kun Yan, James Hendler

    Abstract: Radiologists usually observe anatomical regions of chest X-ray images as well as the overall image before making a decision. However, most existing deep learning models only look at the entire X-ray image for classification, failing to utilize important anatomical information. In this paper, we propose a novel multi-label chest X-ray classification model that accurately classifies the image findin… ▽ More

    Submitted 20 May, 2021; originally announced May 2021.

    Comments: Accepted to MICCAI 2021

  48. 3D Human Body Resha** with Anthropometric Modeling

    Authors: Yanhong Zeng, Jianlong Fu, Hongyang Chao

    Abstract: Resha** accurate and realistic 3D human bodies from anthropometric parameters (e.g., height, chest size, etc.) poses a fundamental challenge for person identification, online shop** and virtual reality. Existing approaches for creating such 3D shapes often suffer from complex measurement by range cameras or high-end scanners, which either involve heavy expense cost or result in low quality. Ho… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: ICIMCS 2017(oral). The final publication is available at Springer via https://doi.org/10.1007/978-981-10-8530-7_10

    Journal ref: In International Conference on Internet Multimedia Computing and Service (pp. 96-107). Springer, Singapore (2017)

  49. arXiv:2104.01431  [pdf, other

    cs.CV

    Aggregated Contextual Transformations for High-Resolution Image Inpainting

    Authors: Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

    Abstract: State-of-the-art image inpainting approaches can suffer from generating distorted structures and blurry textures in high-resolution images (e.g., 512x512). The challenges mainly drive from (1) image content reasoning from distant contexts, and (2) fine-grained texture synthesis for a large missing region. To overcome these two challenges, we propose an enhanced GAN-based model, named Aggregated CO… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  50. arXiv:2103.16519  [pdf, other

    cs.DB

    Explainable Fuzzy Utility Mining on Sequences

    Authors: Wensheng Gan, Zilin Du, Wei** Ding, Chunkai Zhang, Han-Chieh Chao

    Abstract: Fuzzy systems have good modeling capabilities in several data science scenarios, and can provide human-explainable intelligence models with explainability and interpretability. In contrast to transaction data, which have been extensively studied, sequence data are more common in real-life applications. To obtain a human-explainable data intelligence model for decision making, in this study, we inv… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: Preprint. 13 figures, 5 tables