Search | arXiv e-print repository

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: CVPR 2024, AI for Streaming (AIS) Workshop

arXiv:2404.06808 [pdf, other]

Formation-Controlled Dimensionality Reduction

Authors: Taeuk Jeong, Yoon Mo Jung

Abstract: Dimensionality reduction represents the process of generating a low dimensional representation of high dimensional data. Motivated by the formation control of mobile agents, we propose a nonlinear dynamical system for dimensionality reduction. The system consists of two parts; the control of neighbor points, addressing local structures, and the control of remote points, accounting for global struc… ▽ More Dimensionality reduction represents the process of generating a low dimensional representation of high dimensional data. Motivated by the formation control of mobile agents, we propose a nonlinear dynamical system for dimensionality reduction. The system consists of two parts; the control of neighbor points, addressing local structures, and the control of remote points, accounting for global structures. We also include a brief mathematical observation of the model and its numerical procedure. Numerical experiments are performed on both synthetic and real datasets and comparisons with existing models demonstrate the soundness and effectiveness of the proposed model. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2403.10041 [pdf, other]

Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)

Authors: Jeongeun Park, Taemoon Jeong, Hyeonseong Kim, Taehyun Byun, Seungyoon Shin, Keunjun Choi, Jaewoon Kwon, Taeyoon Lee, Matthew Pan, Sungjoon Choi

Abstract: This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent application to the physical realm, employing robots to provide a more immersive and interactive experience. The proposed system, named the Masquerading Animate… ▽ More This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent application to the physical realm, employing robots to provide a more immersive and interactive experience. The proposed system, named the Masquerading Animated Social Kinematics (MASK), leverages an anthropomorphic robot which interacts with guests using non-verbal interactions, including facial expressions and gestures. A behavior generation system based upon a finite-state machine structure effectively conditions robotic behavior to convey distinct personas. The MASK framework integrates a perception engine, a behavior selection engine, and a comprehensive action library to enable real-time, dynamic interactions with minimal human intervention in behavior design. Throughout the user subject studies, we examined whether the users could recognize the intended character in film-character-based persona conditions. We conclude by discussing the role of personas in interactive agents and the factors to consider for creating an engaging user experience. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 4 pages, 3 figures

arXiv:2005.13837 [pdf, other]

Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

Authors: Dong Bok Lee, Seanie Lee, Woo Tae Jeong, Donghwan Kim, Sung Ju Hwang

Abstract: One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we pr… ▽ More One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. We validate our Information Maximizing Hierarchical Conditional Variational AutoEncoder (Info-HCVAE) on several benchmark datasets by evaluating the performance of the QA model (BERT-base) using only the generated QA pairs (QA-based evaluation) or by using both the generated and human-labeled pairs (semi-supervised learning) for training, against state-of-the-art baseline models. The results show that our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training. △ Less

Submitted 14 June, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

Comments: ACL 2020

arXiv:1603.07692 [pdf]

Predictive Analytics Using Smartphone Sensors for Depressive Episodes

Authors: Taeheon Jeong, Diego Klabjan, Justin Starren

Abstract: The behaviors of patients with depression are usually difficult to predict because the patients demonstrate the symptoms of a depressive episode without a warning at unexpected times. The goal of this research is to build algorithms that detect signals of such unusual moments so that doctors can be proactive in approaching already diagnosed patients before they fall in depression. Each patient is… ▽ More The behaviors of patients with depression are usually difficult to predict because the patients demonstrate the symptoms of a depressive episode without a warning at unexpected times. The goal of this research is to build algorithms that detect signals of such unusual moments so that doctors can be proactive in approaching already diagnosed patients before they fall in depression. Each patient is equipped with a smartphone with the capability to track its sensors. We first find the home location of a patient, which is then augmented with other sensor data to identify sleep patterns and select communication patterns. The algorithms require two to three weeks of training data to build standard patterns, which are considered normal behaviors; and then, the methods identify any anomalies in day-to-day data readings of sensors. Four smartphone sensors, including the accelerometer, the gyroscope, the location probe and the communication log probe are used for anomaly detection in slee** and communication patterns. △ Less

Submitted 24 March, 2016; originally announced March 2016.

Comments: HIAI 2016, Expanding the Boundaries of Health Informatics using AI, Phoenix, AZ

Showing 1–5 of 5 results for author: Jeong, T