Search | arXiv e-print repository

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Authors: Hayk Manukyan, Andranik Sargsyan, Barsegh Atanyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi

Abstract: Recent progress in text-guided image inpainting, based on the unprecedented success of text-to-image diffusion models, has led to exceptionally realistic and visually plausible results. However, there is still significant potential for improvement in current text-to-image inpainting models, particularly in better aligning the inpainted area with user prompts and performing high-resolution inpainti… ▽ More Recent progress in text-guided image inpainting, based on the unprecedented success of text-to-image diffusion models, has led to exceptionally realistic and visually plausible results. However, there is still significant potential for improvement in current text-to-image inpainting models, particularly in better aligning the inpainted area with user prompts and performing high-resolution inpainting. Therefore, we introduce HD-Painter, a training free approach that accurately follows prompts and coherently scales to high resolution image inpainting. To this end, we design the Prompt-Aware Introverted Attention (PAIntA) layer enhancing self-attention scores by prompt information resulting in better text aligned generations. To further improve the prompt coherence we introduce the Reweighting Attention Score Guidance (RASG) mechanism seamlessly integrating a post-hoc sampling strategy into the general form of DDIM to prevent out-of-distribution latent shifts. Moreover, HD-Painter allows extension to larger scales by introducing a specialized super-resolution technique customized for inpainting, enabling the completion of missing regions in images of up to 2K resolution. Our experiments demonstrate that HD-Painter surpasses existing state-of-the-art approaches quantitatively and qualitatively across multiple metrics and a user study. Code is publicly available at: https://github.com/Picsart-AI-Research/HD-Painter △ Less

Submitted 18 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

arXiv:2302.09654 [pdf, other]

The Half-Life of a Tweet

Authors: Juergen Pfeffer, Daniel Matter, Anahit Sargsyan

Abstract: Twitter has started to share an impression_count variable as part of the available public metrics for every Tweet collected with Twitter's APIs. With the information about how often a particular Tweet has been shown to Twitter users at the time of data collection, we can learn important insights about the dissemination process of a Tweet by measuring its impression count repeatedly over time. With… ▽ More Twitter has started to share an impression_count variable as part of the available public metrics for every Tweet collected with Twitter's APIs. With the information about how often a particular Tweet has been shown to Twitter users at the time of data collection, we can learn important insights about the dissemination process of a Tweet by measuring its impression count repeatedly over time. With our preliminary analysis, we can show that on average the peak of impressions per second is 72 seconds after a Tweet was sent and that after 24 hours, no relevant number of impressions can be observed for ~95% of all Tweets. Finally, we estimate that the median half-life of a Tweet, i.e. the time it takes before half of all impressions are created, is about 80 minutes. △ Less

Submitted 11 April, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

arXiv:2211.03700 [pdf, other]

Image Completion with Heterogeneously Filtered Spectral Hints

Authors: Xingqian Xu, Shant Navasardyan, Vahram Tadevosyan, Andranik Sargsyan, Yadong Mu, Humphrey Shi

Abstract: Image completion with large-scale free-form missing regions is one of the most challenging tasks for the computer vision community. While researchers pursue better solutions, drawbacks such as pattern unawareness, blurry textures, and structure distortion remain noticeable, and thus leave space for improvement. To overcome these challenges, we propose a new StyleGAN-based image completion network,… ▽ More Image completion with large-scale free-form missing regions is one of the most challenging tasks for the computer vision community. While researchers pursue better solutions, drawbacks such as pattern unawareness, blurry textures, and structure distortion remain noticeable, and thus leave space for improvement. To overcome these challenges, we propose a new StyleGAN-based image completion network, Spectral Hint GAN (SH-GAN), inside which a carefully designed spectral processing module, Spectral Hint Unit, is introduced. We also propose two novel 2D spectral processing strategies, Heterogeneous Filtering and Gaussian Split that well-fit modern deep learning models and may further be extended to other tasks. From our inclusive experiments, we demonstrate that our model can reach FID scores of 3.4134 and 7.0277 on the benchmark datasets FFHQ and Places2, and therefore outperforms prior works and reaches a new state-of-the-art. We also prove the effectiveness of our design via ablation studies, from which one may notice that the aforementioned challenges, i.e. pattern unawareness, blurry textures, and structure distortion, can be noticeably resolved. Our code will be open-sourced at: https://github.com/SHI-Labs/SH-GAN. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: wacv23

arXiv:1806.02615 [pdf, ps, other]

doi 10.1007/978-3-030-64583-0_24

Explainable AI as a Social Microscope: A Case Study on Academic Performance

Authors: Anahit Sargsyan, Areg Karapetyan, Wei Lee Woon, Aamena Alshamsi

Abstract: Academic performance is perceived as a product of complex interactions between students' overall experience, personal characteristics and upbringing. Data science techniques, most commonly involving regression analysis and related approaches, serve as a viable means to explore this interplay. However, these tend to extract factors with wide-ranging impact, while overlooking variations specific to… ▽ More Academic performance is perceived as a product of complex interactions between students' overall experience, personal characteristics and upbringing. Data science techniques, most commonly involving regression analysis and related approaches, serve as a viable means to explore this interplay. However, these tend to extract factors with wide-ranging impact, while overlooking variations specific to individual students. Focusing on each student's peculiarities is generally impossible with thousands or even hundreds of subjects, yet data mining methods might prove effective in devising more targeted approaches. For instance, subjects with shared characteristics can be assigned to clusters, which can then be examined separately with machine learning algorithms, thereby providing a more nuanced view of the factors affecting individuals in a particular group. In this context, we introduce a data science workflow allowing for fine-grained analysis of academic performance correlates that captures the subtle differences in students' sensitivities to these factors. Leveraging the Local Interpretable Model-Agnostic Explanations (LIME) algorithm from the toolbox of Explainable Artificial Intelligence (XAI) techniques, the proposed pipeline yields groups of students having similar academic attainment indicators, rather than similar features (e.g. familial background) as typically practiced in prior studies. As a proof-of-concept case study, a rich longitudinal dataset is selected to evaluate the effectiveness of the proposed approach versus a standard regression model. △ Less

Submitted 4 June, 2020; v1 submitted 7 June, 2018; originally announced June 2018.

Showing 1–4 of 4 results for author: Sargsyan, A