-
Comprehensive Comparisons of Uniform Quantization in Deep Image Compression
Authors:
Koki Tsubota,
Kiyoharu Aizawa
Abstract:
In deep image compression, uniform quantization is applied to latent representations obtained by using an auto-encoder architecture for reducing bits and entropy coding. Quantization is a problem encountered in the end-to-end training of deep image compression. Quantization's gradient is zero, and it cannot backpropagate meaningful gradients. Many methods have been proposed to address the approxim…
▽ More
In deep image compression, uniform quantization is applied to latent representations obtained by using an auto-encoder architecture for reducing bits and entropy coding. Quantization is a problem encountered in the end-to-end training of deep image compression. Quantization's gradient is zero, and it cannot backpropagate meaningful gradients. Many methods have been proposed to address the approximations of quantization to obtain gradients. However, there have not been equitable comparisons among them. In this study, we comprehensively compare the existing approximations of uniform quantization. Furthermore, we evaluate possible combinations of quantizers for the decoder and the entropy model, as the approximated quantizers can be different for them. We conduct experiments using three network architectures on two test datasets. The experimental results reveal that the best approximated quantization differs by the network architectures, and the best approximations of the three are different from the original ones used for the architectures. We also show that the combination of quantizers that uses universal quantization for the entropy model and differentiable soft quantization for the decoder is a comparatively good choice for different architectures and datasets.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Universal Deep Image Compression via Content-Adaptive Optimization with Adapters
Authors:
Koki Tsubota,
Hiroaki Akutsu,
Kiyoharu Aizawa
Abstract:
Deep image compression performs better than conventional codecs, such as JPEG, on natural images. However, deep image compression is learning-based and encounters a problem: the compression performance deteriorates significantly for out-of-domain images. In this study, we highlight this problem and address a novel task: universal deep image compression. This task aims to compress images belonging…
▽ More
Deep image compression performs better than conventional codecs, such as JPEG, on natural images. However, deep image compression is learning-based and encounters a problem: the compression performance deteriorates significantly for out-of-domain images. In this study, we highlight this problem and address a novel task: universal deep image compression. This task aims to compress images belonging to arbitrary domains, such as natural images, line drawings, and comics. To address this problem, we propose a content-adaptive optimization framework; this framework uses a pre-trained compression model and adapts the model to a target image during compression. Adapters are inserted into the decoder of the model. For each input image, our framework optimizes the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion. The adapter parameters are additionally transmitted per image. For the experiments, a benchmark dataset containing uncompressed images of four domains (natural images, line drawings, comics, and vector arts) is constructed and the proposed universal deep compression is evaluated. Finally, the proposed model is compared with non-adaptive and existing adaptive compression models. The comparison reveals that the proposed model outperforms these. The code and dataset are publicly available at https://github.com/kktsubota/universal-dic.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Evaluating the Stability of Deep Image Quality Assessment With Respect to Image Scaling
Authors:
Koki Tsubota,
Hiroaki Akutsu,
Kiyoharu Aizawa
Abstract:
Image quality assessment (IQA) is a fundamental metric for image processing tasks (e.g., compression). With full-reference IQAs, traditional IQAs, such as PSNR and SSIM, have been used. Recently, IQAs based on deep neural networks (deep IQAs), such as LPIPS and DISTS, have also been used. It is known that image scaling is inconsistent among deep IQAs, as some perform down-scaling as pre-processing…
▽ More
Image quality assessment (IQA) is a fundamental metric for image processing tasks (e.g., compression). With full-reference IQAs, traditional IQAs, such as PSNR and SSIM, have been used. Recently, IQAs based on deep neural networks (deep IQAs), such as LPIPS and DISTS, have also been used. It is known that image scaling is inconsistent among deep IQAs, as some perform down-scaling as pre-processing, whereas others instead use the original image size. In this paper, we show that the image scale is an influential factor that affects deep IQA performance. We comprehensively evaluate four deep IQAs on the same five datasets, and the experimental results show that image scale significantly influences IQA performance. We found that the most appropriate image scale is often neither the default nor the original size, and the choice differs depending on the methods and datasets used. We visualized the stability and found that PieAPP is the most stable among the four deep IQAs.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
NTIRE 2021 Challenge on Perceptual Image Quality Assessment
Authors:
**** Gu,
Haoming Cai,
Chao Dong,
Jimmy S. Ren,
Yu Qiao,
Shuhang Gu,
Radu Timofte,
Manri Cheon,
Sungjun Yoon,
Byungyeon Kang,
Junwoo Lee,
Qing Zhang,
Haiyang Guo,
Yi Bin,
Yuqing Hou,
Hengliang Luo,
**gyu Guo,
Zirui Wang,
Hai Wang,
Wenming Yang,
Qingyan Bai,
Shuwei Shi,
Weihao Xia,
Mingdeng Cao,
Jiahao Wang
, et al. (25 additional authors not shown)
Abstract:
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021. As a new type of image processing technology, perceptual image processing algorithms based on Generative Adversarial Networks (GAN) have produced images with more realistic textures. These o…
▽ More
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021. As a new type of image processing technology, perceptual image processing algorithms based on Generative Adversarial Networks (GAN) have produced images with more realistic textures. These output images have completely different characteristics from traditional distortions, thus pose a new challenge for IQA methods to evaluate their visual quality. In comparison with previous IQA challenges, the training and testing datasets in this challenge include the outputs of perceptual image processing algorithms and the corresponding subjective scores. Thus they can be used to develop and evaluate IQA methods on GAN-based distortions. The challenge has 270 registered participants in total. In the final testing stage, 13 participating teams submitted their models and fact sheets. Almost all of them have achieved much better results than existing IQA methods, while the winning method can demonstrate state-of-the-art performance.
△ Less
Submitted 28 June, 2021; v1 submitted 7 May, 2021;
originally announced May 2021.
-
Few-Shot Font Generation with Deep Metric Learning
Authors:
Haruka Aoki,
Koki Tsubota,
Hikaru Ikuta,
Kiyoharu Aizawa
Abstract:
Designing fonts for languages with a large number of characters, such as Japanese and Chinese, is an extremely labor-intensive and time-consuming task. In this study, we addressed the problem of automatically generating Japanese typographic fonts from only a few font samples, where the synthesized glyphs are expected to have coherent characteristics, such as skeletons, contours, and serifs. Existi…
▽ More
Designing fonts for languages with a large number of characters, such as Japanese and Chinese, is an extremely labor-intensive and time-consuming task. In this study, we addressed the problem of automatically generating Japanese typographic fonts from only a few font samples, where the synthesized glyphs are expected to have coherent characteristics, such as skeletons, contours, and serifs. Existing methods often fail to generate fine glyph images when the number of style reference glyphs is extremely limited. Herein, we proposed a simple but powerful framework for extracting better style features. This framework introduces deep metric learning to style encoders. We performed experiments using black-and-white and shape-distinctive font datasets and demonstrated the effectiveness of the proposed framework.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications
Authors:
Kiyoharu Aizawa,
Azuma Fujimoto,
Atsushi Otsubo,
Toru Ogawa,
Yusuke Matsui,
Koki Tsubota,
Hikaru Ikuta
Abstract:
Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset. Hence, we built Manga109, a dataset consisting of a variety of 109 Japanese comic books (94 authors and 21,142 pages) and made it publicly available by obtaining author permissions for academic use. We carefully annotated the fra…
▽ More
Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset. Hence, we built Manga109, a dataset consisting of a variety of 109 Japanese comic books (94 authors and 21,142 pages) and made it publicly available by obtaining author permissions for academic use. We carefully annotated the frames, speech texts, character faces, and character bodies; the total number of annotations exceeds 500k. This dataset provides numerous manga images and annotations, which will be beneficial for use in machine learning algorithms and their evaluation. In addition to academic use, we obtained further permission for a subset of the dataset for industrial use. In this article, we describe the details of the dataset and present a few examples of multimedia processing applications (detection, retrieval, and generation) that apply existing deep learning methods and are made possible by the dataset.
△ Less
Submitted 12 May, 2020; v1 submitted 9 May, 2020;
originally announced May 2020.
-
Electronic Structures of CeM2Al10 (M = Fe, Ru, and Os) Studied by Soft X-ray Resonant and High-Resolution Photoemission Spectroscopies
Authors:
Toshihiko Ishiga,
Takanori Wakita,
Rikiya Yoshida,
Hiroyuki Okazaki,
Koji Tsubota,
Masanori Sunagawa,
Kanta Uenaka,
Kozo Okada,
Hiroshi Kumigashira,
Masaharu Oshima,
Keisuke Yutani,
Yuji Muro,
Toshiro Takabatake,
Yuji Muraoka,
Takayoshi Yokoya
Abstract:
We have performed a photoemission spectroscopy (PES) study of CeM2Al10 (M = Fe, Ru, and Os) to directly observe the electronic structure involved in the unusual magnetic ordering. Soft X-ray resonant (SXR) PES provides spectroscopic evidence of the hybridization between conduction and Ce 4f electrons (c-f hybridization) and the order of the hybridization strength (Ru < Os < Fe). High-resolution (H…
▽ More
We have performed a photoemission spectroscopy (PES) study of CeM2Al10 (M = Fe, Ru, and Os) to directly observe the electronic structure involved in the unusual magnetic ordering. Soft X-ray resonant (SXR) PES provides spectroscopic evidence of the hybridization between conduction and Ce 4f electrons (c-f hybridization) and the order of the hybridization strength (Ru < Os < Fe). High-resolution (HR) PES of CeRu2Al10 and CeOs2Al10, as compared with that of CeFe2Al10, identifies two structures that can be ascribed to structures induced by the c-f hybridization and the antiferromagnetic ordering, respectively. Although the c-f hybridization-induced structure is a depletion of the spectral intensity (pseudogap) around the Fermi level (EF) with an energy scale of 20-30 meV, the structure related to the antiferromagnetic ordering is observed as a shoulder at approximately 10-11 meV within the pseudogap. The energies of the shoulder structures of CeRu2Al10 and CeOs2Al10 are approximately half of the optical gap (20 meV), indicating that EF is located at the midpoint of the gap.
△ Less
Submitted 24 July, 2014;
originally announced July 2014.
-
ASTRA: ASTrometry and phase-Referencing Astronomy on the Keck interferometer
Authors:
J. Woillez,
R. Akeson,
M. Colavita,
J. Eisner,
A. Ghez,
J. Graham,
L. Hillenbrand,
R. Millan-Gabet,
J. Monnier,
J. -U. Pott,
S. Ragland,
P. Wizinowich,
E. Appleby,
B. Berkey,
A. Cooper,
C. Felizardo,
J. Herstein,
M. Hrynevych,
O. Martin,
D. Medeiros,
D. Morrison,
T. Panteleeva,
B. Smith,
K. Summers,
K. Tsubota
, et al. (2 additional authors not shown)
Abstract:
ASTRA (ASTrometric and phase-Referencing Astronomy) is an upgrade to the existing Keck Interferometer which aims at providing new self-phase referencing (high spectral resolution observation of YSOs), dual-field phase referencing (sensitive AGN observations), and astrometric (known exoplanetary systems characterization and galactic center general relativity in strong field regime) capabilities. Wi…
▽ More
ASTRA (ASTrometric and phase-Referencing Astronomy) is an upgrade to the existing Keck Interferometer which aims at providing new self-phase referencing (high spectral resolution observation of YSOs), dual-field phase referencing (sensitive AGN observations), and astrometric (known exoplanetary systems characterization and galactic center general relativity in strong field regime) capabilities. With the first high spectral resolution mode now offered to the community, this contribution focuses on the progress of the dual field and astrometric modes.
△ Less
Submitted 15 August, 2012;
originally announced August 2012.
-
First faint dual-field phase-referenced observations on the Keck interferometer
Authors:
Julien Woillez,
Peter Wizinowich,
Rachel Akeson,
Mark Colavita,
Josh Eisner,
Rafael Millan-Gabet,
John Monnier,
Jorg-Uwe Pott,
Sam Ragland,
Eric Appleby,
Andrew Cooper,
Claude Felizardo,
Jennifer Herstein,
Olivier Martin,
Drew Medeiros,
Douglas Morrison,
Tatyana Panteleeva,
Brett Smith,
Kellee Summers,
Kevin Tsubota,
Colette Tyau,
Ed Wetherell
Abstract:
Ground-based long baseline interferometers have long been limited in sensitivity by the short integration periods imposed by atmospheric turbulence. The first observation fainter than this limit was performed on January 22, 2011 when the Keck Interferometer observed a K=11.5 target, about one magnitude fainter than its K=10.3 limit. This observation was made possible by the Dual Field Phase Refere…
▽ More
Ground-based long baseline interferometers have long been limited in sensitivity by the short integration periods imposed by atmospheric turbulence. The first observation fainter than this limit was performed on January 22, 2011 when the Keck Interferometer observed a K=11.5 target, about one magnitude fainter than its K=10.3 limit. This observation was made possible by the Dual Field Phase Referencing instrument of the ASTRA project: simultaneously measuring the real-time effects of the atmosphere on a nearby bright guide star, and correcting for it on the faint target, integration time longer than the turbulence time scale are made possible. As a prelude to this demonstration, we first present the implementation of Dual Field Phase Referencing on the interferometer. We then detail its on-sky performance focusing on the accuracy of the turbulence correction, and on the resulting fringe contrast stability. We conclude with a presentation of early results obtained with Laser Guide Star AO and the interferometer.
△ Less
Submitted 20 July, 2012;
originally announced July 2012.
-
Probing local density inhomogeneities in the circumstellar disk of a Be star using the new spectro-astrometry mode at the Keck interferometer
Authors:
J. -U. Pott,
J. Woillez,
S. Ragland,
P. L. Wizinowich,
J. A. Eisner,
J. D. Monnier,
R. L. Akeson,
A. M. Ghez,
J. R. Graham,
L. A. Hillenbrand,
R. Millan-Gabet,
E. Appleby,
B. Berkey,
M. M. Colavita,
A. Cooper,
C. Felizardo,
J. Herstein,
M. Hrynevych,
D. Medeiros,
D. Morrison,
T. Panteleeva,
B. Smith,
K. Summers,
K. Tsubota,
C. Tyau
, et al. (1 additional authors not shown)
Abstract:
We report on the successful science verification phase of a new observing mode at the Keck interferometer, which provides a line-spread function width and sampling of 150km/s at K'-band, at a current limiting magnitude of K'~7mag with spatial resolution of lam/2B ~2.7mas and a measured differential phase stability of unprecedented precision (3mrad at K=5mag, which represents 3uas on sky or a centr…
▽ More
We report on the successful science verification phase of a new observing mode at the Keck interferometer, which provides a line-spread function width and sampling of 150km/s at K'-band, at a current limiting magnitude of K'~7mag with spatial resolution of lam/2B ~2.7mas and a measured differential phase stability of unprecedented precision (3mrad at K=5mag, which represents 3uas on sky or a centroiding precision of 10^-3). The scientific potential of this mode is demonstrated by the presented observations of the circumstellar disk of the evolved Be-star 48Lib. In addition to indirect methods such as multi-wavelength spectroscopy and polaritmetry, the here described spectro-interferometric astrometry provides a new tool to directly constrain the radial density structure in the disk. We resolve for the first time several Pfund emission lines, in addition to BrGam, in a single interferometric spectrum, and with adequate spatial and spectral resolution and precision to analyze the radial disk structure in 48Lib. The data suggest that the continuum and Pf-emission originates in significantly more compact regions, inside of the BrGam emission zone. Thus, spectro-interferometric astrometry opens the opportunity to directly connect the different observed line profiles of BrGam and Pfund in the total and correlated flux to different disk radii. The gravitational potential of a rotationally flattened Be star is expected to induce a one-armed density perturbation in the circumstellar disk. Such a slowly rotating disk oscillation has been used to explain the well known periodic V/R spectral profile variability in these stars, as well as the observed V/R cycle phase shifts between different disk emission lines. The differential line properties and linear constraints set by our data lend support to the existence of a radius-dependent disk density perturbation.
△ Less
Submitted 10 August, 2010;
originally announced August 2010.
-
Astrometry with the Keck-Interferometer: the ASTRA project and its science
Authors:
Jorg-Uwe Pott,
Julien Woillez,
Rachel L. Akeson,
Ben Berkey,
Mark M. Colavita,
Andrew Cooper,
Josh A. Eisner,
Andrea M. Ghez,
James R. Graham,
Lynne Hillenbrand,
Michael Hrynewych,
Drew Medeiros,
Rafael Millan-Gabet,
John Monnier,
Douglas Morrison,
Tatyana Panteleeva,
Eliot Quataert,
Bill Randolph,
Brett Smith,
Kellee Summers,
Kevin Tsubota,
Colette Tyau,
Nevin Weinberg,
Ed Wetherell,
Peter L. Wizinowich
Abstract:
The sensitivity and astrometry upgrade ASTRA of the Keck Interferometer is introduced. After a brief overview of the underlying interferometric principles, the technology and concepts of the upgrade are presented. The interferometric dual-field technology of ASTRA will provide the KI with the means to observe two objects simultaneously, and measure the distance between them with a precision even…
▽ More
The sensitivity and astrometry upgrade ASTRA of the Keck Interferometer is introduced. After a brief overview of the underlying interferometric principles, the technology and concepts of the upgrade are presented. The interferometric dual-field technology of ASTRA will provide the KI with the means to observe two objects simultaneously, and measure the distance between them with a precision eventually better than 100 uas. This astrometric functionality of ASTRA will add a unique observing tool to fields of astrophysical research as diverse as exo-planetary kinematics, binary astrometry, and the investigation of stars accelerated by the massive black hole in the center of the Milky Way as discussed in this contribution.
△ Less
Submitted 17 November, 2008; v1 submitted 13 November, 2008;
originally announced November 2008.
-
Milliarcsecond N-Band Observations of the Nova RS Ophiuchi: First Science with the Keck Interferometer Nuller
Authors:
R. K. Barry,
W. C. Danchi,
W. A. Traub,
J. L. Sokoloski,
J. P. Wisniewski,
E. Serabyn,
M. J. Kuchner,
R. Akeson,
E. Appleby,
J. Bell,
A. Booth,
H. Brandenburg,
M. Colavita,
S. Crawford,
M. Creech-Eakman,
W. Dahl,
C. Felizardo,
J. Garcia,
J. Gathright,
M. A. Greenhouse,
J. Herstein,
E. Hovland,
M. Hrynevych,
C. Koresko,
R. Ligon
, et al. (16 additional authors not shown)
Abstract:
We report observations of the nova RS Ophiuchi (RS Oph) using the Keck Interferometer Nuller (KIN), approximately 3.8 days following the most recent outburst that occurred on 2006 February 12. These observations represent the first scientific results from the KIN, which operates in N-band from 8 to 12.5 microns in a nulling mode. By fitting the unique KIN data, we have obtained an angular size o…
▽ More
We report observations of the nova RS Ophiuchi (RS Oph) using the Keck Interferometer Nuller (KIN), approximately 3.8 days following the most recent outburst that occurred on 2006 February 12. These observations represent the first scientific results from the KIN, which operates in N-band from 8 to 12.5 microns in a nulling mode. By fitting the unique KIN data, we have obtained an angular size of the mid-infrared continuum of 6.2, 4.0, or 5.4 mas for a disk profile, gaussian profile (FWHM), and shell profile respectively. The data show evidence of enhanced neutral atomic hydrogen emission and atomic metals including silicon located in the inner spatial regime near the white dwarf (WD) relative to the outer regime. There are also nebular emission lines and evidence of hot silicate dust in the outer spatial region, centered at ! 17 AU from the WD, that are not found in the inner regime. Our evidence suggests that these features have been excited by the nova flash in the outer spatial regime before the blast wave reached these regions. These identifications support a model in which the dust appears to be present between outbursts and is not created during the outburst event. We further discuss the present results in terms of a unifying model of the system that includes an increase in density in the plane of the orbit of the two stars created by a spiral shock wave caused by the motion of the stars through the cool wind of the red giant star. These data show the power and potential of the nulling technique which has been developed for the detection of Earth-like planets around nearby stars for the Terrestrial Planet Finder Mission and Darwin missions.
△ Less
Submitted 27 January, 2008;
originally announced January 2008.