Skip to main content

Showing 1–22 of 22 results for author: Cricri, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12367  [pdf, other

    cs.CV cs.LG cs.MM

    Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines

    Authors: Honglei Zhang, Jukka I. Ahonen, Nam Le, Ruiying Yang, Francesco Cricri

    Abstract: This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filte… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to be preseneted in ICIP 2024

  2. arXiv:2401.10761  [pdf, other

    eess.IV cs.CV

    NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines

    Authors: Jukka I. Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri, Hamed Rezazadegan Tavakoli, Miska M. Hannuksela, Esa Rahtu

    Abstract: The recent progress in artificial intelligence has led to an ever-increasing usage of images and videos by machine analysis algorithms, mainly neural networks. Nonetheless, compression, storage and transmission of media have traditionally been designed considering human beings as the viewers of the content. Recent research on image and video coding for machine analysis has progressed mainly in two… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: ISM 2023 Best paper award winner version

  3. Bridging the gap between image coding for machines and humans

    Authors: Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu

    Abstract: Image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy. In many use cases, such as surveillance, it is also important that the visual quality is not drastically deteriorated by the compression process. Recent works on using neural network (NN) based ICM codecs have shown significant coding gains agai… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Journal ref: IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 2022, pp. 3411-3415

  4. arXiv:2210.04112  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    Leveraging progressive model and overfitting for efficient learned image compression

    Authors: Honglei Zhang, Francesco Cricri, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela

    Abstract: Deep learning is overwhelmingly dominant in the field of computer vision and image/video processing for the last decade. However, for image and video compression, it lags behind the traditional techniques based on discrete cosine transform (DCT) and linear filters. Built on top of an autoencoder architecture, learned image compression (LIC) systems have drawn enormous attention in recent years. Ne… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  5. arXiv:2112.08767  [pdf, other

    eess.IV cs.CV cs.LG

    Adaptation and Attention for Neural Video Coding

    Authors: Nannan Zou, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu

    Abstract: Neural image coding represents now the state-of-the-art image compression approach. However, a lot of work is still to be done in the video domain. In this work, we propose an end-to-end learned video codec that introduces several architectural novelties as well as training novelties, revolving around the concepts of adaptation and attention. Our codec is organized as an intra-frame codec paired w… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  6. arXiv:2108.10551  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Lossless Image Compression Using a Multi-Scale Progressive Statistical Model

    Authors: Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli, Nannan Zou, Emre Aksu, Miska M. Hannuksela

    Abstract: Lossless image compression is an important technique for image storage and transmission when information loss is not allowed. With the fast development of deep learning techniques, deep neural networks have been used in this field to achieve a higher compression rate. Methods based on pixel-wise autoregressive statistical models have shown good performance. However, the sequential processing way p… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Comments: Accepted ACCV 2020

  7. Image coding for machines: an end-to-end learned approach

    Authors: Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Esa Rahtu

    Abstract: Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting hu… ▽ More

    Submitted 30 August, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

    Comments: Fixed a couple of mistakes since the version accepted in IEEE ICASSP2021

    Journal ref: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021), 2021, pp. 1590-1594

  8. Learned Image Coding for Machines: A Content-Adaptive Approach

    Authors: Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Hamed Rezazadegan Tavakoli, Esa Rahtu

    Abstract: Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and v… ▽ More

    Submitted 13 October, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

    Comments: Fig 4 correction

    Journal ref: 2021 IEEE International Conference on Multimedia and Expo (ICME), 2021, pp. 1-6

  9. arXiv:2007.16054  [pdf, other

    eess.IV cs.CV cs.LG cs.MM stat.ML

    Learning to Learn to Compress

    Authors: Nannan Zou, Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli, Jani Lainema, Miska Hannuksela, Emre Aksu, Esa Rahtu

    Abstract: In this paper we present an end-to-end meta-learned system for image compression. Traditional machine learning based approaches to image compression train one or more neural network for generalization performance. However, at inference time, the encoder or the latent tensor output by the encoder can be optimized for each test image. This optimization can be regarded as a form of adaptation or bene… ▽ More

    Submitted 1 May, 2021; v1 submitted 31 July, 2020; originally announced July 2020.

  10. arXiv:2007.14267  [pdf, other

    eess.IV cs.CV cs.LG cs.MM

    Efficient Adaptation of Neural Network Filter for Video Compression

    Authors: Yat-Hong Lam, Alireza Zare, Francesco Cricri, Jani Lainema, Miska Hannuksela

    Abstract: We present an efficient finetuning methodology for neural-network filters which are applied as a postprocessing artifact-removal step in video coding pipelines. The fine-tuning is performed at encoder side to adapt the neural network to the specific content that is being encoded. In order to maximize the PSNR gain and minimize the bitrate overhead, we propose to finetune only the convolutional lay… ▽ More

    Submitted 13 August, 2020; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: Accepted in ACM Multimedia 2020

  11. arXiv:2004.09226  [pdf, other

    eess.IV cs.CV cs.LG

    End-to-End Learning for Video Frame Compression with Self-Attention

    Authors: Nannan Zou, Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu

    Abstract: One of the core components of conventional (i.e., non-learned) video codecs consists of predicting a frame from a previously-decoded frame, by leveraging temporal correlations. In this paper, we propose an end-to-end learned system for compressing video frames. Instead of relying on pixel-space motion (as with optical flow), our system learns deep embeddings of frames and encodes their difference… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

  12. arXiv:1905.10371  [pdf, other

    eess.IV cs.LG stat.ML

    A Compression Objective and a Cycle Loss for Neural Image Compression

    Authors: Caglar Aytekin, Francesco Cricri, Antti Hallapuro, Jani Lainema, Emre Aksu, Miska Hannuksela

    Abstract: In this manuscript we propose two objective terms for neural image compression: a compression objective and a cycle loss. These terms are applied on the encoder output of an autoencoder and are used in combination with reconstruction losses. The compression objective encourages sparsity and low entropy in the activations. The cycle loss term represents the distortion between encoder outputs comput… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: Accepted in Challenge and Workshop on Learned Image Compression (CLIC) as a part of CVPR 2019

  13. arXiv:1905.04079  [pdf, other

    cs.LG cs.MM stat.ML

    Compressing Weight-updates for Image Artifacts Removal Neural Networks

    Authors: Yat Hong Lam, Alireza Zare, Caglar Aytekin, Francesco Cricri, Jani Lainema, Emre Aksu, Miska Hannuksela

    Abstract: In this paper, we present a novel approach for fine-tuning a decoder-side neural network in the context of image compression, such that the weight-updates are better compressible. At encoder side, we fine-tune a pre-trained artifact removal network on target data by using a compression objective applied on the weight-update. In particular, the compression objective encourages weight-updates which… ▽ More

    Submitted 14 June, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

    Comments: Submission for CHALLENGE ON LEARNED IMAGE COMPRESSION (CLIC) 2019 (updated on 14 June 2019)

  14. arXiv:1905.01044  [pdf, other

    cs.LG stat.ML

    Compressibility Loss for Neural Network Weights

    Authors: Caglar Aytekin, Francesco Cricri, Emre Aksu

    Abstract: In this paper we apply a compressibility loss that enables learning highly compressible neural network weights. The loss was previously proposed as a measure of negated sparsity of a signal, yet in this paper we show that minimizing this loss also enforces the non-zero parts of the signal to have very low entropy, thus making the entire signal more compressible. For an optimization problem where t… ▽ More

    Submitted 3 May, 2019; originally announced May 2019.

  15. arXiv:1805.10887  [pdf, ps, other

    cs.LG stat.ML

    Block-optimized Variable Bit Rate Neural Image Compression

    Authors: Caglar Aytekin, Xingyang Ni, Francesco Cricri, Jani Lainema, Emre Aksu, Miska Hannuksela

    Abstract: In this work, we propose an end-to-end block-based auto-encoder system for image compression. We introduce novel contributions to neural-network based image compression, mainly in achieving binarization simulation, variable bit rates with multiple networks, entropy-friendly representations, inference-stage code optimization and performance-improving normalization layers in the auto-encoder. We eva… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: Accepted, Workshop and Challenge on Learned Image Compression (CLIC), CVPR 2018

  16. arXiv:1805.08009  [pdf, other

    cs.CV

    Object Detection in Equirectangular Panorama

    Authors: Wenyan Yang, Yanlin Qian, Francesco Cricri, Lixin Fan, Joni-Kristian Kamarainen

    Abstract: We introduced a high-resolution equirectangular panorama (360-degree, virtual reality) dataset for object detection and propose a multi-projection variant of YOLO detector. The main challenge with equirectangular panorama image are i) the lack of annotated training data, ii) high-resolution imagery and iii) severe geometric distortions of objects near the panorama projection poles. In this work, w… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: 6 pages

  17. arXiv:1802.09227  [pdf, other

    cs.CV

    Depth Masked Discriminative Correlation Filter

    Authors: Uğur Kart, Joni-Kristian Kämäräinen, Jiří Matas, Lixin Fan, Francesco Cricri

    Abstract: Depth information provides a strong cue for occlusion detection and handling, but has been largely omitted in generic object tracking until recently due to lack of suitable benchmark datasets and applications. In this work, we propose a Depth Masked Discriminative Correlation Filter (DM-DCF) which adopts novel depth segmentation based occlusion detection that stops correlation filter updating and… ▽ More

    Submitted 10 October, 2018; v1 submitted 26 February, 2018; originally announced February 2018.

    Comments: 6 pages, accepted to ICPR 2018. ©2018 IEEE

  18. arXiv:1802.02783  [pdf, other

    cs.CV

    Saliency-Enhanced Robust Visual Tracking

    Authors: Caglar Aytekin, Francesco Cricri, Emre Aksu

    Abstract: Discrete correlation filter (DCF) based trackers have shown considerable success in visual object tracking. These trackers often make use of low to mid level features such as histogram of gradients (HoG) and mid-layer activations from convolution neural networks (CNNs). We argue that including semantically higher level information to the tracked features may provide further robustness to challengi… ▽ More

    Submitted 8 February, 2018; originally announced February 2018.

    Comments: Submitted to ICIP 2018

  19. arXiv:1802.00187  [pdf, other

    cs.LG

    Clustering and Unsupervised Anomaly Detection with L2 Normalized Deep Auto-Encoder Representations

    Authors: Caglar Aytekin, Xingyang Ni, Francesco Cricri, Emre Aksu

    Abstract: Clustering is essential to many tasks in pattern recognition and computer vision. With the advent of deep learning, there is an increasing interest in learning deep unsupervised representations for clustering analysis. Many works on this domain rely on variants of auto-encoders and use the encoder outputs as representations/features for clustering. In this paper, we show that an l2 normalization c… ▽ More

    Submitted 1 February, 2018; originally announced February 2018.

    Comments: Submitted to IJCNN 2018

  20. arXiv:1801.07889  [pdf, other

    cs.LG stat.ML

    A Theoretical Investigation of Graph Degree as an Unsupervised Normality Measure

    Authors: Caglar Aytekin, Francesco Cricri, Lixin Fan, Emre Aksu

    Abstract: For a graph representation of a dataset, a straightforward normality measure for a sample can be its graph degree. Considering a weighted graph, degree of a sample is the sum of the corresponding row's values in a similarity matrix. The measure is intuitive given the abnormal samples are usually rare and they are dissimilar to the rest of the data. In order to have an in-depth theoretical understa… ▽ More

    Submitted 5 February, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

    Comments: Submitted to IJCAI 2018

  21. arXiv:1712.09558  [pdf, other

    cs.CV

    Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels

    Authors: Caglar Aytekin, Xingyang Ni, Francesco Cricri, Lixin Fan, Emre Aksu

    Abstract: Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need to have such memory demanding ne… ▽ More

    Submitted 22 May, 2018; v1 submitted 27 December, 2017; originally announced December 2017.

    Comments: 6 pages, submitted to MMSP 2018

  22. arXiv:1612.01756  [pdf, other

    cs.LG cs.CV stat.ML

    Video Ladder Networks

    Authors: Francesco Cricri, Xingyang Ni, Mikko Honkala, Emre Aksu, Moncef Gabbouj

    Abstract: We present the Video Ladder Network (VLN) for efficiently generating future video frames. VLN is a neural encoder-decoder model augmented at all layers by both recurrent and feedforward lateral connections. At each layer, these connections form a lateral recurrent residual block, where the feedforward connection represents a skip connection and the recurrent connection represents the residual. Tha… ▽ More

    Submitted 30 December, 2016; v1 submitted 6 December, 2016; originally announced December 2016.

    Comments: This version extends the paper accepted at the NIPS 2016 workshop on ML for Spatiotemporal Forecasting, with more details and more experimental results