Search | arXiv e-print repository

PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification

Authors: Magdalena Trędowicz, Łukasz Struski, Marcin Mazur, Szymon Janusz, Arkadiusz Lewicki, Jacek Tabor

Abstract: Video processing is generally divided into two main categories: processing of the entire video, which typically yields optimal classification outcomes, and real-time processing, where the objective is to make a decision as promptly as possible. The latter is often driven by the need to identify rapidly potential critical or dangerous situations. These could include machine failure, traffic acciden… ▽ More Video processing is generally divided into two main categories: processing of the entire video, which typically yields optimal classification outcomes, and real-time processing, where the objective is to make a decision as promptly as possible. The latter is often driven by the need to identify rapidly potential critical or dangerous situations. These could include machine failure, traffic accidents, heart problems, or dangerous behavior. Although the models dedicated to the processing of entire videos are typically well-defined and clearly presented in the literature, this is not the case for online processing, where a plethora of hand-devised methods exist. To address this, we present \our{}, a novel, unified, and theoretically-based adaptation framework for dealing with the online classification problem for video data. The initial phase of our study is to establish a robust mathematical foundation for the theory of classification of sequential data, with the potential to make a decision at an early stage. This allows us to construct a natural function that encourages the model to return an outcome much faster. The subsequent phase is to demonstrate a straightforward and readily implementable method for adapting offline models to online and recurrent operations. Finally, by comparing the proposed approach to the non-online state-of-the-art baseline, it is demonstrated that the use of \our{} encourages the network to make earlier classification decisions without compromising accuracy. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2405.18163 [pdf, other]

NegGS: Negative Gaussian Splatting

Authors: Artur Kasymov, Bartosz Czekaj, Marcin Mazur, Jacek Tabor, Przemysław Spurek

Abstract: One of the key advantages of 3D rendering is its ability to simulate intricate scenes accurately. One of the most widely used methods for this purpose is Gaussian Splatting, a novel approach that is known for its rapid training and inference capabilities. In essence, Gaussian Splatting involves incorporating data about the 3D objects of interest into a series of Gaussian distributions, each of whi… ▽ More One of the key advantages of 3D rendering is its ability to simulate intricate scenes accurately. One of the most widely used methods for this purpose is Gaussian Splatting, a novel approach that is known for its rapid training and inference capabilities. In essence, Gaussian Splatting involves incorporating data about the 3D objects of interest into a series of Gaussian distributions, each of which can then be depicted in 3D in a manner analogous to traditional meshes. It is regrettable that the use of Gaussians in Gaussian Splatting is currently somewhat restrictive due to their perceived linear nature. In practice, 3D objects are often composed of complex curves and highly nonlinear structures. This issue can to some extent be alleviated by employing a multitude of Gaussian components to reflect the complex, nonlinear structures accurately. However, this approach results in a considerable increase in time complexity. This paper introduces the concept of negative Gaussians, which are interpreted as items with negative colors. The rationale behind this approach is based on the density distribution created by dividing the probability density functions (PDFs) of two Gaussians, which we refer to as Diff-Gaussian. Such a distribution can be used to approximate structures such as donut and moon-shaped datasets. Experimental findings indicate that the application of these techniques enhances the modeling of high-frequency elements with rapid color transitions. Additionally, it improves the representation of shadows. To the best of our knowledge, this is the first paper to extend the simple elipsoid shapes of Gaussian Splatting to more complex nonlinear structures. △ Less

Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

arXiv:2402.01524 [pdf, other]

HyperPlanes: Hypernetwork Approach to Rapid NeRF Adaptation

Authors: Paweł Batorski, Dawid Malarz, Marcin Przewięźlikowski, Marcin Mazur, Sławomir Tadeja, Przemysław Spurek

Abstract: Neural radiance fields (NeRFs) are a widely accepted standard for synthesizing new 3D object views from a small number of base images. However, NeRFs have limited generalization properties, which means that we need to use significant computational resources to train individual architectures for each item we want to represent. To address this issue, we propose a few-shot learning approach based on… ▽ More Neural radiance fields (NeRFs) are a widely accepted standard for synthesizing new 3D object views from a small number of base images. However, NeRFs have limited generalization properties, which means that we need to use significant computational resources to train individual architectures for each item we want to represent. To address this issue, we propose a few-shot learning approach based on the hypernetwork paradigm that does not require gradient optimization during inference. The hypernetwork gathers information from the training data and generates an update for universal weights. As a result, we have developed an efficient method for generating a high-quality 3D object representation from a small number of images in a single step. This has been confirmed by direct comparison with the state-of-the-art solutions and a comprehensive ablation study. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2303.15221 [pdf, other]

Digital Twin of a Network and Operating Environment Using Augmented Reality

Authors: Haoshuo Chen, Xiaonan Xu, Jesse E. Simsarian, Mijail Szczerban, Rob Harby, Roland Ryf, Mikael Mazur, Lauren Dallachiesa, Nicolas K. Fontaine, John Cloonan, Jim Sandoz, David T. Neilson

Abstract: We demonstrate the digital twin of a network, network elements, and operating environment using machine learning. We achieve network card failure localization and remote collaboration over 86 km of fiber using augmented reality. We demonstrate the digital twin of a network, network elements, and operating environment using machine learning. We achieve network card failure localization and remote collaboration over 86 km of fiber using augmented reality. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2206.09453 [pdf, other]

Bounding Evidence and Estimating Log-Likelihood in VAE

Authors: Łukasz Struski, Marcin Mazur, Paweł Batorski, Przemysław Spurek, Jacek Tabor

Abstract: Many crucial problems in deep learning and statistics are caused by a variational gap, i.e., a difference between evidence and evidence lower bound (ELBO). As a consequence, in the classical VAE model, we obtain only the lower bound on the log-likelihood since ELBO is used as a cost function, and therefore we cannot compare log-likelihood between models. In this paper, we present a general and eff… ▽ More Many crucial problems in deep learning and statistics are caused by a variational gap, i.e., a difference between evidence and evidence lower bound (ELBO). As a consequence, in the classical VAE model, we obtain only the lower bound on the log-likelihood since ELBO is used as a cost function, and therefore we cannot compare log-likelihood between models. In this paper, we present a general and effective upper bound of the variational gap, which allows us to efficiently estimate the true evidence. We provide an extensive theoretical study of the proposed approach. Moreover, we show that by applying our estimation, we can easily obtain lower and upper bounds for the log-likelihood of VAE models. △ Less

Submitted 19 June, 2022; originally announced June 2022.

arXiv:2111.07928 [pdf, other]

Target Layer Regularization for Continual Learning Using Cramer-Wold Generator

Authors: Marcin Mazur, Łukasz Pustelnik, Szymon Knop, Patryk Pagacz, Przemysław Spurek

Abstract: We propose an effective regularization strategy (CW-TaLaR) for solving continual learning problems. It uses a penalizing term expressed by the Cramer-Wold distance between two probability distributions defined on a target layer of an underlying neural network that is shared by all tasks, and the simple architecture of the Cramer-Wold generator for modeling output data representation. Our strategy… ▽ More We propose an effective regularization strategy (CW-TaLaR) for solving continual learning problems. It uses a penalizing term expressed by the Cramer-Wold distance between two probability distributions defined on a target layer of an underlying neural network that is shared by all tasks, and the simple architecture of the Cramer-Wold generator for modeling output data representation. Our strategy preserves target layer distribution while learning a new task but does not require remembering previous tasks' datasets. We perform experiments involving several common supervised frameworks, which prove the competitiveness of the CW-TaLaR method in comparison to a few existing state-of-the-art continual learning models. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: The paper is under consideration at Computer Vision and Image Understanding

arXiv:2110.05770 [pdf, other]

HyperCube: Implicit Field Representations of Voxelized 3D Models

Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzciński, Przemysław Spurek

Abstract: Recently introduced implicit field representations offer an effective way of generating 3D object shapes. They leverage implicit decoder trained to take a 3D point coordinate concatenated with a shape encoding and to output a value which indicates whether the point is outside the shape or not. Although this approach enables efficient rendering of visually plausible objects, it has two significant… ▽ More Recently introduced implicit field representations offer an effective way of generating 3D object shapes. They leverage implicit decoder trained to take a 3D point coordinate concatenated with a shape encoding and to output a value which indicates whether the point is outside the shape or not. Although this approach enables efficient rendering of visually plausible objects, it has two significant limitations. First, it is based on a single neural network dedicated for all objects from a training set which results in a cumbersome training procedure and its application in real life. More importantly, the implicit decoder takes only points sampled within voxels (and not the entire voxels) which yields problems at the classification boundaries and results in empty spaces within the rendered mesh. To solve the above limitations, we introduce a new HyperCube architecture based on interval arithmetic network, that enables direct processing of 3D voxels, trained using a hypernetwork paradigm to enforce model convergence. Instead of processing individual 3D samples from within a voxel, our approach allows to input the entire voxel (3D cube) represented with its convex hull coordinates, while the target network constructed by a hypernet assigns it to an inside or outside category. As a result our HyperCube model outperforms the competing approaches both in terms of training and inference efficiency, as well as the final mesh quality. △ Less

Submitted 12 October, 2021; originally announced October 2021.

arXiv:2102.05984 [pdf, other]

Modeling 3D Surface Manifolds with a Locally Conditioned Atlas

Authors: Przemysław Spurek, Sebastian Winczowski, Maciej Zięba, Tomasz Trzciński, Kacper Kania, Marcin Mazur

Abstract: Recently proposed 3D object reconstruction methods represent a mesh with an atlas - a set of planar patches approximating the surface. However, their application in a real-world scenario is limited since the surfaces of reconstructed objects contain discontinuities, which degrades the quality of the final mesh. This is mainly caused by independent processing of individual patches, and in this work… ▽ More Recently proposed 3D object reconstruction methods represent a mesh with an atlas - a set of planar patches approximating the surface. However, their application in a real-world scenario is limited since the surfaces of reconstructed objects contain discontinuities, which degrades the quality of the final mesh. This is mainly caused by independent processing of individual patches, and in this work, we postulate to mitigate this limitation by preserving local consistency around patch vertices. To that end, we introduce a Locally Conditioned Atlas (LoCondA), a framework for representing a 3D object hierarchically in a generative model. Firstly, the model maps a point cloud of an object into a sphere. Secondly, by leveraging a spherical prior, we enforce the map** to be locally consistent on the sphere and on the target object. This way, we can sample a mesh quad on that sphere and project it back onto the object's manifold. With LoCondA, we can produce topologically diverse objects while maintaining quads to be stitched together. We show that the proposed approach provides structurally coherent reconstructions while producing meshes of quality comparable to the competitors. △ Less

Submitted 5 April, 2024; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:2102.05973 [pdf, other]

HyperPocket: Generative Point Cloud Completion

Authors: Przemysław Spurek, Artur Kasymov, Marcin Mazur, Diana Janik, Sławomir Tadeja, Łukasz Struski, Jacek Tabor, Tomasz Trzciński

Abstract: Scanning real-life scenes with modern registration devices typically give incomplete point cloud representations, mostly due to the limitations of the scanning process and 3D occlusions. Therefore, completing such partial representations remains a fundamental challenge of many computer vision applications. Most of the existing approaches aim to solve this problem by learning to reconstruct individ… ▽ More Scanning real-life scenes with modern registration devices typically give incomplete point cloud representations, mostly due to the limitations of the scanning process and 3D occlusions. Therefore, completing such partial representations remains a fundamental challenge of many computer vision applications. Most of the existing approaches aim to solve this problem by learning to reconstruct individual 3D objects in a synthetic setup of an uncluttered environment, which is far from a real-life scenario. In this work, we reformulate the problem of point cloud completion into an object hallucination task. Thus, we introduce a novel autoencoder-based architecture called HyperPocket that disentangles latent representations and, as a result, enables the generation of multiple variants of the completed 3D point clouds. We split point cloud processing into two disjoint data streams and leverage a hypernetwork paradigm to fill the spaces, dubbed pockets, that are left by the missing object parts. As a result, the generated point clouds are not only smooth but also plausible and geometrically consistent with the scene. Our method offers competitive performances to the other state-of-the-art models, and it enables a~plethora of novel applications. △ Less

Submitted 11 February, 2021; originally announced February 2021.

arXiv:2009.07327 [pdf, other]

Generative models with kernel distance in data space

Authors: Szymon Knop, Marcin Mazur, Przemysław Spurek, Jacek Tabor, Igor Podolak

Abstract: Generative models dealing with modeling a~joint data distribution are generally either autoencoder or GAN based. Both have their pros and cons, generating blurry images or being unstable in training or prone to mode collapse phenomenon, respectively. The objective of this paper is to construct a~model situated between above architectures, one that does not inherit their main weaknesses. The propos… ▽ More Generative models dealing with modeling a~joint data distribution are generally either autoencoder or GAN based. Both have their pros and cons, generating blurry images or being unstable in training or prone to mode collapse phenomenon, respectively. The objective of this paper is to construct a~model situated between above architectures, one that does not inherit their main weaknesses. The proposed LCW generator (Latent Cramer-Wold generator) resembles a classical GAN in transforming Gaussian noise into data space. What is of utmost importance, instead of a~discriminator, LCW generator uses kernel distance. No adversarial training is utilized, hence the name generator. It is trained in two phases. First, an autoencoder based architecture, using kernel measures, is built to model a manifold of data. We propose a Latent Trick map** a Gaussian to latent in order to get the final model. This results in very competitive FID values. △ Less

Submitted 15 September, 2020; originally announced September 2020.

arXiv:2008.11370 [pdf, other]

Gravilon: Applications of a New Gradient Descent Method to Machine Learning

Authors: Chad Kelterborn, Marcin Mazur, Bogdan V. Petrenko

Abstract: Gradient descent algorithms have been used in countless applications since the inception of Newton's method. The explosion in the number of applications of neural networks has re-energized efforts in recent years to improve the standard gradient descent method in both efficiency and accuracy. These methods modify the effect of the gradient in updating the values of the parameters. These modificati… ▽ More Gradient descent algorithms have been used in countless applications since the inception of Newton's method. The explosion in the number of applications of neural networks has re-energized efforts in recent years to improve the standard gradient descent method in both efficiency and accuracy. These methods modify the effect of the gradient in updating the values of the parameters. These modifications often incorporate hyperparameters: additional variables whose values must be specified at the outset of the program. We provide, below, a novel gradient descent algorithm, called Gravilon, that uses the geometry of the hypersurface to modify the length of the step in the direction of the gradient. Using neural networks, we provide promising experimental results comparing the accuracy and efficiency of the Gravilon method against commonly used gradient descent algorithms on MNIST digit classification. △ Less

Submitted 28 October, 2020; v1 submitted 26 August, 2020; originally announced August 2020.

Comments: 16 pages, 5 figures

arXiv:2002.00560 [pdf]

doi 10.1364/OFC.2020.Th1I.5

On the Performance under Hard and Soft Bitwise Mismatched-Decoding

Authors: Tsuyoshi Yoshida, Mikael Mazur, Jochen Schröder, Magnus Karlsson, Erik Agrell

Abstract: We investigated a suitable auxiliary channel setting and the gap between Q-factors with hard and soft demap**. The system margin definition should be reconsidered for systems employing complex coded modulation with soft forward error correction. We investigated a suitable auxiliary channel setting and the gap between Q-factors with hard and soft demap**. The system margin definition should be reconsidered for systems employing complex coded modulation with soft forward error correction. △ Less

Submitted 3 February, 2020; originally announced February 2020.

Comments: 3 pages, 4 figures

Journal ref: Proc. Optical Fiber Communication Conference (OFC), San Diego, CA, Mar. 2020

arXiv:1911.08641 [pdf, other]

doi 10.1049/cp.2019.0958

Performance Monitoring for Live Systems with Soft FEC and Multilevel Modulation

Authors: Tsuyoshi Yoshida, Mikael Mazur, Jochen Schröder, Magnus Karlsson, Erik Agrell

Abstract: Performance monitoring is an essential function for margin measurements in live systems. Historically, system budgets have been described by the Q-factor converted from the bit error rate (BER) under binary modulation and direct detection. The introduction of hard-decision forward error correction (FEC) did not change this. In recent years technologies have changed significantly to comprise cohere… ▽ More Performance monitoring is an essential function for margin measurements in live systems. Historically, system budgets have been described by the Q-factor converted from the bit error rate (BER) under binary modulation and direct detection. The introduction of hard-decision forward error correction (FEC) did not change this. In recent years technologies have changed significantly to comprise coherent detection, multilevel modulation and soft FEC. In such advanced systems, different metrics such as (nomalized) generalized mutual information (GMI/NGMI) and asymmetric information (ASI) are regarded as being more reliable. On the other hand, Q budgets are still useful because pre-FEC BER monitoring is established in industry for live system monitoring. The pre-FEC BER is easily estimated from available information of the number of flipped bits in the FEC decoding, which does not require knowledge of the transmitted bits that are unknown in live systems. Therefore, the use of metrics like GMI/NGMI/ASI for performance monitoring has not been possible in live systems. However, in this work we propose a blind soft-performance estimation method. Based on a histogram of log-likelihood-values without the knowledge of the transmitted bits, we show how the ASI can be estimated. We examined the proposed method experimentally for 16 and 64-ary quadrature amplitude modulation (QAM) and probabilistically shaped 16, 64, and 256-QAM in recirculating loop experiments. We see a relative error of 3.6%, which corresponds to around 0.5 dB signal-to-noise ratio difference for binary modulation, in the regime where the ASI is larger than the assumed FEC threshold. For this proposed method, the digital signal processing circuitry requires only a minimal additional function of storing the L-value histograms before the soft-decision FEC decoder. △ Less

Submitted 17 February, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: 9 pages, 9 figures

Journal ref: Proc. European Conference on Optical Communication (ECOC), Dublin, Ireland, Sept. 2019

arXiv:1901.10417 [pdf, other]

Sliced generative models

Authors: Szymon Knop, Marcin Mazur, Jacek Tabor, Igor Podolak, Przemysław Spurek

Abstract: In this paper we discuss a class of AutoEncoder based generative models based on one dimensional sliced approach. The idea is based on the reduction of the discrimination between samples to one-dimensional case. Our experiments show that methods can be divided into two groups. First consists of methods which are a modification of standard normality tests, while the second is based on classical dis… ▽ More In this paper we discuss a class of AutoEncoder based generative models based on one dimensional sliced approach. The idea is based on the reduction of the discrimination between samples to one-dimensional case. Our experiments show that methods can be divided into two groups. First consists of methods which are a modification of standard normality tests, while the second is based on classical distances between samples. It turns out that both groups are correct generative models, but the second one gives a slightly faster decrease rate of Fréchet Inception Distance (FID). △ Less

Submitted 29 January, 2019; originally announced January 2019.

Comments: 11 pages, 4 figures, conference

arXiv:1805.09235 [pdf, other]

Cramer-Wold AutoEncoder

Authors: Szymon Knop, Jacek Tabor, Przemysław Spurek, Igor Podolak, Marcin Mazur, Stanisław Jastrzębski

Abstract: We propose a new generative model, Cramer-Wold Autoencoder (CWAE). Following WAE, we directly encourage normality of the latent space. Our paper uses also the recent idea from Sliced WAE (SWAE) model, which uses one-dimensional projections as a method of verifying closeness of two distributions. The crucial new ingredient is the introduction of a new (Cramer-Wold) metric in the space of densities,… ▽ More We propose a new generative model, Cramer-Wold Autoencoder (CWAE). Following WAE, we directly encourage normality of the latent space. Our paper uses also the recent idea from Sliced WAE (SWAE) model, which uses one-dimensional projections as a method of verifying closeness of two distributions. The crucial new ingredient is the introduction of a new (Cramer-Wold) metric in the space of densities, which replaces the Wasserstein metric used in SWAE. We show that the Cramer-Wold metric between Gaussian mixtures is given by a simple analytic formula, which results in the removal of sampling necessary to estimate the cost function in WAE and SWAE models. As a consequence, while drastically simplifying the optimization procedure, CWAE produces samples of a matching perceptual quality to other SOTA models. △ Less

Submitted 2 July, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

Journal ref: Journal of Machine Learning Research, 21, 164, 1-28 2020

arXiv:1312.4113 [pdf, other]

Lessons Learned from Development of a Software Tool to Support Academic Advising

Authors: Nicholas Mattei, Thomas Dodson, Joshua T. Guerin, Judy Goldsmith, Joan M. Mazur

Abstract: We detail some lessons learned while designing and testing a decision-theoretic advising support tool for undergraduates at a large state university. Between 2009 and 2011 we conducted two surveys of over 500 students in multiple majors and colleges. These surveys asked students detailed questions about their preferences concerning course selection, advising, and career paths. We present data from… ▽ More We detail some lessons learned while designing and testing a decision-theoretic advising support tool for undergraduates at a large state university. Between 2009 and 2011 we conducted two surveys of over 500 students in multiple majors and colleges. These surveys asked students detailed questions about their preferences concerning course selection, advising, and career paths. We present data from this study which may be helpful for faculty and staff who advise undergraduate students. We find that advising support software tools can augment the student-advisor relationship, particularly in terms of course planning, but cannot and should not replace in-person advising. △ Less

Submitted 29 May, 2014; v1 submitted 15 December, 2013; originally announced December 2013.

Comments: 5 Figures, revised version including more figures and cross-referencing

ACM Class: K.3.1

Showing 1–16 of 16 results for author: Mazur, M