Search | arXiv e-print repository

doi 10.1007/s11554-024-01491-z

SelfReDepth: Self-Supervised Real-Time Depth Restoration for Consumer-Grade Sensors

Authors: Alexandre Duarte, Francisco Fernandes, João M. Pereira, Catarina Moreira, Jacinto C. Nascimento, Joaquim Jorge

Abstract: Depth maps produced by consumer-grade sensors suffer from inaccurate measurements and missing data from either system or scene-specific sources. Data-driven denoising algorithms can mitigate such problems. However, they require vast amounts of ground truth depth data. Recent research has tackled this limitation using self-supervised learning techniques, but it requires multiple RGB-D sensors. More… ▽ More Depth maps produced by consumer-grade sensors suffer from inaccurate measurements and missing data from either system or scene-specific sources. Data-driven denoising algorithms can mitigate such problems. However, they require vast amounts of ground truth depth data. Recent research has tackled this limitation using self-supervised learning techniques, but it requires multiple RGB-D sensors. Moreover, most existing approaches focus on denoising single isolated depth maps or specific subjects of interest, highlighting a need for methods to effectively denoise depth maps in real-time dynamic environments. This paper extends state-of-the-art approaches for depth-denoising commodity depth devices, proposing SelfReDepth, a self-supervised deep learning technique for depth restoration, via denoising and hole-filling by inpainting full-depth maps captured with RGB-D sensors. The algorithm targets depth data in video streams, utilizing multiple sequential depth frames coupled with color data to achieve high-quality depth videos with temporal coherence. Finally, SelfReDepth is designed to be compatible with various RGB-D sensors and usable in real-time scenarios as a pre-processing step before applying other depth-dependent algorithms. Our results demonstrate our approach's real-time performance on real-world datasets. They show that it outperforms state-of-the-art denoising and restoration performance at over 30fps on Commercial Depth Cameras, with potential benefits for augmented and mixed-reality applications. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 13pp, 5 figures, 1 table

Journal ref: Journal of Real-Time Image Processing 2024

arXiv:2406.00370 [pdf, other]

doi 10.1007/978-3-319-22698-9_43

Eery Space: Facilitating Virtual Meetings Through Remote Proxemics

Authors: Maurício Sousa, Daniel Mendes, Alfredo Ferreira, João Madeiras Pereira, Joaquim Jorge

Abstract: Virtual meetings have become increasingly common with modern video-conference and collaborative software. While they allow obvious savings in time and resources, current technologies add unproductive layers of protocol to the flow of communication between participants, rendering the interactions far from seamless. In this work we introduce Remote Proxemics, an extension of proxemics aimed at bring… ▽ More Virtual meetings have become increasingly common with modern video-conference and collaborative software. While they allow obvious savings in time and resources, current technologies add unproductive layers of protocol to the flow of communication between participants, rendering the interactions far from seamless. In this work we introduce Remote Proxemics, an extension of proxemics aimed at bringing the syntax of co-located proximal interactions to virtual meetings. We propose Eery Space, a shared virtual locus that results from merging multiple remote areas, where meeting participants' are located side-by-side as if they shared the same physical location. Eery Space promotes collaborative content creation and seamless mediation of communication channels based on virtual proximity. Results from user evaluation suggest that our approach is effective at enhancing mutual awareness between participants and sufficient to initiate proximal exchanges regardless of their geolocation, while promoting smooth interactions between local and remote people alike. These results happen even in the absence of visual avatars and other social devices such as eye contact, which are largely the focus of previous approaches. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 19 pages, 7 figures

Journal ref: INTERACT 2015. Lecture Notes in Computer Science(), vol 9298. Springer, Cham

arXiv:2405.02672 [pdf, other]

Effects of Realism and Representation on Self-Embodied Avatars in Immersive Virtual Environments

Authors: Rafael Kuffner dos Anjos, João Madeiras Pereira

Abstract: Virtual Reality (VR) has recently gained traction with many new and ever more affordable devices being released. The increase in popularity of this paradigm of interaction has given birth to new applications and has attracted casual consumers to experience VR. Providing a self-embodied representation (avatar) of users' full bodies inside shared virtual spaces can improve the VR experience and make… ▽ More Virtual Reality (VR) has recently gained traction with many new and ever more affordable devices being released. The increase in popularity of this paradigm of interaction has given birth to new applications and has attracted casual consumers to experience VR. Providing a self-embodied representation (avatar) of users' full bodies inside shared virtual spaces can improve the VR experience and make it more engaging to both new and experienced users . This is especially important in fully immersive systems, where the equipment completely occludes the real world making self awareness problematic. Indeed, the feeling of presence of the user is highly influenced by their virtual representations, even though small flaws could lead to uncanny valley side-effects. Following previous research, we would like to assess whether using a third-person perspective could also benefit the VR experience, via an improved spatial awareness of the user's virtual surroundings. In this paper we investigate realism and perspective of self-embodied representation in VR setups in natural tasks, such as walking and avoiding obstacles. We compare both First and Third-Person perspectives with three different levels of realism in avatar representation. These range from a stylized abstract avatar, to a "realistic" mesh-based humanoid representation and a point-cloud rendering. The latter uses data captured via depth-sensors and mapped into a virtual self inside the Virtual Environment. We present a throughout evaluation and comparison of these different representations, describing a series of guidelines for self-embodied VR applications. The effects of the uncanny valley are also discussed in the context of navigation and reflex-based tasks. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2404.01931 [pdf, other]

Fluid Implicit Particle Simulation for CPU and GPU

Authors: Pedro Centeno, João Madeiras Pereira

Abstract: One of the current challenges in physically-based simulations, and, more specifically, fluid simulations, is to produce visually appealing results at interactive rates, capable of being used in multiple forms of media. In recent times, a lot of effort has been made with regards to this with the use of multi-core architectures, as many of the computations involved in the algorithms for these simula… ▽ More One of the current challenges in physically-based simulations, and, more specifically, fluid simulations, is to produce visually appealing results at interactive rates, capable of being used in multiple forms of media. In recent times, a lot of effort has been made with regards to this with the use of multi-core architectures, as many of the computations involved in the algorithms for these simulations are very well suited for these architectures. Although there is a considerable amount of works regarding acceleration techniques in this field, there is yet room to further explore and analyze some of them. To investigate this problem, we surveyed the topic of fluid simulations and some of the recent contributions towards this field. Additionally, we implemented two versions of a fluid simulation algorithm, one on the CPU and the other on the GPU using NVIDIA's CUDA framework, with the intent of gaining a better understanding of the effort needed to move these simulations to a multi-core architecture and the performance gains that we get with it. △ Less

Submitted 16 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.10647 [pdf, other]

Building An Efficient Grid On GPU

Authors: Vasco Costa, João M. Pereira, Joaquim Jorge

Abstract: Grid space partitioning is a technique to speed up queries to graphics databases. We present a parallel grid construction algorithm which can efficiently construct a structured grid on GPU hardware. Our approach is substantially faster than existing uniform grid construction algorithms, especially on non-homogeneous scenes. Indeed, it can populate a grid in real-time (at rates over 25 Hz), for arc… ▽ More Grid space partitioning is a technique to speed up queries to graphics databases. We present a parallel grid construction algorithm which can efficiently construct a structured grid on GPU hardware. Our approach is substantially faster than existing uniform grid construction algorithms, especially on non-homogeneous scenes. Indeed, it can populate a grid in real-time (at rates over 25 Hz), for architectural scenes with 10 million triangles. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2312.06538 [pdf]

Ray-Tracing With a Coherent Ray-Space Hierarchy

Authors: Nuno Reis, Vasco Costa, João M. Pereira

Abstract: We present an algorithm for creating an n-level ray-space hierarchy (RSH) of coherent rays that runs on the GPU. Our algorithm uses rasterization to process the primary rays, then uses those results as the inputs for a RSH, that processes the secondary rays. The RSH algorithm generates bundles of rays; hashes them, according to their attributes; and sorts them. Thus we generate a ray list with adj… ▽ More We present an algorithm for creating an n-level ray-space hierarchy (RSH) of coherent rays that runs on the GPU. Our algorithm uses rasterization to process the primary rays, then uses those results as the inputs for a RSH, that processes the secondary rays. The RSH algorithm generates bundles of rays; hashes them, according to their attributes; and sorts them. Thus we generate a ray list with adjacent coherent rays. To improve the rendering performance of the RSH vs a more classical approach. In addition the scenes geometry is partitioned into a set of bounding spheres, intersected with the RSH, to further decrease the amount of false ray bundle-primitive intersection tests. We show that our technique notably reduces the amount of ray-primitive intersection tests, required to render an image. In particular, it performs up to 50% better in this metric than other algorithms in this class. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.05179 [pdf]

Video-Based Rendering Techniques: A Survey

Authors: Rafael Kuffner dos Anjos, João Madeiras Pereira, José Antonio Gaspar

Abstract: Three-dimensional reconstruction of events recorded on images has been a common challenge between computer vision and computer graphics for a long time. Estimating the real position of objects and surfaces using vision as an input is no trivial task and has been approached in several different ways. Although huge progress has been made so far, there are several open issues to which an answer is ne… ▽ More Three-dimensional reconstruction of events recorded on images has been a common challenge between computer vision and computer graphics for a long time. Estimating the real position of objects and surfaces using vision as an input is no trivial task and has been approached in several different ways. Although huge progress has been made so far, there are several open issues to which an answer is needed. The use of videos as an input for a rendering process (video-based rendering, VBR) is something that recently has been started to be looked upon and has added many other challenges and also solutions to the classical image-based rendering issue (IBR). This article presents the state of art on video-based rendering and image-based techniques that can be applied on this scenario, evaluating the open issues yet to be solved, indicating where future work should be focused. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.00408 [pdf, other]

Beyond the Screen: Resha** the Workplace with Virtual and Augmented Reality

Authors: Nuno Verdelho Trindade, Alfredo Ferreira, João Madeiras Pereira

Abstract: Although extended reality technologies have enjoyed an explosion in popularity in recent years, few applications are effectively used outside the entertainment or academic contexts. This work consists of a literature review regarding the effective integration of such technologies in the workplace. It aims to provide an updated view of how they are being used in that context. First, we examine exis… ▽ More Although extended reality technologies have enjoyed an explosion in popularity in recent years, few applications are effectively used outside the entertainment or academic contexts. This work consists of a literature review regarding the effective integration of such technologies in the workplace. It aims to provide an updated view of how they are being used in that context. First, we examine existing research concerning virtual, augmented, and mixed-reality applications. We also analyze which have made their way to the workflows of companies and institutions. Furthermore, we circumscribe the aspects of extended reality technologies that determined this applicability. △ Less

Submitted 1 December, 2023; originally announced December 2023.

arXiv:2311.14593 [pdf]

Visualizing Plasma Physics Simulations in Immersive Environments

Authors: Nuno Verdelho Trindade, Oscar Amaro, David Bras, Daniel Goncalves, João Madeiras Pereira, Alfredo Ferreira

Abstract: Plasma physics simulations create complex datasets for which researchers need state-of-the-art visualization tools to gain insights. These datasets are 3D in nature but are commonly depicted and analyzed using 2D idioms displayed on 2D screens. These offer limited understandability in a domain where spatial awareness is key. Virtual reality (VR) can be used as an alternative to conventional means… ▽ More Plasma physics simulations create complex datasets for which researchers need state-of-the-art visualization tools to gain insights. These datasets are 3D in nature but are commonly depicted and analyzed using 2D idioms displayed on 2D screens. These offer limited understandability in a domain where spatial awareness is key. Virtual reality (VR) can be used as an alternative to conventional means for analyzing such datasets. VR has been known to improve depth and spatial relationship perception, which are fundamental for obtaining insights into 3D plasma morphology. Likewise, VR can potentially increase user engagement by offering more immersive and enjoyable experiences. Methods This study presents PlasmaVR, a proof-of-concept VR tool for visualizing datasets resulting from plasma physics simulations. It enables immersive multidimensional data visualization of particles, scalar, and vector fields and uses a more natural interface. The study includes user evaluation with domain experts where PlasmaVR was employed to assess the possible benefits of immersive environments in plasma physics visualization. The experimental group comprised five plasma physics researchers who were asked to perform tasks designed to represent their typical analysis workflow. To assess the suitability of the prototype for the different types of tasks, a set of objective metrics, such as completion time and number of errors, were measured. The prototype's usability was also evaluated using a standard System Usability Survey questionnaire. △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2311.09327 [pdf]

Survey of Rigid Body Simulation with Extended Position Based Dynamics

Authors: Miguel Luis Nunes Seabra, Daniel Simões Lopes, João Madeiras Pereira

Abstract: Interactive real-time rigid body simulation is a crucial tool in any modern game engine or 3D authoring tool. The quest for fast, robust and accurate simulations is ever evolving. PBRBD (Position Based Rigid Body Dynamics), a recent expansion of PBD (Position Based Dynamics), is a novel approach for this issue. This work aims at providing a comprehensible state-of-the art comparison between Positi… ▽ More Interactive real-time rigid body simulation is a crucial tool in any modern game engine or 3D authoring tool. The quest for fast, robust and accurate simulations is ever evolving. PBRBD (Position Based Rigid Body Dynamics), a recent expansion of PBD (Position Based Dynamics), is a novel approach for this issue. This work aims at providing a comprehensible state-of-the art comparison between Position Based methods and other real-time simulation methods used for rigid body dynamics using a custom 3D physics engine for benchmarking and comparing PBRBD (Position Based Rigid Body Dynamics), and some variants, with state-of-the-art simulators commonly used in the gaming industry, PhysX and Havok. Showing that PBRBD can produce simulations that are accurate and stable, excelling at maintaining stable energy levels, and allowing for a variety of constraints, but is limited in its handling of stable stacks of rigid bodies. △ Less

Submitted 15 November, 2023; originally announced November 2023.

arXiv:2311.09211 [pdf]

Digitally reproducing the artistic style of XVI century artist Antonio Campelo in Alegoria Prudencia

Authors: Joao Fradinho Oliveira, Joao Madeiras Pereira

Abstract: In this work, the artistic style of the sixteenth century Portuguese artist António Campelo in Alegoria à Prudência is analyzed in order to create a computational tool that allows one to transform any 3D digital sculpture model into an image that resembles the modeled style. From this analysis the problem is divided into two parts: detection and drawing of contour lines and the shading of the scen… ▽ More In this work, the artistic style of the sixteenth century Portuguese artist António Campelo in Alegoria à Prudência is analyzed in order to create a computational tool that allows one to transform any 3D digital sculpture model into an image that resembles the modeled style. From this analysis the problem is divided into two parts: detection and drawing of contour lines and the shading of the scene. Several techniques from Non Photorealistic Rendering (NPR) and from Photorealistic Rendering that can resolve the problem are presented and, based on this study, a possible solution is presented. Each modeled rendering component is then analyzed using image based methods against the proposed artistic style and parameters are adjusted for a closer match. In the final stage a group of people was asked to answer a questionnaire where the similarity between the renderings of different objects and the original style was classified according to their personal opinion. One of our findings is that although the source 3D objects cannot be readily found for a direct comparison, nor can the paper medium with centuries old damage be the same, the comparison of sub -parts of both images of the same topology was still possible validating our method and discarding other styles from the comparison. △ Less

Submitted 24 November, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: Wrong references corrected

arXiv:2203.02399 [pdf, other]

doi 10.1145/3672553

Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box

Authors: Catarina Moreira, Yu-Liang Chou, Chihcheng Hsieh, Chun Ouyang, Joaquim Jorge, João Madeiras Pereira

Abstract: This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generati… ▽ More This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in 25 different datasets. Our findings indicate that: (1) Different machine learning models have little impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A counterfactual inspection analysis is strongly recommended to ensure a robust examination of counterfactual explanations and the potential identification of biases. △ Less

Submitted 11 June, 2024; v1 submitted 4 March, 2022; originally announced March 2022.

Journal ref: ACM Computing Surveys, 2024/6/3

arXiv:2203.02044 [pdf, other]

Design requirements to improve laparoscopy via XR

Authors: Ezequiel R. Zorzal, Maurício Sousa, Pedro Belchior, João Madeiras Pereira, Nuno Figueiredo, Joaquim Jorge

Abstract: Laparoscopic surgery has the advantage of avoiding large open incisions and thereby decreasing blood loss, pain, and discomfort to patients. However, on the other side, it is hampered by restricted workspace, ambiguous communication, and surgeon fatigue caused by non-ergonomic head positioning. We aimed to identify critical problems and suggest design requirements and solutions. We used user and t… ▽ More Laparoscopic surgery has the advantage of avoiding large open incisions and thereby decreasing blood loss, pain, and discomfort to patients. However, on the other side, it is hampered by restricted workspace, ambiguous communication, and surgeon fatigue caused by non-ergonomic head positioning. We aimed to identify critical problems and suggest design requirements and solutions. We used user and task analysis methods to learn about practices performed in an operating room by observing surgeons in their working environment to understand how they performed tasks and achieved their intended goals. Drawing on observations and analysis from recorded laparoscopic surgeries, we have identified several constraints and design requirements to propose potential solutions to address the issues. Surgeons operate in a dimly lit environment, surrounded by monitors, and communicate through verbal commands and pointing gestures. Therefore, performing user and task analysis allowed us to better understand the existing problems in laparoscopy while identifying several communication constraints and design requirements, which a solution has to follow to address those problems. Our contributions include identifying design requirements for laparoscopy surgery through a user and task analysis. These requirements propose design solutions towards improved surgeons' comfort and make the surgical procedure less laborious. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 5 pages, 7 figures, workshop paper

ACM Class: I.3; J.3

arXiv:2203.01643 [pdf, other]

Improving X-ray Diagnostics through Eye-Tracking and XR

Authors: Catarina Moreira, Isabel Blanco Nobre, Sandra Costa Sousa, João Madeiras Pereira, Joaquim Jorge

Abstract: There is a growing need to assist radiologists in performing X-ray readings and diagnoses fast, comfortably, and effectively. As radiologists strive to maximize productivity, it is essential to consider the impact of reading rooms in interpreting complex examinations and ensure that higher volume and reporting speeds do not compromise patient outcomes. Virtual Reality (VR) is a disruptive technolo… ▽ More There is a growing need to assist radiologists in performing X-ray readings and diagnoses fast, comfortably, and effectively. As radiologists strive to maximize productivity, it is essential to consider the impact of reading rooms in interpreting complex examinations and ensure that higher volume and reporting speeds do not compromise patient outcomes. Virtual Reality (VR) is a disruptive technology for clinical practice in assessing X-ray images. We argue that conjugating eye-tracking with VR devices and Machine Learning may overcome obstacles posed by inadequate ergonomic postures and poor room conditions that often cause erroneous diagnostics when professionals examine digital images. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Journal ref: 1st International Workshop on XR for Healthcare and Wellbeing, 2022

arXiv:2202.06930 [pdf, other]

Tensor Moments of Gaussian Mixture Models: Theory and Applications

Authors: João M. Pereira, Joe Kileel, Tamara G. Kolda

Abstract: Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences. We study the moments of multivariate Gaussians and GMMs. The $d$-th moment of an $n$-dimensional random variable is a symmetric $d$-way tensor of size $n^d$, so working with moments naively is assumed to be prohibitively expensive for $d>2$ and larger values of $n$. In this work, we develop theory and numerical… ▽ More Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences. We study the moments of multivariate Gaussians and GMMs. The $d$-th moment of an $n$-dimensional random variable is a symmetric $d$-way tensor of size $n^d$, so working with moments naively is assumed to be prohibitively expensive for $d>2$ and larger values of $n$. In this work, we develop theory and numerical methods for \emph{implicit computations} with moment tensors of GMMs, reducing the computational and storage costs to $\mathcal{O}(n^2)$ and $\mathcal{O}(n^3)$, respectively, for general covariance matrices, and to $\mathcal{O}(n)$ and $\mathcal{O}(n)$, respectively, for diagonal ones. We derive concise analytic expressions for the moments in terms of symmetrized tensor products, relying on the correspondence between symmetric tensors and homogeneous polynomials, and combinatorial identities involving Bell polynomials. The primary application of this theory is to estimating GMM parameters (means and covariances) from a set of observations, when formulated as a moment-matching optimization problem. If there is a known and common covariance matrix, we also show it is possible to debias the data observations, in which case the problem of estimating the unknown means reduces to symmetric CP tensor decomposition. Numerical results validate and illustrate the numerical efficiency of our approaches. This work potentially opens the door to the competitiveness of the method of moments as compared to expectation maximization methods for parameter estimation of GMMs. △ Less

Submitted 21 March, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

arXiv:2110.15821 [pdf, other]

Landscape analysis of an improved power method for tensor decomposition

Authors: Joe Kileel, Timo Klock, João M. Pereira

Abstract: In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tens… ▽ More In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tensor when the input is sufficiently low-rank. We analyze the non-convex optimization landscape associated with the SPM objective. Our analysis accounts for working with noisy tensors. We derive quantitative bounds such that any second-order critical point with SPM objective value exceeding the bound must equal a tensor component in the noiseless case, and must approximate a tensor component in the noisy case. For decomposing tensors of size $D^{\times m}$, we obtain a near-global guarantee up to rank $\widetilde{o}(D^{\lfloor m/2 \rfloor})$ under a random tensor model, and a global guarantee up to rank $\mathcal{O}(D)$ assuming deterministic frame conditions. This implies that SPM with suitable initialization is a provable, efficient, robust algorithm for low-rank symmetric tensor decomposition. We conclude with numerics that show a practical preferability for using the SPM functional over a more established counterpart. △ Less

Submitted 29 October, 2021; originally announced October 2021.

Comments: 45 pages, 4 figures, Matlab code included as ancillary files, to appear in NeurIPS 2021

arXiv:2102.09042 [pdf, other]

Modeling Extremes with d-max-decreasing Neural Networks

Authors: Ali Hasan, Khalil Elkhalil, Yuting Ng, Joao M. Pereira, Sina Farsiu, Jose H. Blanchet, Vahid Tarokh

Abstract: We propose a novel neural network architecture that enables non-parametric calibration and generation of multivariate extreme value distributions (MEVs). MEVs arise from Extreme Value Theory (EVT) as the necessary class of models when extrapolating a distributional fit over large spatial and temporal scales based on data observed in intermediate scales. In turn, EVT dictates that $d$-max-decreasin… ▽ More We propose a novel neural network architecture that enables non-parametric calibration and generation of multivariate extreme value distributions (MEVs). MEVs arise from Extreme Value Theory (EVT) as the necessary class of models when extrapolating a distributional fit over large spatial and temporal scales based on data observed in intermediate scales. In turn, EVT dictates that $d$-max-decreasing, a stronger form of convexity, is an essential shape constraint in the characterization of MEVs. As far as we know, our proposed architecture provides the first class of non-parametric estimators for MEVs that preserve these essential shape constraints. We show that our architecture approximates the dependence structure encoded by MEVs at parametric rate. Moreover, we present a new method for sampling high-dimensional MEVs using a generative model. We demonstrate our methodology on a wide range of experimental settings, ranging from environmental sciences to financial mathematics and verify that the structural properties of MEVs are retained compared to existing methods. △ Less

Submitted 1 March, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

arXiv:2007.06075 [pdf, other]

Identifying Latent Stochastic Differential Equations

Authors: Ali Hasan, João M. Pereira, Sina Farsiu, Vahid Tarokh

Abstract: We present a method for learning latent stochastic differential equations (SDEs) from high-dimensional time series data. Given a high-dimensional time series generated from a lower dimensional latent unknown Itô process, the proposed method learns the map** from ambient to latent space, and the underlying SDE coefficients, through a self-supervised learning approach. Using the framework of varia… ▽ More We present a method for learning latent stochastic differential equations (SDEs) from high-dimensional time series data. Given a high-dimensional time series generated from a lower dimensional latent unknown Itô process, the proposed method learns the map** from ambient to latent space, and the underlying SDE coefficients, through a self-supervised learning approach. Using the framework of variational autoencoders, we consider a conditional generative model for the data based on the Euler-Maruyama approximation of SDE solutions. Furthermore, we use recent results on identifiability of latent variable models to show that the proposed model can recover not only the underlying SDE coefficients, but also the original latent variables, up to an isometry, in the limit of infinite data. We validate the method through several simulated video processing tasks, where the underlying SDE is known, and through real world datasets. △ Less

Submitted 26 November, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: 20 pages, 8 figures, to be published in IEEE Transactions of Signal Processing

arXiv:2001.00564 [pdf, other]

Robust Marine Buoy Placement for Ship Detection Using Dropout K-Means

Authors: Yuting Ng, João M. Pereira, Denis Garagic, Vahid Tarokh

Abstract: Marine buoys aid in the battle against Illegal, Unreported and Unregulated (IUU) fishing by detecting fishing vessels in their vicinity. Marine buoys, however, may be disrupted by natural causes and buoy vandalism. In this paper, we formulate marine buoy placement as a clustering problem, and propose dropout k-means and dropout k-median to improve placement robustness to buoy disruption. We simu… ▽ More Marine buoys aid in the battle against Illegal, Unreported and Unregulated (IUU) fishing by detecting fishing vessels in their vicinity. Marine buoys, however, may be disrupted by natural causes and buoy vandalism. In this paper, we formulate marine buoy placement as a clustering problem, and propose dropout k-means and dropout k-median to improve placement robustness to buoy disruption. We simulated the passage of ships in the Gabonese waters near West Africa using historical Automatic Identification System (AIS) data, then compared the ship detection probability of dropout k-means to classic k-means and dropout k-median to classic k-median. With 5 buoys, the buoy arrangement computed by classic k-means, dropout k-means, classic k-median and dropout k-median have ship detection probabilities of 38%, 45%, 48% and 52%. △ Less

Submitted 20 February, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: ICASSP 2020

arXiv:1910.10262 [pdf, other]

Learning Partial Differential Equations from Data Using Neural Networks

Authors: Ali Hasan, João M. Pereira, Robert Ravier, Sina Farsiu, Vahid Tarokh

Abstract: We develop a framework for estimating unknown partial differential equations from noisy data, using a deep learning approach. Given noisy samples of a solution to an unknown PDE, our method interpolates the samples using a neural network, and extracts the PDE by equating derivatives of the neural network approximation. Our method applies to PDEs which are linear combinations of user-defined dictio… ▽ More We develop a framework for estimating unknown partial differential equations from noisy data, using a deep learning approach. Given noisy samples of a solution to an unknown PDE, our method interpolates the samples using a neural network, and extracts the PDE by equating derivatives of the neural network approximation. Our method applies to PDEs which are linear combinations of user-defined dictionary functions, and generalizes previous methods that only consider parabolic PDEs. We introduce a regularization scheme that prevents the function approximation from overfitting the data and forces it to be a solution of the underlying PDE. We validate the model on simulated data generated by the known PDEs and added Gaussian noise, and we study our method under different levels of noise. We also compare the error of our method with a Cramer-Rao lower bound for an ordinary differential equation. Our results indicate that our method outperforms other methods in estimating PDEs, especially in the low signal-to-noise regime. △ Less

Submitted 22 October, 2019; originally announced October 2019.

arXiv:1801.04366 [pdf, other]

Estimation in the group action channel

Authors: Emmanuel Abbe, João M. Pereira, Amit Singer

Abstract: We analyze the problem of estimating a signal from multiple measurements on a $\mbox{group action channel}$ that linearly transforms a signal by a random group action followed by a fixed projection and additive Gaussian noise. This channel is motivated by applications such as multi-reference alignment and cryo-electron microscopy. We focus on the large noise regime prevalent in these applications.… ▽ More We analyze the problem of estimating a signal from multiple measurements on a $\mbox{group action channel}$ that linearly transforms a signal by a random group action followed by a fixed projection and additive Gaussian noise. This channel is motivated by applications such as multi-reference alignment and cryo-electron microscopy. We focus on the large noise regime prevalent in these applications. We give a lower bound on the mean square error (MSE) of any asymptotically unbiased estimator of the signal's orbit in terms of the signal's moment tensors, which implies that the MSE is bounded away from 0 when $N/σ^{2d}$ is bounded from above, where $N$ is the number of observations, $σ$ is the noise standard deviation, and $d$ is the so-called $\mbox{moment order cutoff}$. In contrast, the maximum likelihood estimator is shown to be consistent if $N /σ^{2d}$ diverges. △ Less

Submitted 12 January, 2018; originally announced January 2018.

Comments: 5 pages, conference

MSC Class: 94A15; 62B10

Showing 1–21 of 21 results for author: Pereira, J M