Search | arXiv e-print repository

Decentralized Peer Review in Open Science: A Mechanism Proposal

Abstract: Peer review is a laborious, yet essential, part of academic publishing with crucial impact on the scientific endeavor. The current lack of incentives and transparency harms the credibility of this process. Researchers are neither rewarded for superior nor penalized for bad reviews. Additionally, confidential reports cause a loss of insights and make the review process vulnerable to scientific misc… ▽ More Peer review is a laborious, yet essential, part of academic publishing with crucial impact on the scientific endeavor. The current lack of incentives and transparency harms the credibility of this process. Researchers are neither rewarded for superior nor penalized for bad reviews. Additionally, confidential reports cause a loss of insights and make the review process vulnerable to scientific misconduct. We propose a community-owned and -governed system that 1) remunerates reviewers for their efforts, 2) publishes the (anonymized) reports for scrutiny by the community, 3) tracks reputation of reviewers and 4) provides digital certificates. Automated by transparent smart-contract blockchain technology, the system aims to increase quality and speed of peer review while lowering the chance and impact of erroneous judgements. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 14 pages, 1 figure

ACM Class: K.3.m; K.4.2; K.4.3

arXiv:2401.14868 [pdf, other]

Particle-MALA and Particle-mGRAD: Gradient-based MCMC methods for high-dimensional state-space models

Authors: Adrien Corenflos, Axel Finke

Abstract: State-of-the-art methods for Bayesian inference in state-space models are (a) conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated 'classical' MCMC algorithms like MALA, or mGRAD from Titsias and Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose $N$ particles at each time step to exploit the model's 'decorrelation-over-time' property and thus scale favour… ▽ More State-of-the-art methods for Bayesian inference in state-space models are (a) conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated 'classical' MCMC algorithms like MALA, or mGRAD from Titsias and Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose $N$ particles at each time step to exploit the model's 'decorrelation-over-time' property and thus scale favourably with the time horizon, $T$ , but break down if the dimension of the latent states, $D$, is large. The latter leverage gradient-/prior-informed local proposals to scale favourably with $D$ but exhibit sub-optimal scalability with $T$ due to a lack of model-structure exploitation. We introduce methods which combine the strengths of both approaches. The first, Particle-MALA, spreads $N$ particles locally around the current state using gradient information, thus extending MALA to $T > 1$ time steps and $N > 1$ proposals. The second, Particle-mGRAD, additionally incorporates (conditionally) Gaussian prior dynamics into the proposal, thus extending the mGRAD algorithm to $T > 1$ time steps and $N > 1$ proposals. We prove that Particle-mGRAD interpolates between CSMC and Particle-MALA, resolving the 'tuning problem' of choosing between CSMC (superior for highly informative prior dynamics) and Particle-MALA (superior for weakly informative prior dynamics). We similarly extend other 'classical' MCMC approaches like auxiliary MALA, aGRAD, and preconditioned Crank-Nicolson-Langevin (PCNL) to $T > 1$ time steps and $N > 1$ proposals. In experiments, for both highly and weakly informative prior dynamics, our methods substantially improve upon both CSMC and sophisticated 'classical' MCMC approaches. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: 29 pages + 31 pages appendix. 6 figures and tables (+ 7 in appendix). Code available at https://github.com/AdrienCorenflos/particle_mala/

arXiv:2308.08378 [pdf, other]

Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation

Authors: **grui Hou, Georgina Cosma, Axel Finke

Abstract: Continual learning refers to the capability of a machine learning model to learn and adapt to new information, without compromising its performance on previously learned tasks. Although several studies have investigated continual learning methods for information retrieval tasks, a well-defined task formulation is still lacking, and it is unclear how typical learning strategies perform in this cont… ▽ More Continual learning refers to the capability of a machine learning model to learn and adapt to new information, without compromising its performance on previously learned tasks. Although several studies have investigated continual learning methods for information retrieval tasks, a well-defined task formulation is still lacking, and it is unclear how typical learning strategies perform in this context. To address this challenge, a systematic task formulation of continual neural information retrieval is presented, along with a multiple-topic dataset that simulates continuous information retrieval. A comprehensive continual neural information retrieval framework consisting of typical retrieval models and continual learning strategies is then proposed. Empirical evaluations illustrate that the proposed framework can successfully prevent catastrophic forgetting in neural information retrieval and enhance performance on previously learned tasks. The results indicate that embedding-based retrieval models experience a decline in their continual learning performance as the topic shift distance and dataset volume of new tasks increase. In contrast, pretraining-based models do not show any such correlation. Adopting suitable learning strategies can mitigate the effects of topic shift and data augmentation. △ Less

Submitted 19 June, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: Submitted to Information Sciences

arXiv:2307.14244 [pdf, other]

Neural-based Cross-modal Search and Retrieval of Artwork

Authors: Yan Gong, Georgina Cosma, Axel Finke

Abstract: Creating an intelligent search and retrieval system for artwork images, particularly paintings, is crucial for documenting cultural heritage, fostering wider public engagement, and advancing artistic analysis and interpretation. Visual-Semantic Embedding (VSE) networks are deep learning models used for information retrieval, which learn joint representations of textual and visual data, enabling 1)… ▽ More Creating an intelligent search and retrieval system for artwork images, particularly paintings, is crucial for documenting cultural heritage, fostering wider public engagement, and advancing artistic analysis and interpretation. Visual-Semantic Embedding (VSE) networks are deep learning models used for information retrieval, which learn joint representations of textual and visual data, enabling 1) cross-modal search and retrieval tasks, such as image-to-text and text-to-image retrieval; and 2) relation-focused retrieval to capture entity relationships and provide more contextually relevant search results. Although VSE networks have played a significant role in cross-modal information retrieval, their application to painting datasets, such as ArtUK, remains unexplored. This paper introduces BoonArt, a VSE-based cross-modal search engine that allows users to search for images using textual queries, and to obtain textual descriptions along with the corresponding images when using image queries. The performance of BoonArt was evaluated using the ArtUK dataset. Experimental evaluations revealed that BoonArt achieved 97% Recall@10 for image-to-text retrieval, and 97.4% Recall@10 for text-to-image Retrieval. By bridging the gap between textual and visual modalities, BoonArt provides a much-improved search performance compared to traditional search engines, such as the one provided by the ArtUK website. BoonArt can be utilised to work with other artwork datasets. △ Less

Submitted 26 July, 2023; originally announced July 2023.

arXiv:2307.11643 [pdf, other]

Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models

Authors: Jiajun Zhang, Georgina Cosma, Sarah Bugby, Axel Finke, Jason Watkins

Abstract: As the use of artificial intelligent (AI) models becomes more prevalent in industries such as engineering and manufacturing, it is essential that these models provide transparent reasoning behind their predictions. This paper proposes the AI-Reasoner, which extracts the morphological characteristics of defects (DefChars) from images and utilises decision trees to reason with the DefChar values. Th… ▽ More As the use of artificial intelligent (AI) models becomes more prevalent in industries such as engineering and manufacturing, it is essential that these models provide transparent reasoning behind their predictions. This paper proposes the AI-Reasoner, which extracts the morphological characteristics of defects (DefChars) from images and utilises decision trees to reason with the DefChar values. Thereafter, the AI-Reasoner exports visualisations (i.e. charts) and textual explanations to provide insights into outputs made by masked-based defect detection and classification models. It also provides effective mitigation strategies to enhance data pre-processing and overall model performance. The AI-Reasoner was tested on explaining the outputs of an IE Mask R-CNN model using a set of 366 images containing defects. The results demonstrated its effectiveness in explaining the IE Mask R-CNN model's predictions. Overall, the proposed AI-Reasoner provides a solution for improving the performance of AI models in industrial applications that require defect analysis. △ Less

Submitted 10 October, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: 8 pages, 3 figures, 5 tables; accepted in 2023 IEEE symposium series on computational intelligence (SSCI)

arXiv:2307.06871 [pdf, other]

Identifying Early Help Referrals For Local Authorities With Machine Learning And Bias Analysis

Authors: Eufrásio de A. Lima Neto, Jonathan Bailiss, Axel Finke, Jo Miller, Georgina Cosma

Abstract: Local authorities in England, such as Leicestershire County Council (LCC), provide Early Help services that can be offered at any point in a young person's life when they experience difficulties that cannot be supported by universal services alone, such as schools. This paper investigates the utilisation of machine learning (ML) to assist experts in identifying families that may need to be referre… ▽ More Local authorities in England, such as Leicestershire County Council (LCC), provide Early Help services that can be offered at any point in a young person's life when they experience difficulties that cannot be supported by universal services alone, such as schools. This paper investigates the utilisation of machine learning (ML) to assist experts in identifying families that may need to be referred for Early Help assessment and support. LCC provided an anonymised dataset comprising 14360 records of young people under the age of 18. The dataset was pre-processed, machine learning models were build, and experiments were conducted to validate and test the performance of the models. Bias mitigation techniques were applied to improve the fairness of these models. During testing, while the models demonstrated the capability to identify young people requiring intervention or early help, they also produced a significant number of false positives, especially when constructed with imbalanced data, incorrectly identifying individuals who most likely did not need an Early Help referral. This paper empirically explores the suitability of data-driven ML models for identifying young people who may require Early Help services and discusses their appropriateness and limitations for this task. △ Less

Submitted 13 July, 2023; originally announced July 2023.

arXiv:2302.06350 [pdf, other]

VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval

Authors: Yan Gong, Georgina Cosma, Axel Finke

Abstract: The relations expressed in user queries are vital for cross-modal information retrieval. Relation-focused cross-modal retrieval aims to retrieve information that corresponds to these relations, enabling effective retrieval across different modalities. Pre-trained networks, such as Contrastive Language-Image Pre-training (CLIP), have gained significant attention and acclaim for their exceptional pe… ▽ More The relations expressed in user queries are vital for cross-modal information retrieval. Relation-focused cross-modal retrieval aims to retrieve information that corresponds to these relations, enabling effective retrieval across different modalities. Pre-trained networks, such as Contrastive Language-Image Pre-training (CLIP), have gained significant attention and acclaim for their exceptional performance in various cross-modal learning tasks. However, the Vision Transformer (ViT) used in these networks is limited in its ability to focus on image region relations. Specifically, ViT is trained to match images with relevant descriptions at the global level, without considering the alignment between image regions and descriptions. This paper introduces VITR, a novel network that enhances ViT by extracting and reasoning about image region relations based on a local encoder. VITR is comprised of two key components. Firstly, it extends the capabilities of ViT-based cross-modal networks by enabling them to extract and reason with region relations present in images. Secondly, VITR incorporates a fusion module that combines the reasoned results with global knowledge to predict similarity scores between images and descriptions. The proposed VITR network was evaluated through experiments on the tasks of relation-focused cross-modal information retrieval. The results derived from the analysis of the RefCOCOg, CLEVR, and Flickr30K datasets demonstrated that the proposed VITR network consistently outperforms state-of-the-art networks in image-to-text and text-to-image retrieval. △ Less

Submitted 27 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

arXiv:2211.07311 [pdf, other]

A Bayesian framework for genome-wide inference of DNA methylation levels

Authors: Marcel Hirt, Axel Finke, Alexandros Beskos, Petros Dellaportas, Stephan Beck, Ismail Moghul, Simone Ecker

Abstract: DNA methylation is an important epigenetic mark that has been studied extensively for its regulatory role in biological processes and diseases. WGBS allows for genome-wide measurements of DNA methylation up to single-base resolutions, yet poses challenges in identifying significantly different methylation patterns across distinct biological conditions. We propose a novel methylome change-point mod… ▽ More DNA methylation is an important epigenetic mark that has been studied extensively for its regulatory role in biological processes and diseases. WGBS allows for genome-wide measurements of DNA methylation up to single-base resolutions, yet poses challenges in identifying significantly different methylation patterns across distinct biological conditions. We propose a novel methylome change-point model which describes the joint dynamics of methylation regimes of a case and a control group and benefits from taking into account the information of neighbouring methylation sites among all available samples. We also devise particle filtering and smoothing algorithms to perform efficient inference of the latent methylation patterns. We illustrate that our approach can detect and test for very flexible differential methylation signatures with high power while controlling Type-I error measures. △ Less

Submitted 14 November, 2022; originally announced November 2022.

arXiv:2203.09238 [pdf, other]

Cosmology and modified gravity with dark sirens from GWTC-3

Authors: Michele Mancarella, Andreas Finke, Stefano Foffa, Edwin Genoud-Prachex, Francesco Iacovelli, Michele Maggiore

Abstract: We present the latest measurements of the Hubble parameter and of the parameter $Ξ_0$ describing modified gravitational wave propagation, obtained from the third gravitational wave transient catalog, GWTC-3, using the correlation with galaxy catalogs and information from the source-frame mass distribution of binary black holes. The latter leads to the tightest bound on $Ξ_0$ so far, i.e.… ▽ More We present the latest measurements of the Hubble parameter and of the parameter $Ξ_0$ describing modified gravitational wave propagation, obtained from the third gravitational wave transient catalog, GWTC-3, using the correlation with galaxy catalogs and information from the source-frame mass distribution of binary black holes. The latter leads to the tightest bound on $Ξ_0$ so far, i.e. $Ξ_0 = 1.2^{+0.7}_{-0.7}$ with a flat prior on $Ξ_0$, and $Ξ_0 = 1.0^{+0.4}_{-0.8}$ with a prior uniform in $\logΞ_0$ (Max posterior and $68\%$ HDI). The measurement of $H_0$ is dominated by the single bright siren GW170817, resulting in $H_0=67^{+9}_{-6} \, \rm km \, s^{-1} \, Mpc$ when combined with the galaxy catalog. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: Contribution to the Gravitation session of the 56th Rencontres de Moriond 2022

arXiv:2203.09237 [pdf, other]

Modified gravitational wave propagation: information from strongly lensed binaries and the BNS mass function

Authors: Francesco Iacovelli, Andreas Finke, Stefano Foffa, Michele Maggiore, Michele Mancarella

Abstract: Modified gravitational wave propagation is a smoking gun of modifications of gravity at cosmological scales, and can be the most promising observable for testing such theories. The observation of gravitational waves (GW) in recent years has allowed us to start probing this effect, and here we briefly review two promising ways of testing it. We will show that, already with the current network of de… ▽ More Modified gravitational wave propagation is a smoking gun of modifications of gravity at cosmological scales, and can be the most promising observable for testing such theories. The observation of gravitational waves (GW) in recent years has allowed us to start probing this effect, and here we briefly review two promising ways of testing it. We will show that, already with the current network of detectors, it is possible to reach an interesting accuracy in the estimation of the $Ξ_0$ parameter (that characterizes modified gravitational wave propagation, with $Ξ_{0, {\rm GR}} = 1$) and with next generation facilities, such as the Einstein Telescope, we can get a sub-percent measurement. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: 4 pages, 2 figures, contribution to the 2022 Gravitation session of the 56th Rencontres de Moriond

arXiv:2108.10277 [pdf, other]

Conditional sequential Monte Carlo in high dimensions

Authors: Axel Finke, Alexandre H. Thiery

Abstract: The iterated conditional sequential Monte Carlo (i-CSMC) algorithm from Andrieu, Doucet and Holenstein (2010) is an MCMC approach for efficiently sampling from the joint posterior distribution of the $T$ latent states in challenging time-series models, e.g. in non-linear or non-Gaussian state-space models. It is also the main ingredient in particle Gibbs samplers which infer unknown model paramete… ▽ More The iterated conditional sequential Monte Carlo (i-CSMC) algorithm from Andrieu, Doucet and Holenstein (2010) is an MCMC approach for efficiently sampling from the joint posterior distribution of the $T$ latent states in challenging time-series models, e.g. in non-linear or non-Gaussian state-space models. It is also the main ingredient in particle Gibbs samplers which infer unknown model parameters alongside the latent states. In this work, we first prove that the i-CSMC algorithm suffers from a curse of dimension in the dimension of the states, $D$: it breaks down unless the number of samples ("particles"), $N$, proposed by the algorithm grows exponentially with $D$. Then, we present a novel "local" version of the algorithm which proposes particles using Gaussian random-walk moves that are suitably scaled with $D$. We prove that this iterated random-walk conditional sequential Monte Carlo (i-RW-CSMC) algorithm avoids the curse of dimension: for arbitrary $N$, its acceptance rates and expected squared jum** distance converge to non-trivial limits as $D \to \infty$. If $T = N = 1$, our proposed algorithm reduces to a Metropolis--Hastings or Barker's algorithm with Gaussian random-walk moves and we recover the well known scaling limits for such algorithms. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: 47 pages, 5 figures

arXiv:2108.04065 [pdf, other]

doi 10.1016/j.dark.2022.100994

Modified gravitational wave propagation and the binary neutron star mass function

Authors: Andreas Finke, Stefano Foffa, Francesco Iacovelli, Michele Maggiore, Michele Mancarella

Abstract: Modified gravitational wave (GW) propagation is a generic phenomenon in modified gravity. It affects the reconstruction of the redshift of coalescing binaries from the luminosity distance measured by GW detectors, and therefore the reconstruction of the actual masses of the component compact stars from the observed (`detector-frame') masses. We show that, thanks to the narrowness of the mass distr… ▽ More Modified gravitational wave (GW) propagation is a generic phenomenon in modified gravity. It affects the reconstruction of the redshift of coalescing binaries from the luminosity distance measured by GW detectors, and therefore the reconstruction of the actual masses of the component compact stars from the observed (`detector-frame') masses. We show that, thanks to the narrowness of the mass distribution of binary neutron stars, this effect can provide a clear signature of modified gravity, particularly for the redshifts explored by third generation GW detectors such as Einstein Telescope and Cosmic Explorer. △ Less

Submitted 20 February, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: v2: expanded version, to appear in Physics of the Dark Universe

arXiv:2107.05046 [pdf, other]

doi 10.1103/PhysRevD.104.084057

Probing modified gravitational wave propagation with strongly lensed coalescing binaries

Authors: Andreas Finke, Stefano Foffa, Francesco Iacovelli, Michele Maggiore, Michele Mancarella

Abstract: It has been recently shown that quadruply lensed gravitational-wave (GW) events due to coalescing binaries can be localized to one or just a few galaxies, even in the absence of an electromagnetic counterpart. We discuss how this can be used to extract information on modified GW propagation, which is a crucial signature of modifications of gravity at cosmological scales. We show that, using quadru… ▽ More It has been recently shown that quadruply lensed gravitational-wave (GW) events due to coalescing binaries can be localized to one or just a few galaxies, even in the absence of an electromagnetic counterpart. We discuss how this can be used to extract information on modified GW propagation, which is a crucial signature of modifications of gravity at cosmological scales. We show that, using quadruply lensed systems, it is possible to constrain the parameter $Ξ_0$ that characterizes modified GW propagation, without the need of imposing a prior on $H_0$. A LIGO/Virgo/Kagra network at target sensitivity might already get a significant measurement of $Ξ_0$, while a third generation GW detector such as the Einstein Telescope could reach a very interesting accuracy. △ Less

Submitted 11 July, 2021; originally announced July 2021.

Comments: 11 pages, 5 figures

arXiv:2101.12660 [pdf, other]

doi 10.1088/1475-7516/2021/08/026

Cosmology with LIGO/Virgo dark sirens: Hubble parameter and modified gravitational wave propagation

Authors: Andreas Finke, Stefano Foffa, Francesco Iacovelli, Michele Maggiore, Michele Mancarella

Abstract: We present a detailed study of the methodology for correlating `dark sirens' (compact binaries coalescences without electromagnetic counterpart) with galaxy catalogs. We propose several improvements on the current state of the art, and we apply them to the GWTC-2 catalog of LIGO/Virgo gravitational wave (GW) detections, and the GLADE galaxy catalog, performing a detailed study of several sources o… ▽ More We present a detailed study of the methodology for correlating `dark sirens' (compact binaries coalescences without electromagnetic counterpart) with galaxy catalogs. We propose several improvements on the current state of the art, and we apply them to the GWTC-2 catalog of LIGO/Virgo gravitational wave (GW) detections, and the GLADE galaxy catalog, performing a detailed study of several sources of systematic errors that, with the expected increase in statistics, will eventually become the dominant limitation. We provide a measurement of $H_0$ from dark sirens alone, finding as the best result $H_0=67.3^{+27.6}_{-17.9}\,\,{\rm km}\, {\rm s}^{-1}\, {\rm Mpc}^{-1}$ ($68\%$ c.l.) which is, currently, the most stringent constraint obtained using only dark sirens. Combining dark sirens with the counterpart for GW170817 we find $H_0= 72.2^{+13.9}_{-7.5} \,{\rm km}\, {\rm s}^{-1}\, {\rm Mpc}^{-1}$. We also study modified GW propagation, which is a smoking gun of dark energy and modifications of gravity at cosmological scales, and we show that current observations of dark sirens already start to provide interesting limits. From dark sirens alone, our best result for the parameter $Ξ_0$ that measures deviations from GR (with $Ξ_0=1$ in GR) is $Ξ_0=2.1^{+3.2}_{-1.2}$. We finally discuss limits on modified GW propagation under the tentative identification of the flare ZTF19abanrhr as the electromagnetic counterpart of the binary black hole coalescence GW190521, in which case our most stringent result is $Ξ_0=1.8^{+0.9}_{-0.6}$. We release the publicly available code $\tt{DarkSirensStat}$, which is available under open source license at \url{https://github.com/CosmoStatGW/DarkSirensStat}. △ Less

Submitted 16 July, 2021; v1 submitted 29 January, 2021; originally announced January 2021.

Comments: v2: several significant technical improvements, results changed v3: minor changes, Fig 9 added. The version to appear in JCAP

arXiv:2001.07619 [pdf, other]

doi 10.1088/1475-7516/2020/04/010

Gravity in the infrared and effective nonlocal models

Authors: Enis Belgacem, Yves Dirian, Andreas Finke, Stefano Foffa, Michele Maggiore

Abstract: We provide a systematic and updated discussion of a research line carried out by our group over the last few years, in which gravity is modified at cosmological distances by the introduction of nonlocal terms, assumed to emerge at an effective level from the infrared behavior of the quantum theory. The requirement of producing a viable cosmology turns out to be very stringent and basically selects… ▽ More We provide a systematic and updated discussion of a research line carried out by our group over the last few years, in which gravity is modified at cosmological distances by the introduction of nonlocal terms, assumed to emerge at an effective level from the infrared behavior of the quantum theory. The requirement of producing a viable cosmology turns out to be very stringent and basically selects a unique model, in which the nonlocal term describes an effective mass for the conformal mode. We discuss how such a specific structure could emerge from a fundamental local theory of gravity, and we perform a detailed comparison of this model with the most recent cosmological datasets, confirming that it fits current data at the same level as $Λ$CDM. Most notably, the model has striking predictions in the sector of tensor perturbations, leading to a very large effect in the propagation of gravitational wave (GWs) over cosmological distances. At the redshifts relevant for the next generation of GW detectors such as Einstein Telescope, Cosmic Explorer and LISA, this leads to deviations from GR that could be as large as $80\%$, and could be verified with the detection of just a single coalescing binary with electromagnetic counterpart. This would also have potentially important consequences for the search of the counterpart since, for a given luminosity distance to the source, as inferred through the GW signal, the actual source redshift could be significantly different from that predicted by $Λ$CDM. At the redshifts relevant for advanced LIGO/Virgo/Kagra the effect is smaller, but still potentially observable over a few years of runs at target sensitivity. △ Less

Submitted 21 January, 2020; originally announced January 2020.

Comments: 84 pages, 22 figures

arXiv:1907.10477 [pdf, other]

On importance-weighted autoencoders

Authors: Axel Finke, Alexandre H. Thiery

Abstract: The importance weighted autoencoder (IWAE) (Burda et al., 2016) is a popular variational-inference method which achieves a tighter evidence bound (and hence a lower bias) than standard variational autoencoders by optimising a multi-sample objective, i.e. an objective that is expressible as an integral over $K > 1$ Monte Carlo samples. Unfortunately, IWAE crucially relies on the availability of rep… ▽ More The importance weighted autoencoder (IWAE) (Burda et al., 2016) is a popular variational-inference method which achieves a tighter evidence bound (and hence a lower bias) than standard variational autoencoders by optimising a multi-sample objective, i.e. an objective that is expressible as an integral over $K > 1$ Monte Carlo samples. Unfortunately, IWAE crucially relies on the availability of reparametrisations and even if these exist, the multi-sample objective leads to inference-network gradients which break down as $K$ is increased (Rainforth et al., 2018). This breakdown can only be circumvented by removing high-variance score-function terms, either by heuristically ignoring them (which yields the 'sticking-the-landing' IWAE (IWAE-STL) gradient from Roeder et al. (2017)) or through an identity from Tucker et al. (2019) (which yields the 'doubly-reparametrised' IWAE (IWAE-DREG) gradient). In this work, we argue that directly optimising the proposal distribution in importance sampling as in the reweighted wake-sleep (RWS) algorithm from Bornschein & Bengio (2015) is preferable to optimising IWAE-type multi-sample objectives. To formalise this argument, we introduce an adaptive-importance sampling framework termed adaptive importance sampling for learning (AISLE) which slightly generalises the RWS algorithm. We then show that AISLE admits IWAE-STL and IWAE-DREG (i.e. the IWAE-gradients which avoid breakdown) as special cases. △ Less

Submitted 19 September, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

arXiv:1907.02047 [pdf, other]

doi 10.1088/1475-7516/2019/11/022

Nonlocal gravity and gravitational-wave observations

Authors: Enis Belgacem, Yves Dirian, Andreas Finke, Stefano Foffa, Michele Maggiore

Abstract: We discuss a modified gravity model which fits cosmological observations at a level statistically indistinguishable from $Λ$CDM and at the same time predicts very large deviations from General Relativity (GR) in the propagation of gravitational waves (GWs) across cosmological distances. The model is a variant of the RT nonlocal model proposed and developed by our group, with initial conditions set… ▽ More We discuss a modified gravity model which fits cosmological observations at a level statistically indistinguishable from $Λ$CDM and at the same time predicts very large deviations from General Relativity (GR) in the propagation of gravitational waves (GWs) across cosmological distances. The model is a variant of the RT nonlocal model proposed and developed by our group, with initial conditions set during inflation, and predicts a GW luminosity distance that, at the redshifts accessible to LISA or to a third-generation GW detector such as the Einstein Telescope (ET), can differ from that in GR by as much as $60\%$. An effect of this size could be detected with just a single standard siren with counterpart by LISA or ET. At the redshifts accessible to a LIGO/Virgo/Kagra network at target sensitivity the effect is smaller but still potentially detectable. Indeed, for the recently announced LIGO/Virgo NS-BH candidate S190814bv, the RT model predicts that, given the measured GW luminosity distance, the actual luminosity distance, and the redshift of an electromagnetic counterpart, would be smaller by as much as $7\%$ with respect to the value inferred from $Λ$CDM. △ Less

Submitted 18 August, 2019; v1 submitted 3 July, 2019; originally announced July 2019.

Comments: v2: added discussion of the effect of modified GW propagation on the recent LIGO/Virgo NS-BH candidate event S190814bv

arXiv:1902.09769 [pdf, other]

doi 10.1093/mnras/stz3145

The perturbed FLRW metric on all scales: Newtonian limit and top-hat collapse

Authors: Andreas Finke

Abstract: The applicability of a linearized perturbed FLRW metric to the late, lumpy universe has been subject to debate. We consider in an elementary way the Newtonian limit of the Einstein equations with this ansatz for the case of structure formation in late-time cosmology, on small and large scales, and argue that linearizing the Einstein tensor produces only a small error down to arbitrarily small, dec… ▽ More The applicability of a linearized perturbed FLRW metric to the late, lumpy universe has been subject to debate. We consider in an elementary way the Newtonian limit of the Einstein equations with this ansatz for the case of structure formation in late-time cosmology, on small and large scales, and argue that linearizing the Einstein tensor produces only a small error down to arbitrarily small, decoupled scales (e.g. Solar system scales). On subhorizon patches, the metric scale factor becomes a coordinate choice equivalent to choosing the spatial curvature, and not a sign that the FLRW metric cannot perturbatively accommodate very different local physical expansion rates of matter; we distinguish these concepts, and show that they merge on large scales for the Newtonian limit to be globally valid. Furthermore, on subhorizon scales, a perturbed FLRW metric ansatz does not already imply assumptions on isotropy, and effects beyond an FLRW background, including those potentially caused by non-linearities of general relativity (GR), may be encoded into non-trivial boundary conditions. The corresponding cosmologies have already been developed in a Newtonian setting by Heckmann and Schücking and none of these boundary conditions can explain the accelerated expansion of the universe. Our analysis of the field equations is confirmed on the level of solutions by an example of pedagogical value, comparing a collapsing top-hat overdensity (embedded into a cosmological background) treated in such perturbative manner to the corresponding exact solution of GR, where we find good agreement even in the regimes of strong density contrast. △ Less

Submitted 17 December, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

Comments: couple of useful clarifications; agrees with published version

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 491, Issue 2, January 2020

arXiv:1812.11181 [pdf, other]

doi 10.1088/1475-7516/2019/02/035

Testing nonlocal gravity with Lunar Laser Ranging

Authors: Enis Belgacem, Andreas Finke, Antonia Frassino, Michele Maggiore

Abstract: We study the impact of the limit on $|\dot{G}|/G$ from Lunar Laser Ranging on "nonlocal gravity", i.e. on models of the quantum effective action of gravity that include nonlocal terms relevant in the infrared, such as the "RR" and "RT" models proposed by our group, and the Deser-Woodard (DW) model. We elaborate on the analysis of Barreira et al. [1] and we confirm their findings that (under plausi… ▽ More We study the impact of the limit on $|\dot{G}|/G$ from Lunar Laser Ranging on "nonlocal gravity", i.e. on models of the quantum effective action of gravity that include nonlocal terms relevant in the infrared, such as the "RR" and "RT" models proposed by our group, and the Deser-Woodard (DW) model. We elaborate on the analysis of Barreira et al. [1] and we confirm their findings that (under plausible assumptions such as the absence of strong backreaction from non-linear structures), the RR model is ruled out. We also show that the mechanism of "perfect screening for free" suggested for the DW model actually does not work and the DW model is also ruled out. In contrast, the RT model passes all phenomenological consistency tests and is still a viable candidate. △ Less

Submitted 28 December, 2018; originally announced December 2018.

Comments: 46 pages, 4 figures

arXiv:1807.01057 [pdf, other]

doi 10.1017/apr.2020.9

Limit theorems for sequential MCMC methods

Authors: Axel Finke, Arnaud Doucet, Adam M. Johansen

Abstract: Sequential Monte Carlo (SMC) methods, also known as particle filters, constitute a class of algorithms used to approximate expectations with respect to a sequence of probability distributions as well as the normalising constants of those distributions. Sequential MCMC methods are an alternative class of techniques addressing similar problems in which particles are sampled according to an MCMC kern… ▽ More Sequential Monte Carlo (SMC) methods, also known as particle filters, constitute a class of algorithms used to approximate expectations with respect to a sequence of probability distributions as well as the normalising constants of those distributions. Sequential MCMC methods are an alternative class of techniques addressing similar problems in which particles are sampled according to an MCMC kernel rather than conditionally independently at each time step. These methods were introduced over twenty years ago by Berzuini et al. (1997). Recently, there has been a renewed interest in such algorithms as they demonstrate an empirical performance superior to that of SMC methods in some applications. We establish a strong law of large numbers and a central limit theorem for sequential MCMC methods and provide conditions under which errors can be controlled uniformly in time. In the context of state-space models, we provide conditions under which sequential MCMC methods can indeed outperform standard SMC methods in terms of asymptotic variance of the corresponding Monte Carlo estimators. △ Less

Submitted 25 July, 2018; v1 submitted 3 July, 2018; originally announced July 2018.

Journal ref: Adv. Appl. Probab. 52 (2020) 377-403

arXiv:1708.04221 [pdf, other]

Efficient sequential Monte Carlo algorithms for integrated population models

Authors: Axel Finke, Ruth King, Alexandros Beskos, Petros Dellaportas

Abstract: State-space models are commonly used to describe different forms of ecological data. We consider the case of count data with observation errors. For such data the system process is typically multi-dimensional consisting of coupled Markov processes, where each component corresponds to a different characterisation of the population, such as age group, gender or breeding status. The associated system… ▽ More State-space models are commonly used to describe different forms of ecological data. We consider the case of count data with observation errors. For such data the system process is typically multi-dimensional consisting of coupled Markov processes, where each component corresponds to a different characterisation of the population, such as age group, gender or breeding status. The associated system process equations describe the biological mechanisms under which the system evolves over time. However, there is often limited information in the count data alone to sensibly estimate demographic parameters of interest, so these are often combined with additional ecological observations leading to an integrated data analysis. Unfortunately, fitting these models to the data can be challenging, especially if the state-space model for the count data is non-linear or non-Gaussian. We propose an efficient particle Markov chain Monte Carlo algorithm to estimate the demographic parameters without the need for resorting to linear or Gaussian approximations. In particular, we exploit the integrated model structure to enhance the efficiency of the algorithm. We then incorporate the algorithm into a sequential Monte Carlo sampler in order to perform model comparison with regards to the dependence structure of the demographic parameters. Finally, we demonstrate the applicability and computational efficiency of our algorithms on two real datasets. △ Less

Submitted 14 August, 2017; originally announced August 2017.

Comments: includes supplementary materials

arXiv:1610.08962 [pdf, other]

On embedded hidden Markov models and particle Markov chain Monte Carlo methods

Authors: Axel Finke, Arnaud Doucet, Adam M. Johansen

Abstract: The embedded hidden Markov model (EHMM) sampling method is a Markov chain Monte Carlo (MCMC) technique for state inference in non-linear non-Gaussian state-space models which was proposed in Neal (2003); Neal et al. (2004) and extended in Shestopaloff and Neal (2016). An extension to Bayesian parameter inference was presented in Shestopaloff and Neal (2013). An alternative class of MCMC schemes ad… ▽ More The embedded hidden Markov model (EHMM) sampling method is a Markov chain Monte Carlo (MCMC) technique for state inference in non-linear non-Gaussian state-space models which was proposed in Neal (2003); Neal et al. (2004) and extended in Shestopaloff and Neal (2016). An extension to Bayesian parameter inference was presented in Shestopaloff and Neal (2013). An alternative class of MCMC schemes addressing similar inference problems is provided by particle MCMC (PMCMC) methods (Andrieu et al. 2009; 2010). All these methods rely on the introduction of artificial extended target distributions for multiple state sequences which, by construction, are such that one randomly indexed sequence is distributed according to the posterior of interest. By adapting the Metropolis-Hastings algorithms developed in the framework of PMCMC methods to the EHMM framework, we obtain novel particle filter (PF)-type algorithms for state inference and novel MCMC schemes for parameter and state inference. In addition, we show that most of these algorithms can be viewed as particular cases of a general PF and PMCMC framework. We compare the empirical performance of the various algorithms on low- to high-dimensional state-space models. We demonstrate that a properly tuned conditional PF with "local" MCMC moves proposed in Shestopaloff and Neal (2016) can outperform the standard conditional PF significantly when applied to high-dimensional state-space models while the novel PF-type algorithm could prove to be an interesting alternative to standard PFs for likelihood estimation in some lower-dimensional scenarios. △ Less

Submitted 27 October, 2016; originally announced October 2016.

Comments: 23 pages, 7 figures

arXiv:1606.08650 [pdf, other]

doi 10.1109/TSP.2017.2733504

Approximate Smoothing and Parameter Estimation in High-Dimensional State-Space Models

Authors: Axel Finke, Sumeetpal S. Singh

Abstract: We present approximate algorithms for performing smoothing in a class of high-dimensional state-space models via sequential Monte Carlo methods ("particle filters"). In high dimensions, a prohibitively large number of Monte Carlo samples ("particles") -- growing exponentially in the dimension of the state space -- is usually required to obtain a useful smoother. Using blocking strategies as in Reb… ▽ More We present approximate algorithms for performing smoothing in a class of high-dimensional state-space models via sequential Monte Carlo methods ("particle filters"). In high dimensions, a prohibitively large number of Monte Carlo samples ("particles") -- growing exponentially in the dimension of the state space -- is usually required to obtain a useful smoother. Using blocking strategies as in Rebeschini and Van Handel (2015) (and earlier pioneering work on blocking), we exploit the spatial ergodicity properties of the model to circumvent this curse of dimensionality. We thus obtain approximate smoothers that can be computed recursively in time and in parallel in space. First, we show that the bias of our blocked smoother is bounded uniformly in the time horizon and in the model dimension. We then approximate the blocked smoother with particles and derive the asymptotic variance of idealised versions of our blocked particle smoother to show that variance is no longer adversely effected by the dimension of the model. Finally, we employ our method to successfully perform maximum-likelihood estimation via stochastic gradient-ascent and stochastic expectation--maximisation algorithms in a 100-dimensional state-space model. △ Less

Submitted 20 September, 2017; v1 submitted 28 June, 2016; originally announced June 2016.

Comments: Includes supplementary materials

Journal ref: IEEE Transactions on Signal Processing, 65(22), 5982-5994, 2017

arXiv:1605.09065 [pdf, ps, other]

doi 10.1103/PhysRevB.96.165134

Bulk Fermi-surface of the Weyl type-II semi-metallic candidate MoTe2

Authors: D. Rhodes, R. Schönemann, N. Aryal, Q. Zhou, Q. R. Zhang, E. Kampert, Y. -C. Chiu, Y. Lai, Y. Shimura, G. T. McCandless, J. Y. Chan, D. W. Paley, J. Lee, A. D. Finke, J. P. C. Ruff, S. Das, E. Manousakis, L. Balicas

Abstract: The electronic structure of WTe$_2$ and orthorhombic $γ-$MoTe$_2$, are claimed to contain pairs of Weyl type-II points. A series of ARPES experiments claim a broad agreement with these predictions. We synthesized single-crystals of MoTe$_2$ through a Te flux method to validate these predictions through measurements of its bulk Fermi surface (FS) \emph{via} quantum oscillatory phenomena. We find th… ▽ More The electronic structure of WTe$_2$ and orthorhombic $γ-$MoTe$_2$, are claimed to contain pairs of Weyl type-II points. A series of ARPES experiments claim a broad agreement with these predictions. We synthesized single-crystals of MoTe$_2$ through a Te flux method to validate these predictions through measurements of its bulk Fermi surface (FS) \emph{via} quantum oscillatory phenomena. We find that the superconducting transition temperature of $γ-$MoTe$_2$ depends on disorder as quantified by the ratio between the room- and low-temperature resistivities, suggesting the possibility of an unconventional superconducting pairing symmetry. Similarly to WTe$_2$, the magnetoresistivity of $γ-$MoTe$_2$ does not saturate at high magnetic fields and can easily surpass $10^{6}$ \%. Remarkably, the analysis of the de Haas-van Alphen (dHvA) signal superimposed onto the magnetic torque, indicates that the geometry of its FS is markedly distinct from the calculated one. The dHvA signal also reveals that the FS is affected by the Zeeman-effect precluding the extraction of the Berry-phase. A direct comparison between the previous ARPES studies and density-functional-theory (DFT) calculations reveals a disagreement in the position of the valence bands relative to the Fermi level $\varepsilon_F$. Here, we show that a shift of the DFT valence bands relative to $\varepsilon_F$, in order to match the ARPES observations, and of the DFT electron bands to explain some of the observed dHvA frequencies, leads to a good agreement between the calculations and the angular dependence of the FS cross-sectional areas observed experimentally. However, this relative displacement between electron- and hole-bands eliminates their crossings and, therefore, the Weyl type-II points predicted for $γ-$MoTe$_2$. △ Less

Submitted 29 September, 2017; v1 submitted 29 May, 2016; originally announced May 2016.

Comments: 13 pages, 7 figures, supplementary file not included (in press)

Journal ref: Phys. Rev. B 96, 165134 (2017)

arXiv:1601.06766 [pdf, other]

doi 10.1088/1367-2630/18/11/113017

On the observation of nonclassical excitations in Bose-Einstein condensates

Authors: Andreas Finke, Piyush Jain, Silke Weinfurtner

Abstract: In the recent experimental and theoretical literature well-established nonclassicality criteria from the field of quantum optics have been directly applied to the case of excitations in matter-waves. Among these are violations of Cauchy-Schwarz inequalities, Glauber-Sudarshan P-nonclassicality, sub-Poissonian number-difference squeezing (also known as the two-mode variance) and the criterion of no… ▽ More In the recent experimental and theoretical literature well-established nonclassicality criteria from the field of quantum optics have been directly applied to the case of excitations in matter-waves. Among these are violations of Cauchy-Schwarz inequalities, Glauber-Sudarshan P-nonclassicality, sub-Poissonian number-difference squeezing (also known as the two-mode variance) and the criterion of nonseparability. We review the strong connection of these criteria and their meaning in quantum optics, and point out differences in the interpretation between light and matter waves. We then calculate observables for a homogenous Bose-Einstein condensate undergoing an arbitrary modulation in the interaction parameter at finite initial temperature, within both the quantum theory as well as a classical reference. We conclude that to date in experiments relevant for analogue gravity, nonclassical effects have not conclusively been observed and conjecture that additional, noncommuting, observables have to be measured to this end. △ Less

Submitted 25 January, 2016; originally announced January 2016.

Comments: 11 pages, 1 figure

Journal ref: New Journal of Physics, Volume 18, November 2016

Showing 1–25 of 25 results for author: Finke, A