-
Domain Knowledge Aids in Signal Disaggregation; the Example of the Cumulative Water Heater
Authors:
Alexander Belikov,
Guillaume Matheron,
Johan Sassi
Abstract:
In this article we present an unsupervised low-frequency method aimed at detecting and disaggregating the power used by Cumulative Water Heaters (CWH) in residential homes. Our model circumvents the inherent difficulty of unsupervised signal disaggregation by using both the shape of a power spike and its time of occurrence to identify the contribution of CWH reliably. Indeed, many CHWs in France a…
▽ More
In this article we present an unsupervised low-frequency method aimed at detecting and disaggregating the power used by Cumulative Water Heaters (CWH) in residential homes. Our model circumvents the inherent difficulty of unsupervised signal disaggregation by using both the shape of a power spike and its time of occurrence to identify the contribution of CWH reliably. Indeed, many CHWs in France are configured to turn on automatically during off-peak hours only, and we are able to use this domain knowledge to aid peak identification despite the low sampling frequency. In order to test our model, we equipped a home with sensors to record the ground-truth consumption of a water heater. We then apply the model to a larger dataset of energy consumption of Hello Watt users consisting of one month of consumption data for 5k homes at 30-minute resolution. In this dataset we successfully identified CWHs in the majority of cases where consumers declared using them. The remaining part is likely due to possible misconfiguration of CWHs, since triggering them during off-peak hours requires specific wiring in the electrical panel of the house. Our model, despite its simplicity, offers promising applications: detection of mis-configured CWHs on off-peak contracts and slow performance degradation.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Bayesian model of electrical heating disaggregation
Authors:
François Culière,
Laetitia Leduc,
Alexander Belikov
Abstract:
Adoption of smart meters is a major milestone on the path of European transition to smart energy. The residential sector in France represents $\approx$35\% of electricity consumption with $\approx$40\% (INSEE) of households using electrical heating. The number of deployed smart meters Linky is expected to reach 35M in 2021. In this manuscript we present an analysis of 676 households with an observ…
▽ More
Adoption of smart meters is a major milestone on the path of European transition to smart energy. The residential sector in France represents $\approx$35\% of electricity consumption with $\approx$40\% (INSEE) of households using electrical heating. The number of deployed smart meters Linky is expected to reach 35M in 2021. In this manuscript we present an analysis of 676 households with an observation period of at least 6 months, for which we have metadata, such as the year of construction and the type of heating and propose a Bayesian model of the electrical consumption conditioned on temperature that allows to disaggregate the heating component from the electrical load curve in an unsupervised manner. In essence the model is a mixture of piece-wise linear models, characterised by a temperature threshold, below which we allow a mixture of two modes to represent the latent state home/away.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Detecting signal from science:The structure of research communities and prior knowledge improves prediction of genetic regulatory experiments
Authors:
Alexander V. Belikov,
Andrey Rzhetsky,
James Evans
Abstract:
The explosive growth of scientists, scientific journals, articles and findings in recent years exponentially increases the difficulty scientists face in navigating prior knowledge. This challenge is exacerbated by uncertainty about the reproducibility of published findings. The availability of massive digital archives, machine reading and extraction tools on the one hand, and automated high-throug…
▽ More
The explosive growth of scientists, scientific journals, articles and findings in recent years exponentially increases the difficulty scientists face in navigating prior knowledge. This challenge is exacerbated by uncertainty about the reproducibility of published findings. The availability of massive digital archives, machine reading and extraction tools on the one hand, and automated high-throughput experiments on the other, allow us to evaluate these challenges at scale and identify novel opportunities for accelerating scientific advance. Here we demonstrate a Bayesian calculus that enables the positive prediction of robust, replicable scientific claims with findings automatically extracted from published literature on gene interactions. We matched these findings, filtered by science, with unfiltered gene interactions measured by the massive LINCS L1000 high-throughput experiment to identify and counteract sources of bias. Our calculus is built on easily extracted publication meta-data regarding the position of a scientific claim within the web of prior knowledge, and its breadth of support across institutions, authors and communities, revealing that scientifically focused but socially and institutionally independent research activity is most likely to replicate. These findings recommend policies that go against the common practice of channeling biomedical research funding into centralized research consortia and institutes rather than dispersing it more broadly. Our results demonstrate that robust scientific findings hinge upon a delicate balance of shared focus and independence, and that this complex pattern can be computationally exploited to decode bias and predict the replicability of published findings. These insights provide guidance for scientists navigating the research literature and for science funders seeking to improve it.
△ Less
Submitted 23 August, 2020;
originally announced August 2020.
-
Distillation of neural network models for detection and description of key points of images
Authors:
A. V. Yashchenko,
A. V. Belikov,
M. V. Peterson,
A. S. Potapov
Abstract:
Image matching and classification methods, as well as synchronous location and map**, are widely used on embedded and mobile devices. Their most resource-intensive part is the detection and description of the key points of the images. And if the classical methods of detecting and describing key points can be executed in real time on mobile devices, then for modern neural network methods with the…
▽ More
Image matching and classification methods, as well as synchronous location and map**, are widely used on embedded and mobile devices. Their most resource-intensive part is the detection and description of the key points of the images. And if the classical methods of detecting and describing key points can be executed in real time on mobile devices, then for modern neural network methods with the best quality, such use is difficult. Thus, it is important to increase the speed of neural network models for the detection and description of key points. The subject of research is distillation as one of the methods for reducing neural network models. The aim of thestudy is to obtain a more compact model of detection and description of key points, as well as a description of the procedure for obtaining this model. A method for the distillation of neural networks for the task of detecting and describing key points was tested. The objective function and training parameters that provide the best results in the framework of the study are proposed. A new data set has been introduced for testing key point detection methods and a new quality indicator of the allocated key points and their corresponding local features. As a result of training in the described way, the new model, with the same number of parameters, showed greater accuracy in comparing key points than the original model. A new model with a significantly smaller number of parameters shows the accuracy of point matching close to the accuracy of the original model.
△ Less
Submitted 18 May, 2020;
originally announced June 2020.
-
GoodPoint: unsupervised learning of keypoint detection and description
Authors:
Anatoly Belikov,
Alexey Potapov
Abstract:
This paper introduces a new algorithm for unsupervised learning of keypoint detectors and descriptors, which demonstrates fast convergence and good performance across different datasets. The training procedure uses homographic transformation of images. The proposed model learns to detect points and generate descriptors on pairs of transformed images, which are easy for it to distinguish and repeat…
▽ More
This paper introduces a new algorithm for unsupervised learning of keypoint detectors and descriptors, which demonstrates fast convergence and good performance across different datasets. The training procedure uses homographic transformation of images. The proposed model learns to detect points and generate descriptors on pairs of transformed images, which are easy for it to distinguish and repeatedly detect. The trained model follows SuperPoint architecture for ease of comparison, and demonstrates similar performance on natural images from HPatches dataset, and better performance on retina images from Fundus Image Registration Dataset, which contain low number of corner-like features. For HPatches and other datasets, coverage was also computed to provide better estimation of model quality.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Differentiable Probabilistic Logic Networks
Authors:
Alexey Potapov,
Anatoly Belikov,
Vitaly Bogdanov,
Alexander Scherbatiy
Abstract:
Probabilistic logic reasoning is a central component of such cognitive architectures as OpenCog. However, as an integrative architecture, OpenCog facilitates cognitive synergy via hybridization of different inference methods. In this paper, we introduce a differentiable version of Probabilistic Logic networks, which rules operate over tensor truth values in such a way that a chain of reasoning ste…
▽ More
Probabilistic logic reasoning is a central component of such cognitive architectures as OpenCog. However, as an integrative architecture, OpenCog facilitates cognitive synergy via hybridization of different inference methods. In this paper, we introduce a differentiable version of Probabilistic Logic networks, which rules operate over tensor truth values in such a way that a chain of reasoning steps constructs a computation graph over tensors that accepts truth values of premises from the knowledge base as input and produces truth values of conclusions as output. This allows for both learning truth values of premises and formulas for rules (specified in a form with trainable weights) by backpropagation combining subsymbolic optimization and symbolic reasoning.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Theano: A Python framework for fast computation of mathematical expressions
Authors:
The Theano Development Team,
Rami Al-Rfou,
Guillaume Alain,
Amjad Almahairi,
Christof Angermueller,
Dzmitry Bahdanau,
Nicolas Ballas,
Frédéric Bastien,
Justin Bayer,
Anatoly Belikov,
Alexander Belopolsky,
Yoshua Bengio,
Arnaud Bergeron,
James Bergstra,
Valentin Bisson,
Josh Bleecher Snyder,
Nicolas Bouchard,
Nicolas Boulanger-Lewandowski,
Xavier Bouthillier,
Alexandre de Brébisson,
Olivier Breuleux,
Pierre-Luc Carrier,
Kyunghyun Cho,
Jan Chorowski,
Paul Christiano
, et al. (88 additional authors not shown)
Abstract:
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu…
▽ More
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models.
The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
△ Less
Submitted 9 May, 2016;
originally announced May 2016.
-
Astro-WISE processing of wide-field images and other data
Authors:
Hugo Buddelmeijer,
O. Rees Williams,
John P. McFarland,
Andrey Belikov
Abstract:
Astro-WISE is the Astronomical Wide-field Imaging System for Europe. It is a scientific information system which consists of hardware and software federated over about a dozen institutes throughout Europe. It has been developed to exploit the ever increasing avalanche of data produced by astronomical surveys and data intensive scientific experiments in general.
The demo explains the architecture…
▽ More
Astro-WISE is the Astronomical Wide-field Imaging System for Europe. It is a scientific information system which consists of hardware and software federated over about a dozen institutes throughout Europe. It has been developed to exploit the ever increasing avalanche of data produced by astronomical surveys and data intensive scientific experiments in general.
The demo explains the architecture of the Astro-WISE information system and shows the use of Astro-WISE interfaces. Wide-field astronomical images are derived from the raw image to the final catalog according to the user's request. The demo is based on the standard Astro-WISE guided tour, which can be accessed from the Astro-WISE website.
The typical Astro-WISE data processing chain is shown, which can be used for data handling for a variety of different instruments, currently 14, including OmegaCAM, MegaCam, WFI, WFC, ACS/HST, etc.
△ Less
Submitted 29 November, 2011;
originally announced November 2011.
-
Astro-WISE Information System
Authors:
Willem-Jan Vriend,
Edwin A. Valentijn,
Andrey Belikov,
Gijs A. Verdoes Kleijn
Abstract:
Astro-WISE is a scientific information system for the data processing of optical images. In this paper we review main features of Astro-WISE and describe the current status of the system.
Astro-WISE is a scientific information system for the data processing of optical images. In this paper we review main features of Astro-WISE and describe the current status of the system.
△ Less
Submitted 28 November, 2011;
originally announced November 2011.
-
Information Systems Playground - The Target Infrastructure, Scaling Astro-WISE into the Petabyte range
Authors:
A. N. Belikov,
F. Dijkstra,
J. A. Gankema,
J. B. A. N. van Hoof,
R. Koopman
Abstract:
The Target infrastructure has been specially built as a storage and compute infrastructure for the information systems derived from Astro-WISE. This infrastructure will be used by several applications that collaborate in the area of information systems within the Target project. It currently consists of 10 PB of storage and thousands of computational cores. The infrastructure has been constructed…
▽ More
The Target infrastructure has been specially built as a storage and compute infrastructure for the information systems derived from Astro-WISE. This infrastructure will be used by several applications that collaborate in the area of information systems within the Target project. It currently consists of 10 PB of storage and thousands of computational cores. The infrastructure has been constructed based on the requirements of the applications. The storage is controlled by the Global Parallel File System of IBM. This file system takes care of the required flexibility by combining storage hardware with different characteristics into a single file system. It is also very scalable, which allows the system to be extended into the future, while replacing old hardware with new technology.
△ Less
Submitted 5 October, 2011;
originally announced October 2011.