-
Decentralized Multi-Party Multi-Network AI for Global Deployment of 6G Wireless Systems
Authors:
Merim Dzaferagic,
Marco Ruffini,
Nina Slamnik-Krijestorac,
Joao F. Santos,
Johann Marquez-Barja,
Christos Tranoris,
Spyros Denazis,
Thomas Kyriakakis,
Panagiotis Karafotis,
Luiz DaSilva,
Shashi Raj Pandey,
Junya Shiraishi,
Petar Popovski,
Soren Kejser Jensen,
Christian Thomsen,
Torben Bach Pedersen,
Holger Claussen,
**feng Du,
Gil Zussman,
Tingjun Chen,
Yiran Chen,
Seshu Tirupathi,
Ivan Seskar,
Daniel Kilper
Abstract:
Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DM…
▽ More
Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DMMAI harmonizes AI-driven controls across diverse network platforms and thus facilitates networks that autonomously configure, monitor, and repair themselves. This is particularly crucial at the network edge, where advanced applications meet heightened functionality and security demands. The radio/optical integration is vital due to the current compartmentalization of AI research within these domains, which lacks a comprehensive understanding of their interaction. Our approach explores multi-network orchestration and AI control integration, filling a critical gap in standardized frameworks for AI-driven coordination in 6G networks. The DMMAI framework is a step towards a global standard for AI in 6G, aiming to establish reference use cases, data and model management methods, and benchmarking platforms for future AI/ML solutions.
△ Less
Submitted 15 April, 2024;
originally announced July 2024.
-
Time evolution as an optimization problem: The hydrogen atom in strong laser fields in a basis of time-dependent Gaussian wave packets
Authors:
Simon Elias Schrader,
Håkon Emil Kristiansen,
Thomas Bondo Pedersen,
Simen Kvaal
Abstract:
Recent advances in attosecond science have made it increasingly important to develop stable, reliable and accurate algorithms and methods to model the time evolution of atoms and molecules in intense laser fields. A key process in attosecond science is high-harmonic generation, which is challenging to model with fixed Gaussian basis sets, as it produces high-energy electrons, with a resulting rapi…
▽ More
Recent advances in attosecond science have made it increasingly important to develop stable, reliable and accurate algorithms and methods to model the time evolution of atoms and molecules in intense laser fields. A key process in attosecond science is high-harmonic generation, which is challenging to model with fixed Gaussian basis sets, as it produces high-energy electrons, with a resulting rapidly varying and highly oscillatory wave function that extends over dozens of ångström. Recently, Rothe's method, where time evolution is rephrased as an optimization problem, has been applied to the one-dimensional Schrödinger equation. Here, we apply Rothe's method to the hydrogen wave function and demonstrate that complex-valued Gaussian wave packets with time-dependent width, center, and momentum parameters are able to reproduce spectra obtained from essentially exact grid calculations for high-harmonic generation with only 50-181 Gaussians for field strengths up to $5\times 10^{14}$W/cm$^2$. This paves the way for the inclusion of continuum contributions into real-time, time-dependent electronic-structure theory with Gaussian basis sets for strong fields, and eventually accurate simulations of the time evolution of molecules without the Born-Oppenheimer approximation.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Gaussians for Electronic and Rovibrational Quantum Dynamics
Authors:
Aleksander P. Woźniak,
Ludwik Adamowicz,
Thomas Bondo Pedersen,
Simen Kvaal
Abstract:
The assumptions underpinning the adiabatic Born-Oppenheimer (BO) approximation are broken for molecules interacting with attosecond laser pulses, which generate complicated coupled electronic-nuclear wavepackets that generally will have components of electronic and dissociation continua as well as bound-state contributions. The conceptually most straightforward way to overcome this challenge is to…
▽ More
The assumptions underpinning the adiabatic Born-Oppenheimer (BO) approximation are broken for molecules interacting with attosecond laser pulses, which generate complicated coupled electronic-nuclear wavepackets that generally will have components of electronic and dissociation continua as well as bound-state contributions. The conceptually most straightforward way to overcome this challenge is to treat the electronic and nuclear degrees of freedom on equal quantum-mechanical footing by not invoking the BO approximation at all. Explicitly correlated Gaussian (ECG) basis functions have proved successful for non-BO calculations of stationary molecular states and energies, reproducing rovibrational absorption spectra with very high accuracy. In this paper, we present a proof-of-principle study of the ability of fully flexible ECGs (FFECGs) to capture the intricate electronic and rovibrational dynamics generated by short, high-intensity laser pulses. By fitting linear combinations of FFECGs to accurate wave function histories obtained on a large real-space grid for a regularized 2D model of the hydrogen atom and for the 2D Morse potential we demonstrate that FFECGs provide a very compact description of laser-driven electronic and rovibrational dynamics.
△ Less
Submitted 12 April, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
Domain Adaptation for Time series Transformers using One-step fine-tuning
Authors:
Subina Khanal,
Seshu Tirupathi,
Giulio Zizzo,
Ambrish Rawat,
Torben Bach Pedersen
Abstract:
The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited d…
▽ More
The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited data. Additionally, addressing the issue of catastrophic forgetting, where models forget previously learned information when exposed to new data, is another critical aspect that requires attention in enhancing the robustness of Transformers for time series tasks. To address these limitations, in this paper, we pre-train the time series Transformer model on a source domain with sufficient data and fine-tune it on the target domain with limited data. We introduce the \emph{One-step fine-tuning} approach, adding some percentage of source domain data to the target domains, providing the model with diverse time series instances. We then fine-tune the pre-trained model using a gradual unfreezing technique. This helps enhance the model's performance in time series prediction for domains with limited data. Extensive experimental results on two real-world datasets show that our approach improves over the state-of-the-art baselines by 4.35% and 11.54% for indoor temperature and wind power prediction, respectively.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Creating and Querying Data Cubes in Python using pyCube
Authors:
Sigmundur Vang,
Christian Thomsen,
Torben Bach Pedersen
Abstract:
Data cubes are used for analyzing large data sets usually contained in data warehouses. The most popular data cube tools use graphical user interfaces (GUI) to do the data analysis. Traditionally this was fine since data analysts were not expected to be technical people. However, in the subsequent decades the data landscape changed dramatically requiring companies to employ large teams of highly t…
▽ More
Data cubes are used for analyzing large data sets usually contained in data warehouses. The most popular data cube tools use graphical user interfaces (GUI) to do the data analysis. Traditionally this was fine since data analysts were not expected to be technical people. However, in the subsequent decades the data landscape changed dramatically requiring companies to employ large teams of highly technical data scientists in order to manage and use the ever increasing amount of data. These data scientists generally use tools like Python, interactive notebooks, pandas, etc. while modern data cube tools are still GUI based. This paper proposes a Python-based data cube tool called pyCube. pyCube is able to semi-automatically create data cubes for data stored in an RDBMS and manages the data cube metadata. pyCube's programmatic interface enables data scientist to query data cubes by specifying the expected metadata of the result. pyCube is experimentally evaluated on Star Schema Benchmark (SSB). The results show that pyCube vastly outperforms different implementations of SSB queries in pandas in both runtime and memory while being easier to read and write.
△ Less
Submitted 28 January, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Magnetic Optical Rotation from Real-Time Simulations in Finite Magnetic Fields
Authors:
Benedicte Sverdrup Ofstad,
Meilani Wibowo-Teale,
Håkon Emil Kristiansen,
Einar Aurbakken,
Marios Petros Kitsaras,
Øyvind Sigmundson Schøyen,
Eirill Hauge,
Simen Kvaal,
Stella Stopkowicz,
Andrew M. Wibowo-Teale,
Thomas Bondo Pedersen
Abstract:
We present a numerical approach to magnetic optical rotation based on real-time time-dependent electronic-structure theory. Not relying on perturbation expansions in the magnetic-field strength, the formulation allows us to test the range of validity of the linear relation between the rotation angle per unit path length and the magnetic-field strength that was established empirically by Verdet 160…
▽ More
We present a numerical approach to magnetic optical rotation based on real-time time-dependent electronic-structure theory. Not relying on perturbation expansions in the magnetic-field strength, the formulation allows us to test the range of validity of the linear relation between the rotation angle per unit path length and the magnetic-field strength that was established empirically by Verdet 160 years ago. Results obtained from time-dependent coupled-cluster and time-dependent current density-functional theory are presented for the closed-shell molecules H2, HF, and CO in magnetic fields up to 55 kT at standard temperature and pressure conditions. We find that Verdet's linearity remains valid up to roughly 10-20 kT, above which significant deviations from linearity are observed. Among the three current density-functional approximations tested in this work, the current-dependent Tao-Perdew-Staroverov-Scuseria hybrid functional performs the best in comparison with time-dependent coupled-cluster singles and doubles results for the magnetic optical rotation.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Transient spectroscopy from time-dependent electronic-structure theory without multipole expansions
Authors:
Einar Aurbakken,
Benedicte Sverdrup Ofstad,
Håkon Emil Kristiansen,
Øyvind Sigmundson Schøyen,
Simen Kvaal,
Lasse Kragh Sørensen,
Roland Lindh,
Thomas Bondo Pedersen
Abstract:
Based on the work done by an electromagnetic field on an atomic or molecular electronic system, a general gauge invariant formulation of transient absorption spectroscopy is presented within the semi-classical approximation. Avoiding multipole expansions, a computationally viable expression for the spectral response function is derived from the minimal-coupling Hamiltonian of an electronic system…
▽ More
Based on the work done by an electromagnetic field on an atomic or molecular electronic system, a general gauge invariant formulation of transient absorption spectroscopy is presented within the semi-classical approximation. Avoiding multipole expansions, a computationally viable expression for the spectral response function is derived from the minimal-coupling Hamiltonian of an electronic system interacting with one or more laser pulses described by a source-free, enveloped electromagnetic vector potential. With a fixed-basis expansion of the electronic wave function, the computational cost of simulations of laser-driven electron dynamics beyond the dipole approximation is the same as simulations adopting the dipole approximation. We illustrate the theory by time-dependent configuration interaction and coupled-cluster simulations of core-level absorption and circular dichroism spectra.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Cost-Efficient High-Resolution Linear Absorption Spectra Through Extrapolating the Dipole Moment from Real-Time Time-Dependent Electronic-Structure Theory
Authors:
Eirill Hauge,
Håkon Emil Kristiansen,
Lukas Konecny,
Marius Kadek,
Michal Repisky,
Thomas Bondo Pedersen
Abstract:
We present a novel function fitting method for approximating the propagation of the time-dependent electric dipole moment from real-time electronic structure calculations. Real-time calculations of the electronic absorption spectrum require discrete Fourier transforms of the electric dipole moment. The spectral resolution is determined by the total propagation time, i.e. the trajectory length of t…
▽ More
We present a novel function fitting method for approximating the propagation of the time-dependent electric dipole moment from real-time electronic structure calculations. Real-time calculations of the electronic absorption spectrum require discrete Fourier transforms of the electric dipole moment. The spectral resolution is determined by the total propagation time, i.e. the trajectory length of the dipole moment, causing a high computational cost. Our developed method uses function fitting on shorter trajectories of the dipole moment, achieving arbitrary spectral resolution through extrapolation. Numerical testing shows that the fitting method can reproduce high-resolution spectra using short dipole trajectories. The method converges with as little as 100 a.u. dipole trajectories for some systems, though the difficulty converging increases with the spectral density. We also introduce an error estimate of the fit, reliably assessing its convergence and hence the quality of the approximated spectrum.
△ Less
Submitted 31 October, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Efficient Generalized Temporal Pattern Mining in Big Time Series Using Mutual Information
Authors:
Van Long Ho,
Nguyen Ho,
Torben Bach Pedersen,
Panagiotis Papapetrou
Abstract:
Big time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in various environments. Significant insights can be gained by mining temporal patterns from these time series. Temporal pattern mining (TPM) extends traditional pattern mining by adding event time intervals into extracted patterns, making them more expressive at the expense of increased time and sp…
▽ More
Big time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in various environments. Significant insights can be gained by mining temporal patterns from these time series. Temporal pattern mining (TPM) extends traditional pattern mining by adding event time intervals into extracted patterns, making them more expressive at the expense of increased time and space complexities. Besides frequent temporal patterns (FTPs), which occur frequently in the entire dataset, another useful type of temporal patterns are so-called rare temporal patterns (RTPs), which appear rarely but with high confidence. Mining rare temporal patterns yields additional challenges. For FTP mining, the temporal information and complex relations between events already create an exponential search space. For RTP mining, the support measure is set very low, leading to a further combinatorial explosion and potentially producing too many uninteresting patterns. Thus, there is a need for a generalized approach which can mine both frequent and rare temporal patterns. This paper presents our Generalized Temporal Pattern Mining from Time Series (GTPMfTS) approach with the following specific contributions: (1) The end-to-end GTPMfTS process taking time series as input and producing frequent/rare temporal patterns as output. (2) The efficient Generalized Temporal Pattern Mining (GTPM) algorithm mines frequent and rare temporal patterns using efficient data structures for fast retrieval of events and patterns during the mining process, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of GTPM that uses mutual information, a measure of data correlation, to prune unpromising time series from the search space.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Goal-Oriented Scheduling in Sensor Networks with Application Timing Awareness
Authors:
Josefine Holm,
Federico Chiariotti,
Anders E. Kalør,
Beatriz Soret,
Torben Bach Pedersen,
Petar Popovski
Abstract:
Taking inspiration from linguistics, the communications theoretical community has recently shown a significant recent interest in pragmatic , or goal-oriented, communication. In this paper, we tackle the problem of pragmatic communication with multiple clients with different, and potentially conflicting, objectives. We capture the goal-oriented aspect through the metric of Value of Information (Vo…
▽ More
Taking inspiration from linguistics, the communications theoretical community has recently shown a significant recent interest in pragmatic , or goal-oriented, communication. In this paper, we tackle the problem of pragmatic communication with multiple clients with different, and potentially conflicting, objectives. We capture the goal-oriented aspect through the metric of Value of Information (VoI), which considers the estimation of the remote process as well as the timing constraints. However, the most common definition of VoI is simply the Mean Square Error (MSE) of the whole system state, regardless of the relevance for a specific client. Our work aims to overcome this limitation by including different summary statistics, i.e., value functions of the state, for separate clients, and a diversified query process on the client side, expressed through the fact that different applications may request different functions of the process state at different times. A query-aware Deep Reinforcement Learning (DRL) solution based on statically defined VoI can outperform naive approaches by 15-20%.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Adiabatic extraction of nonlinear optical properties from real-time time-dependent electronic-structure theory
Authors:
Benedicte Sverdrup Ofstad,
Håkon Emil Kristiansen,
Einar Aurbakken,
Øyvind Sigmundson Schøyen,
Simen Kvaal,
Thomas Bondo Pedersen
Abstract:
Real-time simulations of laser-driven electron dynamics contain information about molecular optical properties through all orders in response theory. These properties can be extracted by assuming convergence of the power series expansion of induced electric and magnetic multipole moments. However, the accuracy relative to analytical results from response theory quickly deteriorates for higher-orde…
▽ More
Real-time simulations of laser-driven electron dynamics contain information about molecular optical properties through all orders in response theory. These properties can be extracted by assuming convergence of the power series expansion of induced electric and magnetic multipole moments. However, the accuracy relative to analytical results from response theory quickly deteriorates for higher-order responses due to the presence of high-frequency oscillations in the induced multipole moment in the time domain. This problem has been ascribed to missing higher-order corrections. We here demonstrate that the deviations are caused by nonadiabatic effects arising from the finite-time ram** from zero to full strength of the external laser field. Three different approaches, two using a ramped wave and one using a pulsed wave, for extracting electrical properties from real-time time-dependent electronic-structure simulations are investigated. The standard linear ramp is compared to a quadratic ramp, which is found to yield highly accurate results for polarizabilities, and first and second hyperpolarizabilities, at roughly half the computational cost. Results for the third hyperpolarizability are presented along with a simple, computable measure of reliability.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
The $S$-diagnostic -- an a posteriori error assessment for single-reference coupled-cluster methods
Authors:
Fabian M. Faulstich,
Håkon E. Kristiansen,
Mihaly A. Csirik,
Simen Kvaal,
Thomas Bondo Pedersen,
Andre Laestadius
Abstract:
We propose a novel a posteriori error assessment for the single-reference coupled-cluster (SRCC) method called the $S$-diagnostic. We provide a derivation of the $S$-diagnostic that is rooted in the mathematical analysis of different SRCC variants. We numerically scrutinized the $S$-diagnostic, testing its performance for (1) geometry optimizations, (2) electronic correlation simulations of system…
▽ More
We propose a novel a posteriori error assessment for the single-reference coupled-cluster (SRCC) method called the $S$-diagnostic. We provide a derivation of the $S$-diagnostic that is rooted in the mathematical analysis of different SRCC variants. We numerically scrutinized the $S$-diagnostic, testing its performance for (1) geometry optimizations, (2) electronic correlation simulations of systems with varying numerical difficulty, and (3) the square-planar copper complexes [CuCl$_4$]$^{2-}$, [Cu(NH$_3$)$_4$]$^{2+}$, and [Cu(H$_2$O)$_4$]$^{2+}$. Throughout the numerical investigations, the $S$-diagnostic is compared to other SRCC diagnostic procedures, that is, the $T_1$, $D_1$, and $D_2$ diagnostics as well as different indices of multi-determinantal and multi-reference character in coupled-cluster theory. Our numerical investigations show that the $S$-diagnostic outperforms the $T_1$, $D_1$, and $D_2$ diagnostics and is comparable to the indices of multi-determinantal and multi-reference character in coupled-cluster theory in their individual fields of applicability. The experiments investigating the performance of the $S$-diagnostic for geometry optimizations using SRCC reveal that the $S$-diagnostic correlates well with different error measures at a high level of statistical relevance. The experiments investigating the performance of the $S$-diagnostic for electronic correlation simulations show that the $S$-diagnostic correctly predicts strong multi-reference regimes. The $S$-diagnostic moreover correctly detects the successful SRCC computations for [CuCl$_4$]$^{2-}$, [Cu(NH$_3$)$_4$]$^{2+}$, and [Cu(H$_2$O)$_4$]$^{2+}$, which have been known to be misdiagnosed by $T_1$ and $D_1$ diagnostics in the past. This shows that the $S$-diagnostic is a promising candidate for an a posteriori diagnostic for SRCC calculations.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
A Comparative Study on Unsupervised Anomaly Detection for Time Series: Experiments and Analysis
Authors:
Yan Zhao,
Liwei Deng,
Xuanhao Chen,
Chenjuan Guo,
Bin Yang,
Tung Kieu,
Feiteng Huang,
Torben Bach Pedersen,
Kai Zheng,
Christian S. Jensen
Abstract:
The continued digitization of societal processes translates into a proliferation of time series data that cover applications such as fraud detection, intrusion detection, and energy management, where anomaly detection is often essential to enable reliability and safety. Many recent studies target anomaly detection for time series data. Indeed, area of time series anomaly detection is characterized…
▽ More
The continued digitization of societal processes translates into a proliferation of time series data that cover applications such as fraud detection, intrusion detection, and energy management, where anomaly detection is often essential to enable reliability and safety. Many recent studies target anomaly detection for time series data. Indeed, area of time series anomaly detection is characterized by diverse data, methods, and evaluation strategies, and comparisons in existing studies consider only part of this diversity, which makes it difficult to select the best method for a particular problem setting. To address this shortcoming, we introduce taxonomies for data, methods, and evaluation strategies, provide a comprehensive overview of unsupervised time series anomaly detection using the taxonomies, and systematically evaluate and compare state-of-the-art traditional as well as deep learning techniques. In the empirical study using nine publicly available datasets, we apply the most commonly-used performance evaluation metrics to typical methods under a fair implementation standard. Based on the structuring offered by the taxonomies, we report on empirical studies and provide guidelines, in the form of comparative tables, for choosing the methods most suitable for particular application settings. Finally, we propose research directions for this dynamic field.
△ Less
Submitted 10 September, 2022;
originally announced September 2022.
-
No need for a grid: Adaptive fully-flexible gaussians for the time-dependent Schrödinger equation
Authors:
Simen Kvaal,
Caroline Lasser,
Thomas Bondo Pedersen,
Ludwik Adamowicz
Abstract:
Linear combinations of complex gaussian functions, where the linear and nonlinear parameters are allowed to vary, are shown to provide an extremely flexible and effective approach for solving the time-dependent Schrödinger equation in one spatial dimension. The use of flexible basis sets has been proven notoriously hard within the systematics of the Dirac--Frenkel variational principle. In this wo…
▽ More
Linear combinations of complex gaussian functions, where the linear and nonlinear parameters are allowed to vary, are shown to provide an extremely flexible and effective approach for solving the time-dependent Schrödinger equation in one spatial dimension. The use of flexible basis sets has been proven notoriously hard within the systematics of the Dirac--Frenkel variational principle. In this work we present an alternative time-propagation scheme that de-emphasizes optimal parameter evolution but directly targets residual minimization via the method of Rothe's method, also called the method of vertical time layers. We test the scheme using a simple model system mimicking an atom subjected to an extreme laser pulse. Such a pulse produces complicated ionization dynamics of the system. The scheme is shown to perform very well on this model and notably does not rely on a computational grid. Only a handful of gaussian functions are needed to achieve an accuracy on par with a high-resolution, grid-based solver. This paves the way for accurate and affordable solution of the time-dependent Schrödinger equation for atoms and molecules within and beyond the Born--Oppenheimer approximation.
△ Less
Submitted 7 March, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Mining Seasonal Temporal Patterns in Time Series
Authors:
Van Long Ho,
Nguyen Ho,
Torben Bach Pedersen
Abstract:
Very large time series are increasingly available from an ever wider range of IoT-enabled sensors, from which significant insights can be obtained through mining temporal patterns from them. A useful type of patterns found in many real-world applications exhibits periodic occurrences, and is thus called seasonal temporal pattern (STP). Compared to regular patterns, mining seasonal temporal pattern…
▽ More
Very large time series are increasingly available from an ever wider range of IoT-enabled sensors, from which significant insights can be obtained through mining temporal patterns from them. A useful type of patterns found in many real-world applications exhibits periodic occurrences, and is thus called seasonal temporal pattern (STP). Compared to regular patterns, mining seasonal temporal patterns is more challenging since traditional measures such as support and confidence do not capture the seasonality characteristics. Further, the anti-monotonicity property does not hold for STPs, and thus, resulting in an exponential search space. This paper presents our Frequent Seasonal Temporal Pattern Mining from Time Series (FreqSTPfTS) solution providing: (1) The first solution for seasonal temporal pattern mining (STPM) from time series that can mine STP at different data granularities. (2) The STPM algorithm that uses efficient data structures and two pruning techniques to reduce the search space and speed up the mining process. (3) An approximate version of STPM that uses mutual information, a measure of data correlation, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that STPM outperforms the baseline in runtime and memory consumption, and can scale to big datasets. The approximate STPM is up to an order of magnitude faster and less memory consuming than the baseline, while maintaining high accuracy.
△ Less
Submitted 9 January, 2023; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Laser-induced dynamic alignment of the HD molecule without the Born-Oppenheimer approximation
Authors:
Ludwik Adamowicz,
Simen Kvaal,
Caroline Lasser,
Thomas Bondo Pedersen
Abstract:
Laser-induced molecular alignment is well understood within the framework of the Born-Oppenheimer (BO) approximation Without the BO approximation, however, the concept of molecular structure is lost, making alignment hard to define precisely. In this work, we demonstrate the emergence of alignment from the first-ever non-BO quantum dynamics simulations, using the HD molecule exposed to ultrashort…
▽ More
Laser-induced molecular alignment is well understood within the framework of the Born-Oppenheimer (BO) approximation Without the BO approximation, however, the concept of molecular structure is lost, making alignment hard to define precisely. In this work, we demonstrate the emergence of alignment from the first-ever non-BO quantum dynamics simulations, using the HD molecule exposed to ultrashort laser pulses as a few-body test case We extract the degree of alignment from the non-BO wave function by means of an operator expressed in terms of pseudo-proton coordinates that mimics the BO-based definition of alignment The only essential approximation, in addition to the semiclassical electric-dipole approximation for the matter-field interaction, is the choice of time-independent explicitly correlated Gaussian basis functions. We use a variational, electric-field-dependent basis-set construction procedure, which allows us to keep the basis-set dimension low whilst capturing the main effects of electric polarization on the nuclear and electronic degrees of freedom. The basis-set construction procedure is validated by comparing with virtually exact grid-based simulations for two one-dimensional model systems: laser-driven electron dynamics in a soft attractive Coulomb potential and nuclear rovibrational dynamics in a Morse potential.
△ Less
Submitted 15 September, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
A Unified Approach for Multi-Scale Synchronous Correlation Search in Big Time Series -- Full Version
Authors:
Nguyen Ho,
Van Long Ho,
Torben Bach Pedersen,
Mai Vu,
Christophe A. N. Biscio
Abstract:
The wide deployment of IoT sensors has enabled the collection of very big time series across different domains, from which advanced analytics can be performed to find unknown relationships, most importantly the correlations between them. However, current approaches for correlation search on time series are limited to only a single temporal scale and simple types of relations, and cannot handle noi…
▽ More
The wide deployment of IoT sensors has enabled the collection of very big time series across different domains, from which advanced analytics can be performed to find unknown relationships, most importantly the correlations between them. However, current approaches for correlation search on time series are limited to only a single temporal scale and simple types of relations, and cannot handle noise effectively. This paper presents the integrated SYnchronous COrrelation Search (iSYCOS) framework to find multi-scale correlations in big time series. Specifically, iSYCOS integrates top-down and bottom-up approaches into a single auto-configured framework capable of efficiently extracting complex window-based correlations from big time series using mutual information (MI). Moreover, iSYCOS includes a novel MI-based theory to identify noise in the data, and is used to perform pruning to improve iSYCOS performance. Besides, we design a distributed version of iSYCOS that can scale out in a Spark cluster to handle big time series. Our extensive experimental evaluation on synthetic and real-world datasets shows that iSYCOS can auto-configure on a given dataset to find complex multi-scale correlations. The pruning and optimisations can improve iSYCOS performance up to an order of magnitude, and the distributed iSYCOS can scale out linearly on a computing cluster.
△ Less
Submitted 19 April, 2022;
originally announced April 2022.
-
Finding Representative Sampling Subsets in Sensor Graphs using Time Series Similarities
Authors:
Roshni Chakraborty,
Josefine Holm,
Torben Bach Pedersen,
Petar Popovski
Abstract:
With the increasing use of IoT-enabled sensors, it is important to have effective methods for querying the sensors. For example, in a dense network of battery-driven temperature sensors, it is often possible to query (sample) just a subset of the sensors at any given time, since the values of the non-sampled sensors can be estimated from the sampled values. If we can divide the set of sensors into…
▽ More
With the increasing use of IoT-enabled sensors, it is important to have effective methods for querying the sensors. For example, in a dense network of battery-driven temperature sensors, it is often possible to query (sample) just a subset of the sensors at any given time, since the values of the non-sampled sensors can be estimated from the sampled values. If we can divide the set of sensors into disjoint so-called representative sampling subsets that each represent the other sensors sufficiently well, we can alternate the sampling between the sampling subsets and thus, increase battery life significantly. In this paper, we formulate the problem of finding representative sampling subsets as a graph problem on a so-called sensor graph with the sensors as nodes. Our proposed solution, SubGraphSample, consists of two phases. In Phase-I, we create edges in the sensor graph based on the similarities between the time series of sensor values, analyzing six different techniques based on proven time series similarity metrics. In Phase-II, we propose two new techniques and extend four existing ones to find the maximal number of representative sampling subsets. Finally, we propose AutoSubGraphSample which auto-selects the best technique for Phase-I and Phase-II for a given dataset. Our extensive experimental evaluation shows that our approach can yield significant battery life improvements within realistic error bounds.
△ Less
Submitted 18 February, 2022; v1 submitted 17 February, 2022;
originally announced February 2022.
-
Linear and nonlinear optical properties from TDOMP2 theory
Authors:
Håkon Emil Kristiansen,
Benedicte Sverdrup Ofstad,
Eirill Hauge,
Einar Aurbakken,
Øyvind Sigmundson Schøyen,
Simen Kvaal,
Thomas Bondo Pedersen
Abstract:
In this work we present a derivation of the real-time time-dependent orbital-optimized Møller-Plesser TDOMP2 and its biorthogonal companion, time-dependent non-orthogonal OMP2 (TDNOMP2), theory starting from the time-dependent bivariational principle and a parametrization based on the exponential orbital-rotation operator formulation commonly used in time-independent molecular electronic-structure…
▽ More
In this work we present a derivation of the real-time time-dependent orbital-optimized Møller-Plesser TDOMP2 and its biorthogonal companion, time-dependent non-orthogonal OMP2 (TDNOMP2), theory starting from the time-dependent bivariational principle and a parametrization based on the exponential orbital-rotation operator formulation commonly used in time-independent molecular electronic-structure theory. We apply the TDOMP2 method to extract absorption spectra and frequency-dependent polarizabilities and first hyperpolarizabilities from real-time simulations, comparing the results with those obtained from conventional time-dependent coupled-cluster singles and doubles (TDCCSD) simulations and from its second-order approximation TDCC2. We also compare with results from CCSD and CC2 linear and quadratic response theory. Our results indicate that while TDOMP2 absorption spectra are of the same quality as TDCC2 spectra, frequency-dependent polarizabilities and hyperpolarizabilties from TDOMP2 simulations are significantly closer to TDCCSD results than those from TDCC2 simulations.
△ Less
Submitted 21 April, 2022; v1 submitted 27 December, 2021;
originally announced December 2021.
-
Practical continuous-variable quantum key distribution with composable security
Authors:
Nitin Jain,
Hou-Man Chin,
Hossein Mani,
Cosmo Lupo,
Dino Solar Nikolic,
Arne Kordts,
Stefano Pirandola,
Thomas Brochmann Pedersen,
Matthias Kolb,
Bernhard Ömer,
Christoph Pacher,
Tobias Gehring,
Ulrik L. Andersen
Abstract:
A quantum key distribution (QKD) system must fulfill the requirement of universal composability to ensure that any cryptographic application (using the QKD system) is also secure. Furthermore, the theoretical proof responsible for security analysis and key generation should cater to the number $N$ of the distributed quantum states being finite in practice. Continuous-variable (CV) QKD based on coh…
▽ More
A quantum key distribution (QKD) system must fulfill the requirement of universal composability to ensure that any cryptographic application (using the QKD system) is also secure. Furthermore, the theoretical proof responsible for security analysis and key generation should cater to the number $N$ of the distributed quantum states being finite in practice. Continuous-variable (CV) QKD based on coherent states, despite being a suitable candidate for integration in the telecom infrastructure, has so far been unable to demonstrate composability as existing proofs require a rather large $N$ for successful key generation. Here we report the first Gaussian-modulated coherent state CVQKD system that is able to overcome these challenges and can generate composable keys secure against collective attacks with $N \lesssim 3.5\times10^8$ coherent states. With this advance, possible due to novel improvements to the security proof and a fast, yet low-noise and highly stable system operation, CVQKD implementations take a significant step towards their discrete-variable counterparts in practicality, performance, and security.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Evolutionary Clustering of Streaming Trajectories
Authors:
Tianyi Li,
Lu Chen,
Christian S. Jensen,
Torben Bach Pedersen,
Jilin Hu
Abstract:
The widespread deployment of smartphones and location-enabled, networked in-vehicle devices renders it increasingly feasible to collect streaming trajectory data of moving objects. The continuous clustering of such data can enable a variety of real-time services, such as identifying representative paths or common moving trends among objects in real-time. However, little attention has so far been g…
▽ More
The widespread deployment of smartphones and location-enabled, networked in-vehicle devices renders it increasingly feasible to collect streaming trajectory data of moving objects. The continuous clustering of such data can enable a variety of real-time services, such as identifying representative paths or common moving trends among objects in real-time. However, little attention has so far been given to the quality of clusters -- for example, it is beneficial to smooth short-term fluctuations in clusters to achieve robustness to exceptional data.
We propose the notion of evolutionary clustering of streaming trajectories, abbreviated ECO, that enhances streaming-trajectory clustering quality by means of temporal smoothing that prevents abrupt changes in clusters across successive timestamps. Employing the notions of snapshot and historical trajectory costs, we formalize ECO and then formulate ECO as an optimization problem and prove that ECO can be performed approximately in linear time, thus eliminating the iterative processes employed in previous studies. Further, we propose a minimal-group structure and a seed point shifting strategy to facilitate temporal smoothing. Finally, we present all algorithms underlying ECO along with a set of optimization techniques. Extensive experiments with two real-life datasets offer insight into ECO and show that it outperforms state-of-the-art solutions in terms of both clustering quality and efficiency.
△ Less
Submitted 23 September, 2021;
originally announced September 2021.
-
Query Age of Information: Freshness in Pull-Based Communication
Authors:
Federico Chiariotti,
Josefine Holm,
Anders E. Kalør,
Beatriz Soret,
Søren K. Jensen,
Torben B. Pedersen,
Petar Popovski
Abstract:
Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumes that new information is used at any time, which is not always the case: the instants at which information is collected and used are dependent on a certain qu…
▽ More
Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumes that new information is used at any time, which is not always the case: the instants at which information is collected and used are dependent on a certain query process. We propose a model that accounts for the discrete time nature of many monitoring processes, considering a pull-based communication model in which the freshness of information is only important when the receiver generates a query: if the monitoring process is not using the value, the age of the last update is irrelevant. We then define the Age of Information at Query (QAoI), a more general metric that fits the pull-based scenario, and show how its optimization can lead to very different choices from traditional push-based AoI optimization when using a Packet Erasure Channel (PEC) and with limited link availability. Our results show that QAoI-aware optimization can significantly reduce the average and worst-case perceived age for both periodic and stochastic queries.
△ Less
Submitted 12 January, 2022; v1 submitted 14 May, 2021;
originally announced May 2021.
-
The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes
Authors:
Ciprian-Octavian Truică,
Elena-Simona Apostol,
Jérôme Darmont,
Torben Bach Pedersen
Abstract:
In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve the issue of rigid data structures present in relational databases, by introducing semi-structured and flexible schema design. As current data generated by differ…
▽ More
In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve the issue of rigid data structures present in relational databases, by introducing semi-structured and flexible schema design. As current data generated by different sources and devices, especially from IoT sensors and actuators, use either XML or JSON format, depending on the application, database technologies that store and query semi-structured data in XML format are needed. Thus, Native XML Databases, which were initially designed to manipulate XML data using standardized querying languages, i.e., XQuery and XPath, were rebranded as NoSQL Document-Oriented Databases Systems. Currently, the majority of these solutions have been replaced with the more modern JSON based Database Management Systems. However, we believe that XML-based solutions can still deliver performance in executing complex queries on heterogeneous collections. Unfortunately nowadays, research lacks a clear comparison of the scalability and performance for database technologies that store and query documents in XML versus the more modern JSON format. Moreover, to the best of our knowledge, there are no Big Data-compliant benchmarks for such database technologies. In this paper, we present a comparison for selected Document-Oriented Database Systems that either use the XML format to encode documents, i.e., BaseX, eXist-db, and Sedna, or the JSON format, i.e., MongoDB, CouchDB, and Couchbase. To underline the performance differences we also propose a benchmark that uses a heterogeneous complex schema on a large DBLP corpus.
△ Less
Submitted 3 February, 2021;
originally announced February 2021.
-
Freshness on Demand: Optimizing Age of Information for the Query Process
Authors:
Josefine Holm,
Anders E. Kalør,
Federico Chiariotti,
Beatriz Soret,
Søren K. Jensen,
Torben B. Pedersen,
Petar Popovski
Abstract:
Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumed that new information is used at any time, which is not always the case and the instants at which information is collected and used are dependent on a certain…
▽ More
Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumed that new information is used at any time, which is not always the case and the instants at which information is collected and used are dependent on a certain query process. We propose a model that accounts for the discrete time nature of many monitoring processes, considering a pull-based communication model in which the freshness of information is only important when the receiver generates a query. We then define the Age of Information at Query (QAoI), a more general metric that fits the pull-based scenario, and show how its optimization can lead to very different choices from traditional push-based AoI optimization when using a Packet Erasure Channel (PEC).
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
On Efficient and Scalable Time-Continuous Spatial Crowdsourcing -- Full Version
Authors:
Ting Wang,
Xike Xie,
Xin Cao,
Torben Bach Pedersen,
Yang Wang,
Mingjun Xiao
Abstract:
The proliferation of advanced mobile terminals opened up a new crowdsourcing avenue, spatial crowdsourcing, to utilize the crowd potential to perform real-world tasks. In this work, we study a new type of spatial crowdsourcing, called time-continuous spatial crowdsourcing (TCSC in short). It supports broad applications for long-term continuous spatial data acquisition, ranging from environmental m…
▽ More
The proliferation of advanced mobile terminals opened up a new crowdsourcing avenue, spatial crowdsourcing, to utilize the crowd potential to perform real-world tasks. In this work, we study a new type of spatial crowdsourcing, called time-continuous spatial crowdsourcing (TCSC in short). It supports broad applications for long-term continuous spatial data acquisition, ranging from environmental monitoring to traffic surveillance in citizen science and crowdsourcing projects. However, due to limited budgets and limited availability of workers in practice, the data collected is often incomplete, incurring data deficiency problem. To tackle that, in this work, we first propose an entropy-based quality metric, which captures the joint effects of incompletion in data acquisition and the imprecision in data interpolation. Based on that, we investigate quality-aware task assignment methods for both single- and multi-task scenarios. We show the NP-hardness of the single-task case, and design polynomial-time algorithms with guaranteed approximation ratios. We study novel indexing and pruning techniques for further enhancing the performance in practice. Then, we extend the solution to multi-task scenarios and devise a parallel framework for speeding up the process of optimization. We conduct extensive experiments on both real and synthetic datasets to show the effectiveness of our proposals.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Efficient Temporal Pattern Mining in Big Time Series Using Mutual Information -- Full Version
Authors:
Van Long Ho,
Nguyen Ho,
Torben Bach Pedersen
Abstract:
Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased mining…
▽ More
Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased mining time complexity. Existing TPM methods either cannot scale to large datasets, or work only on pre-processed temporal events rather than on time series. This paper presents our Frequent Temporal Pattern Mining from Time Series (FTPMf TS) approach which provides: (1) The end-to-end FTPMf TS process taking time series as input and producing frequent temporal patterns as output. (2) The efficient Hierarchical Temporal Pattern Graph Mining (HTPGM) algorithm that uses efficient data structures for fast support and confidence computation, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of HTPGM that uses mutual information, a measure of data correlation known from information theory, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that HTPGM outperforms the baselines in runtime and memory consumption, and can scale to big datasets. The approximate HTPGM is up to two orders of magnitude faster and less memory consuming than the baselines, while retaining high accuracy.
△ Less
Submitted 17 November, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Interpretation of Coupled-Cluster Many-Electron Dynamics in Terms of Stationary States
Authors:
Thomas Bondo Pedersen,
Håkon Emil Kristiansen,
Tilmann Bodenstein,
Simen Kvaal,
Øyvind Sigmundson Schøyen
Abstract:
We demonstrate theoretically and numerically that laser-driven many-electron dynamics, as described by bivariational time-dependent coupled-cluster theory, may be analyzed in terms of stationary-state populations. Projectors heuristically defined from linear response theory and equation-of-motion coupled-cluster theory are proposed for the calculation of stationary-state populations during interac…
▽ More
We demonstrate theoretically and numerically that laser-driven many-electron dynamics, as described by bivariational time-dependent coupled-cluster theory, may be analyzed in terms of stationary-state populations. Projectors heuristically defined from linear response theory and equation-of-motion coupled-cluster theory are proposed for the calculation of stationary-state populations during interaction with laser pulses or other external forces, and conservation laws of the populations are discussed. Numerical tests of the proposed projectors, involving both linear and nonlinear optical processes for the He and Be atoms, and for the LiH, CH$^+$, and LiF molecules, show that the laser-driven evolution of the stationary-state populations at the coupled-cluster singles-and-doubles (CCSD) level is very close to that obtained by full configuration-interaction theory provided all stationary states actively participating in the dynamics are sufficiently well approximated. When double-excited states are important for the dynamics, the quality of the CCSD results deteriorate. Observing that populations computed from the linear-response projector may show spurious small-amplitude, high-frequency oscillations, the equation-of-motion projector emerges as the most promising approach to stationary-state populations.
△ Less
Submitted 19 December, 2020; v1 submitted 21 September, 2020;
originally announced September 2020.
-
High-Level ETL for Semantic Data Warehouses -- Full Version
Authors:
Rudra Pratap Deb Nath,
Oscar Romero,
Torben Bach Pedersen,
Katja Hose
Abstract:
The popularity of the Semantic Web (SW) encourages organizations to organize and publish semantic data using the RDF model. This growth poses new requirements to Business Intelligence (BI) technologies to enable On-Line Analytical Processing (OLAP)-like analysis over semantic data. The incorporation of semantic data into a Data Warehouse (DW) is not supported by the traditional Extract-Transform-L…
▽ More
The popularity of the Semantic Web (SW) encourages organizations to organize and publish semantic data using the RDF model. This growth poses new requirements to Business Intelligence (BI) technologies to enable On-Line Analytical Processing (OLAP)-like analysis over semantic data. The incorporation of semantic data into a Data Warehouse (DW) is not supported by the traditional Extract-Transform-Load (ETL) tools because they do not consider semantic issues in the integration process. In this paper, we propose a layer-based integration process and a set of high-level RDF-based ETL constructs required to define, map, extract, process, transform, integrate, update, and load (multidimensional) semantic data. Different to other ETL tools, we automate the ETL data flows by creating metadata at the schema level. Therefore, it relieves ETL developers from the burden of manual map** at the ETL operation level. We create a prototype, named Semantic ETL Construct (SETLCONSTRUCT), based on the innovative ETL constructs proposed here. To evaluate SETLCONSTRUCT, we create a multidimensional semantic DW by integrating a Danish Business dataset and an EU Subsidy dataset using it and compare it with the previous programmable framework SETLPROG in terms of productivity, development time and performance. The evaluation shows that 1) SETLCONSTRUCT uses 92% fewer Number of Typed Characters (NOTC) than SETLPROG, and SETLAUTO (the extension of SETLCONSTRUCT for generating ETL execution flow automatically) further reduces the Number of Used Concepts (NOUC) by another 25%; 2) using SETLCONSTRUCT, the development time is almost cut in half compared to SETLPROG, and is cut by another 27% using SETLAUTO; 3) SETLCONSTRUCT is scalable and has similar performance compared to SETLPROG.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Multidimensional Enrichment of Spatial RDF Data for SOLAP -- Full Version
Authors:
Nurefsan Gür,
Torben Bach Pedersen,
Katja Hose,
Mikael Midtgaard
Abstract:
Large volumes of spatial data and multidimensional data are being published on the Semantic Web, which has led to new opportunities for advanced analysis, such as Spatial Online Analytical Processing (SOLAP). The RDF Data Cube (QB) and QB4OLAP vocabularies have been widely used for annotating and publishing statistical and multidimensional RDF data. Although such statistical data sets might have s…
▽ More
Large volumes of spatial data and multidimensional data are being published on the Semantic Web, which has led to new opportunities for advanced analysis, such as Spatial Online Analytical Processing (SOLAP). The RDF Data Cube (QB) and QB4OLAP vocabularies have been widely used for annotating and publishing statistical and multidimensional RDF data. Although such statistical data sets might have spatial information, such as coordinates, the lack of spatial semantics and spatial multidimensional concepts in QB4OLAP and QB prevents users from employing SOLAP queries over spatial data using SPARQL. The QB4SOLAP vocabulary, on the other hand, fully supports annotating spatial and multidimensional data on the Semantic Web and enables users to query endpoints with SOLAP operators in SPARQL. To bridge the gap between QB/QB4OLAP and QB4SOLAP, we propose an RDF2SOLAP enrichment model that automatically annotates spatial multidimensional concepts with QB4SOLAP and in doing so enables SOLAP on existing QB and QB4OLAP data on the Semantic Web. Furthermore, we present and evaluate a wide range of enrichment algorithms and apply them on a non-trivial real-world use case involving governmental open data with complex geometry types.
△ Less
Submitted 16 February, 2020;
originally announced February 2020.
-
Numerical stability of time-dependent coupled-cluster methods for many-electron dynamics in intense laser pulses
Authors:
Håkon Emil Kristiansen,
Øyvind Sigmundson Schøyen,
Simen Kvaal,
Thomas Bondo Pedersen
Abstract:
We investigate the numerical stability of time-dependent coupled-cluster theory for many-electron dynamics in intense laser pulses, comparing two coupled-cluster formulations with full configuration interaction theory. Our numerical experiments show that orbital-adaptive time-dependent coupled-cluster doubles (OATDCCD) theory offers significantly improved stability compared with the conventional H…
▽ More
We investigate the numerical stability of time-dependent coupled-cluster theory for many-electron dynamics in intense laser pulses, comparing two coupled-cluster formulations with full configuration interaction theory. Our numerical experiments show that orbital-adaptive time-dependent coupled-cluster doubles (OATDCCD) theory offers significantly improved stability compared with the conventional Hartree-Fock-based time-dependent coupled-cluster singles-and-doubles (TDCCSD) formulation. The improved stability stems from greatly reduced oscillations in the doubles amplitudes, which, in turn, can be traced to the dynamic biorthonormal reference determinants of OATDCCD theory. As long as these are good approximations to the Brueckner determinant, OATDCCD theory is numerically stable. We propose the reference weight as a diagnostic quantity to identify situations where the TDCCSD and OATDCCD theories become unstable.
△ Less
Submitted 24 February, 2020; v1 submitted 19 December, 2019;
originally announced December 2019.
-
Multi-Source Spatial Entity Linkage
Authors:
Suela Isaj,
Torben Bach Pedersen,
Esteban Zimányi
Abstract:
Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity lin…
▽ More
Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity linkage problem, which finds which pairs of spatial entities belong to the same physical spatial entity. Our proposed solution (QuadSky) starts with a time-efficient spatial blocking technique (QuadFlex), compares pairwise the spatial entities in the same block, ranks the pairs using Pareto optimality with the SkyRank algorithm, and finally, classifies the pairs with our novel SkyEx-* family of algorithms that yield 0.85 precision and 0.85 recall for a manually labeled dataset of 1,500 pairs and 0.87 precision and 0.6 recall for a semi-manually labeled dataset of 777,452 pairs. Moreover, we provide a theoretical guarantee and formalize the SkyEx-FES algorithm that explores only 27% of the skylines without any loss in F-measure. Furthermore, our fully unsupervised algorithm SkyEx-D approximates the optimal result with an F-measure loss of just 0.01. Finally, QuadSky provides the best trade-off between precision and recall, and the best F-measure compared to the existing baselines and clustering techniques, and approximates the results of supervised learning solutions.
△ Less
Submitted 29 April, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version
Authors:
Nguyen Ho,
Huy Vo,
Mai Vu,
Torben Bach Pedersen
Abstract:
Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first importa…
▽ More
Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first important step is to identify temporal correlations between data sets. Given the characteristics of Big Data in terms of volume and velocity, techniques that identify correlations not only need to be fast and scalable, but also need to help users in ordering the correlations across temporal scales so that they can focus on important relationships. In this paper, we present AMIC (Adaptive Mutual Information-based Correlation), a method based on mutual information to identify correlations at multiple temporal scales in large time series. Discovered correlations are suggested to users in an order based on the strength of the relationships. Our method supports an adaptive streaming technique that minimizes duplicated computation and is implemented on top of Apache Spark for scalability. We also provide a comprehensive evaluation on the effectiveness and the scalability of AMIC using both synthetic and real-world data sets.
△ Less
Submitted 7 July, 2019; v1 submitted 24 June, 2019;
originally announced June 2019.
-
Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+
Authors:
Søren Kejser Jensen,
Torben Bach Pedersen,
Christian Thomsen
Abstract:
To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. S…
▽ More
To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. Specifically, we propose compressing groups of correlated time series using an extensible set of model types within a user-defined error bound (possibly zero). We name this new category of model-based compression methods for time series Multi-Model Group Compression (MMGC). We present the first MMGC method GOLEMM and extend model types to compress time series groups. We propose primitives for users to effectively define groups for differently sized data sets, and based on these, an automated grou** method using only the time series dimensions. We propose algorithms for executing simple and multi-dimensional aggregate queries on models. Last, we implement our methods in the Time Series Management System (TSMS) ModelarDB (ModelarDB+). Our evaluation shows that compared to widely used formats, ModelarDB+ provides up to 13.7 times faster ingestion due to high compression, 113 times better compression due to the adaptivity of GOLEMM, 630 times faster aggregates by using models, and close to linear scalability. It is also extensible and supports online query processing.
△ Less
Submitted 29 June, 2021; v1 submitted 25 March, 2019;
originally announced March 2019.
-
Seed-Driven Geo-Social Data Extraction -- Full Version
Authors:
Suela Isaj,
Torben Bach Pedersen
Abstract:
Geo-social data has been an attractive source for a variety of problems such as mining mobility patterns, link prediction, location recommendation, and influence maximization. However, new geo-social data is increasingly unavailable and suffers several limitations. In this paper, we aim to remedy the problem of effective data extraction from geo-social data sources. We first identify and categoriz…
▽ More
Geo-social data has been an attractive source for a variety of problems such as mining mobility patterns, link prediction, location recommendation, and influence maximization. However, new geo-social data is increasingly unavailable and suffers several limitations. In this paper, we aim to remedy the problem of effective data extraction from geo-social data sources. We first identify and categorize the limitations of extracting geo-social data. In order to overcome the limitations, we propose a novel seed-driven approach that uses the points of one source as the seed to feed as queries for the others. We additionally handle differences between, and dynamics within the sources by proposing three variants for optimizing search radius. Furthermore, we provide an optimization based on recursive clustering to minimize the number of requests and an adaptive procedure to learn the specific data distribution of each source. Our comprehensive experiments with six popular sources show that our seed-driven approach yields 14.3 times more data overall, while our request-optimized algorithm retrieves up to 95% of the data with less than 16% of the requests. Thus, our proposed seed-driven approach set new standards for effective and efficient extraction of geo-social data.
△ Less
Submitted 23 June, 2019; v1 submitted 20 January, 2019;
originally announced January 2019.
-
Ultra-fast real-time quantum random number generator with correlated measurement outcomes and rigorous security certification
Authors:
Tobias Gehring,
Cosmo Lupo,
Arne Kordts,
Dino Solar Nikolic,
Nitin Jain,
Tobias Rydberg,
Thomas B. Pedersen,
Stefano Pirandola,
Ulrik L. Andersen
Abstract:
Quantum random number generators (QRNGs) promise perfectly unpredictable random numbers. However, the security certification of the random numbers in form of a stochastic model often introduces assumptions that are either hardly justified or indeed unnecessary. Two important examples are the restriction of an adversary to the classical regime as well as negligible correlations between consecutive…
▽ More
Quantum random number generators (QRNGs) promise perfectly unpredictable random numbers. However, the security certification of the random numbers in form of a stochastic model often introduces assumptions that are either hardly justified or indeed unnecessary. Two important examples are the restriction of an adversary to the classical regime as well as negligible correlations between consecutive measurement outcomes. Additionally, non-rigorous system characterization opens a security loophole. In this work we experimentally realize a QRNG that does not rely on the aforementioned assumptions and whose stochastic model is established by a rigorous -- metrological -- approach. Based on quadrature measurements of vacuum fluctuations, we demonstrate a real-time random number generation rate of 8 \,GBit/s. Our security certification approach offers a number of practical benefits and will therefore find widespread applications in quantum random number generators. In particular, our generated random numbers are well suited for today's conventional and quantum cryptographic solutions.
△ Less
Submitted 30 March, 2020; v1 submitted 13 December, 2018;
originally announced December 2018.
-
Symplectic integration and physical interpretation of time-dependent coupled-cluster theory
Authors:
Thomas Bondo Pedersen,
Simen Kvaal
Abstract:
The formulation of the time-dependent Schrodinger equation in terms of coupled-cluster theory is outlined, with emphasis on the bivariational framework and its classical Hamiltonian structure. An indefinite inner product is introduced, inducing physical interpretation of coupled-cluster states in the form of transition probabilities, autocorrelation functions, and explicitly real values for observ…
▽ More
The formulation of the time-dependent Schrodinger equation in terms of coupled-cluster theory is outlined, with emphasis on the bivariational framework and its classical Hamiltonian structure. An indefinite inner product is introduced, inducing physical interpretation of coupled-cluster states in the form of transition probabilities, autocorrelation functions, and explicitly real values for observables, solving interpretation issues which are present in time-dependent coupled-cluster theory and in ground-state calculations of molecular systems under influence of external magnetic fields. The problem of the numerical integration of the equations of motion is considered, and a critial evaluation of the standard fourth-order Runge--Kutta scheme and the symplectic Gauss integrator of variable order is given, including several illustrative numerical experiments. While the Gauss integrator is stable even for laser pulses well above the perturbation limit, our experiments indicate that a system-dependent upper limit exists for the external field strengths. Above this limit, time-dependent coupled-cluster calculations become very challenging numerically, even in the full configuration interaction limit. The source of these numerical instabilities is shown to be rapid increases of the amplitudes as ultrashort high-intensity laser pulses pump the system out of the ground state into states that are virtually orthogonal to the static Hartree-Fock reference determinant.
△ Less
Submitted 14 April, 2019; v1 submitted 11 December, 2018;
originally announced December 2018.
-
Adaptive User-Oriented Direct Load-Control of Residential Flexible Devices
Authors:
Davide Frazzetto,
Bijay Neupane,
Torben Bach Pedersen,
Thomas Dyhre Nielsen
Abstract:
Demand Response (DR) schemes are effective tools to maintain a dynamic balance in energy markets with higher integration of fluctuating renewable energy sources. DR schemes can be used to harness residential devices' flexibility and to utilize it to achieve social and financial objectives. However, existing DR schemes suffer from low user participation as they fail at taking into account the users…
▽ More
Demand Response (DR) schemes are effective tools to maintain a dynamic balance in energy markets with higher integration of fluctuating renewable energy sources. DR schemes can be used to harness residential devices' flexibility and to utilize it to achieve social and financial objectives. However, existing DR schemes suffer from low user participation as they fail at taking into account the users' requirements. First, DR schemes are highly demanding for the users, as users need to provide direct information, e.g. via surveys, on their energy consumption preferences. Second, the user utility models based on these surveys are hard-coded and do not adapt over time. Third, the existing scheduling techniques require the users to input their energy requirements on a daily basis. As an alternative, this paper proposes a DR scheme for user-oriented direct load-control of residential appliances operations. Instead of relying on user surveys to evaluate the user utility, we propose an online data-driven approach for estimating user utility functions, purely based on available load consumption data, that adaptively models the users' preference over time. Our scheme is based on a day-ahead scheduling technique that transparently prescribes the users with optimal device operation schedules that take into account both financial benefits and user-perceived quality of service. To model day-ahead user energy demand and flexibility, we propose a probabilistic approach for generating flexibility models under uncertainty. Results on both real-world and simulated datasets show that our DR scheme can provide significant financial benefits while preserving the user-perceived quality of service.
△ Less
Submitted 9 May, 2018;
originally announced May 2018.
-
Day-ahead Trading of Aggregated Energy Flexibility - Full Version
Authors:
Emmanouil Valsomatzis,
Torben Bach Pedersen,
Alberto Abello
Abstract:
Flexibility of small loads, in particular from Electric Vehicles (EVs), has recently attracted a lot of interest due to their possibility of participating in the energy market and the new commercial potentials. Different from existing work, the aggregation techniques proposed in this paper produce flexible aggregated loads from EVs taking into account technical market requirements. They can be fur…
▽ More
Flexibility of small loads, in particular from Electric Vehicles (EVs), has recently attracted a lot of interest due to their possibility of participating in the energy market and the new commercial potentials. Different from existing work, the aggregation techniques proposed in this paper produce flexible aggregated loads from EVs taking into account technical market requirements. They can be further transformed into the so-called flexible orders and be traded in the day-ahead market by a Balance Responsible Party (BRP). As a result, the BRP can achieve at least 20% cost reduction on average in energy purchase compared to traditional charging based on 2017 real electricity prices from the Danish electricity market.
△ Less
Submitted 24 May, 2018; v1 submitted 6 May, 2018;
originally announced May 2018.
-
Utilizing Device-level Demand Forecasting for Flexibility Markets - Full Version
Authors:
Bijay Neupane,
Torben Bach Pedersen,
Bo Thiesson
Abstract:
The uncertainty in the power supply due to fluctuating Renewable Energy Sources (RES) has severe (financial and other) implications for energy market players. In this paper, we present a device-level Demand Response (DR) scheme that captures the atomic (all available) flexibilities in energy demand and provides the largest possible solution space to generate demand/supply schedules that minimize m…
▽ More
The uncertainty in the power supply due to fluctuating Renewable Energy Sources (RES) has severe (financial and other) implications for energy market players. In this paper, we present a device-level Demand Response (DR) scheme that captures the atomic (all available) flexibilities in energy demand and provides the largest possible solution space to generate demand/supply schedules that minimize market imbalances. We evaluate the effectiveness and feasibility of widely used forecasting models for device-level flexibility analysis. In a typical device-level flexibility forecast, a market player is more concerned with the \textit{utility} that the demand flexibility brings to the market, rather than the intrinsic forecast accuracy. In this regard, we provide comprehensive predictive modeling and scheduling of demand flexibility from household appliances to demonstrate the (financial and otherwise) viability of introducing flexibility-based DR in the Danish/Nordic market. Further, we investigate the correlation between the potential utility and the accuracy of the demand forecast model. Furthermore, we perform a number of experiments to determine the data granularity that provides the best financial reward to market players for adopting the proposed DR scheme. A cost-benefit analysis of forecast results shows that even with somewhat low forecast accuracy, market players can achieve regulation cost savings of 54% of the theoretically optimal.
△ Less
Submitted 2 May, 2018;
originally announced May 2018.
-
Time Series Management Systems: A Survey
Authors:
Søren Kejser Jensen,
Torben Bach Pedersen,
Christian Thomsen
Abstract:
The collection of time series data increases as more monitoring and automation are being deployed. These deployments range in scale from an Internet of things (IoT) device located in a household to enormous distributed Cyber-Physical Systems (CPSs) producing large volumes of data at high velocity. To store and analyze these vast amounts of data, specialized Time Series Management Systems (TSMSs) h…
▽ More
The collection of time series data increases as more monitoring and automation are being deployed. These deployments range in scale from an Internet of things (IoT) device located in a household to enormous distributed Cyber-Physical Systems (CPSs) producing large volumes of data at high velocity. To store and analyze these vast amounts of data, specialized Time Series Management Systems (TSMSs) have been developed to overcome the limitations of general purpose Database Management Systems (DBMSs) for times series management. In this paper, we present a thorough analysis and classification of TSMSs developed through academic or industrial research and documented through publications. Our classification is organized into categories based on the architectures observed during our analysis. In addition, we provide an overview of each system with a focus on the motivational use case that drove the development of the system, the functionality for storage and querying of time series a system implements, the components the system is composed of, and the capabilities of each system with regard to Stream Processing and Approximate Query Processing (AQP). Last, we provide a summary of research directions proposed by other researchers in the field and present our vision for a next generation TSMS.
△ Less
Submitted 3 October, 2017;
originally announced October 2017.
-
High Performance Information Reconciliation for QKD with CASCADE
Authors:
Thomas Brochmann Pedersen,
Mustafa Toyran
Abstract:
It is widely accepted in the quantum cryptography community that interactive information reconciliation protocols, such as cascade, are ineffcient due to the communication overhead. Instead, non-interactive information reconciliation protocols based on i.e. LDPC codes or, more recently, polar codes have been proposed. In this work, we argue that interactive protocols should be taken into considera…
▽ More
It is widely accepted in the quantum cryptography community that interactive information reconciliation protocols, such as cascade, are ineffcient due to the communication overhead. Instead, non-interactive information reconciliation protocols based on i.e. LDPC codes or, more recently, polar codes have been proposed. In this work, we argue that interactive protocols should be taken into consideration in modern quantum key distribution systems. In particular, we demonstrate how to improve the performance of cascade by proper implementation and use. Our implementation of cascade reaches a throughput above 80 Mbps under realistic conditions. This is more than four times the throughput previously demonstrated in any information reconciliation protocol.
△ Less
Submitted 30 July, 2013;
originally announced July 2013.