Search | arXiv e-print repository

Decentralized Multi-Party Multi-Network AI for Global Deployment of 6G Wireless Systems

Authors: Merim Dzaferagic, Marco Ruffini, Nina Slamnik-Krijestorac, Joao F. Santos, Johann Marquez-Barja, Christos Tranoris, Spyros Denazis, Thomas Kyriakakis, Panagiotis Karafotis, Luiz DaSilva, Shashi Raj Pandey, Junya Shiraishi, Petar Popovski, Soren Kejser Jensen, Christian Thomsen, Torben Bach Pedersen, Holger Claussen, **feng Du, Gil Zussman, Tingjun Chen, Yiran Chen, Seshu Tirupathi, Ivan Seskar, Daniel Kilper

Abstract: Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DM… ▽ More Multiple visions of 6G networks elicit Artificial Intelligence (AI) as a central, native element. When 6G systems are deployed at a large scale, end-to-end AI-based solutions will necessarily have to encompass both the radio and the fiber-optical domain. This paper introduces the Decentralized Multi-Party, Multi-Network AI (DMMAI) framework for integrating AI into 6G networks deployed at scale. DMMAI harmonizes AI-driven controls across diverse network platforms and thus facilitates networks that autonomously configure, monitor, and repair themselves. This is particularly crucial at the network edge, where advanced applications meet heightened functionality and security demands. The radio/optical integration is vital due to the current compartmentalization of AI research within these domains, which lacks a comprehensive understanding of their interaction. Our approach explores multi-network orchestration and AI control integration, filling a critical gap in standardized frameworks for AI-driven coordination in 6G networks. The DMMAI framework is a step towards a global standard for AI in 6G, aiming to establish reference use cases, data and model management methods, and benchmarking platforms for future AI/ML solutions. △ Less

Submitted 15 April, 2024; originally announced July 2024.

arXiv:2405.07560 [pdf]

Coding historical causes of death data with Large Language Models

Authors: Bjørn Pedersen, Maisha Islam, Doris Tove Kristoffersen, Lars Ailo Bongo, Eilidh Garrett, Alice Reid, Hilde Sommerseth

Abstract: This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. Due to the complex narratives often found in historical causes of death, this task has traditionally been manually performed by coding experts. We evaluate the ability of GPT-3.5, GPT-4, and Llama 2 LLMs to accurately assign… ▽ More This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. Due to the complex narratives often found in historical causes of death, this task has traditionally been manually performed by coding experts. We evaluate the ability of GPT-3.5, GPT-4, and Llama 2 LLMs to accurately assign ICD-10 codes on the HiCaD dataset that contains causes of death recorded in the civil death register entries of 19,361 individuals from Ipswich, Kilmarnock, and the Isle of Skye from the UK between 1861-1901. Our findings show that GPT-3.5, GPT-4, and Llama 2 assign the correct code for 69%, 83%, and 40% of causes, respectively. However, we achieve a maximum accuracy of 89% by standard machine learning techniques. All LLMs performed better for causes of death that contained terms still in use today, compared to archaic terms. Also they perform better for short causes (1-2 words) compared to longer causes. LLMs therefore do not currently perform well enough for historical ICD-10 code assignment tasks. We suggest further fine-tuning or alternative frameworks to achieve adequate performance. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 18 pages, 1 figure in main text, 3 figures in appendix

arXiv:2404.07699 [pdf, other]

Time evolution as an optimization problem: The hydrogen atom in strong laser fields in a basis of time-dependent Gaussian wave packets

Authors: Simon Elias Schrader, Håkon Emil Kristiansen, Thomas Bondo Pedersen, Simen Kvaal

Abstract: Recent advances in attosecond science have made it increasingly important to develop stable, reliable and accurate algorithms and methods to model the time evolution of atoms and molecules in intense laser fields. A key process in attosecond science is high-harmonic generation, which is challenging to model with fixed Gaussian basis sets, as it produces high-energy electrons, with a resulting rapi… ▽ More Recent advances in attosecond science have made it increasingly important to develop stable, reliable and accurate algorithms and methods to model the time evolution of atoms and molecules in intense laser fields. A key process in attosecond science is high-harmonic generation, which is challenging to model with fixed Gaussian basis sets, as it produces high-energy electrons, with a resulting rapidly varying and highly oscillatory wave function that extends over dozens of ångström. Recently, Rothe's method, where time evolution is rephrased as an optimization problem, has been applied to the one-dimensional Schrödinger equation. Here, we apply Rothe's method to the hydrogen wave function and demonstrate that complex-valued Gaussian wave packets with time-dependent width, center, and momentum parameters are able to reproduce spectra obtained from essentially exact grid calculations for high-harmonic generation with only 50-181 Gaussians for field strengths up to $5\times 10^{14}$W/cm$^2$. This paves the way for the inclusion of continuum contributions into real-time, time-dependent electronic-structure theory with Gaussian basis sets for strong fields, and eventually accurate simulations of the time evolution of molecules without the Born-Oppenheimer approximation. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 20 pages, 10 figures

arXiv:2401.11926 [pdf, other]

doi 10.1021/acs.jpca.4c00364

Gaussians for Electronic and Rovibrational Quantum Dynamics

Authors: Aleksander P. Woźniak, Ludwik Adamowicz, Thomas Bondo Pedersen, Simen Kvaal

Abstract: The assumptions underpinning the adiabatic Born-Oppenheimer (BO) approximation are broken for molecules interacting with attosecond laser pulses, which generate complicated coupled electronic-nuclear wavepackets that generally will have components of electronic and dissociation continua as well as bound-state contributions. The conceptually most straightforward way to overcome this challenge is to… ▽ More The assumptions underpinning the adiabatic Born-Oppenheimer (BO) approximation are broken for molecules interacting with attosecond laser pulses, which generate complicated coupled electronic-nuclear wavepackets that generally will have components of electronic and dissociation continua as well as bound-state contributions. The conceptually most straightforward way to overcome this challenge is to treat the electronic and nuclear degrees of freedom on equal quantum-mechanical footing by not invoking the BO approximation at all. Explicitly correlated Gaussian (ECG) basis functions have proved successful for non-BO calculations of stationary molecular states and energies, reproducing rovibrational absorption spectra with very high accuracy. In this paper, we present a proof-of-principle study of the ability of fully flexible ECGs (FFECGs) to capture the intricate electronic and rovibrational dynamics generated by short, high-intensity laser pulses. By fitting linear combinations of FFECGs to accurate wave function histories obtained on a large real-space grid for a regularized 2D model of the hydrogen atom and for the 2D Morse potential we demonstrate that FFECGs provide a very compact description of laser-driven electronic and rovibrational dynamics. △ Less

Submitted 12 April, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

arXiv:2401.06527 [pdf, other]

Fragmentation of Water Clusters Formed in Helium Nanodroplets by Charge Transfer and Penning Ionization

Authors: S. De, A. R. Abid, J. D. Asmussen, L. Ben Ltaief, K. Sishodia, A. Ulmer, H. B. Pedersen, S. R. Krishnan, M. Mudrich

Abstract: Helium nanodroplets ("HNDs") are widely used for forming tailor-made clusters and molecular complexes in a cold, transparent, and weakly-interacting matrix. Characterization of embedded species by mass spectrometry is often complicated by fragmentation and trap** of ions in the HNDs. Here, we systematically study fragment ion mass spectra of HND-aggregated water and oxygen clusters following the… ▽ More Helium nanodroplets ("HNDs") are widely used for forming tailor-made clusters and molecular complexes in a cold, transparent, and weakly-interacting matrix. Characterization of embedded species by mass spectrometry is often complicated by fragmentation and trap** of ions in the HNDs. Here, we systematically study fragment ion mass spectra of HND-aggregated water and oxygen clusters following their ionization by charge transfer ionization ("CTI") and Penning ionization ("PEI"). While the efficiency of PEI of embedded clusters is lower than for CTI by about factor 10, both the mean sizes of detected water clusters and the relative yields of unprotonated cluster ions are significantly larger, making PEI a ``soft ionization'' scheme. However, the tendency of ions to remain bound to HNDs leads to a reduced detection efficiency for large HNDs containing $>10^4$ helium atoms. These results are instrumental for determining optimal conditions for mass spectrometry and photoionization spectroscopy of molecular complexes and clusters aggregated in HNDs. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2401.06524 [pdf, ps, other]

Domain Adaptation for Time series Transformers using One-step fine-tuning

Authors: Subina Khanal, Seshu Tirupathi, Giulio Zizzo, Ambrish Rawat, Torben Bach Pedersen

Abstract: The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited d… ▽ More The recent breakthrough of Transformers in deep learning has drawn significant attention of the time series community due to their ability to capture long-range dependencies. However, like other deep learning models, Transformers face limitations in time series prediction, including insufficient temporal understanding, generalization challenges, and data shift issues for the domains with limited data. Additionally, addressing the issue of catastrophic forgetting, where models forget previously learned information when exposed to new data, is another critical aspect that requires attention in enhancing the robustness of Transformers for time series tasks. To address these limitations, in this paper, we pre-train the time series Transformer model on a source domain with sufficient data and fine-tune it on the target domain with limited data. We introduce the \emph{One-step fine-tuning} approach, adding some percentage of source domain data to the target domains, providing the model with diverse time series instances. We then fine-tune the pre-trained model using a gradual unfreezing technique. This helps enhance the model's performance in time series prediction for domains with limited data. Extensive experimental results on two real-world datasets show that our approach improves over the state-of-the-art baselines by 4.35% and 11.54% for indoor temperature and wind power prediction, respectively. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: Accepted at the Fourth Workshop of Artificial Intelligence for Time Series Analysis (AI4TS): Theory, Algorithms, and Applications, AAAI 2024, Vancouver, Canada

arXiv:2312.11915 [pdf, other]

doi 10.1016/j.sna.2024.115450

Squeeze film absolute pressure sensors with sub-millipascal sensitivity

Authors: Mohsen Salimi, Robin V. Nielsen, Henrik B. Pedersen, Aurélien Dantan

Abstract: We report on the realization of ultrasensitive absolute pressure sensors based on silicon nitride membrane sandwiches. These sandwiches consist in a pair of highly-pretensioned, ultrathin (50 nm), large area (0.25 mm2) films, suspended parallel to each other and forming an ultrashort (500 nm), open cavity. The compression of a gas in this cavity leads to a strong squeeze film force, resulting in a… ▽ More We report on the realization of ultrasensitive absolute pressure sensors based on silicon nitride membrane sandwiches. These sandwiches consist in a pair of highly-pretensioned, ultrathin (50 nm), large area (0.25 mm2) films, suspended parallel to each other and forming an ultrashort (500 nm), open cavity. The compression of a gas in this cavity leads to a strong squeeze film force, resulting in an increase in the membrane mechanical resonance frequencies which is directly proportional to the absolute gas pressure. These sandwiches show a record high responsitivity of >300 Hz/Pa in terms of squeeze film-induced frequency shift, which, combined with high quality factor mechanical resonances (Q>10^6), allows for bringing the sensitivity of absolute squeeze film pressure sensors down to the sub-millipascal level. △ Less

Submitted 12 March, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

Journal ref: Sensors and Actuators A: Physical 374, 115450 (2024)

arXiv:2312.08557 [pdf, other]

Creating and Querying Data Cubes in Python using pyCube

Authors: Sigmundur Vang, Christian Thomsen, Torben Bach Pedersen

Abstract: Data cubes are used for analyzing large data sets usually contained in data warehouses. The most popular data cube tools use graphical user interfaces (GUI) to do the data analysis. Traditionally this was fine since data analysts were not expected to be technical people. However, in the subsequent decades the data landscape changed dramatically requiring companies to employ large teams of highly t… ▽ More Data cubes are used for analyzing large data sets usually contained in data warehouses. The most popular data cube tools use graphical user interfaces (GUI) to do the data analysis. Traditionally this was fine since data analysts were not expected to be technical people. However, in the subsequent decades the data landscape changed dramatically requiring companies to employ large teams of highly technical data scientists in order to manage and use the ever increasing amount of data. These data scientists generally use tools like Python, interactive notebooks, pandas, etc. while modern data cube tools are still GUI based. This paper proposes a Python-based data cube tool called pyCube. pyCube is able to semi-automatically create data cubes for data stored in an RDBMS and manages the data cube metadata. pyCube's programmatic interface enables data scientist to query data cubes by specifying the expected metadata of the result. pyCube is experimentally evaluated on Star Schema Benchmark (SSB). The results show that pyCube vastly outperforms different implementations of SSB queries in pandas in both runtime and memory while being easier to read and write. △ Less

Submitted 28 January, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Extended version of DOLAP2024 submission

arXiv:2310.15835 [pdf, other]

Observation of interatomic Coulombic decay induced by double excitation of helium in nanodroplets

Authors: B. Bastian, J. D. Asmussen, L. Ben Ltaief, H. B. Pedersen, K. Sishodia, S. De, S. R. Krishnan, C. Medina, N. Pal, R. Richter, N. Sisourat, M. Mudrich

Abstract: Interatomic Coulombic decay (ICD) plays a crucial role in weakly bound complexes exposed to intense or high-energy radiation. So far, neutral or ionic atoms or molecules have been prepared in singly excited electron or hole states which can transfer energy to neighboring centers and cause ionization and radiation damage. Here we demonstrate that a doubly excited atom, despite its extremely short l… ▽ More Interatomic Coulombic decay (ICD) plays a crucial role in weakly bound complexes exposed to intense or high-energy radiation. So far, neutral or ionic atoms or molecules have been prepared in singly excited electron or hole states which can transfer energy to neighboring centers and cause ionization and radiation damage. Here we demonstrate that a doubly excited atom, despite its extremely short lifetime, can decay by ICD; evidenced by high-resolution photoelectron spectra of He nanodroplets excited to the 2s2p+ state. We find that ICD proceeds by relaxation into excited He$^*$He$^+$ atom-pair states, in agreement with calculations. The ability of inducing ICD by resonant excitation far above the single-ionization threshold opens opportunities for controlling radiation damage to a high degree of element specificity and spectral selectivity. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 6 pages, 4 figures, to be submitted to PRL

arXiv:2308.06003 [pdf, other]

Magnetic Optical Rotation from Real-Time Simulations in Finite Magnetic Fields

Authors: Benedicte Sverdrup Ofstad, Meilani Wibowo-Teale, Håkon Emil Kristiansen, Einar Aurbakken, Marios Petros Kitsaras, Øyvind Sigmundson Schøyen, Eirill Hauge, Simen Kvaal, Stella Stopkowicz, Andrew M. Wibowo-Teale, Thomas Bondo Pedersen

Abstract: We present a numerical approach to magnetic optical rotation based on real-time time-dependent electronic-structure theory. Not relying on perturbation expansions in the magnetic-field strength, the formulation allows us to test the range of validity of the linear relation between the rotation angle per unit path length and the magnetic-field strength that was established empirically by Verdet 160… ▽ More We present a numerical approach to magnetic optical rotation based on real-time time-dependent electronic-structure theory. Not relying on perturbation expansions in the magnetic-field strength, the formulation allows us to test the range of validity of the linear relation between the rotation angle per unit path length and the magnetic-field strength that was established empirically by Verdet 160 years ago. Results obtained from time-dependent coupled-cluster and time-dependent current density-functional theory are presented for the closed-shell molecules H2, HF, and CO in magnetic fields up to 55 kT at standard temperature and pressure conditions. We find that Verdet's linearity remains valid up to roughly 10-20 kT, above which significant deviations from linearity are observed. Among the three current density-functional approximations tested in this work, the current-dependent Tao-Perdew-Staroverov-Scuseria hybrid functional performs the best in comparison with time-dependent coupled-cluster singles and doubles results for the magnetic optical rotation. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2307.05519 [pdf]

Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence Assisted Cancer Diagnosis

Authors: Xiaoyi Ji, Richard Salmon, Nita Mulliqi, Umair Khan, Yinxi Wang, Anders Blilie, Henrik Olsson, Bodil Ginnerup Pedersen, Karina Dalsgaard Sørensen, Benedicte Parm Ulhøi, Svein R Kjosavik, Emilius AM Janssen, Mattias Rantalainen, Lars Egevad, Pekka Ruusuvuori, Martin Eklund, Kimmo Kartasalo

Abstract: The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs), leading to degraded AI performance and posing a challenge for widespread clinical application as fine-tuning algorithms for each new site is impractical. Changes in the imaging workflow can also lead to compromised diagnoses and patient safety r… ▽ More The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs), leading to degraded AI performance and posing a challenge for widespread clinical application as fine-tuning algorithms for each new site is impractical. Changes in the imaging workflow can also lead to compromised diagnoses and patient safety risks. We evaluated whether physical color calibration of scanners can standardize WSI appearance and enable robust AI performance. We employed a color calibration slide in four different laboratories and evaluated its impact on the performance of an AI system for prostate cancer diagnosis on 1,161 WSIs. Color standardization resulted in consistently improved AI model calibration and significant improvements in Gleason grading performance. The study demonstrates that physical color calibration provides a potential solution to the variation introduced by different scanners, making AI-based cancer diagnostics more reliable and applicable in clinical settings. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2307.02519 [pdf, other]

Transient spectroscopy from time-dependent electronic-structure theory without multipole expansions

Authors: Einar Aurbakken, Benedicte Sverdrup Ofstad, Håkon Emil Kristiansen, Øyvind Sigmundson Schøyen, Simen Kvaal, Lasse Kragh Sørensen, Roland Lindh, Thomas Bondo Pedersen

Abstract: Based on the work done by an electromagnetic field on an atomic or molecular electronic system, a general gauge invariant formulation of transient absorption spectroscopy is presented within the semi-classical approximation. Avoiding multipole expansions, a computationally viable expression for the spectral response function is derived from the minimal-coupling Hamiltonian of an electronic system… ▽ More Based on the work done by an electromagnetic field on an atomic or molecular electronic system, a general gauge invariant formulation of transient absorption spectroscopy is presented within the semi-classical approximation. Avoiding multipole expansions, a computationally viable expression for the spectral response function is derived from the minimal-coupling Hamiltonian of an electronic system interacting with one or more laser pulses described by a source-free, enveloped electromagnetic vector potential. With a fixed-basis expansion of the electronic wave function, the computational cost of simulations of laser-driven electron dynamics beyond the dipole approximation is the same as simulations adopting the dipole approximation. We illustrate the theory by time-dependent configuration interaction and coupled-cluster simulations of core-level absorption and circular dichroism spectra. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2307.01511 [pdf, other]

doi 10.1021/acs.jctc.3c00727

Cost-Efficient High-Resolution Linear Absorption Spectra Through Extrapolating the Dipole Moment from Real-Time Time-Dependent Electronic-Structure Theory

Authors: Eirill Hauge, Håkon Emil Kristiansen, Lukas Konecny, Marius Kadek, Michal Repisky, Thomas Bondo Pedersen

Abstract: We present a novel function fitting method for approximating the propagation of the time-dependent electric dipole moment from real-time electronic structure calculations. Real-time calculations of the electronic absorption spectrum require discrete Fourier transforms of the electric dipole moment. The spectral resolution is determined by the total propagation time, i.e. the trajectory length of t… ▽ More We present a novel function fitting method for approximating the propagation of the time-dependent electric dipole moment from real-time electronic structure calculations. Real-time calculations of the electronic absorption spectrum require discrete Fourier transforms of the electric dipole moment. The spectral resolution is determined by the total propagation time, i.e. the trajectory length of the dipole moment, causing a high computational cost. Our developed method uses function fitting on shorter trajectories of the dipole moment, achieving arbitrary spectral resolution through extrapolation. Numerical testing shows that the fitting method can reproduce high-resolution spectra using short dipole trajectories. The method converges with as little as 100 a.u. dipole trajectories for some systems, though the difficulty converging increases with the spectral density. We also introduce an error estimate of the fit, reliably assessing its convergence and hence the quality of the approximated spectrum. △ Less

Submitted 31 October, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

arXiv:2306.16126 [pdf]

More efficient manual review of automatically transcribed tabular data

Authors: Bjørn-Richard Pedersen, Rigmor Katrine Johansen, Einar Holsbø, Hilde Sommerseth, Lars Ailo Bongo

Abstract: Machine learning methods have proven useful in transcribing historical data. However, results from even highly accurate methods require manual verification and correction. Such manual review can be time-consuming and expensive, therefore the objective of this paper was to make it more efficient. Previously, we used machine learning to transcribe 2.3 million handwritten occupation codes from the No… ▽ More Machine learning methods have proven useful in transcribing historical data. However, results from even highly accurate methods require manual verification and correction. Such manual review can be time-consuming and expensive, therefore the objective of this paper was to make it more efficient. Previously, we used machine learning to transcribe 2.3 million handwritten occupation codes from the Norwegian 1950 census with high accuracy (97%). We manually reviewed the 90,000 (3%) codes with the lowest model confidence. We allocated those 90,000 codes to human reviewers, who used our annotation tool to review the codes. To assess reviewer agreement, some codes were assigned to multiple reviewers. We then analyzed the review results to understand the relationship between accuracy improvements and effort. Additionally, we interviewed the reviewers to improve the workflow. The reviewers corrected 62.8% of the labels and agreed with the model label in 31.9% of cases. About 0.2% of the images could not be assigned a label, while for 5.1% the reviewers were uncertain, or they assigned an invalid label. 9,000 images were independently reviewed by multiple reviewers, resulting in an agreement of 86.43% and disagreement of 8.96%. We learned that our automatic transcription is biased towards the most frequent codes, with a higher degree of misclassification for the lowest frequency codes. Our interview findings show that the reviewers did internal quality control and found our custom tool well-suited. So, only one reviewer is needed, but they should report uncertainty. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Comments: 19 pages, 5 figures, 1 table

arXiv:2306.12056 [pdf, other]

doi 10.1039/D3CP02879H

Secondary ionization of pyrimidine nucleobases and their microhydrated derivatives in helium nanodroplets

Authors: Jakob D. Asmussen, Abdul R. Abid, Akgash Sundaralingam, Björn Bastian, Keshav Sishodia, Subhendu De, Ltaief Ben Ltaief, Sivarama R. Krishnan, Henrik B. Pedersen, Marcel Mudrich

Abstract: Radiation damage in biological systems by ionizing radiation is predominantly caused by secondary processes such as charge and energy transfer leading to the breaking of bonds in DNA. Here, we study the fragmentation of cytosine (Cyt) and thymine (Thy) molecules, clusters and microhydrated derivatives induced by direct and indirect ionization initiated by extreme-ultraviolet (XUV) irradiation. Pho… ▽ More Radiation damage in biological systems by ionizing radiation is predominantly caused by secondary processes such as charge and energy transfer leading to the breaking of bonds in DNA. Here, we study the fragmentation of cytosine (Cyt) and thymine (Thy) molecules, clusters and microhydrated derivatives induced by direct and indirect ionization initiated by extreme-ultraviolet (XUV) irradiation. Photofragmentation mass spectra and photoelectron spectra of free Cyt and Thy molecules are compared with mass and electron spectra of Cyt/Thy clusters and microhydrated Cyt/Thy molecules formed by aggregation in superfluid helium (He) nanodroplets. Penning ionization after resonant excitation of the He droplets is generally found to cause less fragmentation compared to direct photoionization and charge-transfer ionization after photoionization of the He droplets. When Cyt/Thy molecules and oligomers are complexed with water molecules, their fragmentation is efficiently suppressed. However, a similar suppression of fragmentation is observed when homogeneous Cyt/Thy clusters are formed in He nanodroplets, indicating a general trend. Penning ionization electron spectra (PIES) of Cyt/Thy are broad and nearly featureless but PIES of their microhydrated derivatives point at a sequential ionization process ending in unfragmented microsolvated Cyt/Thy cations. △ Less

Submitted 21 June, 2023; originally announced June 2023.

Comments: 9 pages, 8 figures

arXiv:2306.10994 [pdf, other]

Efficient Generalized Temporal Pattern Mining in Big Time Series Using Mutual Information

Authors: Van Long Ho, Nguyen Ho, Torben Bach Pedersen, Panagiotis Papapetrou

Abstract: Big time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in various environments. Significant insights can be gained by mining temporal patterns from these time series. Temporal pattern mining (TPM) extends traditional pattern mining by adding event time intervals into extracted patterns, making them more expressive at the expense of increased time and sp… ▽ More Big time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in various environments. Significant insights can be gained by mining temporal patterns from these time series. Temporal pattern mining (TPM) extends traditional pattern mining by adding event time intervals into extracted patterns, making them more expressive at the expense of increased time and space complexities. Besides frequent temporal patterns (FTPs), which occur frequently in the entire dataset, another useful type of temporal patterns are so-called rare temporal patterns (RTPs), which appear rarely but with high confidence. Mining rare temporal patterns yields additional challenges. For FTP mining, the temporal information and complex relations between events already create an exponential search space. For RTP mining, the support measure is set very low, leading to a further combinatorial explosion and potentially producing too many uninteresting patterns. Thus, there is a need for a generalized approach which can mine both frequent and rare temporal patterns. This paper presents our Generalized Temporal Pattern Mining from Time Series (GTPMfTS) approach with the following specific contributions: (1) The end-to-end GTPMfTS process taking time series as input and producing frequent/rare temporal patterns as output. (2) The efficient Generalized Temporal Pattern Mining (GTPM) algorithm mines frequent and rare temporal patterns using efficient data structures for fast retrieval of events and patterns during the mining process, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of GTPM that uses mutual information, a measure of data correlation, to prune unpromising time series from the search space. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2010.03653

arXiv:2306.03750 [pdf, other]

doi 10.1109/TCOMM.2023.3282256

Goal-Oriented Scheduling in Sensor Networks with Application Timing Awareness

Authors: Josefine Holm, Federico Chiariotti, Anders E. Kalør, Beatriz Soret, Torben Bach Pedersen, Petar Popovski

Abstract: Taking inspiration from linguistics, the communications theoretical community has recently shown a significant recent interest in pragmatic , or goal-oriented, communication. In this paper, we tackle the problem of pragmatic communication with multiple clients with different, and potentially conflicting, objectives. We capture the goal-oriented aspect through the metric of Value of Information (Vo… ▽ More Taking inspiration from linguistics, the communications theoretical community has recently shown a significant recent interest in pragmatic , or goal-oriented, communication. In this paper, we tackle the problem of pragmatic communication with multiple clients with different, and potentially conflicting, objectives. We capture the goal-oriented aspect through the metric of Value of Information (VoI), which considers the estimation of the remote process as well as the timing constraints. However, the most common definition of VoI is simply the Mean Square Error (MSE) of the whole system state, regardless of the relevance for a specific client. Our work aims to overcome this limitation by including different summary statistics, i.e., value functions of the state, for separate clients, and a diversified query process on the client side, expressed through the fact that different applications may request different functions of the process state at different times. A query-aware Deep Reinforcement Learning (DRL) solution based on statically defined VoI can outperform naive approaches by 15-20%. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Journal ref: IEEE Transactions on Communications, 2023

arXiv:2305.19619 [pdf, other]

doi 10.1063/5.0160171

Dopant ionization and efficiency of ion and electron ejection from helium nanodroplets

Authors: Jakob D. Asmussen, Ltaief Ben Ltaief, Keshav Sishodia, Abdul R. Abid, Björn Bastian, Sivarama Krishnan, Henrik B. Pedersen, Marcel Mudrich

Abstract: Photoionization spectroscopy and mass spectrometry of doped helium (He) nanodroplets rely on the ability to efficiently detect ions and/or electrons. Using a commercial quadrupole mass spectrometer and a photoelectron-photoion coincidence (PEPICO) spectrometer, we systematically measure yields of ions and electrons created in pure and doped He nanodroplets in a wide size range and in two ionizatio… ▽ More Photoionization spectroscopy and mass spectrometry of doped helium (He) nanodroplets rely on the ability to efficiently detect ions and/or electrons. Using a commercial quadrupole mass spectrometer and a photoelectron-photoion coincidence (PEPICO) spectrometer, we systematically measure yields of ions and electrons created in pure and doped He nanodroplets in a wide size range and in two ionization regimes -- direct ionization and secondary ionization after resonant photoexcitation of the droplets. For two different types of dopants (oxygen molecules, O$_2$, and lithium atoms, Li), we infer the optimal droplet size to maximize the yield of ejected ions. When dopants are ionized by charge-transfer to photoionized He nanodroplets, the highest yield of O$_2$ and Li ions is detected for a mean size of $\sim5\times10^4$ He atoms per nanodroplet. When dopants are Penning ionized via photoexcitation of the He droplets, the highest yield of O$_2$ and Li ions is detected for $\sim10^3$ and $\sim10^5$ He atoms per droplet, respectively. At optimum droplet sizes, the detection efficiency of dopant ions in proportion to the number of primary photoabsorption events is up to 20\,\% for charge-transfer ionization of O$_2$ and 2\,\% for Li, whereas for Penning ionization it is 1\,\% for O$_2$ and 4\,\% for Li. Our results are instrumental in determining optimal conditions for mass spectrometric studies and photoionization spectroscopy of molecules and complexes isolated in He nanodroplets. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2305.05269 [pdf, other]

doi 10.1039/d3nr03295g

Electron energy loss and angular asymmetry induced by elastic scattering in helium droplets

Authors: Jakob D. Asmussen, Keshav Sishodia, Björn Bastian, Abdul R. Abid, Ltaief Ben Ltaief, Henrik B. Pedersen, Subhendu De, Christian Medina, Nitish Pal, Robert Richter, Thomas Fennel, Sivarama Krishnan, Marcel Mudrich

Abstract: Helium nanodroplets are ideal model systems to unravel the complex interaction of condensed matter with ionizing radiation. Here we study the effect of purely elastic electron scattering on angular and energy distributions of photoelectrons emitted from He nanodroplets of variable size ($10$-$10^9$ atoms per droplets). For large droplets, photoelectrons develop a pronounced anisotropy along the in… ▽ More Helium nanodroplets are ideal model systems to unravel the complex interaction of condensed matter with ionizing radiation. Here we study the effect of purely elastic electron scattering on angular and energy distributions of photoelectrons emitted from He nanodroplets of variable size ($10$-$10^9$ atoms per droplets). For large droplets, photoelectrons develop a pronounced anisotropy along the incident light beam due to a shadowing effect within the droplets. In contrast, the detected photoelectron spectra are only weakly perturbed. This opens up possibilities for photoelectron spectroscopy of dopants embedded in droplets provided they are smaller than the penetration depth of the light and the trap** range of emitted electrons. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2302.02779 [pdf, other]

Adiabatic extraction of nonlinear optical properties from real-time time-dependent electronic-structure theory

Authors: Benedicte Sverdrup Ofstad, Håkon Emil Kristiansen, Einar Aurbakken, Øyvind Sigmundson Schøyen, Simen Kvaal, Thomas Bondo Pedersen

Abstract: Real-time simulations of laser-driven electron dynamics contain information about molecular optical properties through all orders in response theory. These properties can be extracted by assuming convergence of the power series expansion of induced electric and magnetic multipole moments. However, the accuracy relative to analytical results from response theory quickly deteriorates for higher-orde… ▽ More Real-time simulations of laser-driven electron dynamics contain information about molecular optical properties through all orders in response theory. These properties can be extracted by assuming convergence of the power series expansion of induced electric and magnetic multipole moments. However, the accuracy relative to analytical results from response theory quickly deteriorates for higher-order responses due to the presence of high-frequency oscillations in the induced multipole moment in the time domain. This problem has been ascribed to missing higher-order corrections. We here demonstrate that the deviations are caused by nonadiabatic effects arising from the finite-time ram** from zero to full strength of the external laser field. Three different approaches, two using a ramped wave and one using a pulsed wave, for extracting electrical properties from real-time time-dependent electronic-structure simulations are investigated. The standard linear ramp is compared to a quadratic ramp, which is found to yield highly accurate results for polarizabilities, and first and second hyperpolarizabilities, at roughly half the computational cost. Results for the third hyperpolarizability are presented along with a simple, computable measure of reliability. △ Less

Submitted 6 February, 2023; originally announced February 2023.

arXiv:2301.11393 [pdf, other]

doi 10.1021/acs.jpca.3c01575

The $S$-diagnostic -- an a posteriori error assessment for single-reference coupled-cluster methods

Authors: Fabian M. Faulstich, Håkon E. Kristiansen, Mihaly A. Csirik, Simen Kvaal, Thomas Bondo Pedersen, Andre Laestadius

Abstract: We propose a novel a posteriori error assessment for the single-reference coupled-cluster (SRCC) method called the $S$-diagnostic. We provide a derivation of the $S$-diagnostic that is rooted in the mathematical analysis of different SRCC variants. We numerically scrutinized the $S$-diagnostic, testing its performance for (1) geometry optimizations, (2) electronic correlation simulations of system… ▽ More We propose a novel a posteriori error assessment for the single-reference coupled-cluster (SRCC) method called the $S$-diagnostic. We provide a derivation of the $S$-diagnostic that is rooted in the mathematical analysis of different SRCC variants. We numerically scrutinized the $S$-diagnostic, testing its performance for (1) geometry optimizations, (2) electronic correlation simulations of systems with varying numerical difficulty, and (3) the square-planar copper complexes [CuCl$_4$]$^{2-}$, [Cu(NH$_3$)$_4$]$^{2+}$, and [Cu(H$_2$O)$_4$]$^{2+}$. Throughout the numerical investigations, the $S$-diagnostic is compared to other SRCC diagnostic procedures, that is, the $T_1$, $D_1$, and $D_2$ diagnostics as well as different indices of multi-determinantal and multi-reference character in coupled-cluster theory. Our numerical investigations show that the $S$-diagnostic outperforms the $T_1$, $D_1$, and $D_2$ diagnostics and is comparable to the indices of multi-determinantal and multi-reference character in coupled-cluster theory in their individual fields of applicability. The experiments investigating the performance of the $S$-diagnostic for geometry optimizations using SRCC reveal that the $S$-diagnostic correlates well with different error measures at a high level of statistical relevance. The experiments investigating the performance of the $S$-diagnostic for electronic correlation simulations show that the $S$-diagnostic correctly predicts strong multi-reference regimes. The $S$-diagnostic moreover correctly detects the successful SRCC computations for [CuCl$_4$]$^{2-}$, [Cu(NH$_3$)$_4$]$^{2+}$, and [Cu(H$_2$O)$_4$]$^{2+}$, which have been known to be misdiagnosed by $T_1$ and $D_1$ diagnostics in the past. This shows that the $S$-diagnostic is a promising candidate for an a posteriori diagnostic for SRCC calculations. △ Less

Submitted 26 January, 2023; originally announced January 2023.

arXiv:2209.04635 [pdf, other]

A Comparative Study on Unsupervised Anomaly Detection for Time Series: Experiments and Analysis

Authors: Yan Zhao, Liwei Deng, Xuanhao Chen, Chenjuan Guo, Bin Yang, Tung Kieu, Feiteng Huang, Torben Bach Pedersen, Kai Zheng, Christian S. Jensen

Abstract: The continued digitization of societal processes translates into a proliferation of time series data that cover applications such as fraud detection, intrusion detection, and energy management, where anomaly detection is often essential to enable reliability and safety. Many recent studies target anomaly detection for time series data. Indeed, area of time series anomaly detection is characterized… ▽ More The continued digitization of societal processes translates into a proliferation of time series data that cover applications such as fraud detection, intrusion detection, and energy management, where anomaly detection is often essential to enable reliability and safety. Many recent studies target anomaly detection for time series data. Indeed, area of time series anomaly detection is characterized by diverse data, methods, and evaluation strategies, and comparisons in existing studies consider only part of this diversity, which makes it difficult to select the best method for a particular problem setting. To address this shortcoming, we introduce taxonomies for data, methods, and evaluation strategies, provide a comprehensive overview of unsupervised time series anomaly detection using the taxonomies, and systematically evaluate and compare state-of-the-art traditional as well as deep learning techniques. In the empirical study using nine publicly available datasets, we apply the most commonly-used performance evaluation metrics to typical methods under a fair implementation standard. Based on the structuring offered by the taxonomies, we report on empirical studies and provide guidelines, in the form of comparative tables, for choosing the methods most suitable for particular application settings. Finally, we propose research directions for this dynamic field. △ Less

Submitted 10 September, 2022; originally announced September 2022.

arXiv:2207.00271 [pdf, other]

No need for a grid: Adaptive fully-flexible gaussians for the time-dependent Schrödinger equation

Authors: Simen Kvaal, Caroline Lasser, Thomas Bondo Pedersen, Ludwik Adamowicz

Abstract: Linear combinations of complex gaussian functions, where the linear and nonlinear parameters are allowed to vary, are shown to provide an extremely flexible and effective approach for solving the time-dependent Schrödinger equation in one spatial dimension. The use of flexible basis sets has been proven notoriously hard within the systematics of the Dirac--Frenkel variational principle. In this wo… ▽ More Linear combinations of complex gaussian functions, where the linear and nonlinear parameters are allowed to vary, are shown to provide an extremely flexible and effective approach for solving the time-dependent Schrödinger equation in one spatial dimension. The use of flexible basis sets has been proven notoriously hard within the systematics of the Dirac--Frenkel variational principle. In this work we present an alternative time-propagation scheme that de-emphasizes optimal parameter evolution but directly targets residual minimization via the method of Rothe's method, also called the method of vertical time layers. We test the scheme using a simple model system mimicking an atom subjected to an extreme laser pulse. Such a pulse produces complicated ionization dynamics of the system. The scheme is shown to perform very well on this model and notably does not rely on a computational grid. Only a handful of gaussian functions are needed to achieve an accuracy on par with a high-resolution, grid-based solver. This paves the way for accurate and affordable solution of the time-dependent Schrödinger equation for atoms and molecules within and beyond the Born--Oppenheimer approximation. △ Less

Submitted 7 March, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

Comments: 8 pages, 6 figures

arXiv:2206.14604 [pdf, other]

Mining Seasonal Temporal Patterns in Time Series

Authors: Van Long Ho, Nguyen Ho, Torben Bach Pedersen

Abstract: Very large time series are increasingly available from an ever wider range of IoT-enabled sensors, from which significant insights can be obtained through mining temporal patterns from them. A useful type of patterns found in many real-world applications exhibits periodic occurrences, and is thus called seasonal temporal pattern (STP). Compared to regular patterns, mining seasonal temporal pattern… ▽ More Very large time series are increasingly available from an ever wider range of IoT-enabled sensors, from which significant insights can be obtained through mining temporal patterns from them. A useful type of patterns found in many real-world applications exhibits periodic occurrences, and is thus called seasonal temporal pattern (STP). Compared to regular patterns, mining seasonal temporal patterns is more challenging since traditional measures such as support and confidence do not capture the seasonality characteristics. Further, the anti-monotonicity property does not hold for STPs, and thus, resulting in an exponential search space. This paper presents our Frequent Seasonal Temporal Pattern Mining from Time Series (FreqSTPfTS) solution providing: (1) The first solution for seasonal temporal pattern mining (STPM) from time series that can mine STP at different data granularities. (2) The STPM algorithm that uses efficient data structures and two pruning techniques to reduce the search space and speed up the mining process. (3) An approximate version of STPM that uses mutual information, a measure of data correlation, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that STPM outperforms the baseline in runtime and memory consumption, and can scale to big datasets. The approximate STPM is up to an order of magnitude faster and less memory consuming than the baseline, while maintaining high accuracy. △ Less

Submitted 9 January, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

arXiv:2205.15229 [pdf, other]

doi 10.1063/5.0101352

Laser-induced dynamic alignment of the HD molecule without the Born-Oppenheimer approximation

Authors: Ludwik Adamowicz, Simen Kvaal, Caroline Lasser, Thomas Bondo Pedersen

Abstract: Laser-induced molecular alignment is well understood within the framework of the Born-Oppenheimer (BO) approximation Without the BO approximation, however, the concept of molecular structure is lost, making alignment hard to define precisely. In this work, we demonstrate the emergence of alignment from the first-ever non-BO quantum dynamics simulations, using the HD molecule exposed to ultrashort… ▽ More Laser-induced molecular alignment is well understood within the framework of the Born-Oppenheimer (BO) approximation Without the BO approximation, however, the concept of molecular structure is lost, making alignment hard to define precisely. In this work, we demonstrate the emergence of alignment from the first-ever non-BO quantum dynamics simulations, using the HD molecule exposed to ultrashort laser pulses as a few-body test case We extract the degree of alignment from the non-BO wave function by means of an operator expressed in terms of pseudo-proton coordinates that mimics the BO-based definition of alignment The only essential approximation, in addition to the semiclassical electric-dipole approximation for the matter-field interaction, is the choice of time-independent explicitly correlated Gaussian basis functions. We use a variational, electric-field-dependent basis-set construction procedure, which allows us to keep the basis-set dimension low whilst capturing the main effects of electric polarization on the nuclear and electronic degrees of freedom. The basis-set construction procedure is validated by comparing with virtually exact grid-based simulations for two one-dimensional model systems: laser-driven electron dynamics in a soft attractive Coulomb potential and nuclear rovibrational dynamics in a Morse potential. △ Less

Submitted 15 September, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: 8 pages, 6 figures

arXiv:2204.09131 [pdf, other]

A Unified Approach for Multi-Scale Synchronous Correlation Search in Big Time Series -- Full Version

Authors: Nguyen Ho, Van Long Ho, Torben Bach Pedersen, Mai Vu, Christophe A. N. Biscio

Abstract: The wide deployment of IoT sensors has enabled the collection of very big time series across different domains, from which advanced analytics can be performed to find unknown relationships, most importantly the correlations between them. However, current approaches for correlation search on time series are limited to only a single temporal scale and simple types of relations, and cannot handle noi… ▽ More The wide deployment of IoT sensors has enabled the collection of very big time series across different domains, from which advanced analytics can be performed to find unknown relationships, most importantly the correlations between them. However, current approaches for correlation search on time series are limited to only a single temporal scale and simple types of relations, and cannot handle noise effectively. This paper presents the integrated SYnchronous COrrelation Search (iSYCOS) framework to find multi-scale correlations in big time series. Specifically, iSYCOS integrates top-down and bottom-up approaches into a single auto-configured framework capable of efficiently extracting complex window-based correlations from big time series using mutual information (MI). Moreover, iSYCOS includes a novel MI-based theory to identify noise in the data, and is used to perform pruning to improve iSYCOS performance. Besides, we design a distributed version of iSYCOS that can scale out in a Spark cluster to handle big time series. Our extensive experimental evaluation on synthetic and real-world datasets shows that iSYCOS can auto-configure on a given dataset to find complex multi-scale correlations. The pruning and optimisations can improve iSYCOS performance up to an order of magnitude, and the distributed iSYCOS can scale out linearly on a computing cluster. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Comments: 18 pages

arXiv:2204.04094 [pdf, other]

doi 10.1063/5.0094430

A new endstation for extreme-ultraviolet spectroscopy of free clusters and nanodroplets

Authors: Björn Bastian, Jakob D. Asmussen, Ltaief Ben Ltaief, Achim Czasch, Nykola C. Jones, Søren V. Hoffmann, Henrik B. Pedersen, Marcel Mudrich

Abstract: We present a new endstation for the AMOLine of the ASTRID2 synchrotron at Aarhus University, which combines a cluster and nanodroplet beam source with a velocity map imaging and time-of-flight spectrometer for coincidence imaging spectroscopy. Extreme-ultraviolet spectroscopy of free nanoparticles is a powerful tool for studying the photophysics and photochemistry of resonantly excited or ionized… ▽ More We present a new endstation for the AMOLine of the ASTRID2 synchrotron at Aarhus University, which combines a cluster and nanodroplet beam source with a velocity map imaging and time-of-flight spectrometer for coincidence imaging spectroscopy. Extreme-ultraviolet spectroscopy of free nanoparticles is a powerful tool for studying the photophysics and photochemistry of resonantly excited or ionized nanometer-sized condensed-phase systems. Here we demonstrate this capability by performing photoelectron-photoion coincidence (PEPICO) experiments with pure and doped superfluid helium nanodroplets. Different do** options and beam sources provide a versatile platform to generate various van der Waals clusters as well as He nanodroplets. We present a detailed characterization of the new setup and present examples of its use for measuring high-resolution yield spectra of charged particles, time-of-flight ion mass spectra, anion-cation coincidence spectra, multi-coincidence electron spectra and angular distributions. A particular focus of the research with this new endstation is on intermolecular charge and energy-transfer processes in heterogeneous nanosystems induced by valence-shell excitation and ionization. △ Less

Submitted 8 April, 2022; originally announced April 2022.

Comments: 28 pages, 17 figures, submitted to Review of Scientific Instruments

arXiv:2202.08504 [pdf, other]

Finding Representative Sampling Subsets in Sensor Graphs using Time Series Similarities

Authors: Roshni Chakraborty, Josefine Holm, Torben Bach Pedersen, Petar Popovski

Abstract: With the increasing use of IoT-enabled sensors, it is important to have effective methods for querying the sensors. For example, in a dense network of battery-driven temperature sensors, it is often possible to query (sample) just a subset of the sensors at any given time, since the values of the non-sampled sensors can be estimated from the sampled values. If we can divide the set of sensors into… ▽ More With the increasing use of IoT-enabled sensors, it is important to have effective methods for querying the sensors. For example, in a dense network of battery-driven temperature sensors, it is often possible to query (sample) just a subset of the sensors at any given time, since the values of the non-sampled sensors can be estimated from the sampled values. If we can divide the set of sensors into disjoint so-called representative sampling subsets that each represent the other sensors sufficiently well, we can alternate the sampling between the sampling subsets and thus, increase battery life significantly. In this paper, we formulate the problem of finding representative sampling subsets as a graph problem on a so-called sensor graph with the sensors as nodes. Our proposed solution, SubGraphSample, consists of two phases. In Phase-I, we create edges in the sensor graph based on the similarities between the time series of sensor values, analyzing six different techniques based on proven time series similarity metrics. In Phase-II, we propose two new techniques and extend four existing ones to find the maximal number of representative sampling subsets. Finally, we propose AutoSubGraphSample which auto-selects the best technique for Phase-I and Phase-II for a given dataset. Our extensive experimental evaluation shows that our approach can yield significant battery life improvements within realistic error bounds. △ Less

Submitted 18 February, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

arXiv:2112.13611 [pdf, other]

doi 10.1021/acs.jctc.1c01309

Linear and nonlinear optical properties from TDOMP2 theory

Authors: Håkon Emil Kristiansen, Benedicte Sverdrup Ofstad, Eirill Hauge, Einar Aurbakken, Øyvind Sigmundson Schøyen, Simen Kvaal, Thomas Bondo Pedersen

Abstract: In this work we present a derivation of the real-time time-dependent orbital-optimized Møller-Plesser TDOMP2 and its biorthogonal companion, time-dependent non-orthogonal OMP2 (TDNOMP2), theory starting from the time-dependent bivariational principle and a parametrization based on the exponential orbital-rotation operator formulation commonly used in time-independent molecular electronic-structure… ▽ More In this work we present a derivation of the real-time time-dependent orbital-optimized Møller-Plesser TDOMP2 and its biorthogonal companion, time-dependent non-orthogonal OMP2 (TDNOMP2), theory starting from the time-dependent bivariational principle and a parametrization based on the exponential orbital-rotation operator formulation commonly used in time-independent molecular electronic-structure theory. We apply the TDOMP2 method to extract absorption spectra and frequency-dependent polarizabilities and first hyperpolarizabilities from real-time simulations, comparing the results with those obtained from conventional time-dependent coupled-cluster singles and doubles (TDCCSD) simulations and from its second-order approximation TDCC2. We also compare with results from CCSD and CC2 linear and quadratic response theory. Our results indicate that while TDOMP2 absorption spectra are of the same quality as TDCC2 spectra, frequency-dependent polarizabilities and hyperpolarizabilties from TDOMP2 simulations are significantly closer to TDCCSD results than those from TDCC2 simulations. △ Less

Submitted 21 April, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

arXiv:2111.09764 [pdf, other]

doi 10.1002/pssb.202100159

Compositional Studies of Metals with Complex Order by means of the Optical Floating-Zone Technique

Authors: Andreas Bauer, Georg Benka, Andreas Neubauer, Alexander Regnat, Alexander Engelhardt, Christoph Resch, Sabine Wurmehl, Christian G. F. Blum, Tim Adams, Alfonso Chacon, Rainer Jungwirth, Robert Georgii, Anatoliy Senyshyn, Björn Pedersen, Martin Meven, Christian Pfleiderer

Abstract: The availability of large high-quality single crystals is an important prerequisite for many studies in solid-state research. The optical floating-zone technique is an elegant method to grow such crystals, offering potential to prepare samples that may be hardly accessible with other techniques. As elaborated in this report, examples include single crystals with intentional compositional gradients… ▽ More The availability of large high-quality single crystals is an important prerequisite for many studies in solid-state research. The optical floating-zone technique is an elegant method to grow such crystals, offering potential to prepare samples that may be hardly accessible with other techniques. As elaborated in this report, examples include single crystals with intentional compositional gradients, deliberate off-stoichiometry, or complex metallurgy. For the cubic chiral magnets Mn$_{1-x}$Fe$_{x}$Si and Fe$_{1-x}$Co$_{x}$Si, we prepared single crystals in which the composition was varied during growth from $x = 0 - 0.15$ and from $x = 0.1 - 0.3$, respectively. Such samples allowed us to efficiently study the evolution of the magnetic properties as a function of composition, as demonstrated by means of neutron scattering. For the archetypical chiral magnet MnSi and the itinerant antiferromagnet CrB$_{2}$, we grew single crystals with varying initial manganese (0.99 to 1.04) and boron (1.95 to 2.1) content. Measurements of the low-temperature properties addressed the correlation between magnetic transition temperature and sample quality. Furthermore, we prepared single crystals of the diborides ErB$_{2}$, MnB$_{2}$, and VB$_{2}$. In addition to high vapor pressures, these materials suffer from peritectic formation, potential decomposition, and high melting temperature, respectively. △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: 16 pages, 13 figures

Journal ref: physica status solidi (b) 2100159 (2021)

arXiv:2110.09262 [pdf, other]

doi 10.1038/s41467-022-32161-y

Practical continuous-variable quantum key distribution with composable security

Authors: Nitin Jain, Hou-Man Chin, Hossein Mani, Cosmo Lupo, Dino Solar Nikolic, Arne Kordts, Stefano Pirandola, Thomas Brochmann Pedersen, Matthias Kolb, Bernhard Ömer, Christoph Pacher, Tobias Gehring, Ulrik L. Andersen

Abstract: A quantum key distribution (QKD) system must fulfill the requirement of universal composability to ensure that any cryptographic application (using the QKD system) is also secure. Furthermore, the theoretical proof responsible for security analysis and key generation should cater to the number $N$ of the distributed quantum states being finite in practice. Continuous-variable (CV) QKD based on coh… ▽ More A quantum key distribution (QKD) system must fulfill the requirement of universal composability to ensure that any cryptographic application (using the QKD system) is also secure. Furthermore, the theoretical proof responsible for security analysis and key generation should cater to the number $N$ of the distributed quantum states being finite in practice. Continuous-variable (CV) QKD based on coherent states, despite being a suitable candidate for integration in the telecom infrastructure, has so far been unable to demonstrate composability as existing proofs require a rather large $N$ for successful key generation. Here we report the first Gaussian-modulated coherent state CVQKD system that is able to overcome these challenges and can generate composable keys secure against collective attacks with $N \lesssim 3.5\times10^8$ coherent states. With this advance, possible due to novel improvements to the security proof and a fast, yet low-noise and highly stable system operation, CVQKD implementations take a significant step towards their discrete-variable counterparts in practicality, performance, and security. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 8 pages (including 3 figures and references) + 12 pages supplement

arXiv:2109.11609 [pdf, ps, other]

Evolutionary Clustering of Streaming Trajectories

Authors: Tianyi Li, Lu Chen, Christian S. Jensen, Torben Bach Pedersen, Jilin Hu

Abstract: The widespread deployment of smartphones and location-enabled, networked in-vehicle devices renders it increasingly feasible to collect streaming trajectory data of moving objects. The continuous clustering of such data can enable a variety of real-time services, such as identifying representative paths or common moving trends among objects in real-time. However, little attention has so far been g… ▽ More The widespread deployment of smartphones and location-enabled, networked in-vehicle devices renders it increasingly feasible to collect streaming trajectory data of moving objects. The continuous clustering of such data can enable a variety of real-time services, such as identifying representative paths or common moving trends among objects in real-time. However, little attention has so far been given to the quality of clusters -- for example, it is beneficial to smooth short-term fluctuations in clusters to achieve robustness to exceptional data. We propose the notion of evolutionary clustering of streaming trajectories, abbreviated ECO, that enhances streaming-trajectory clustering quality by means of temporal smoothing that prevents abrupt changes in clusters across successive timestamps. Employing the notions of snapshot and historical trajectory costs, we formalize ECO and then formulate ECO as an optimization problem and prove that ECO can be performed approximately in linear time, thus eliminating the iterative processes employed in previous studies. Further, we propose a minimal-group structure and a seed point shifting strategy to facilitate temporal smoothing. Finally, we present all algorithms underlying ECO along with a set of optimization techniques. Extensive experiments with two real-life datasets offer insight into ECO and show that it outperforms state-of-the-art solutions in terms of both clustering quality and efficiency. △ Less

Submitted 23 September, 2021; originally announced September 2021.

arXiv:2106.03996 [pdf]

doi 10.51964/hlcs11331

Lessons learned develo** and using a machine learning model to automatically transcribe 2.3 million handwritten occupation codes

Authors: Bjørn-Richard Pedersen, Einar Holsbø, Trygve Andersen, Nikita Shvetsov, Johan Ravn, Hilde Leikny Sommerseth, Lars Ailo Bongo

Abstract: Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end pipeline that scales to the dataset size and a model that achieves high accuracy with few manual transcriptions. The correctness of the model results must a… ▽ More Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end pipeline that scales to the dataset size and a model that achieves high accuracy with few manual transcriptions. The correctness of the model results must also be verified. This paper describes our lessons learned develo**, tuning and using the Occode end-to-end machine learning pipeline for transcribing 2.3 million handwritten occupation codes from the Norwegian 1950 population census. We achieve an accuracy of 97% for the automatically transcribed codes, and we send 3% of the codes for manual verification. We verify that the occupation code distribution found in our results matches the distribution found in our training data, which should be representative for the census as a whole. We believe our approach and lessons learned may be useful for other transcription projects that plan to use machine learning in production. The source code is available at: https://github.com/uit-hdl/rhd-codes △ Less

Submitted 1 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

arXiv:2105.06845 [pdf, other]

Query Age of Information: Freshness in Pull-Based Communication

Authors: Federico Chiariotti, Josefine Holm, Anders E. Kalør, Beatriz Soret, Søren K. Jensen, Torben B. Pedersen, Petar Popovski

Abstract: Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumes that new information is used at any time, which is not always the case: the instants at which information is collected and used are dependent on a certain qu… ▽ More Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumes that new information is used at any time, which is not always the case: the instants at which information is collected and used are dependent on a certain query process. We propose a model that accounts for the discrete time nature of many monitoring processes, considering a pull-based communication model in which the freshness of information is only important when the receiver generates a query: if the monitoring process is not using the value, the age of the last update is irrelevant. We then define the Age of Information at Query (QAoI), a more general metric that fits the pull-based scenario, and show how its optimization can lead to very different choices from traditional push-based AoI optimization when using a Packet Erasure Channel (PEC) and with limited link availability. Our results show that QAoI-aware optimization can significantly reduce the average and worst-case perceived age for both periodic and stochastic queries. △ Less

Submitted 12 January, 2022; v1 submitted 14 May, 2021; originally announced May 2021.

Comments: Accepted for publication in IEEE Transactions on Communications (preprint version). Extended version of conference paper arXiv:2011.00917

arXiv:2102.02246 [pdf, other]

doi 10.1016/j.bdr.2021.100205

The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes

Authors: Ciprian-Octavian Truică, Elena-Simona Apostol, Jérôme Darmont, Torben Bach Pedersen

Abstract: In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve the issue of rigid data structures present in relational databases, by introducing semi-structured and flexible schema design. As current data generated by differ… ▽ More In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve the issue of rigid data structures present in relational databases, by introducing semi-structured and flexible schema design. As current data generated by different sources and devices, especially from IoT sensors and actuators, use either XML or JSON format, depending on the application, database technologies that store and query semi-structured data in XML format are needed. Thus, Native XML Databases, which were initially designed to manipulate XML data using standardized querying languages, i.e., XQuery and XPath, were rebranded as NoSQL Document-Oriented Databases Systems. Currently, the majority of these solutions have been replaced with the more modern JSON based Database Management Systems. However, we believe that XML-based solutions can still deliver performance in executing complex queries on heterogeneous collections. Unfortunately nowadays, research lacks a clear comparison of the scalability and performance for database technologies that store and query documents in XML versus the more modern JSON format. Moreover, to the best of our knowledge, there are no Big Data-compliant benchmarks for such database technologies. In this paper, we present a comparison for selected Document-Oriented Database Systems that either use the XML format to encode documents, i.e., BaseX, eXist-db, and Sedna, or the JSON format, i.e., MongoDB, CouchDB, and Couchbase. To underline the performance differences we also propose a benchmark that uses a heterogeneous complex schema on a large DBLP corpus. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: 28 pages, 6 figures, 7 tables

ACM Class: H.2

Journal ref: Big Data Research, Vol. 25, July 2021

arXiv:2011.00917 [pdf, ps, other]

Freshness on Demand: Optimizing Age of Information for the Query Process

Authors: Josefine Holm, Anders E. Kalør, Federico Chiariotti, Beatriz Soret, Søren K. Jensen, Torben B. Pedersen, Petar Popovski

Abstract: Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumed that new information is used at any time, which is not always the case and the instants at which information is collected and used are dependent on a certain… ▽ More Age of Information (AoI) has become an important concept in communications, as it allows system designers to measure the freshness of the information available to remote monitoring or control processes. However, its definition tacitly assumed that new information is used at any time, which is not always the case and the instants at which information is collected and used are dependent on a certain query process. We propose a model that accounts for the discrete time nature of many monitoring processes, considering a pull-based communication model in which the freshness of information is only important when the receiver generates a query. We then define the Age of Information at Query (QAoI), a more general metric that fits the pull-based scenario, and show how its optimization can lead to very different choices from traditional push-based AoI optimization when using a Packet Erasure Channel (PEC). △ Less

Submitted 2 November, 2020; originally announced November 2020.

Comments: Submitted for publication

arXiv:2010.15404 [pdf, other]

On Efficient and Scalable Time-Continuous Spatial Crowdsourcing -- Full Version

Authors: Ting Wang, Xike Xie, Xin Cao, Torben Bach Pedersen, Yang Wang, Mingjun Xiao

Abstract: The proliferation of advanced mobile terminals opened up a new crowdsourcing avenue, spatial crowdsourcing, to utilize the crowd potential to perform real-world tasks. In this work, we study a new type of spatial crowdsourcing, called time-continuous spatial crowdsourcing (TCSC in short). It supports broad applications for long-term continuous spatial data acquisition, ranging from environmental m… ▽ More The proliferation of advanced mobile terminals opened up a new crowdsourcing avenue, spatial crowdsourcing, to utilize the crowd potential to perform real-world tasks. In this work, we study a new type of spatial crowdsourcing, called time-continuous spatial crowdsourcing (TCSC in short). It supports broad applications for long-term continuous spatial data acquisition, ranging from environmental monitoring to traffic surveillance in citizen science and crowdsourcing projects. However, due to limited budgets and limited availability of workers in practice, the data collected is often incomplete, incurring data deficiency problem. To tackle that, in this work, we first propose an entropy-based quality metric, which captures the joint effects of incompletion in data acquisition and the imprecision in data interpolation. Based on that, we investigate quality-aware task assignment methods for both single- and multi-task scenarios. We show the NP-hardness of the single-task case, and design polynomial-time algorithms with guaranteed approximation ratios. We study novel indexing and pruning techniques for further enhancing the performance in practice. Then, we extend the solution to multi-task scenarios and devise a parallel framework for speeding up the process of optimization. We conduct extensive experiments on both real and synthetic datasets to show the effectiveness of our proposals. △ Less

Submitted 29 October, 2020; originally announced October 2020.

arXiv:2010.03653 [pdf, other]

Efficient Temporal Pattern Mining in Big Time Series Using Mutual Information -- Full Version

Authors: Van Long Ho, Nguyen Ho, Torben Bach Pedersen

Abstract: Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased mining… ▽ More Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased mining time complexity. Existing TPM methods either cannot scale to large datasets, or work only on pre-processed temporal events rather than on time series. This paper presents our Frequent Temporal Pattern Mining from Time Series (FTPMf TS) approach which provides: (1) The end-to-end FTPMf TS process taking time series as input and producing frequent temporal patterns as output. (2) The efficient Hierarchical Temporal Pattern Graph Mining (HTPGM) algorithm that uses efficient data structures for fast support and confidence computation, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of HTPGM that uses mutual information, a measure of data correlation known from information theory, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that HTPGM outperforms the baselines in runtime and memory consumption, and can scale to big datasets. The approximate HTPGM is up to two orders of magnitude faster and less memory consuming than the baselines, while retaining high accuracy. △ Less

Submitted 17 November, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

arXiv:2009.10169 [pdf, other]

doi 10.1021/acs.jctc.0c00977

Interpretation of Coupled-Cluster Many-Electron Dynamics in Terms of Stationary States

Authors: Thomas Bondo Pedersen, Håkon Emil Kristiansen, Tilmann Bodenstein, Simen Kvaal, Øyvind Sigmundson Schøyen

Abstract: We demonstrate theoretically and numerically that laser-driven many-electron dynamics, as described by bivariational time-dependent coupled-cluster theory, may be analyzed in terms of stationary-state populations. Projectors heuristically defined from linear response theory and equation-of-motion coupled-cluster theory are proposed for the calculation of stationary-state populations during interac… ▽ More We demonstrate theoretically and numerically that laser-driven many-electron dynamics, as described by bivariational time-dependent coupled-cluster theory, may be analyzed in terms of stationary-state populations. Projectors heuristically defined from linear response theory and equation-of-motion coupled-cluster theory are proposed for the calculation of stationary-state populations during interaction with laser pulses or other external forces, and conservation laws of the populations are discussed. Numerical tests of the proposed projectors, involving both linear and nonlinear optical processes for the He and Be atoms, and for the LiH, CH$^+$, and LiF molecules, show that the laser-driven evolution of the stationary-state populations at the coupled-cluster singles-and-doubles (CCSD) level is very close to that obtained by full configuration-interaction theory provided all stationary states actively participating in the dynamics are sufficiently well approximated. When double-excited states are important for the dynamics, the quality of the CCSD results deteriorate. Observing that populations computed from the linear-response projector may show spurious small-amplitude, high-frequency oscillations, the equation-of-motion projector emerges as the most promising approach to stationary-state populations. △ Less

Submitted 19 December, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

Comments: 58 pages, 14 figures

arXiv:2006.07180 [pdf, other]

doi 10.3233/SW-210429

High-Level ETL for Semantic Data Warehouses -- Full Version

Authors: Rudra Pratap Deb Nath, Oscar Romero, Torben Bach Pedersen, Katja Hose

Abstract: The popularity of the Semantic Web (SW) encourages organizations to organize and publish semantic data using the RDF model. This growth poses new requirements to Business Intelligence (BI) technologies to enable On-Line Analytical Processing (OLAP)-like analysis over semantic data. The incorporation of semantic data into a Data Warehouse (DW) is not supported by the traditional Extract-Transform-L… ▽ More The popularity of the Semantic Web (SW) encourages organizations to organize and publish semantic data using the RDF model. This growth poses new requirements to Business Intelligence (BI) technologies to enable On-Line Analytical Processing (OLAP)-like analysis over semantic data. The incorporation of semantic data into a Data Warehouse (DW) is not supported by the traditional Extract-Transform-Load (ETL) tools because they do not consider semantic issues in the integration process. In this paper, we propose a layer-based integration process and a set of high-level RDF-based ETL constructs required to define, map, extract, process, transform, integrate, update, and load (multidimensional) semantic data. Different to other ETL tools, we automate the ETL data flows by creating metadata at the schema level. Therefore, it relieves ETL developers from the burden of manual map** at the ETL operation level. We create a prototype, named Semantic ETL Construct (SETLCONSTRUCT), based on the innovative ETL constructs proposed here. To evaluate SETLCONSTRUCT, we create a multidimensional semantic DW by integrating a Danish Business dataset and an EU Subsidy dataset using it and compare it with the previous programmable framework SETLPROG in terms of productivity, development time and performance. The evaluation shows that 1) SETLCONSTRUCT uses 92% fewer Number of Typed Characters (NOTC) than SETLPROG, and SETLAUTO (the extension of SETLCONSTRUCT for generating ETL execution flow automatically) further reduces the Number of Used Concepts (NOUC) by another 25%; 2) using SETLCONSTRUCT, the development time is almost cut in half compared to SETLPROG, and is cut by another 27% using SETLAUTO; 3) SETLCONSTRUCT is scalable and has similar performance compared to SETLPROG. △ Less

Submitted 12 June, 2020; originally announced June 2020.

Comments: 44 pages including reference, 13 figures and 4 tables. This paper is submitted to Semantic Web Journal and now it is under review

Journal ref: Semantic Web, vol. 13, no. 1, pp. 85-132, 2022

arXiv:2003.13833 [pdf]

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Authors: Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim Köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin , et al. (22 additional authors not shown)

Abstract: Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitu… ▽ More Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions. △ Less

Submitted 30 March, 2020; originally announced March 2020.

Comments: Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear

arXiv:2002.06608 [pdf, other]

Multidimensional Enrichment of Spatial RDF Data for SOLAP -- Full Version

Authors: Nurefsan Gür, Torben Bach Pedersen, Katja Hose, Mikael Midtgaard

Abstract: Large volumes of spatial data and multidimensional data are being published on the Semantic Web, which has led to new opportunities for advanced analysis, such as Spatial Online Analytical Processing (SOLAP). The RDF Data Cube (QB) and QB4OLAP vocabularies have been widely used for annotating and publishing statistical and multidimensional RDF data. Although such statistical data sets might have s… ▽ More Large volumes of spatial data and multidimensional data are being published on the Semantic Web, which has led to new opportunities for advanced analysis, such as Spatial Online Analytical Processing (SOLAP). The RDF Data Cube (QB) and QB4OLAP vocabularies have been widely used for annotating and publishing statistical and multidimensional RDF data. Although such statistical data sets might have spatial information, such as coordinates, the lack of spatial semantics and spatial multidimensional concepts in QB4OLAP and QB prevents users from employing SOLAP queries over spatial data using SPARQL. The QB4SOLAP vocabulary, on the other hand, fully supports annotating spatial and multidimensional data on the Semantic Web and enables users to query endpoints with SOLAP operators in SPARQL. To bridge the gap between QB/QB4OLAP and QB4SOLAP, we propose an RDF2SOLAP enrichment model that automatically annotates spatial multidimensional concepts with QB4SOLAP and in doing so enables SOLAP on existing QB and QB4OLAP data on the Semantic Web. Furthermore, we present and evaluate a wide range of enrichment algorithms and apply them on a non-trivial real-world use case involving governmental open data with complex geometry types. △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: 33 pages, 8 figures, 7 tables, 10 listings, 7 algorithms, under review in Semantic Web Journal, available on http://www.semantic-web-journal.net/content/multidimensional-enrichment-spatial-rdf-data-solap

arXiv:2002.01683 [pdf]

doi 10.1103/PhysRevB.101.064425

Direct control of magnetic chirality in NdMn2O5 by external electric field

Authors: I. A. Zobkalo, A. N. Matveeva, A. Sazonov, S. N. Barilo, S. V. Shiryaev, B. Pedersen, V. Hutanu

Abstract: Detailed investigation of the incommensurate magnetic ordering in a single crystal of multiferroic NdMn2O5 has been performed using both non-polarized and polarized neutron diffraction techniques. Below TN = 30.5 K magnetic Bragg reflections corresponding to the non-chiral type magnetic structure with propagation vector k1 = (0.5 0 kz1) occurs. Below about 27 K a new distorted magnetic modulation… ▽ More Detailed investigation of the incommensurate magnetic ordering in a single crystal of multiferroic NdMn2O5 has been performed using both non-polarized and polarized neutron diffraction techniques. Below TN = 30.5 K magnetic Bragg reflections corresponding to the non-chiral type magnetic structure with propagation vector k1 = (0.5 0 kz1) occurs. Below about 27 K a new distorted magnetic modulation with a similar vector kz2 occurs, which is attributed to the magnetization of the Nd3+ ions by the Mn-sub-lattice. Strong temperature hysteresis in the occurrence of the incommensurate magnetic phases in NdMn2O5 was observed depending on the cooling or heating history of the sample. Below about 20 K the magnetic structure became of a chiral type. From spherical neutron polarimetry measurements, the resulting low-temperature magnetic structure kz3 was approximated by the general elliptic helix. The parameters of the magnetic helix-like ellipticity and helical plane orientation in regard to the crystal structure were determined. A reorientation of the helix occurs at an intermediate temperature between 4 K and 18 K. A difference between the population of right- and left-handed chiral domains of about 0.2 was observed in the as-grown crystal when cooling without an external electric field. The magnetic chiral ratio can be changed by the application of an external electric field of a few kV/cm, revealing strong magnetoelectric coupling. A linear dependence of the magnetic chirality on the applied electric field in NdMn2O5 was found. The results are discussed within the frame of the antisymmetric super-exchange model for Dzyaloshinsky-Moria interaction. △ Less

Submitted 5 February, 2020; originally announced February 2020.

Journal ref: Phys. Rev. B 101, 064425 (2020)

arXiv:2001.05812 [pdf, other]

doi 10.1103/PhysRevB.101.020507

Uniaxial $c$-axis pressure effects on underdoped BaFe$_2$(As$_{0.72}$P$_{0.28}$)$_2$ superconductor

Authors: Ding Hu, David W. Tam, Wenliang Zhang, Yuan Wei, Robert Georgii, Bjorn Pedersen, Alfonso Chacon Roldan, Pengcheng Dai

Abstract: The optimal superconductivity ($T_c \approx 30 $ K) in BaFe$_2$(As$_{1-x}$P$_x$)$_2$ can be reached when the coupled antiferromagnetic (AF) order ($T_N$) and orthorhombic lattice distortion ($T_s$) are suppressed to zero temperature with increasing of P concentration or hydrostatic pressure. Here we use transport and neutron scattering to study the $c$-axis pressure effects on electronic phases in… ▽ More The optimal superconductivity ($T_c \approx 30 $ K) in BaFe$_2$(As$_{1-x}$P$_x$)$_2$ can be reached when the coupled antiferromagnetic (AF) order ($T_N$) and orthorhombic lattice distortion ($T_s$) are suppressed to zero temperature with increasing of P concentration or hydrostatic pressure. Here we use transport and neutron scattering to study the $c$-axis pressure effects on electronic phases in underdoped BaFe$_2$(As$_{0.72}$P$_{0.28}$)$_2$, which has $T_N = T_s\approx 40$ K and $T_c \approx $ 28 K at zero pressure. With increasing $c$-axis pressure, $T_N$ and $T_s$ are slightly enhanced around $P_c \sim 20 $ MPa. Upon further increasing pressure, AF order is gradually suppressed to zero, while $T_c$ is enhanced to 30 K. Our results reveal the importance of magnetoelastic couplings in BaFe$_2$(As$_{1-x}$P$_x$)$_2$, suggesting that the $c$-axis pressure can be used as a tuning parameter to manipulate the electronic phases in iron pnictides. △ Less

Submitted 16 January, 2020; originally announced January 2020.

Journal ref: Phys. Rev. B 101, 020507(R) (2020)

arXiv:1912.09217 [pdf, other]

doi 10.1063/1.5142276

Numerical stability of time-dependent coupled-cluster methods for many-electron dynamics in intense laser pulses

Authors: Håkon Emil Kristiansen, Øyvind Sigmundson Schøyen, Simen Kvaal, Thomas Bondo Pedersen

Abstract: We investigate the numerical stability of time-dependent coupled-cluster theory for many-electron dynamics in intense laser pulses, comparing two coupled-cluster formulations with full configuration interaction theory. Our numerical experiments show that orbital-adaptive time-dependent coupled-cluster doubles (OATDCCD) theory offers significantly improved stability compared with the conventional H… ▽ More We investigate the numerical stability of time-dependent coupled-cluster theory for many-electron dynamics in intense laser pulses, comparing two coupled-cluster formulations with full configuration interaction theory. Our numerical experiments show that orbital-adaptive time-dependent coupled-cluster doubles (OATDCCD) theory offers significantly improved stability compared with the conventional Hartree-Fock-based time-dependent coupled-cluster singles-and-doubles (TDCCSD) formulation. The improved stability stems from greatly reduced oscillations in the doubles amplitudes, which, in turn, can be traced to the dynamic biorthonormal reference determinants of OATDCCD theory. As long as these are good approximations to the Brueckner determinant, OATDCCD theory is numerically stable. We propose the reference weight as a diagnostic quantity to identify situations where the TDCCSD and OATDCCD theories become unstable. △ Less

Submitted 24 February, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

Comments: 5 pages, 6 figures (supplemental material, 7 pages, 11 figures)

Journal ref: The Journal of Chemical Physics, 152, (2020), 071102

arXiv:1911.09016 [pdf, other]

doi 10.1109/TKDE.2020.2990491

Multi-Source Spatial Entity Linkage

Authors: Suela Isaj, Torben Bach Pedersen, Esteban Zimányi

Abstract: Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity lin… ▽ More Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity linkage problem, which finds which pairs of spatial entities belong to the same physical spatial entity. Our proposed solution (QuadSky) starts with a time-efficient spatial blocking technique (QuadFlex), compares pairwise the spatial entities in the same block, ranks the pairs using Pareto optimality with the SkyRank algorithm, and finally, classifies the pairs with our novel SkyEx-* family of algorithms that yield 0.85 precision and 0.85 recall for a manually labeled dataset of 1,500 pairs and 0.87 precision and 0.6 recall for a semi-manually labeled dataset of 777,452 pairs. Moreover, we provide a theoretical guarantee and formalize the SkyEx-FES algorithm that explores only 27% of the skylines without any loss in F-measure. Furthermore, our fully unsupervised algorithm SkyEx-D approximates the optimal result with an F-measure loss of just 0.01. Finally, QuadSky provides the best trade-off between precision and recall, and the best F-measure compared to the existing baselines and clustering techniques, and approximates the results of supervised learning solutions. △ Less

Submitted 29 April, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

arXiv:1906.09995 [pdf]

doi 10.1109/TBDATA.2019.2907987

AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Authors: Nguyen Ho, Huy Vo, Mai Vu, Torben Bach Pedersen

Abstract: Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first importa… ▽ More Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first important step is to identify temporal correlations between data sets. Given the characteristics of Big Data in terms of volume and velocity, techniques that identify correlations not only need to be fast and scalable, but also need to help users in ordering the correlations across temporal scales so that they can focus on important relationships. In this paper, we present AMIC (Adaptive Mutual Information-based Correlation), a method based on mutual information to identify correlations at multiple temporal scales in large time series. Discovered correlations are suggested to users in an order based on the strength of the relationships. Our method supports an adaptive streaming technique that minimizes duplicated computation and is implemented on top of Apache Spark for scalability. We also provide a comprehensive evaluation on the effectiveness and the scalability of AMIC using both synthetic and real-world data sets. △ Less

Submitted 7 July, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

arXiv:1903.10269 [pdf, other]

doi 10.1109/ICDE51399.2021.00123

Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+

Authors: Søren Kejser Jensen, Torben Bach Pedersen, Christian Thomsen

Abstract: To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. S… ▽ More To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. Specifically, we propose compressing groups of correlated time series using an extensible set of model types within a user-defined error bound (possibly zero). We name this new category of model-based compression methods for time series Multi-Model Group Compression (MMGC). We present the first MMGC method GOLEMM and extend model types to compress time series groups. We propose primitives for users to effectively define groups for differently sized data sets, and based on these, an automated grou** method using only the time series dimensions. We propose algorithms for executing simple and multi-dimensional aggregate queries on models. Last, we implement our methods in the Time Series Management System (TSMS) ModelarDB (ModelarDB+). Our evaluation shows that compared to widely used formats, ModelarDB+ provides up to 13.7 times faster ingestion due to high compression, 113 times better compression due to the adaptivity of GOLEMM, 630 times faster aggregates by using models, and close to linear scalability. It is also extensible and supports online query processing. △ Less

Submitted 29 June, 2021; v1 submitted 25 March, 2019; originally announced March 2019.

Comments: 12 Pages, 28 Figures, and 1 Table

arXiv:1901.06712 [pdf, other]

Seed-Driven Geo-Social Data Extraction -- Full Version

Authors: Suela Isaj, Torben Bach Pedersen

Abstract: Geo-social data has been an attractive source for a variety of problems such as mining mobility patterns, link prediction, location recommendation, and influence maximization. However, new geo-social data is increasingly unavailable and suffers several limitations. In this paper, we aim to remedy the problem of effective data extraction from geo-social data sources. We first identify and categoriz… ▽ More Geo-social data has been an attractive source for a variety of problems such as mining mobility patterns, link prediction, location recommendation, and influence maximization. However, new geo-social data is increasingly unavailable and suffers several limitations. In this paper, we aim to remedy the problem of effective data extraction from geo-social data sources. We first identify and categorize the limitations of extracting geo-social data. In order to overcome the limitations, we propose a novel seed-driven approach that uses the points of one source as the seed to feed as queries for the others. We additionally handle differences between, and dynamics within the sources by proposing three variants for optimizing search radius. Furthermore, we provide an optimization based on recursive clustering to minimize the number of requests and an adaptive procedure to learn the specific data distribution of each source. Our comprehensive experiments with six popular sources show that our seed-driven approach yields 14.3 times more data overall, while our request-optimized algorithm retrieves up to 95% of the data with less than 16% of the requests. Thus, our proposed seed-driven approach set new standards for effective and efficient extraction of geo-social data. △ Less

Submitted 23 June, 2019; v1 submitted 20 January, 2019; originally announced January 2019.

arXiv:1812.05377 [pdf, other]

Ultra-fast real-time quantum random number generator with correlated measurement outcomes and rigorous security certification

Authors: Tobias Gehring, Cosmo Lupo, Arne Kordts, Dino Solar Nikolic, Nitin Jain, Tobias Rydberg, Thomas B. Pedersen, Stefano Pirandola, Ulrik L. Andersen

Abstract: Quantum random number generators (QRNGs) promise perfectly unpredictable random numbers. However, the security certification of the random numbers in form of a stochastic model often introduces assumptions that are either hardly justified or indeed unnecessary. Two important examples are the restriction of an adversary to the classical regime as well as negligible correlations between consecutive… ▽ More Quantum random number generators (QRNGs) promise perfectly unpredictable random numbers. However, the security certification of the random numbers in form of a stochastic model often introduces assumptions that are either hardly justified or indeed unnecessary. Two important examples are the restriction of an adversary to the classical regime as well as negligible correlations between consecutive measurement outcomes. Additionally, non-rigorous system characterization opens a security loophole. In this work we experimentally realize a QRNG that does not rely on the aforementioned assumptions and whose stochastic model is established by a rigorous -- metrological -- approach. Based on quadrature measurements of vacuum fluctuations, we demonstrate a real-time random number generation rate of 8 \,GBit/s. Our security certification approach offers a number of practical benefits and will therefore find widespread applications in quantum random number generators. In particular, our generated random numbers are well suited for today's conventional and quantum cryptographic solutions. △ Less

Submitted 30 March, 2020; v1 submitted 13 December, 2018; originally announced December 2018.

Showing 1–50 of 74 results for author: Pedersen, B