-
IncomeSCM: From tabular data set to time-series simulator and causal estimation benchmark
Authors:
Fredrik D. Johansson
Abstract:
Evaluating observational estimators of causal effects demands information that is rarely available: unconfounded interventions and outcomes from the population of interest, created either by randomization or adjustment. As a result, it is customary to fall back on simulators when creating benchmark tasks. Simulators offer great control but are often too simplistic to make challenging tasks, either…
▽ More
Evaluating observational estimators of causal effects demands information that is rarely available: unconfounded interventions and outcomes from the population of interest, created either by randomization or adjustment. As a result, it is customary to fall back on simulators when creating benchmark tasks. Simulators offer great control but are often too simplistic to make challenging tasks, either because they are hand-designed and lack the nuances of real-world data, or because they are fit to observational data without structural constraints. In this work, we propose a general, repeatable strategy for turning observational data into sequential structural causal models and challenging estimation tasks by following two simple principles: 1) fitting real-world data where possible, and 2) creating complexity by composing simple, hand-designed mechanisms. We implement these ideas in a highly configurable software package and apply it to the well-known Adult income data set to construct the \tt IncomeSCM simulator. From this, we devise multiple estimation tasks and sample data sets to compare established estimators of causal effects. The tasks present a suitable challenge, with effect estimates varying greatly in quality between methods, despite similar performance in the modeling of factual outcomes, highlighting the need for dedicated causal estimators and model selection criteria.
△ Less
Submitted 31 May, 2024; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Active Preference Learning for Ordering Items In- and Out-of-sample
Authors:
Herman Bergström,
Emil Carlsson,
Devdatt Dubhashi,
Fredrik D. Johansson
Abstract:
Learning an ordering of items based on noisy pairwise comparisons is useful when item-specific labels are difficult to assign, for example, when annotators have to make subjective assessments. Algorithms have been proposed for actively sampling comparisons of items to minimize the number of annotations necessary for learning an accurate ordering. However, many ignore shared structure between items…
▽ More
Learning an ordering of items based on noisy pairwise comparisons is useful when item-specific labels are difficult to assign, for example, when annotators have to make subjective assessments. Algorithms have been proposed for actively sampling comparisons of items to minimize the number of annotations necessary for learning an accurate ordering. However, many ignore shared structure between items, treating them as unrelated, limiting sample efficiency and precluding generalization to new items. In this work, we study active learning with pairwise preference feedback for ordering items with contextual attributes, both in- and out-of-sample. We give an upper bound on the expected ordering error incurred by active learning strategies under a logistic preference model, in terms of the aleatoric and epistemic uncertainty in comparisons, and propose two algorithms designed to greedily minimize this bound. We evaluate these algorithms in two realistic image ordering tasks, including one with comparisons made by human annotators, and demonstrate superior sample efficiency compared to non-contextual ranking approaches and active preference learning baselines.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Cometary ion drift energy and temperature at comet 67P-Churyumov/Gerasimeko
Authors:
Hayley N. Williamson,
Annie Johansson,
Romain Canu-Blot,
Gabriella Stenberg Wieser,
Hans Nilsson,
Fredrik L. Johansson,
Anja Moeslinger
Abstract:
The Ion Composition Analyzer (ICA) on the Rosetta spacecraft observed both the solar wind and the cometary ionosphere around comet 67P/Churyumov-Gerasimenko for nearly two years. However, observations of low energy cometary ions were affected by a highly negative spacecraft potential, and the ICA ion density estimates were often much lower than plasma densities found by other instruments. Since th…
▽ More
The Ion Composition Analyzer (ICA) on the Rosetta spacecraft observed both the solar wind and the cometary ionosphere around comet 67P/Churyumov-Gerasimenko for nearly two years. However, observations of low energy cometary ions were affected by a highly negative spacecraft potential, and the ICA ion density estimates were often much lower than plasma densities found by other instruments. Since the low energy cometary ions are often the highest density population in the plasma environment, it is nonetheless desirable to understand their properties. To do so, we select ICA data with densities comparable to those of Rosetta's Langmuir Probe (LAP)/Mutual Impedance Probe throughout the mission. We then correct the cometary ion energy distribution of each energy-angle scan for spacecraft potential and fit a drifting Maxwell-Boltzmann distribution, which gives an estimate of the drift energy and temperature for 3521 scans. The resulting drift energy is generally between 11--18 eV and the temperature between 0.5--1 eV. The drift energy shows good agreement with published ion flow speeds from LAP during the same time period and is much higher than the cometary neutral speed. We see additional higher energy cometary ions in the spectra closest to perihelion, which can either be a second Maxwellian or a kappa distribution. The energy and temperature are negatively correlated with heliocentric distance, but the slope of the change is small. It cannot be quantitatively determined whether this trend is primarily due to heliocentric distance or spacecraft distance to the comet, which increased with decreasing heliocentric distance.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
MINTY: Rule-based Models that Minimize the Need for Imputing Features with Missing Values
Authors:
Lena Stempfle,
Fredrik D. Johansson
Abstract:
Rule models are often preferred in prediction tasks with tabular inputs as they can be easily interpreted using natural language and provide predictive performance on par with more complex models. However, most rule models' predictions are undefined or ambiguous when some inputs are missing, forcing users to rely on statistical imputation models or heuristics like zero imputation, undermining the…
▽ More
Rule models are often preferred in prediction tasks with tabular inputs as they can be easily interpreted using natural language and provide predictive performance on par with more complex models. However, most rule models' predictions are undefined or ambiguous when some inputs are missing, forcing users to rely on statistical imputation models or heuristics like zero imputation, undermining the interpretability of the models. In this work, we propose fitting concise yet precise rule models that learn to avoid relying on features with missing values and, therefore, limit their reliance on imputation at test time. We develop MINTY, a method that learns rules in the form of disjunctions between variables that act as replacements for each other when one or more is missing. This results in a sparse linear rule model, regularized to have small dependence on features with missing values, that allows a trade-off between goodness of fit, interpretability, and robustness to missing values at test time. We demonstrate the value of MINTY in experiments using synthetic and real-world data sets and find its predictive performance comparable or favorable to baselines, with smaller reliance on features with missing values.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Pure Exploration in Bandits with Linear Constraints
Authors:
Emil Carlsson,
Debabrota Basu,
Fredrik D. Johansson,
Devdatt Dubhashi
Abstract:
We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well studied, the optimal policy in this case may not be deterministic and could mix between several arms. This changes the geometry of the problem which we characte…
▽ More
We address the problem of identifying the optimal policy with a fixed confidence level in a multi-armed bandit setup, when \emph{the arms are subject to linear constraints}. Unlike the standard best-arm identification problem which is well studied, the optimal policy in this case may not be deterministic and could mix between several arms. This changes the geometry of the problem which we characterize via an information-theoretic lower bound. We introduce two asymptotically optimal algorithms for this setting, one based on the Track-and-Stop method and the other based on a game-theoretic approach. Both these algorithms try to track an optimal allocation based on the lower bound and computed by a weighted projection onto the boundary of a normal cone. Finally, we provide empirical results that validate our bounds and visualize how constraints change the hardness of the problem.
△ Less
Submitted 25 January, 2024; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Simulating secondary electron and ion emission from the Cassini spacecraft in Saturn's ionosphere
Authors:
Zeqi Zhang,
Ravindra T. Desai,
Oleg Shebanits,
Fredrik L. Johansson,
Yohei Miyake,
Hideyuki Usui
Abstract:
The Cassini spacecraft's Grand Finale flybys through Saturn's ionosphere provided unprecedented insight into the composition and dynamics of the gas giant's upper atmosphere and a novel and complex spacecraft-plasma interaction. In this article, we further study Cassini's interaction with Saturn's ionosphere using three dimensional Particle-in-Cell simulations. We focus on understanding how electr…
▽ More
The Cassini spacecraft's Grand Finale flybys through Saturn's ionosphere provided unprecedented insight into the composition and dynamics of the gas giant's upper atmosphere and a novel and complex spacecraft-plasma interaction. In this article, we further study Cassini's interaction with Saturn's ionosphere using three dimensional Particle-in-Cell simulations. We focus on understanding how electrons and ions, emitted from spacecraft surfaces due to the high-velocity impact of atmospheric water molecules, could have affected the spacecraft potential and low-energy plasma measurements. The simulations show emitted electrons extend upstream along the magnetic field and, for sufficiently high emission rates, charge the spacecraft to positive potentials. The lack of accurate emission rates and characteristics, however, makes differentiation between the prominence of secondary electron emission and ionospheric charged dust populations, which induce similar charging effects, difficult for Cassini. These results provide further context for Cassini's final measurements and highlight the need for future laboratory studies to support high-velocity flyby missions through planetary and cometary ionospheres.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Unsupervised domain adaptation by learning using privileged information
Authors:
Adam Breitholtz,
Anton Matsson,
Fredrik D. Johansson
Abstract:
Successful unsupervised domain adaptation is guaranteed only under strong assumptions such as covariate shift and overlap between input domains. The latter is often violated in high-dimensional applications like image classification which, despite this limitation, continues to serve as inspiration and benchmark for algorithm development. In this work, we show that training-time access to side info…
▽ More
Successful unsupervised domain adaptation is guaranteed only under strong assumptions such as covariate shift and overlap between input domains. The latter is often violated in high-dimensional applications like image classification which, despite this limitation, continues to serve as inspiration and benchmark for algorithm development. In this work, we show that training-time access to side information in the form of auxiliary variables can help relax restrictions on input variables and increase the sample efficiency of learning at the cost of collecting a richer variable set. As this information is assumed available only during training, not in deployment, we call this problem unsupervised domain adaptation by learning using privileged information (DALUPI). To solve this problem, we propose a simple two-stage learning algorithm, inspired by our analysis of the expected error in the target domain, and a practical end-to-end variant for image classification. We propose three evaluation tasks based on classification of entities in photos and anomalies in medical images with different types of available privileged information (binary attributes and single or multiple regions of interest). We demonstrate across these tasks that using privileged information in learning can reduce errors in domain transfer compared to baselines, be robust to spurious correlations in the source domain, and increase sample efficiency.
△ Less
Submitted 12 June, 2024; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Practicality of generalization guarantees for unsupervised domain adaptation with neural networks
Authors:
Adam Breitholtz,
Fredrik D. Johansson
Abstract:
Understanding generalization is crucial to confidently engineer and deploy machine learning models, especially when deployment implies a shift in the data domain. For such domain adaptation problems, we seek generalization bounds which are tractably computable and tight. If these desiderata can be reached, the bounds can serve as guarantees for adequate performance in deployment. However, in appli…
▽ More
Understanding generalization is crucial to confidently engineer and deploy machine learning models, especially when deployment implies a shift in the data domain. For such domain adaptation problems, we seek generalization bounds which are tractably computable and tight. If these desiderata can be reached, the bounds can serve as guarantees for adequate performance in deployment. However, in applications where deep neural networks are the models of choice, deriving results which fulfill these remains an unresolved challenge; most existing bounds are either vacuous or has non-estimable terms, even in favorable conditions. In this work, we evaluate existing bounds from the literature with potential to satisfy our desiderata on domain adaptation image classification tasks, where deep neural networks are preferred. We find that all bounds are vacuous and that sample generalization terms account for much of the observed looseness, especially when these terms interact with measures of domain shift. To overcome this and arrive at the tightest possible results, we combine each bound with recent data-dependent PAC-Bayes analysis, greatly improving the guarantees. We find that, when domain overlap can be assumed, a simple importance weighting extension of previous work provides the tightest estimable bound. Finally, we study which terms dominate the bounds and identify possible directions for further improvement.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Integrating Earth Observation Data into Causal Inference: Challenges and Opportunities
Authors:
Connor T. Jerzak,
Fredrik Johansson,
Adel Daoud
Abstract:
Observational studies require adjustment for confounding factors that are correlated with both the treatment and outcome. In the setting where the observed variables are tabular quantities such as average income in a neighborhood, tools have been developed for addressing such confounding. However, in many parts of the develo** world, features about local communities may be scarce. In this contex…
▽ More
Observational studies require adjustment for confounding factors that are correlated with both the treatment and outcome. In the setting where the observed variables are tabular quantities such as average income in a neighborhood, tools have been developed for addressing such confounding. However, in many parts of the develo** world, features about local communities may be scarce. In this context, satellite imagery can play an important role, serving as a proxy for the confounding variables otherwise unobserved. In this paper, we study confounder adjustment in this non-tabular setting, where patterns or objects found in satellite images contribute to the confounder bias. Using the evaluation of anti-poverty aid programs in Africa as our running example, we formalize the challenge of performing causal adjustment with such unstructured data -- what conditions are sufficient to identify causal effects, how to perform estimation, and how to quantify the ways in which certain aspects of the unstructured image object are most predictive of the treatment decision. Via simulation, we also explore the sensitivity of satellite image-based observational inference to image resolution and to misspecification of the image-associated confounder. Finally, we apply these tools in estimating the effect of anti-poverty interventions in African communities from satellite imagery.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Off-Policy Evaluation with Out-of-Sample Guarantees
Authors:
Sofia Ek,
Dave Zachariah,
Fredrik D. Johansson,
Petre Stoica
Abstract:
We consider the problem of evaluating the performance of a decision policy using past observational data. The outcome of a policy is measured in terms of a loss (aka. disutility or negative reward) and the main problem is making valid inferences about its out-of-sample loss when the past data was observed under a different and possibly unknown policy. Using a sample-splitting method, we show that…
▽ More
We consider the problem of evaluating the performance of a decision policy using past observational data. The outcome of a policy is measured in terms of a loss (aka. disutility or negative reward) and the main problem is making valid inferences about its out-of-sample loss when the past data was observed under a different and possibly unknown policy. Using a sample-splitting method, we show that it is possible to draw such inferences with finite-sample coverage guarantees about the entire loss distribution, rather than just its mean. Importantly, the method takes into account model misspecifications of the past policy - including unmeasured confounding. The evaluation method can be used to certify the performance of a policy using observational data under a specified range of credible model assumptions.
△ Less
Submitted 30 June, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Efficient learning of nonlinear prediction models with time-series privileged information
Authors:
Bastian Jung,
Fredrik D Johansson
Abstract:
In domains where sample sizes are limited, efficient learning algorithms are critical. Learning using privileged information (LuPI) offers increased sample efficiency by allowing prediction models access to auxiliary information at training time which is unavailable when the models are used. In recent work, it was shown that for prediction in linear-Gaussian dynamical systems, a LuPI learner with…
▽ More
In domains where sample sizes are limited, efficient learning algorithms are critical. Learning using privileged information (LuPI) offers increased sample efficiency by allowing prediction models access to auxiliary information at training time which is unavailable when the models are used. In recent work, it was shown that for prediction in linear-Gaussian dynamical systems, a LuPI learner with access to intermediate time series data is never worse and often better in expectation than any unbiased classical learner. We provide new insights into this analysis and generalize it to nonlinear prediction tasks in latent dynamical systems, extending theoretical guarantees to the case where the map connecting latent variables and observations is known up to a linear transform. In addition, we propose algorithms based on random features and representation learning for the case when this map is unknown. A suite of empirical results confirm theoretical findings and show the potential of using privileged time-series information in nonlinear prediction.
△ Less
Submitted 20 November, 2023; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Computing elementary functions using multi-prime argument reduction
Authors:
Fredrik Johansson
Abstract:
We describe an algorithm for arbitrary-precision computation of the elementary functions (exp, log, sin, atan, etc.) which, after a cheap precomputation, gives roughly a factor-two speedup over previous state-of-the-art algorithms at precision from a few thousand bits up to millions of bits. Following an idea of Sch{ö}nhage, we perform argument reduction using Diophantine combinations of logarithm…
▽ More
We describe an algorithm for arbitrary-precision computation of the elementary functions (exp, log, sin, atan, etc.) which, after a cheap precomputation, gives roughly a factor-two speedup over previous state-of-the-art algorithms at precision from a few thousand bits up to millions of bits. Following an idea of Sch{ö}nhage, we perform argument reduction using Diophantine combinations of logarithms of primes; our contribution is to use a large set of primes instead of a single pair, aided by a fast algorithm to solve the associated integer relation problem. We also list new, optimized Machin-like formulas for the necessary logarithm and arctangent precomputations.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Sharing pattern submodels for prediction with missing values
Authors:
Lena Stempfle,
Ashkan Panahi,
Fredrik D. Johansson
Abstract:
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as a solution. However, fitting models independently does not make efficient use of all available data. Conversely, fitting a single shared model to the full data…
▽ More
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as a solution. However, fitting models independently does not make efficient use of all available data. Conversely, fitting a single shared model to the full data set relies on imputation which often leads to biased results when missingness depends on unobserved factors. We propose an alternative approach, called sharing pattern submodels, which i) makes predictions that are robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels, and iii) has a short description, enabling improved interpretability. Parameter sharing is enforced through sparsity-inducing regularization which we prove leads to consistent estimation. Finally, we give conditions for when a sharing model is optimal, even when both missingness and the target outcome depend on unobserved variables. Classification and regression experiments on synthetic and real-world data sets demonstrate that our models achieve a favorable tradeoff between pattern specialization and information sharing.
△ Less
Submitted 24 November, 2023; v1 submitted 22 June, 2022;
originally announced June 2022.
-
Image-based Treatment Effect Heterogeneity
Authors:
Connor T. Jerzak,
Fredrik Johansson,
Adel Daoud
Abstract:
Randomized controlled trials (RCTs) are considered the gold standard for estimating the average treatment effect (ATE) of interventions. One use of RCTs is to study the causes of global poverty -- a subject explicitly cited in the 2019 Nobel Memorial Prize awarded to Duflo, Banerjee, and Kremer "for their experimental approach to alleviating global poverty." Because the ATE is a population summary…
▽ More
Randomized controlled trials (RCTs) are considered the gold standard for estimating the average treatment effect (ATE) of interventions. One use of RCTs is to study the causes of global poverty -- a subject explicitly cited in the 2019 Nobel Memorial Prize awarded to Duflo, Banerjee, and Kremer "for their experimental approach to alleviating global poverty." Because the ATE is a population summary, anti-poverty experiments often seek to unpack the effect variation around the ATE by conditioning (CATE) on tabular variables such as age and ethnicity that were measured during the RCT data collection. Although such variables are key to unpacking CATE, using only such variables may fail to capture historical, geographical, or neighborhood-specific contributors to effect variation, as tabular RCT data are often only observed near the time of the experiment. In global poverty research, when the location of the experiment units is approximately known, satellite imagery can provide a window into such factors important for understanding heterogeneity. However, there is no method that specifically enables applied researchers to analyze CATE from images. In this paper, using a deep probabilistic modeling framework, we develop such a method that estimates latent clusters of images by identifying images with similar treatment effects distributions. Our interpretable image CATE model also includes a sensitivity factor that quantifies the importance of image segments contributing to the effect cluster prediction. We compare the proposed methods against alternatives in simulation; also, we show how the model works in an actual RCT, estimating the effects of an anti-poverty intervention in northern Uganda and obtaining a posterior predictive distribution over effects for the rest of the country where no experimental data was collected. We make all models available in open-source software.
△ Less
Submitted 25 May, 2023; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa
Authors:
Connor T. Jerzak,
Fredrik Johansson,
Adel Daoud
Abstract:
Observational studies of causal effects require adjustment for confounding factors. In the tabular setting, where these factors are well-defined, separate random variables, the effect of confounding is well understood. However, in public policy, ecology, and in medicine, decisions are often made in non-tabular settings, informed by patterns or objects detected in images (e.g., maps, satellite or t…
▽ More
Observational studies of causal effects require adjustment for confounding factors. In the tabular setting, where these factors are well-defined, separate random variables, the effect of confounding is well understood. However, in public policy, ecology, and in medicine, decisions are often made in non-tabular settings, informed by patterns or objects detected in images (e.g., maps, satellite or tomography imagery). Using such imagery for causal inference presents an opportunity because objects in the image may be related to the treatment and outcome of interest. In these cases, we rely on the images to adjust for confounding but observed data do not directly label the existence of the important objects. Motivated by real-world applications, we formalize this challenge, how it can be handled, and what conditions are sufficient to identify and estimate causal effects. We analyze finite-sample performance using simulation experiments, estimating effects using a propensity adjustment algorithm that employs a machine learning model to estimate the image confounding. Our experiments also examine sensitivity to misspecification of the image pattern mechanism. Finally, we use our methodology to estimate the effects of policy interventions on poverty in African communities from satellite imagery.
△ Less
Submitted 15 February, 2023; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Implications from secondary emission from neutral impact on Cassini plasma and dust measurements
Authors:
Fredrik Leffe Johansson,
Erik Vigren,
Jack Hunter Waite,
Kelly Miller,
Anders Eriksson,
Niklas Edberg,
Joshua Dreyer
Abstract:
We investigate the role of secondary electron and ion emission from impact of gas molecules on the Cassini Langmuir Probe (RPWS-LP, or LP) measurements in the ionosphere of Saturn. We add a model of the emission currents, based on laboratory measurements and data from comet 1P/Halley, to the equations used to derive plasma parameters from LP bias voltage sweeps. Reanalysing several hundred sweeps…
▽ More
We investigate the role of secondary electron and ion emission from impact of gas molecules on the Cassini Langmuir Probe (RPWS-LP, or LP) measurements in the ionosphere of Saturn. We add a model of the emission currents, based on laboratory measurements and data from comet 1P/Halley, to the equations used to derive plasma parameters from LP bias voltage sweeps. Reanalysing several hundred sweeps from the Cassini Grand Finale orbits, we find reasonable explanations for three open conundrums from previous LP studies of the Saturn ionosphere. We find an explanation for the observed positive charging of the Cassini spacecraft, the possibly overestimated ionospheric electron temperatures, and the excess ion current reported. For the sweeps analysed in detail, we do not find (indirect or direct) evidence of dust having a significant charge-carrying role in Saturn's ionosphere. We also produce an estimate of H2O number density from the last six revolutions of Cassini through Saturn's ionosphere in higher detail than reported by the Ion and Neutral Mass Spectrometer (INMS). Our analysis reveals an ionosphere that is highly structured in latitude across all six final revolutions, with mixing ratios varying with two orders of magnitude in latitude and one order of magnitude between revolutions and altitude. The result is generally consistent with an empirical photochemistry model balancing the production of H+ ions with the H+ loss through charge transfer with e.g., H2O, CH4 and CO2, for which water vapour appears as the likeliest dominant source of the signal in terms of yield and concentration.
△ Less
Submitted 22 August, 2022; v1 submitted 29 April, 2022;
originally announced May 2022.
-
Measuring poverty in India with machine learning and remote sensing
Authors:
Adel Daoud,
Felipe Jordan,
Makkunda Sharma,
Fredrik Johansson,
Devdatt Dubhashi,
Sourabh Paul,
Subhashis Banerjee
Abstract:
In this paper, we use deep learning to estimate living conditions in India. We use both census and surveys to train the models. Our procedure achieves comparable results to those found in the literature, but for a wide range of outcomes.
In this paper, we use deep learning to estimate living conditions in India. We use both census and surveys to train the models. Our procedure achieves comparable results to those found in the literature, but for a wide range of outcomes.
△ Less
Submitted 27 October, 2022; v1 submitted 27 December, 2021;
originally announced February 2022.
-
Growth of Transition Metal Sulfides by Sulfuric Vapor Transport and Liquid Sulfur: Synthesis and Properties
Authors:
D. A. Chareev,
D. Phyual,
D. Karmakar,
A. Nekrasov,
F. O. L. Johansson,
T. Sarkar,
H. Rensmo,
Olle Eriksson,
Anna Delin,
A. N. Vasiliev,
Mahmoud Abdel-Hafiez
Abstract:
Transition metals dichalcogenides (TMDs) are an emergent class of low-dimensional materials with growing applications in the field of nanoelectronics. However, efficient methods for synthesizing large mono-crystals of these systems are still lacking. Here, we describe an efficient synthetic route for a large number of TMDs that were obtained in quartz ampoules by sulfuric vapor transport and liqui…
▽ More
Transition metals dichalcogenides (TMDs) are an emergent class of low-dimensional materials with growing applications in the field of nanoelectronics. However, efficient methods for synthesizing large mono-crystals of these systems are still lacking. Here, we describe an efficient synthetic route for a large number of TMDs that were obtained in quartz ampoules by sulfuric vapor transport and liquid sulfur. Crystals of metal sulfides MgS, PdS, PtS2, ReS2, NbS2, TaS2, TaS3, MoS2, WS2, FeS2, CoS2, NiS2, Cr2S3, VS2, In2S3, Bi2S3, TiS2, ZrS3, HfS3, and pure Au were obtained in quartz ampoules by chemical vapor transport technique with sulfur vapors as the transport agent. Unlike the sublimation technique, the metal enters the gas phase in the form of molecules, hence containing greater amount of sulfur than the growing crystal. We have investigated the physical properties for a selection of these crystals and compared them to state-of-the-art findings reported in the literature. The acquired x-ray photoemission spectroscopy features demonstrate the overall high quality of single crystals grown in this work as exemplified by ReS2 and CoS2. This new approach to synthesize high-quality transition metal dichalcogenides single crystals can alleviate many material quality concerns and is suitable for emerging electronic devices.
△ Less
Submitted 31 December, 2021;
originally announced December 2021.
-
Case-based off-policy policy evaluation using prototype learning
Authors:
Anton Matsson,
Fredrik D. Johansson
Abstract:
Importance sampling (IS) is often used to perform off-policy policy evaluation but is prone to several issues, especially when the behavior policy is unknown and must be estimated from data. Significant differences between the target and behavior policies can result in uncertain value estimates due to, for example, high variance and non-evaluated actions. If the behavior policy is estimated using…
▽ More
Importance sampling (IS) is often used to perform off-policy policy evaluation but is prone to several issues, especially when the behavior policy is unknown and must be estimated from data. Significant differences between the target and behavior policies can result in uncertain value estimates due to, for example, high variance and non-evaluated actions. If the behavior policy is estimated using black-box models, it can be hard to diagnose potential problems and to determine for which inputs the policies differ in their suggested actions and resulting values. To address this, we propose estimating the behavior policy for IS using prototype learning. We apply this approach in the evaluation of policies for sepsis treatment, demonstrating how the prototypes give a condensed summary of differences between the target and behavior policies while retaining an accuracy comparable to baseline estimators. We also describe estimated values in terms of the prototypes to better understand which parts of the target policies have the most impact on the estimates. Using a simulator, we study the bias resulting from restricting models to use prototypes.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects
Authors:
Newton Mwai Kinyanjui,
Fredrik D. Johansson
Abstract:
Simulators make unique benchmarks for causal effect estimation since they do not rely on unverifiable assumptions or the ability to intervene on real-world systems, but are often too simple to capture important aspects of real applications. We propose a simulator of Alzheimer's disease aimed at modeling intricacies of healthcare data while enabling benchmarking of causal effect and policy estimato…
▽ More
Simulators make unique benchmarks for causal effect estimation since they do not rely on unverifiable assumptions or the ability to intervene on real-world systems, but are often too simple to capture important aspects of real applications. We propose a simulator of Alzheimer's disease aimed at modeling intricacies of healthcare data while enabling benchmarking of causal effect and policy estimators. We fit the system to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and ground hand-crafted components in results from comparative treatment trials and observational treatment patterns. The simulator includes parameters which alter the nature and difficulty of the causal inference tasks, such as latent variables, effect heterogeneity, length of observed history, behavior policy and sample size. We use the simulator to compare estimators of average and conditional treatment effects.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Influence of the Magnetic Sub-Lattices in the Double Perovskite Compound LaCaNiReO$_6$
Authors:
Konstantinos Papadopoulos,
Ola Kenji Forslund,
Elisabetta Nocerino,
Fredrik O. L. Johansson,
Gediminas Simutis,
Nami Matsubara,
Gerald Morris,
Bassam Hitti,
Donald Arseneau,
Jean-Christophe Orain,
Vladimir Pomjakushin,
Peter Svedlindh,
Daniel Andreica,
Lars Börjesson,
Jun Sugiyama,
Martin Månsson,
Yasmine Sassa
Abstract:
The magnetism of double perovskites is a complex phenomenon, determined from intra- or interatomic magnetic moment interactions, and strongly influenced by geometry. We take advantage of the complementary length and time scales of the muon spin rotation, relaxation and resonance ($μ^+$SR) microscopic technique and bulk AC/DC magnetic susceptibility measurements to study the magnetic phases of the…
▽ More
The magnetism of double perovskites is a complex phenomenon, determined from intra- or interatomic magnetic moment interactions, and strongly influenced by geometry. We take advantage of the complementary length and time scales of the muon spin rotation, relaxation and resonance ($μ^+$SR) microscopic technique and bulk AC/DC magnetic susceptibility measurements to study the magnetic phases of the LaCaNiReO$_6$ double perovskite. As a result we are able to discern and report a newly found dynamic phase transition and the formation of magnetic domains below and above the known magnetic transition of this compound at T$_N$ = 103 K. $μ^+$SR, serving as a local probe at crystallographic interstitial sites, reveals a transition from a metastable ferrimagnetic ordering below T = 103 K to a stable one below T = 30 K. The fast and slow collective dynamic state of this system are investigated. Between 103 K < T < 230 K, the following two magnetic environments appear, a dense spin region and a static-dilute spin region. The paramagnetic state is obtained only above T > 270 K. An evolution of the interaction between Ni and Re magnetic sublattices in this geometrically frustrated fcc perovskite structure, is revealed as a function of temperature and magnetic field, through the critical behaviour and thermal evolution of microscopic and macroscopic physical quantities.
△ Less
Submitted 12 April, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models
Authors:
Rickard K. A. Karlsson,
Martin Willbo,
Zeshan Hussain,
Rahul G. Krishnan,
David Sontag,
Fredrik D. Johansson
Abstract:
We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data…
▽ More
We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data leads to more sample-efficient learning of models that use only baseline data for predictions at test time. We give an algorithm for this setting and prove that when the time series are drawn from a non-stationary Gaussian-linear dynamical system of fixed horizon, learning with privileged information is more efficient than learning without it. On synthetic data, we test the limits of our algorithm and theory, both when our assumptions hold and when they are violated. On three diverse real-world datasets, we show that our approach is generally preferable to classical learning, particularly when data is scarce. Finally, we relate our estimator to a distillation approach both theoretically and empirically.
△ Less
Submitted 5 May, 2022; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Rapid computation of special values of Dirichlet $L$-functions
Authors:
Fredrik Johansson
Abstract:
We consider computing the Riemann zeta function $ζ(s)$ and Dirichlet $L$-functions $L(s,χ)$ to $p$-bit accuracy for large $p$. Using the approximate functional equation together with asymptotically fast computation of the incomplete gamma function, we observe that $p^{3/2+o(1)}$ bit complexity can be achieved if $s$ is an algebraic number of fixed degree and with algebraic height bounded by…
▽ More
We consider computing the Riemann zeta function $ζ(s)$ and Dirichlet $L$-functions $L(s,χ)$ to $p$-bit accuracy for large $p$. Using the approximate functional equation together with asymptotically fast computation of the incomplete gamma function, we observe that $p^{3/2+o(1)}$ bit complexity can be achieved if $s$ is an algebraic number of fixed degree and with algebraic height bounded by $O(p)$. This is an improvement over the $p^{2+o(1)}$ complexity of previously published algorithms and yields, among other things, $p^{3/2+o(1)}$ complexity algorithms for Stieltjes constants and $n^{3/2+o(1)}$ complexity algorithms for computing the $n$th Bernoulli number or the $n$th Euler number exactly.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Arbitrary-precision computation of the gamma function
Authors:
Fredrik Johansson
Abstract:
We discuss the best methods available for computing the gamma function $Γ(z)$ in arbitrary-precision arithmetic with rigorous error bounds. We address different cases: rational, algebraic, real or complex arguments; large or small arguments; low or high precision; with or without precomputation. The methods also cover the log-gamma function $\log Γ(z)$, the digamma function $ψ(z)$, and derivatives…
▽ More
We discuss the best methods available for computing the gamma function $Γ(z)$ in arbitrary-precision arithmetic with rigorous error bounds. We address different cases: rational, algebraic, real or complex arguments; large or small arguments; low or high precision; with or without precomputation. The methods also cover the log-gamma function $\log Γ(z)$, the digamma function $ψ(z)$, and derivatives $Γ^{(n)}(z)$ and $ψ^{(n)}(z)$. Besides attempting to summarize the existing state of the art, we present some new formulas, estimates, bounds and algorithmic improvements and discuss implementation results.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
Thompson Sampling for Bandits with Clustered Arms
Authors:
Emil Carlsson,
Devdatt Dubhashi,
Fredrik D. Johansson
Abstract:
We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and empirically, how exploiting a given cluster structure can significantly improve the regret and computational cost compared to using standard Thompson sampling. I…
▽ More
We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and empirically, how exploiting a given cluster structure can significantly improve the regret and computational cost compared to using standard Thompson sampling. In the case of the stochastic multi-armed bandit we give upper bounds on the expected cumulative regret showing how it depends on the quality of the clustering. Finally, we perform an empirical evaluation showing that our algorithms perform well compared to previously proposed algorithms for bandits with clustered arms.
△ Less
Submitted 15 June, 2022; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Plasma densities, flow and Solar EUV flux at comet 67P - A cross-calibration approach
Authors:
F. L. Johansson,
A. I. Eriksson,
E. Vigren,
L. Bucciantini,
P. Henri,
H. Nilsson,
S. Bergman,
N. J. T. Edberg,
G. Stenberg Wieser,
E. Odelstad
Abstract:
During its two-year mission at comet 67P, Rosetta nearly continuously monitored the inner coma plasma environment for gas production rates varying over three orders of magnitude, at distances to the nucleus from a few to a few hundred km. To achieve the best possible measurements, cross-calibration of the plasma instruments is needed. We construct with two different physical models to cross-calibr…
▽ More
During its two-year mission at comet 67P, Rosetta nearly continuously monitored the inner coma plasma environment for gas production rates varying over three orders of magnitude, at distances to the nucleus from a few to a few hundred km. To achieve the best possible measurements, cross-calibration of the plasma instruments is needed. We construct with two different physical models to cross-calibrate the electron density as measured by the Mutual Impedance Probe (MIP) to the ion current and spacecraft potential as measured by the Rosetta Langmuir Probe (LAP), the latter validated with the Ion Composition Analyser (ICA). We retrieve a continuous plasma density dataset for the entire cometary mission with a much improved dynamical range compared to any plasma instrument alone and, at times, improve the temporal resolution from 0.24-0.74~Hz to 57.8~Hz. The new density dataset is consistent with the existing MIP density dataset and covers long time periods where densities were too low to be measured by MIP. The physical model also yields, at 3~hour time resolution, ion flow speeds as well as a proxy for the solar EUV flux from the photoemission from the Langmuir Probes. We report on two independent mission-wide estimates of the ion flow speed which are consistent with the bulk H$_2$O$^+$ ion velocities as measured by ICA. We find the ion flow to consistently be much faster than the neutral gas over the entire mission, lending further evidence that the ions are collisionally decoupled from the neutrals in the coma. RPC measurements of ion speeds are therefore not consistent with the assumptions made in previously published plasma density models of the comet ionosphere at the start and end of the mission. Also, the measured EUV flux is perfectly consistent with independently derived values previously published from Johansson et al. (2017) and lends support for the conclusions drawn therein.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Learning Approximate and Exact Numeral Systems via Reinforcement Learning
Authors:
Emil Carlsson,
Devdatt Dubhashi,
Fredrik D. Johansson
Abstract:
Recent work (Xu et al., 2020) has suggested that numeral systems in different languages are shaped by a functional need for efficient communication in an information-theoretic sense. Here we take a learning-theoretic approach and show how efficient communication emerges via reinforcement learning. In our framework, two artificial agents play a Lewis signaling game where the goal is to convey a num…
▽ More
Recent work (Xu et al., 2020) has suggested that numeral systems in different languages are shaped by a functional need for efficient communication in an information-theoretic sense. Here we take a learning-theoretic approach and show how efficient communication emerges via reinforcement learning. In our framework, two artificial agents play a Lewis signaling game where the goal is to convey a numeral concept. The agents gradually learn to communicate using reinforcement learning and the resulting numeral systems are shown to be efficient in the information-theoretic framework of Regier et al. (2015); Gibson et al. (2017). They are also shown to be similar to human numeral systems of same type. Our results thus provide a mechanistic explanation via reinforcement learning of the recent results in Xu et al. (2020) and can potentially be generalized to other semantic domains.
△ Less
Submitted 30 April, 2024; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Multi-instrument analysis of far-ultraviolet aurora in the southern hemisphere of comet 67P/Churyumov-Gerasimenko
Authors:
P. Stephenson,
M. Galand,
P. D. Feldman,
A. Beth,
M. Rubin,
D. Bockelée-Morvan,
N. Biver,
Y. -C Cheng,
J. Parker,
J. Burch,
F. L. Johansson,
A. Eriksson
Abstract:
Aims. We aim to determine whether dissociative excitation of cometary neutrals by electron impact is the major source of far-ultraviolet (FUV) emissions at comet 67P/Churyumov-Gerasimenko in the southern hemisphere at large heliocentric distances, both during quiet conditions and impacts of corotating interaction regions observed in the summer of 2016.
Methods. We combined multiple datasets from…
▽ More
Aims. We aim to determine whether dissociative excitation of cometary neutrals by electron impact is the major source of far-ultraviolet (FUV) emissions at comet 67P/Churyumov-Gerasimenko in the southern hemisphere at large heliocentric distances, both during quiet conditions and impacts of corotating interaction regions observed in the summer of 2016.
Methods. We combined multiple datasets from the Rosetta mission through a multi-instrument analysis to complete the first forward modelling of FUV emissions in the southern hemisphere of comet 67P and compared modelled brightnesses to observations with the Alice FUV imaging spectrograph. We modelled the brightness of OI1356, OI1304, Lyman-$β$, CI1657, and CII1335 emissions, which are associated with the dissociation products of the four major neutral species in the coma: CO$_2$, H$_2$O, CO, and O$_2$. The suprathermal electron population was probed by RPC/IES and the neutral column density was constrained by several instruments: ROSINA, MIRO and VIRTIS.
Results. The modelled and observed brightnesses of the FUV emission lines agree closely when viewing nadir and dissociative excitation by electron impact is shown to be the dominant source of emissions away from perihelion. The CII1335 emissions are shown to be consistent with the volume mixing ratio of CO derived from ROSINA. When viewing the limb during the impacts of corotating interaction regions, the model reproduces brightnesses of OI1356 and CI1657 well, but resonance scattering in the extended coma may contribute significantly to the observed Lyman-$β$ and OI1304 emissions. The correlation between variations in the suprathermal electron flux and the observed FUV line brightnesses when viewing the comet's limb suggests electrons are accelerated on large scales and that they originate in the solar wind. This means that the FUV emissions are auroral in nature.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
The CoESCA station at BESSY: Auger electron-Photoelectron coincidences from surfaces demonstrated for Ag MNN
Authors:
T. Leitner,
A. Born,
I. Bidermane,
R. Ovsyannikov,
F. O. L. Johansson,
Y. Sassa,
A. Föhlisch,
A. Lindblad,
F. O. Schumann,
S. Svensson,
N. Mårtensson
Abstract:
In this work, we present the CoESCA station for electron-electron coincidence spectroscopy from surfaces, built in a close collaboration between Uppsala University and Helmholtz-Zentrum Berlin at the BESSY II synchrotron facility in Berlin, Germany. We start with a detailed overview of previous work in the field of electron-electron coincidences, before we describe the CoESCA setup and its design…
▽ More
In this work, we present the CoESCA station for electron-electron coincidence spectroscopy from surfaces, built in a close collaboration between Uppsala University and Helmholtz-Zentrum Berlin at the BESSY II synchrotron facility in Berlin, Germany. We start with a detailed overview of previous work in the field of electron-electron coincidences, before we describe the CoESCA setup and its design parameters. The system is capable of recording shot-to-shot resolved 6D coincidence datasets, i.e. the kinetic energy and the two take off angles for both coincident electrons. The mathematics behind extracting and analysing these multi-dimensional coincidence datasets is introduced, with a focus on coincidence statistics, resulting in fundamental limits of the signal-to-noise ratio and its implications for acquisition times and the size of the raw data stream. The functionality of the CoESCA station is demonstrated for the example of Auger electron - photoelectron coincidences from silver surfaces for photoelectrons from the Ag 3d core levels and their corresponding MNN Auger electrons. The Auger spectra originating from the different core levels, 3d$_{3/2}$ and 3d$_{5/2}$ could be separated and further, the two-hole state energy distributions were determined for these Auger decay channels.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Computing isolated coefficients of the $j$-function
Authors:
Fredrik Johansson
Abstract:
We consider the problem of efficiently computing isolated coefficients $c_n$ in the Fourier series of the elliptic modular function $j(τ)$. We show that a hybrid numerical-modular method with complexity $n^{1+o(1)}$ is efficient in practice. As an application, we locate the first few values of $c_n$ that are prime, the first occurring at $n = 457871$.
We consider the problem of efficiently computing isolated coefficients $c_n$ in the Fourier series of the elliptic modular function $j(τ)$. We show that a hybrid numerical-modular method with complexity $n^{1+o(1)}$ is efficient in practice. As an application, we locate the first few values of $c_n$ that are prime, the first occurring at $n = 457871$.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
On a fast and nearly division-free algorithm for the characteristic polynomial
Authors:
Fredrik Johansson
Abstract:
We review the Preparata-Sarwate algorithm, a simple $O(n^{3.5})$ method for computing the characteristic polynomial, determinant and adjugate of an $n \times n$ matrix using only ring operations together with exact divisions by small integers. The algorithm is a baby-step giant-step version of the more well-known Faddeev-Leverrier algorithm. We make a few comments about the algorithm and evaluate…
▽ More
We review the Preparata-Sarwate algorithm, a simple $O(n^{3.5})$ method for computing the characteristic polynomial, determinant and adjugate of an $n \times n$ matrix using only ring operations together with exact divisions by small integers. The algorithm is a baby-step giant-step version of the more well-known Faddeev-Leverrier algorithm. We make a few comments about the algorithm and evaluate its performance empirically.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Electronic Coupling between the Unoccupied States of the Organic and Inorganic Sub-Lattices of Methylammonium Lead Iodide a Hybrid Organic-Inorganic Perovskite Single Crystal
Authors:
Gabriel J. Man,
Cody M. Sterling,
Chinnathambi Kamal,
Konstantin A. Simonov,
Sebastian Svanström,
Joydev Acharya,
Fredrik O. L. Johansson,
Erika Giangrisostomi,
Ruslan Ovsyannikov,
Thomas Huthwelker,
Sergei M. Butorin,
Pabitra K. Nayak,
Michael Odelius,
Håkan Rensmo
Abstract:
Organic-inorganic halide perovskites have been intensively re-investigated due to their applications, yet the opto-electronic function of the organic cation remains unclear. Through organic-selective resonant Auger electron spectroscopy measurements on well-defined single crystal surfaces, we find evidence for electronic coupling in the unoccupied states between the organic and inorganic sub-latti…
▽ More
Organic-inorganic halide perovskites have been intensively re-investigated due to their applications, yet the opto-electronic function of the organic cation remains unclear. Through organic-selective resonant Auger electron spectroscopy measurements on well-defined single crystal surfaces, we find evidence for electronic coupling in the unoccupied states between the organic and inorganic sub-lattices of the prototypical hybrid perovskite, which is contrary to the notion based on previous studies that the organic cation is electronically inert. The coupling is relevant for electron dynamics in the material and for understanding opto-electronic functionality.
△ Less
Submitted 18 May, 2021; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Calcium: computing in exact real and complex fields
Authors:
Fredrik Johansson
Abstract:
Calcium is a C library for real and complex numbers in a form suitable for exact algebraic and symbolic computation. Numbers are represented as elements of fields $\mathbb{Q}(a_1,\ldots,a_n)$ where the extensions numbers $a_k$ may be algebraic or transcendental. The system combines efficient field operations with automatic discovery and certification of algebraic relations, resulting in a practica…
▽ More
Calcium is a C library for real and complex numbers in a form suitable for exact algebraic and symbolic computation. Numbers are represented as elements of fields $\mathbb{Q}(a_1,\ldots,a_n)$ where the extensions numbers $a_k$ may be algebraic or transcendental. The system combines efficient field operations with automatic discovery and certification of algebraic relations, resulting in a practical computational model of $\mathbb{R}$ and $\mathbb{C}$ in which equality is rigorously decidable for a large class of numbers.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
A charging model for the Rosetta spacecraft
Authors:
F. L. Johansson,
A. I. Eriksson,
N. Gilet,
P. Henri,
G. Wattieaux,
M. G. G. T. Taylor,
C. Imhof,
F. Cipriani
Abstract:
Context. The electrostatic potential of a spacecraft, VS, is important for the capabilities of in situ plasma measurements. Rosetta has been found to be negatively charged during most of the comet mission and even more so in denser plasmas. Aims. Our goal is to investigate how the negative VS correlates with electron density and temperature and to understand the physics of the observed correlation…
▽ More
Context. The electrostatic potential of a spacecraft, VS, is important for the capabilities of in situ plasma measurements. Rosetta has been found to be negatively charged during most of the comet mission and even more so in denser plasmas. Aims. Our goal is to investigate how the negative VS correlates with electron density and temperature and to understand the physics of the observed correlation. Methods. We applied full mission comparative statistics of VS, electron temperature, and electron density to establish VS dependence on cold and warm plasma density and electron temperature. We also used Spacecraft-Plasma Interaction System (SPIS) simulations and an analytical vacuum model to investigate if positively biased elements covering a fraction of the solar array surface can explain the observed correlations. Results. Here, the VS was found to depend more on electron density, particularly with regard to the cold part of the electrons, and less on electron temperature than was expected for the high flux of thermal (cometary) ionospheric electrons. This behaviour was reproduced by an analytical model which is consistent with numerical simulations. Conclusions. Rosetta is negatively driven mainly by positively biased elements on the borders of the front side of the solar panels as these can efficiently collect cold plasma electrons. Biased elements distributed elsewhere on the front side of the panels are less efficient at collecting electrons apart from locally produced electrons (photoelectrons). To avoid significant charging, future spacecraft may minimise the area of exposed bias conductors or use a positive ground power system.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
Learning to search efficiently for causally near-optimal treatments
Authors:
Samuel Håkansson,
Viktor Lindblom,
Omer Gottesman,
Fredrik D. Johansson
Abstract:
Finding an effective medical treatment often requires a search by trial and error. Making this search more efficient by minimizing the number of unnecessary trials could lower both costs and patient suffering. We formalize this problem as learning a policy for finding a near-optimal treatment in a minimum number of trials using a causal inference framework. We give a model-based dynamic programmin…
▽ More
Finding an effective medical treatment often requires a search by trial and error. Making this search more efficient by minimizing the number of unnecessary trials could lower both costs and patient suffering. We formalize this problem as learning a policy for finding a near-optimal treatment in a minimum number of trials using a causal inference framework. We give a model-based dynamic programming algorithm which learns from observational data while being robust to unmeasured confounding. To reduce time complexity, we suggest a greedy algorithm which bounds the near-optimality constraint. The methods are evaluated on synthetic and real-world healthcare data and compared to model-free reinforcement learning. We find that our methods compare favorably to the model-free baseline while offering a more transparent trade-off between search time and treatment efficacy.
△ Less
Submitted 17 February, 2021; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Gone in 23 Attoseconds: Charge Transfer in Resonantly Core Excited Black Phosphorous
Authors:
Fredrik O. L. Johansson,
Yasmine Sassa,
Tomas Edvinsson,
Andreas Lindblad
Abstract:
How fast processes can we measure? Attosecond physics address the limit of measurable time in science. Atomic X-ray excited states offers a way to study extremely fast dynamics with chemical specificity. In black phosphorous an X-ray excited electron can relocate in 22.7 attoseconds. Using the lifetime of the P 1s core-hole as time-base, the radiationless decay spectrum can be used to study charge…
▽ More
How fast processes can we measure? Attosecond physics address the limit of measurable time in science. Atomic X-ray excited states offers a way to study extremely fast dynamics with chemical specificity. In black phosphorous an X-ray excited electron can relocate in 22.7 attoseconds. Using the lifetime of the P 1s core-hole as time-base, the radiationless decay spectrum can be used to study charge transfer processes on the time-scale of the atomic unit of time (24 attoseconds). We demonstrate that the technique can be extended to within a few percent of the core hole's lifetime, an order of magnitude smaller than previously thought.
△ Less
Submitted 14 February, 2020;
originally announced March 2020.
-
FunGrim: a symbolic library for special functions
Authors:
Fredrik Johansson
Abstract:
We present the Mathematical Functions Grimoire (FunGrim), a website and database of formulas and theorems for special functions. We also discuss the symbolic computation library used as the backend and main development tool for FunGrim, and the Grim formula language used in these projects to represent mathematical content semantically.
We present the Mathematical Functions Grimoire (FunGrim), a website and database of formulas and theorems for special functions. We also discuss the symbolic computation library used as the backend and main development tool for FunGrim, and the Grim formula language used in these projects to represent mathematical content semantically.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Kagome silicene: a novel exotic form of two-dimensional epitaxial silicon
Authors:
Y. Sassa,
F. O. L. Johansson,
A. Lindblad,
M. G. Yazdi,
K. Simonov,
J. Weissenrieder,
M. Muntwiler,
F. Iyikanat,
H. Sahin,
T. Angot,
E. Salomon,
G. Le Lay
Abstract:
Since the discovery of graphene, intensive efforts have been made in search of novel two-dimensional (2D) materials. Decreasing the materials dimensionality to their ultimate thinness is a promising route to unveil new physical phenomena, and potentially improve the performance of devices. Among recent 2D materials, analogs of graphene, the group IV elements have attracted much attention for their…
▽ More
Since the discovery of graphene, intensive efforts have been made in search of novel two-dimensional (2D) materials. Decreasing the materials dimensionality to their ultimate thinness is a promising route to unveil new physical phenomena, and potentially improve the performance of devices. Among recent 2D materials, analogs of graphene, the group IV elements have attracted much attention for their unexpected and tunable physical properties. Depending on the growth conditions and substrates, several structures of silicene, germanene, and stanene can be formed. Here, we report the synthesis of a Kagome lattice of silicene on aluminum (111) substrates. We provide evidence of such an exotic 2D Si allotrope through scanning tunneling microscopy (STM) observations, high-resolution core-level (CL) and angle-resolved photoelectron spectroscopy (ARPES) measurements, along with Density Functional Theory calculations.
△ Less
Submitted 30 January, 2020;
originally announced January 2020.
-
Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects
Authors:
Fredrik D. Johansson,
Uri Shalit,
Nathan Kallus,
David Sontag
Abstract:
Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making. The cost and impracticality of performing experiments and a recent monumental increase in electronic record kee** has brought attention to the problem of evaluating decisions based on non-experimental observational data. This is the setting of this work. In…
▽ More
Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making. The cost and impracticality of performing experiments and a recent monumental increase in electronic record kee** has brought attention to the problem of evaluating decisions based on non-experimental observational data. This is the setting of this work. In particular, we study estimation of individual-level causal effects, such as a single patient's response to alternative medication, from recorded contexts, decisions and outcomes. We give generalization bounds on the error in estimated effects based on distance measures between groups receiving different treatments, allowing for sample re-weighting. We provide conditions under which our bound is tight and show how it relates to results for unsupervised domain adaptation. Led by our theoretical results, we devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance, and encourage sharing of information between treatment groups. We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances. Finally, an experimental evaluation on real and synthetic data shows the value of our proposed representation architecture and regularization scheme.
△ Less
Submitted 31 July, 2023; v1 submitted 21 January, 2020;
originally announced January 2020.
-
Estimation of Bounds on Potential Outcomes For Decision Making
Authors:
Maggie Makar,
Fredrik D. Johansson,
John Guttag,
David Sontag
Abstract:
Estimation of individual treatment effects is commonly used as the basis for contextual decision making in fields such as healthcare, education, and economics. However, it is often sufficient for the decision maker to have estimates of upper and lower bounds on the potential outcomes of decision alternatives to assess risks and benefits. We show that, in such cases, we can improve sample efficienc…
▽ More
Estimation of individual treatment effects is commonly used as the basis for contextual decision making in fields such as healthcare, education, and economics. However, it is often sufficient for the decision maker to have estimates of upper and lower bounds on the potential outcomes of decision alternatives to assess risks and benefits. We show that, in such cases, we can improve sample efficiency by estimating simple functions that bound these outcomes instead of estimating their conditional expectations, which may be complex and hard to estimate. Our analysis highlights a trade-off between the complexity of the learning task and the confidence with which the learned bounds hold. Guided by these findings, we develop an algorithm for learning upper and lower bounds on potential outcomes which optimize an objective function defined by the decision maker, subject to the probability that bounds are violated being small. Using a clinical dataset and a well-known causality benchmark, we demonstrate that our algorithm outperforms baselines, providing tighter, more reliable bounds.
△ Less
Submitted 12 August, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
The evolution of the electron number density in the coma of comet 67P at the location of Rosetta from 2015 November through 2016 March
Authors:
Erik Vigren,
Niklas J. T. Edberg,
Anders I. Eriksson,
Marina Galand,
Pierre Henri,
Fredrik L. Johansson,
Elias Odelstad,
Martin Rubin,
Xavier Vallieres
Abstract:
A comet ionospheric model assuming the plasma to move radially outward with the same bulk speed as the neutral gas and not being subject to severe reduction through dissociative recombination has previously been tested in a series of case studies associated with the Rosetta mission at comet 67P/Churyumov-Gerasimenko. It has been found that at low activity and within several tens of km from the nuc…
▽ More
A comet ionospheric model assuming the plasma to move radially outward with the same bulk speed as the neutral gas and not being subject to severe reduction through dissociative recombination has previously been tested in a series of case studies associated with the Rosetta mission at comet 67P/Churyumov-Gerasimenko. It has been found that at low activity and within several tens of km from the nucleus such models (which originally were developed for such conditions) generally work well in reproducing observed electron number densities, in particular when plasma production through both photoionization and electron-impact ionization is taken into account. Near perihelion, case studies have, on the contrary, showed that applying similar assumptions overestimates the observed electron number densities at the location of Rosetta. Here we compare ROSINA/COPS driven model results with RPC/MIP derived electron number densities for an extended time period (2015 November through 2016 March) during the post-perihelion phase with southern summer/spring. We observe a gradual transition from a state when the model grossly overestimates (by more than a factor of 10) the observations to being in reasonable agreement during 2016 March.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
Characterization of Overlap in Observational Studies
Authors:
Michael Oberst,
Fredrik D. Johansson,
Dennis Wei,
Tian Gao,
Gabriel Brat,
David Sontag,
Kush R. Varshney
Abstract:
Overlap between treatment groups is required for non-parametric estimation of causal effects. If a subgroup of subjects always receives the same intervention, we cannot estimate the effect of intervention changes on that subgroup without further assumptions. When overlap does not hold globally, characterizing local regions of overlap can inform the relevance of causal conclusions for new subjects,…
▽ More
Overlap between treatment groups is required for non-parametric estimation of causal effects. If a subgroup of subjects always receives the same intervention, we cannot estimate the effect of intervention changes on that subgroup without further assumptions. When overlap does not hold globally, characterizing local regions of overlap can inform the relevance of causal conclusions for new subjects, and can help guide additional data collection. To have impact, these descriptions must be interpretable for downstream users who are not machine learning experts, such as policy makers. We formalize overlap estimation as a problem of finding minimum volume sets subject to coverage constraints and reduce this problem to binary classification with Boolean rule classifiers. We then generalize this method to estimate overlap in off-policy policy evaluation. In several real-world applications, we demonstrate that these rules have comparable accuracy to black-box estimators and provide intuitive and informative explanations that can inform policy making.
△ Less
Submitted 3 June, 2020; v1 submitted 9 July, 2019;
originally announced July 2019.
-
A Survey on Graph Kernels
Authors:
Nils M. Kriege,
Fredrik D. Johansson,
Christopher Morris
Abstract:
Graph kernels have become an established and widely-used technique for solving classification tasks on graphs. This survey gives a comprehensive overview of techniques for kernel-based graph classification developed in the past 15 years. We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of compu…
▽ More
Graph kernels have become an established and widely-used technique for solving classification tasks on graphs. This survey gives a comprehensive overview of techniques for kernel-based graph classification developed in the past 15 years. We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of computation and their applicability to problems in practice. In an extensive experimental evaluation, we study the classification accuracy of a large suite of graph kernels on established benchmarks as well as new datasets. We compare the performance of popular kernels with several baseline methods and study the effect of applying a Gaussian RBF kernel to the metric induced by a graph kernel. In doing so, we find that simple baselines become competitive after this transformation on some datasets. Moreover, we study the extent to which existing graph kernels agree in their predictions (and prediction errors) and obtain a data-driven categorization of kernels as result. Finally, based on our experimental results, we derive a practitioner's guide to kernel-based graph classification.
△ Less
Submitted 4 February, 2020; v1 submitted 28 March, 2019;
originally announced March 2019.
-
Support and Invertibility in Domain-Invariant Representations
Authors:
Fredrik D. Johansson,
David Sontag,
Rajesh Ranganath
Abstract:
Learning domain-invariant representations has become a popular approach to unsupervised domain adaptation and is often justified by invoking a particular suite of theoretical results. We argue that there are two significant flaws in such arguments. First, the results in question hold only for a fixed representation and do not account for information lost in non-invertible transformations. Second,…
▽ More
Learning domain-invariant representations has become a popular approach to unsupervised domain adaptation and is often justified by invoking a particular suite of theoretical results. We argue that there are two significant flaws in such arguments. First, the results in question hold only for a fixed representation and do not account for information lost in non-invertible transformations. Second, domain invariance is often a far too strict requirement and does not always lead to consistent estimation, even under strong and favorable assumptions. In this work, we give generalization bounds for unsupervised domain adaptation that hold for any representation function by acknowledging the cost of non-invertibility. In addition, we show that penalizing distance between densities is often wasteful and propose a bound based on measuring the extent to which the support of the source domain covers the target domain. We perform experiments on well-known benchmarks that illustrate the short-comings of current standard practice.
△ Less
Submitted 3 July, 2019; v1 submitted 8 March, 2019;
originally announced March 2019.
-
Solar flares observed by Rosetta at comet 67P
Authors:
N. J. T. Edberg,
F. L. Johansson,
A. I. Eriksson,
D. J. Andrews,
R. Hajra,
P. Henri,
C. Simon Wedlund,
M. Alho,
E. Thiemann
Abstract:
Context. The Rosetta spacecraft made continuous measurements of the coma of comet 67P/ Churyumov-Gerasimenko (67P) for more than two years. The plasma in the coma appeared very dynamic, and many factors control its variability. Aims. We wish to identify the effects of solar flares on the comet plasma and also their effect on the measurements by the Langmuir Probe Instrument (LAP). Methods. To iden…
▽ More
Context. The Rosetta spacecraft made continuous measurements of the coma of comet 67P/ Churyumov-Gerasimenko (67P) for more than two years. The plasma in the coma appeared very dynamic, and many factors control its variability. Aims. We wish to identify the effects of solar flares on the comet plasma and also their effect on the measurements by the Langmuir Probe Instrument (LAP). Methods. To identify the effects of flares, we proceeded from an existing flare catalog of Earth-directed solar flares, from which a new list was created that only included Rosetta-directed flares. We also used measurements of flares at Mars when at similar longitudes as Rosetta. The flare irradiance spectral model (FISM v.1) and its Mars equivalent (FISM-M) produce an extreme-ultraviolet (EUV) irradiance (10-120 nm) of the flares at 1 min resolution. LAP data and density measurements obtained with the Mutual Impedence Probe (MIP) from the time of arrival of the flares at Rosetta were examined to determine the flare effects. Results. From the vantage point of Earth, 1504 flares directed toward Rosetta occurred during the mission. In only 24 of these, that is, 1.6%, was the increase in EUV irradiance large enough to cause an observable effect in LAP data. Twenty-four Mars-directed flares were also observed in Rosetta data. The effect of the flares was to increase the photoelectron current by typically 1-5 nA. We find little evidence that the solar flares increase the plasma density, at least not above the background variability. Conclusions. Solar flares have a small effect on the photoelectron current of the LAP instrument, and they are not significant in comparison to other factors that control the plasma density in the coma. The photoelectron current can only be used for flare detection during periods of calm plasma conditions.
△ Less
Submitted 28 February, 2019;
originally announced February 2019.
-
Faster arbitrary-precision dot product and matrix multiplication
Authors:
Fredrik Johansson
Abstract:
We present algorithms for real and complex dot product and matrix multiplication in arbitrary-precision floating-point and ball arithmetic. A low-overhead dot product is implemented on the level of GMP limb arrays; it is about twice as fast as previous code in MPFR and Arb at precision up to several hundred bits. Up to 128 bits, it is 3-4 times as fast, costing 20-30 cycles per term for floating-p…
▽ More
We present algorithms for real and complex dot product and matrix multiplication in arbitrary-precision floating-point and ball arithmetic. A low-overhead dot product is implemented on the level of GMP limb arrays; it is about twice as fast as previous code in MPFR and Arb at precision up to several hundred bits. Up to 128 bits, it is 3-4 times as fast, costing 20-30 cycles per term for floating-point evaluation and 40-50 cycles per term for balls. We handle large matrix multiplications even more efficiently via blocks of scaled integer matrices. The new methods are implemented in Arb and significantly speed up polynomial operations and linear algebra.
△ Less
Submitted 10 May, 2019; v1 submitted 14 January, 2019;
originally announced January 2019.
-
Machine Learning Analysis of Heterogeneity in the Effect of Student Mindset Interventions
Authors:
Fredrik D. Johansson
Abstract:
We study heterogeneity in the effect of a mindset intervention on student-level performance through an observational dataset from the National Study of Learning Mindsets (NSLM). Our analysis uses machine learning (ML) to address the following associated problems: assessing treatment group overlap and covariate balance, imputing conditional average treatment effects, and interpreting imputed effect…
▽ More
We study heterogeneity in the effect of a mindset intervention on student-level performance through an observational dataset from the National Study of Learning Mindsets (NSLM). Our analysis uses machine learning (ML) to address the following associated problems: assessing treatment group overlap and covariate balance, imputing conditional average treatment effects, and interpreting imputed effects. By comparing several different model families we illustrate the flexibility of both off-the-shelf and purpose-built estimators. We find that the mindset intervention has a positive average effect of 0.26, 95%-CI [0.22, 0.30], and that heterogeneity in the range of [0.1, 0.4] is moderated by school-level achievement level, poverty concentration, urbanicity, and student prior expectations.
△ Less
Submitted 14 November, 2018;
originally announced November 2018.
-
Solar wind interaction with comet 67P: impacts of corotating interaction regions
Authors:
Niklas J. T. Edberg,
A. I. Eriksson,
E. Odelstad,
E. Vigren,
D. J. Andrews,
F. Johansson,
J. L. Burch,
C. M. Carr,
E. Cupido,
K. -H. Glassmeier,
R. Goldstein,
J. S. Halekas,
P. Henri,
J. -P. Lebreton,
K. Mandt,
P. Mokashi,
Z. Nemeth,
H. Nilsson,
R. Ramstad,
I. Richter,
G. Stenberg Wieser
Abstract:
We present observations from the Rosetta Plasma Consortium of the effects of stormy solar wind on comet 67P/Churyumov-Gerasimenko. Four corotating interaction regions (CIRs), where the first event has possibly merged with a CME, are traced from Earth via Mars (using Mars Express and MAVEN) and to comet 67P from October to December 2014. When the comet is 3.1-2.7 AU from the Sun and the neutral out…
▽ More
We present observations from the Rosetta Plasma Consortium of the effects of stormy solar wind on comet 67P/Churyumov-Gerasimenko. Four corotating interaction regions (CIRs), where the first event has possibly merged with a CME, are traced from Earth via Mars (using Mars Express and MAVEN) and to comet 67P from October to December 2014. When the comet is 3.1-2.7 AU from the Sun and the neutral outgassing rate $\sim10^{25}-10^{26}$ s$^{-1}$ the CIRs significantly influence the cometary plasma environment at altitudes down to 10-30 km. The ionospheric low-energy \textcolor{black}{($\sim$5 eV) plasma density increases significantly in all events, by a factor $>2$ in events 1-2 but less in events 3-4. The spacecraft potential drops below -20V upon impact when the flux of electrons increases}. The increased density is \textcolor{black}{likely} caused by compression of the plasma environment, increased particle impact ionisation, and possibly charge exchange processes and acceleration of mass loaded plasma back to the comet ionosphere. During all events, the fluxes of suprathermal ($\sim$10-100 eV) electrons increase significantly, suggesting that the heating mechanism of these electrons is coupled to the solar wind energy input. At impact the magnetic field strength in the coma increases by a factor of ~2-5 as more interplanetary magnetic field piles up around of the comet. During two CIR impact events, we observe possible plasma boundaries forming, or moving past Rosetta, as the strong solar wind compresses the cometary plasma environment. \textcolor{black}{We also discuss the possibility of seeing some signatures of the ionospheric response to tail disconnection events
△ Less
Submitted 14 September, 2018;
originally announced September 2018.
-
CME impact on comet 67P/Churyumov-Gerasimenko
Authors:
Niklas J. T. Edberg,
M. Alho,
M. André,
D. J. Andrews,
E. Behar,
J. L. Burch,
C. M. Carr,
E. Cupido,
I. A. D. Engelhardt,
A. I. Eriksson,
K. -H. Glassmeier,
C. Goetz,
R. Goldstein,
P. Henri,
F. L. Johansson,
C. Koenders,
K. Mandt,
H. Nilsson,
E. Odelstad,
I. Richter,
C. Simon Wedlund,
G. Stenberg Wieser,
K. Szego,
E. Vigren,
M. Volwerk
Abstract:
We present Rosetta observations from comet 67P/Churyumov-Gerasimenko during the impact of a coronal mass ejection (CME). The CME impacted on 5-6 Oct 2015, when Rosetta was about 800 km from the comet nucleus, \textcolor{black}{and 1.4 AU from the Sun}. Upon impact, the plasma environment is compressed to the level that solar wind ions, not seen a few days earlier when at 1500 km, now reach Rosetta…
▽ More
We present Rosetta observations from comet 67P/Churyumov-Gerasimenko during the impact of a coronal mass ejection (CME). The CME impacted on 5-6 Oct 2015, when Rosetta was about 800 km from the comet nucleus, \textcolor{black}{and 1.4 AU from the Sun}. Upon impact, the plasma environment is compressed to the level that solar wind ions, not seen a few days earlier when at 1500 km, now reach Rosetta. In response to the compression, the flux of suprathermal electrons increases by a factor of 5-10 and the background magnetic field strength increases by a factor of $\sim$2.5. The plasma density increases by a factor of 10 and reaches 600 cm$^{-3}$, due to increased particle impact ionisation, charge exchange and the adiabatic compression of the plasma environment. We also observe unprecedentedly large magnetic field spikes at 800 km, reaching above 200 nT, which are interpreted as magnetic flux ropes. We suggest that these could possibly be formed by magnetic reconnection processes in the coma as the magnetic field across the CME changes polarity, or as a consequence of strong shears causing Kelvin-Helmholtz instabilities in the plasma flow. Due to the \textcolor{black}{limited orbit of Rosetta}, we are not able to observe if a tail disconnection occurs during the CME impact, which could be expected based on previous remote observations of other CME-comet interactions.
△ Less
Submitted 13 September, 2018;
originally announced September 2018.
-
Ion velocity and electron temperature inside and around the diamagnetic cavity of comet 67P
Authors:
Elias Odelstad,
Anders I. Eriksson,
Fredrik L. Johansson,
Erik Vigren,
Pierre Henri,
Nicolas Gilet,
Kevin L. Heritier,
Xavier Vallières,
Martin Rubin,
Mats André
Abstract:
A major point of interest in cometary plasma physics has been the diamagnetic cavity, an unmagnetized region in the inner-most part of the coma. Here, we combine Langmuir and Mutual Impedance Probe measurements to investigate ion velocities and electron temperatures in the diamagnetic cavity of comet 67P, probed by the Rosetta spacecraft. We find ion velocities generally in the range 2-4 km/s, sig…
▽ More
A major point of interest in cometary plasma physics has been the diamagnetic cavity, an unmagnetized region in the inner-most part of the coma. Here, we combine Langmuir and Mutual Impedance Probe measurements to investigate ion velocities and electron temperatures in the diamagnetic cavity of comet 67P, probed by the Rosetta spacecraft. We find ion velocities generally in the range 2-4 km/s, significantly above the expected neutral velocity $\lesssim$1~km/s, showing that the ions are (partially) decoupled from the neutrals, indicating that ion-neutral drag was not responsible for balancing the outside magnetic pressure. Observations of clear wake effects on one of the Langmuir probes showed that the ion flow was close to radial and supersonic, at least w.r.t. the perpendicular temperature, inside the cavity and possibly in the surrounding region as well. We observed spacecraft potentials $\lesssim$-5~V throughout the cavity, showing that a population of warm ($\sim$5~eV) electrons was present throughout the parts of the cavity reached by Rosetta. Also, a population of cold ($\lesssim0.1$~eV) electrons was consistently observed throughout the cavity, but less consistently in the surrounding region, suggesting that while Rosetta never entered a region of collisionally coupled electrons, such a region was possibly not far away during the cavity crossings.
△ Less
Submitted 10 August, 2018;
originally announced August 2018.