-
How Much Freedom Does An Effectiveness Metric Really Have?
Authors:
Alistair Moffat,
Joel Mackenzie
Abstract:
It is tempting to assume that because effectiveness metrics have free choice to assign scores to search engine result pages (SERPs) there must thus be a similar degree of freedom as to the relative order that SERP pairs can be put into. In fact that second freedom is, to a considerable degree, illusory. That's because if one SERP in a pair has been given a certain score by a metric, fundamental or…
▽ More
It is tempting to assume that because effectiveness metrics have free choice to assign scores to search engine result pages (SERPs) there must thus be a similar degree of freedom as to the relative order that SERP pairs can be put into. In fact that second freedom is, to a considerable degree, illusory. That's because if one SERP in a pair has been given a certain score by a metric, fundamental ordering constraints in many cases then dictate that the score for the second SERP must be either not less than, or not greater than, the score assigned to the first SERP. We refer to these fixed relationships as innate pairwise SERP orderings. Our first goal in this work is to describe and defend those pairwise SERP relationship constraints, and tabulate their relative occurrence via both exhaustive and empirical experimentation.
We then consider how to employ such innate pairwise relationships in IR experiments, leading to a proposal for a new measurement paradigm. Specifically, we argue that tables of results in which many different metrics are listed for champion versus challenger system comparisons should be avoided; and that instead a single metric be argued for in principled terms, with any relationships identified by that metric then reinforced via an assessment of the innate relationship as to whether other metrics - indeed, all other metrics - are likely to yield the same system-vs-system outcome.
△ Less
Submitted 22 January, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Architecture Optimization Dramatically Improves Reverse Bias Stability in Perovskite Solar Cells: A Role of Polymer Hole Transport Layers
Authors:
Fangyuan Jiang,
Yangwei Shi,
Tanka R. Rana,
Daniel Morales,
Isaac Gould,
Declan P. McCarthy,
Joel Smith,
Grey Christoforo,
Hannah Contreras,
Stephen Barlow,
Aditya D. Mohite,
Henry Snaith,
Seth R. Marder,
J. Devin MacKenzie,
Michael D. McGehee,
David S. Ginger
Abstract:
We report that device architecture engineering has a substantial impact on the reverse bias instability that has been reported as a critical issue in commercializing perovskite solar cells. We demonstrate breakdown voltages exceeding -15 V in typical pin structured perovskite solar cells via two steps: i) using polymer hole transporting materials; ii) using a more electrochemically stable gold ele…
▽ More
We report that device architecture engineering has a substantial impact on the reverse bias instability that has been reported as a critical issue in commercializing perovskite solar cells. We demonstrate breakdown voltages exceeding -15 V in typical pin structured perovskite solar cells via two steps: i) using polymer hole transporting materials; ii) using a more electrochemically stable gold electrode. While device degradation can be exacerbated by higher reverse bias and prolonged exposure, our as-fabricated perovskite solar cells completely recover their performance even after stressing at -7 V for 9 hours both in the dark and under partial illumination. Following these observations, we systematically discuss and compare the reverse bias driven degradation pathways in perovskite solar cells with different device architectures. Our model highlights the role of electrochemical reaction rates and species in dictating the reverse bias stability of perovskite solar cells.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Exploring the Representation Power of SPLADE Models
Authors:
Joel Mackenzie,
Shengyao Zhuang,
Guido Zuccon
Abstract:
The SPLADE (SParse Lexical AnD Expansion) model is a highly effective approach to learned sparse retrieval, where documents are represented by term impact scores derived from large language models. During training, SPLADE applies regularization to ensure postings lists are kept sparse -- with the aim of mimicking the properties of natural term distributions -- allowing efficient and effective lexi…
▽ More
The SPLADE (SParse Lexical AnD Expansion) model is a highly effective approach to learned sparse retrieval, where documents are represented by term impact scores derived from large language models. During training, SPLADE applies regularization to ensure postings lists are kept sparse -- with the aim of mimicking the properties of natural term distributions -- allowing efficient and effective lexical matching and ranking. However, we hypothesize that SPLADE may encode additional signals into common postings lists to further improve effectiveness. To explore this idea, we perform a number of empirical analyses where we re-train SPLADE with different, controlled vocabularies and measure how effective it is at ranking passages. Our findings suggest that SPLADE can effectively encode useful ranking signals in documents even when the vocabulary is constrained to terms that are not traditionally useful for ranking, such as stopwords or even random words.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Efficient Immediate-Access Dynamic Indexing
Authors:
Alistair Moffat,
Joel Mackenzie
Abstract:
In a dynamic retrieval system, documents must be ingested as they arrive, and be immediately findable by queries. Our purpose in this paper is to describe an index structure and processing regime that accommodates that requirement for immediate access, seeking to make the ingestion process as streamlined as possible, while at the same time seeking to make the growing index as small as possible, an…
▽ More
In a dynamic retrieval system, documents must be ingested as they arrive, and be immediately findable by queries. Our purpose in this paper is to describe an index structure and processing regime that accommodates that requirement for immediate access, seeking to make the ingestion process as streamlined as possible, while at the same time seeking to make the growing index as small as possible, and seeking to make term-based querying via the index as efficient as possible. We describe a new compression operation and a novel approach to extensible lists which together facilitate that triple goal. In particular, the structure we describe provides incremental document-level indexing using as little as two bytes per posting and only a small amount more for word-level indexing; provides fast document insertion; supports immediate and continuous queryability; provides support for fast conjunctive queries and similarity score-based ranked queries; and facilitates fast conversion of the dynamic index to a "normal" static compressed inverted index structure. Measurement of our new mechanism confirms that in-memory dynamic document-level indexes for collections into the gigabyte range can be constructed at a rate of two gigabytes/minute using a typical server architecture, that multi-term conjunctive Boolean queries can be resolved in just a few milliseconds each on average even while new documents are being concurrently ingested, and that the net memory space required for all of the required data structures amounts to an average of as little as two bytes per stored posting, less than half the space required by the best previous mechanism.
△ Less
Submitted 10 January, 2023; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Post-processing of coronary and myocardial spatial data
Authors:
Jay Aodh Mackenzie,
Megan Jeanne Miller,
Nicholas Hill,
Mette Olufsen
Abstract:
Numerical simulations of real-world phenomenon are implemented with at least two parts: the computational scheme and the computational domain. In the context of hemodynamics, the computational domain of a simulation represents the blood vessel network through which blood flows. Such blood vessel networks can contain millions of individual vessels that are joined together to form a in series and pa…
▽ More
Numerical simulations of real-world phenomenon are implemented with at least two parts: the computational scheme and the computational domain. In the context of hemodynamics, the computational domain of a simulation represents the blood vessel network through which blood flows. Such blood vessel networks can contain millions of individual vessels that are joined together to form a in series and parallel to form the network. It is computationally unfeasible to explicitly simulate blood flow in all blood vessels. Here, from imaged data of a single porcine left coronary arterial tree, we develop a data-pipeline to obtain computational domains for hemodynmaic simulations from a graph representing the coronary vascular tree. Further, we develop a method to ascertain which subregions of the left ventricle are most likely to be perfused via a given artery using a comparison with the American Heart Association division of the left ventricle as a sense check.
△ Less
Submitted 15 April, 2024; v1 submitted 29 July, 2022;
originally announced July 2022.
-
Fully turbulent flows of viscoplastic fluids in a rectangular duct
Authors:
Rodrigo S. Mitishita,
Jordan A. MacKenzie,
Gwynn J. Elfring,
Ian A. Frigaard
Abstract:
Turbulent flows of viscoplastic fluids at high Reynolds numbers have been investigated recently with direct numerical simulations (DNS) but experimental results have been limited. For this reason, we carry out an experimental study of fully turbulent flows of a yield stress fluid in a rectangular aspect ratio channel with a high-resolution laser doppler velocimetry (LDA) setup. We employ aqueous C…
▽ More
Turbulent flows of viscoplastic fluids at high Reynolds numbers have been investigated recently with direct numerical simulations (DNS) but experimental results have been limited. For this reason, we carry out an experimental study of fully turbulent flows of a yield stress fluid in a rectangular aspect ratio channel with a high-resolution laser doppler velocimetry (LDA) setup. We employ aqueous Carbopol solutions, often considered to be a simple yield stress fluid. We formulate different concentrations to address the effect of the rheology of the fluid on the turbulence statistics at an approximately constant Reynolds number. Additionally, we also perform experiments with a single Carbopol formulation at different Reynolds numbers to study its effect. The flow analysis is performed via rheology measurements, turbulence statistics and power spectral densities of velocity fluctuations. The addition of Carbopol to the flow increases turbulence anisotropy, with an enhancement of streamwise velocity fluctuations and a decrease in wall normal velocity fluctuations in comparison to water at the same mean velocity. This change is reflected on the power spectral densities of streamwise velocity fluctuations, where we observe a large increase in energy of large scale turbulent structures. Conversely, the energy of smaller scales is decreased in comparison to water, where the energy drops with a steeper scale than the Newtonian power law of $k_x^{-5/3}$. As we increase the Reynolds number with a Carbopol solution, the streamwise Reynolds stresses approach Newtonian values in the core, which suggests diminishing effects of shear-thinning. The power spectral densities reveal that the energy content at larger scales decreases slightly with the Reynolds number. However, the shear thinning effects do not disappear even as the Reynolds number approaches 50000.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Faster Learned Sparse Retrieval with Guided Traversal
Authors:
Antonio Mallia,
Joel Mackenzie,
Torsten Suel,
Nicola Tonellotto
Abstract:
Neural information retrieval architectures based on transformers such as BERT are able to significantly improve system effectiveness over traditional sparse models such as BM25. Though highly effective, these neural approaches are very expensive to run, making them difficult to deploy under strict latency constraints. To address this limitation, recent studies have proposed new families of learned…
▽ More
Neural information retrieval architectures based on transformers such as BERT are able to significantly improve system effectiveness over traditional sparse models such as BM25. Though highly effective, these neural approaches are very expensive to run, making them difficult to deploy under strict latency constraints. To address this limitation, recent studies have proposed new families of learned sparse models that try to match the effectiveness of learned dense models, while leveraging the traditional inverted index data structure for efficiency. Current learned sparse models learn the weights of terms in documents and, sometimes, queries; however, they exploit different vocabulary structures, document expansion techniques, and query expansion strategies, which can make them slower than traditional sparse models such as BM25. In this work, we propose a novel indexing and query processing technique that exploits a traditional sparse model's "guidance" to efficiently traverse the index, allowing the more effective learned model to execute fewer scoring operations. Our experiments show that our guided processing heuristic is able to boost the efficiency of the underlying learned sparse model by a factor of four without any measurable loss of effectiveness.
△ Less
Submitted 24 April, 2022;
originally announced April 2022.
-
A Sensitivity Analysis of the MSMARCO Passage Collection
Authors:
Joel Mackenzie,
Matthias Petri,
Alistair Moffat
Abstract:
The recent MSMARCO passage retrieval collection has allowed researchers to develop highly tuned retrieval systems. One aspect of this data set that makes it distinctive compared to traditional corpora is that most of the topics only have a single answer passage marked relevant. Here we carry out a "what if" sensitivity study, asking whether a set of systems would still have the same relative perfo…
▽ More
The recent MSMARCO passage retrieval collection has allowed researchers to develop highly tuned retrieval systems. One aspect of this data set that makes it distinctive compared to traditional corpora is that most of the topics only have a single answer passage marked relevant. Here we carry out a "what if" sensitivity study, asking whether a set of systems would still have the same relative performance if more passages per topic were deemed to be "relevant", exploring several mechanisms for identifying sets of passages to be so categorized. Our results show that, in general, while run scores can vary markedly if additional plausible passages are presumed to be relevant, the derived system ordering is relatively insensitive to additional relevance, providing support for the methodology that was used at the time the MSMARCO passage collection was created.
△ Less
Submitted 10 January, 2022; v1 submitted 6 December, 2021;
originally announced December 2021.
-
Wacky Weights in Learned Sparse Representations and the Revenge of Score-at-a-Time Query Evaluation
Authors:
Joel Mackenzie,
Andrew Trotman,
Jimmy Lin
Abstract:
Recent advances in retrieval models based on learned sparse representations generated by transformers have led us to, once again, consider score-at-a-time query evaluation techniques for the top-k retrieval problem. Previous studies comparing document-at-a-time and score-at-a-time approaches have consistently found that the former approach yields lower mean query latency, although the latter appro…
▽ More
Recent advances in retrieval models based on learned sparse representations generated by transformers have led us to, once again, consider score-at-a-time query evaluation techniques for the top-k retrieval problem. Previous studies comparing document-at-a-time and score-at-a-time approaches have consistently found that the former approach yields lower mean query latency, although the latter approach has more predictable query latency. In our experiments with four different retrieval models that exploit representational learning with bags of words, we find that transformers generate "wacky weights" that appear to greatly reduce the opportunities for skip** and early exiting optimizations that lie at the core of standard document-at-a-time techniques. As a result, score-at-a-time approaches appear to be more competitive in terms of query evaluation latency than in previous studies. We find that, if an effectiveness loss of up to three percent can be tolerated, a score-at-a-time approach can yield substantial gains in mean query latency while at the same time dramatically reducing tail latency.
△ Less
Submitted 27 October, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data
Authors:
Matthew Howe,
Ian Reid,
Jamie Mackenzie
Abstract:
Accurate 7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users. In principle, this could be achieved by a single camera system that is capable of detecting the pose of each vehicle but this would require a large, accurately labelled dataset from which to train the detector. Although large vehicle pose datasets exist (ostensibly dev…
▽ More
Accurate 7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users. In principle, this could be achieved by a single camera system that is capable of detecting the pose of each vehicle but this would require a large, accurately labelled dataset from which to train the detector. Although large vehicle pose datasets exist (ostensibly developed for autonomous vehicles), we find training on these datasets inadequate. These datasets contain images from a ground level viewpoint, whereas an ideal view for intersection observation would be elevated higher above the road surface. We develop an alternative approach using a weakly supervised method of fine tuning 3D object detectors for traffic observation cameras; showing in the process that large existing autonomous vehicle datasets can be leveraged for pre-training. To fine-tune the monocular 3D object detector, our method utilises multiple 2D detections from overlap**, wide-baseline views and a loss that encodes the subjacent geometric consistency. Our method achieves vehicle 7DoF pose prediction accuracy on our dataset comparable to the top performing monocular 3D object detectors on autonomous vehicle datasets. We present our training methodology, multi-view reprojection loss, and dataset.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
Anytime Ranking on Document-Ordered Indexes
Authors:
Joel Mackenzie,
Matthias Petri,
Alistair Moffat
Abstract:
Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are amenable to various query types, support index updates, and allow for efficient dynamic pruning operations. One disadvantage with document-ordered indexes is tha…
▽ More
Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are amenable to various query types, support index updates, and allow for efficient dynamic pruning operations. One disadvantage with document-ordered indexes is that high-scoring documents can be distributed across the document identifier space, meaning that index traversal algorithms that terminate early might put search effectiveness at risk. The alternative is impact-ordered indexes, which primarily support top-k disjunctions, but also allow for anytime query processing, where the search can be terminated at any time, with search quality improving as processing latency increases. Anytime query processing can be used to effectively reduce high-percentile tail latency which is essential for operational scenarios in which a service level agreement (SLA) imposes response time requirements. In this work, we show how document-ordered indexes can be organized such that they can be queried in an anytime fashion, enabling strict latency control with effective early termination. Our experiments show that processing document-ordered topical segments selected by a simple score estimator outperforms existing anytime algorithms, and allows query runtimes to be accurately limited in order to comply with SLA requirements.
△ Less
Submitted 10 June, 2021; v1 submitted 18 April, 2021;
originally announced April 2021.
-
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization
Authors:
Graham Neubig,
Shruti Rijhwani,
Alexis Palmer,
Jordan MacKenzie,
Hilaria Cruz,
Xinjian Li,
Matthew Lee,
Aditi Chaudhary,
Luke Gessler,
Steven Abney,
Shirley Anugrah Hayati,
Antonios Anastasopoulos,
Olga Zamaraeva,
Emily Prud'hommeaux,
Jennette Child,
Sara Child,
Rebecca Knowles,
Sarah Moeller,
Jeffrey Micher,
Yiyuan Li,
Sydney Zink,
Mengzhou Xia,
Roshan S Sharma,
Patrick Littell
Abstract:
Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited. In August 2019, a workshop was held at Carnegie Mellon University in Pittsburgh to attempt to bring together language community members, documentary linguists, and technologists to discuss how to bridge this gap and cr…
▽ More
Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited. In August 2019, a workshop was held at Carnegie Mellon University in Pittsburgh to attempt to bring together language community members, documentary linguists, and technologists to discuss how to bridge this gap and create prototypes of novel and practical language revitalization technologies. This paper reports the results of this workshop, including issues discussed, and various conceived and implemented technologies for nine languages: Arapaho, Cayuga, Inuktitut, Irish Gaelic, Kidaw'ida, Kwak'wala, Ojibwe, San Juan Quiahije Chatino, and Seneca.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format
Authors:
Jimmy Lin,
Joel Mackenzie,
Chris Kamphuis,
Craig Macdonald,
Antonio Mallia,
Michał Siedlaczek,
Andrew Trotman,
Arjen de Vries
Abstract:
There exists a natural tension between encouraging a diverse ecosystem of open-source search engines and supporting fair, replicable comparisons across those systems. To balance these two goals, we examine two approaches to providing interoperability between the inverted indexes of several systems. The first takes advantage of internal abstractions around index structures and building wrappers tha…
▽ More
There exists a natural tension between encouraging a diverse ecosystem of open-source search engines and supporting fair, replicable comparisons across those systems. To balance these two goals, we examine two approaches to providing interoperability between the inverted indexes of several systems. The first takes advantage of internal abstractions around index structures and building wrappers that allow one system to directly read the indexes of another. The second involves sharing indexes across systems via a data exchange specification that we have developed, called the Common Index File Format (CIFF). We demonstrate the first approach with the Java systems Anserini and Terrier, and the second approach with Anserini, JASSv2, OldDog, PISA, and Terrier. Together, these systems provide a wide range of implementations and features, with different research goals. Overall, we recommend CIFF as a low-effort approach to support independent innovation while enabling the types of fair evaluations that are critical for driving the field forward.
△ Less
Submitted 18 March, 2020;
originally announced March 2020.
-
Machine learning inference of the interior structure of low-mass exoplanets
Authors:
Philipp Baumeister,
Sebastiano Padovan,
Nicola Tosi,
Grégoire Montavon,
Nadine Nettelmann,
Jasmine MacKenzie,
Mareike Godolt
Abstract:
We explore the application of machine learning based on mixture density neural networks (MDNs) to the interior characterization of low-mass exoplanets up to 25 Earth masses constrained by mass, radius, and fluid Love number $k_2$. We create a dataset of 900$\:$000 synthetic planets, consisting of an iron-rich core, a silicate mantle, a high-pressure ice shell, and a gaseous H/He envelope, to train…
▽ More
We explore the application of machine learning based on mixture density neural networks (MDNs) to the interior characterization of low-mass exoplanets up to 25 Earth masses constrained by mass, radius, and fluid Love number $k_2$. We create a dataset of 900$\:$000 synthetic planets, consisting of an iron-rich core, a silicate mantle, a high-pressure ice shell, and a gaseous H/He envelope, to train a MDN using planetary mass and radius as inputs to the network. For this layered structure, we show that the MDN is able to infer the distribution of possible thicknesses of each planetary layer from mass and radius of the planet. This approach obviates the time-consuming task of calculating such distributions with a dedicated set of forward models for each individual planet. While gas-rich planets may be characterized by compositional gradients rather than distinct layers, the method presented here can be easily extended to any interior structure model. The fluid Love number $k_2$ bears constraints on the mass distribution in the planets' interior and will be measured for an increasing number of exoplanets in the future. Adding $k_2$as an input to the MDN significantly decreases the degeneracy of the possible interior structures.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
A Conservative Finite Element ALE Scheme for Mass-Conserving Reaction-Diffusion Equations on Evolving Two-Dimensional Domains
Authors:
John A. Mackenzie,
Christopher F. Rowlatt,
Robert H. Insall
Abstract:
Mass-conservative reaction-diffusion systems have recently been proposed as a general framework to describe intracellular pattern formation. These systems have been used to model the conformational switching of proteins as they cycle from an inactive state in the cell cytoplasm, to an active state at the cell membrane. The active state then acts as input to downstream effectors. The paradigm of ac…
▽ More
Mass-conservative reaction-diffusion systems have recently been proposed as a general framework to describe intracellular pattern formation. These systems have been used to model the conformational switching of proteins as they cycle from an inactive state in the cell cytoplasm, to an active state at the cell membrane. The active state then acts as input to downstream effectors. The paradigm of activation by recruitment to the membrane underpins a range of biological pathways - including G-protein signalling, growth control through Ras and PI 3-kinase, and cell polarity through Rac and Rho; all activate their targets by recruiting them from the cytoplasm to the membrane. Global mass conservation lies at the heart of these models reflecting the property that the total number of active and inactive forms, and targets, remains constant. Here we present a conservative arbitrary Lagrangian Eulerian (ALE) finite element method for the approximate solution of systems of bulk-surface reaction-diffusion equations on an evolving two-dimensional domain. Fundamental to the success of the method is the robust generation of bulk and surface meshes. For this purpose, we use a moving mesh partial differential equation (MMPDE) approach. Global conservation of the fully discrete finite element solution is established independently of the ALE velocity field and the time step size. The developed method is applied to model problems with known analytical solutions; these experiments indicate that the method is second-order accurate and globally conservative. The method is further applied to a model of a single cell migrating in the presence of an external chemotactic signal.
△ Less
Submitted 5 October, 2019;
originally announced October 2019.
-
A Moving Mesh Method for Modelling Defects in Nematic Liquid Crystals
Authors:
Craig S. MacDonald,
John A. Mackenzie,
Alison Ramage
Abstract:
The properties of liquid crystals can be modelled using an order parameter which describes the variability of the local orientation of rod-like molecules. Defects in the director field can arise due to external factors such as applied electric or magnetic fields, or the constraining geometry of the cell containing the liquid crystal material. Understanding the formation and dynamics of defects is…
▽ More
The properties of liquid crystals can be modelled using an order parameter which describes the variability of the local orientation of rod-like molecules. Defects in the director field can arise due to external factors such as applied electric or magnetic fields, or the constraining geometry of the cell containing the liquid crystal material. Understanding the formation and dynamics of defects is important in the design and control of liquid crystal devices, and poses significant challenges for numerical modelling. In this paper we consider the numerical solution of a $\bf{Q}$-tensor model of a nematic liquid crystal, where defects arise through rapid changes in the $\bf{Q}$-tensor over a very small physical region in relation to the dimensions of the liquid crystal device. The efficient solution of the resulting six coupled partial differential equations is achieved using a finite element based adaptive moving mesh approach, where an unstructured triangular mesh is adapted towards high activity regions, including those around defects. Spatial convergence studies are presented using a stationary defect as a model test case, and the adaptive method is shown to be optimally convergent using quadratic triangular finite elements. The full effectiveness of the method is then demonstrated using a challenging two-dimensional dynamic Pi-cell problem involving the creation, movement, and annihilation of defects.
△ Less
Submitted 5 October, 2019;
originally announced October 2019.
-
An $hr$-Adaptive Method for the Cubic Nonlinear Schrödinger Equation
Authors:
J. A. Mackenzie,
W. R. Mekwi
Abstract:
The nonlinear Schrödinger equation (NLSE) is one of the most important equations in quantum mechanics, and appears in a wide range of applications including optical fibre communications, plasma physics and biomolecule dynamics. It is a notoriously difficult problem to solve numerically as solutions have very steep temporal and spatial gradients. Adaptive moving mesh methods ($r$-adaptive) attempt…
▽ More
The nonlinear Schrödinger equation (NLSE) is one of the most important equations in quantum mechanics, and appears in a wide range of applications including optical fibre communications, plasma physics and biomolecule dynamics. It is a notoriously difficult problem to solve numerically as solutions have very steep temporal and spatial gradients. Adaptive moving mesh methods ($r$-adaptive) attempt to optimise the accuracy obtained using a fixed number of nodes by moving them to regions of steep solution features. This approach on its own is however limited if the solution becomes more or less difficult to resolve over the period of interest. Mesh refinement methods ($h$-adaptive), where the mesh is locally coarsened or refined, is an alternative adaptive strategy which is popular for time-independent problems. In this paper, we consider the effectiveness of a combined method ($hr$-adaptive) to solve the NLSE in one space dimension. Simulations are presented indicating excellent solution accuracy compared to other moving mesh approaches. The method is also shown to control the spatial error based on the user's input error tolerance. Evidence is also presented indicating second-order spatial convergence using a novel monitor function to generate the adaptive moving mesh.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
Boosting Search Performance Using Query Variations
Authors:
Rodger Benham,
Joel Mackenzie,
Alistair Moffat,
J. Shane Culpepper
Abstract:
Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. However, to date fusion has not been regarded as being cost-effective in cases where strict per-query efficiency guarantees are required, such as in web search. In this work we propose a novel solution to rank fusion by splitting the computation into two parts -- one phase that…
▽ More
Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. However, to date fusion has not been regarded as being cost-effective in cases where strict per-query efficiency guarantees are required, such as in web search. In this work we propose a novel solution to rank fusion by splitting the computation into two parts -- one phase that is carried out offline to generate pre-computed centroid answers for queries with broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query. We explore efficiency improvements to classic fusion algorithms whose costs can be amortized as a pre-processing step, and can then be combined with re-ranking approaches to dramatically improve effectiveness in multi-stage retrieval systems with little efficiency overhead at query time. Experimental results using the ClueWeb12B collection and the UQV100 query variations demonstrate that centroid-based approaches allow improved retrieval effectiveness at little or no loss in query throughput or latency, and with reasonable pre-processing requirements. We additionally show that queries that do not match any of the pre-computed clusters can be accurately identified and efficiently processed in our proposed ranking pipeline.
△ Less
Submitted 9 November, 2020; v1 submitted 14 November, 2018;
originally announced November 2018.
-
A coupled bulk-surface model for cell polarisation
Authors:
Davide Cusseddu,
Leah Edelstein-Keshet,
John A. Mackenzie,
Stéphanie Portet,
Anotida Madzvamuse
Abstract:
Several cellular activities, such as directed cell migration, are coordinated by an intricate network of biochemical reactions which lead to a polarised state of the cell, in which cellular symmetry is broken, causing the cell to have a well defined front and back. Recent work on balancing biological complexity with mathematical tractability resulted in the proposal and formulation of a famous min…
▽ More
Several cellular activities, such as directed cell migration, are coordinated by an intricate network of biochemical reactions which lead to a polarised state of the cell, in which cellular symmetry is broken, causing the cell to have a well defined front and back. Recent work on balancing biological complexity with mathematical tractability resulted in the proposal and formulation of a famous minimal model for cell polarisation, known as the wave pinning model. In this study, we present a three-dimensional generalisation of this mathematical framework through the maturing theory of coupled bulk-surface semilinear partial differential equations in which protein compartmentalisation becomes natural. We show how a local perturbation over the surface can trigger propagating reactions, eventually stopped in a stable profile by the interplay with the bulk component. We describe the behavior of the model through asymptotic and local perturbation analysis, in which the role of the geometry is investigated. The bulk-surface finite element method is used to generate numerical simulations over simple and complex geometries, which confirm our analysis, showing pattern formation due to propagation and pinning dynamics. The generality of our mathematical and computational framework allows to study more complex biochemical reactions and biomechanical properties associated with cell polarisation in multi-dimensions.
△ Less
Submitted 12 September, 2018;
originally announced September 2018.
-
OGLE-2016-BLG-1190Lb: First Spitzer Bulge Planet Lies Near the Planet/Brown-Dwarf Boundary
Authors:
Y. -H. Ryu,
J. C. Yee,
A. Udalski,
I. A. Bond,
Y. Shvartzvald,
W. Zang,
R. Figuera Jaimes,
U. G. Jorgensen,
W. Zhu,
C. X. Huang,
Y. K. Jung,
M. D. Albrow,
S. -J. Chung,
A. Gould,
C. Han,
K. -H. Hwang,
I. -G. Shin,
S. -M. Cha,
D. -J. Kim,
H. -W. Kim,
S. -L. Kim,
C. -U. Lee,
D. -J. Lee,
Y. Lee,
B. -G. Park
, et al. (85 additional authors not shown)
Abstract:
We report the discovery of OGLE-2016-BLG-1190Lb, which is likely to be the first Spitzer microlensing planet in the Galactic bulge/bar, an assignation that can be confirmed by two epochs of high-resolution imaging of the combined source-lens baseline object. The planet's mass M_p= 13.4+-0.9 M_J places it right at the deuterium burning limit, i.e., the conventional boundary between "planets" and "b…
▽ More
We report the discovery of OGLE-2016-BLG-1190Lb, which is likely to be the first Spitzer microlensing planet in the Galactic bulge/bar, an assignation that can be confirmed by two epochs of high-resolution imaging of the combined source-lens baseline object. The planet's mass M_p= 13.4+-0.9 M_J places it right at the deuterium burning limit, i.e., the conventional boundary between "planets" and "brown dwarfs". Its existence raises the question of whether such objects are really "planets" (formed within the disks of their hosts) or "failed stars" (low mass objects formed by gas fragmentation). This question may ultimately be addressed by comparing disk and bulge/bar planets, which is a goal of the Spitzer microlens program. The host is a G dwarf M_host = 0.89+-0.07 M_sun and the planet has a semi-major axis a~2.0 AU. We use Kepler K2 Campaign 9 microlensing data to break the lens-mass degeneracy that generically impacts parallax solutions from Earth-Spitzer observations alone, which is the first successful application of this approach. The microlensing data, derived primarily from near-continuous, ultra-dense survey observations from OGLE, MOA, and three KMTNet telescopes, contain more orbital information than for any previous microlensing planet, but not quite enough to accurately specify the full orbit. However, these data do permit the first rigorous test of microlensing orbital-motion measurements, which are typically derived from data taken over <1% of an orbital period.
△ Less
Submitted 20 November, 2017; v1 submitted 26 October, 2017;
originally announced October 2017.
-
High-resolution Imaging of Transiting Extrasolar Planetary systems (HITEP). II. Lucky Imaging results from 2015 and 2016
Authors:
D. F. Evans,
J. Southworth,
B. Smalley,
U. G. Jørgensen,
M. Dominik,
M. I. Andersen,
V. Bozza,
D. M. Bramich,
M. J. Burgdorf,
S. Ciceri,
G. D'Ago,
R. Figuera Jaimes,
S. -H. Gu,
T. C. Hinse,
Th. Henning,
M. Hundertmark,
N. Kains,
E. Kerins,
H. Korhonen,
R. Kokotanekova,
M. Kuffmeier,
P. Longa-Peña,
L. Mancini,
J. MacKenzie,
A. Popovas
, et al. (11 additional authors not shown)
Abstract:
The formation and dynamical history of hot Jupiters is currently debated, with wide stellar binaries having been suggested as a potential formation pathway. Additionally, contaminating light from both binary companions and unassociated stars can significantly bias the results of planet characterisation studies, but can be corrected for if the properties of the contaminating star are known. We sear…
▽ More
The formation and dynamical history of hot Jupiters is currently debated, with wide stellar binaries having been suggested as a potential formation pathway. Additionally, contaminating light from both binary companions and unassociated stars can significantly bias the results of planet characterisation studies, but can be corrected for if the properties of the contaminating star are known. We search for binary companions to known transiting exoplanet host stars, in order to determine the multiplicity properties of hot Jupiter host stars. We also characterise unassociated stars along the line of sight, allowing photometric and spectroscopic observations of the planetary system to be corrected for contaminating light. We analyse lucky imaging observations of 97 Southern hemisphere exoplanet host stars, using the Two Colour Instrument on the Danish 1.54m telescope. For each detected companion star, we determine flux ratios relative to the planet host star in two passbands, and measure the relative position of the companion. The probability of each companion being physically associated was determined using our two-colour photometry. A catalogue of close companion stars is presented, including flux ratios, position measurements, and estimated companion star temperature. For companions that are potential binary companions, we review archival and catalogue data for further evidence. For WASP-77AB and WASP-85AB, we combine our data with historical measurements to determine the binary orbits, showing them to be moderately eccentric and inclined to the line of sight and planetary orbital axis. Combining our survey with the similar Friends of Hot Jupiters survey, we conclude that known hot Jupiter host stars show a deficit of high mass stellar companions compared to the field star population; however, this may be a result of the biases in detection and target selection by ground-based surveys.
△ Less
Submitted 13 October, 2017; v1 submitted 21 September, 2017;
originally announced September 2017.
-
Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems
Authors:
Joel Mackenzie,
J. Shane Culpepper,
Roi Blanco,
Matt Crane,
Charles L. A. Clarke,
Jimmy Lin
Abstract:
Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propo…
▽ More
Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propose and validate a unified framework that can be used to predict a wide range of performance-sensitive parameters which minimize effectiveness loss, while simultaneously minimizing query latency, across all stages of a multi-stage search architecture. Furthermore, our framework can be easily applied in large-scale IR systems, can be trained without explicitly requiring relevance judgments, and can target a variety of different efficiency-effectiveness trade-offs, making it well suited to a wide range of search scenarios. Our results show that we can reliably predict a number of different parameters on a per-query basis, while simultaneously detecting and minimizing the likelihood of tail-latency queries that exceed a pre-specified performance budget. As a proof of concept, we use the prediction framework to help alleviate the problem of tail-latency queries in early stage retrieval. On the standard ClueWeb09B collection and 31k queries, we show that our new hybrid system can reliably achieve a maximum query time of 200 ms with a 99.99% response time guarantee without a significant loss in overall effectiveness. The solutions presented are practical, and can easily be used in large-scale distributed search engine deployments with a small amount of additional overhead.
△ Less
Submitted 20 April, 2017; v1 submitted 12 April, 2017;
originally announced April 2017.