Search | arXiv e-print repository

A question of Erdös on $3$-powerful numbers and an elliptic curve analogue of the Ankeny-Artin-Chowla conjecture

Abstract: We describe how the Mordell-Weil group of rational points on a certain family of elliptic curves give rise to solutions to a conjecture of Erdös on $3$-powerful numbers, and state a related conjecture which can be viewed as an elliptic curve analogue of the Ankeny-Artin-Chowla conjecture. We describe how the Mordell-Weil group of rational points on a certain family of elliptic curves give rise to solutions to a conjecture of Erdös on $3$-powerful numbers, and state a related conjecture which can be viewed as an elliptic curve analogue of the Ankeny-Artin-Chowla conjecture. △ Less

Submitted 5 April, 2024; originally announced April 2024.

MSC Class: 11D25

arXiv:2402.00838 [pdf, other]

OLMo: Accelerating the Science of Language Models

Authors: Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam , et al. (18 additional authors not shown)

Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models… ▽ More Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, we have built OLMo, a competitive, truly Open Language Model, to enable the scientific study of language models. Unlike most prior efforts that have only released model weights and inference code, we release OLMo alongside open training data and training and evaluation code. We hope this release will empower the open research community and inspire a new wave of innovation. △ Less

Submitted 7 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2402.00159 [pdf, other]

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Authors: Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen , et al. (11 additional authors not shown)

Abstract: Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training dat… ▽ More Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations. To facilitate scientific research on language model pretraining, we curate and release Dolma, a three-trillion-token English corpus, built from a diverse mixture of web content, scientific papers, code, public-domain books, social media, and encyclopedic materials. We extensively document Dolma, including its design principles, details about its construction, and a summary of its contents. We present analyses and experimental results on intermediate states of Dolma to share what we have learned about important data curation practices. Finally, we open-source our data curation toolkit to enable reproduction of our work as well as support further research in large-scale data curation. △ Less

Submitted 6 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

Comments: Accepted at ACL 2024; Dataset: https://hf.co/datasets/allenai/dolma; Code: https://github.com/allenai/dolma

arXiv:2312.10523 [pdf, other]

Paloma: A Benchmark for Evaluating Language Model Fit

Authors: Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge

Abstract: Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribution extrapolates to others, Perplexity Analysis for Language Model Assessment (Paloma), measures LM fit to 585 text domains, ranging from nytimes.com… ▽ More Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribution extrapolates to others, Perplexity Analysis for Language Model Assessment (Paloma), measures LM fit to 585 text domains, ranging from nytimes.com to r/depression on Reddit. We invite submissions to our benchmark and organize results by comparability based on compliance with guidelines such as removal of benchmark contamination from pretraining. Submissions can also record parameter and training token count to make comparisons of Pareto efficiency for performance as a function of these measures of cost. We populate our benchmark with results from 6 baselines pretrained on popular corpora. In case studies, we demonstrate analyses that are possible with Paloma, such as finding that pretraining without data beyond Common Crawl leads to inconsistent fit to many domains. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: Project Page: https://paloma.allen.ai/

arXiv:2312.10253 [pdf, other]

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Authors: Dirk Groeneveld, Anas Awadalla, Iz Beltagy, Akshita Bhagia, Ian Magnusson, Hao Peng, Oyvind Tafjord, Pete Walsh, Kyle Richardson, Jesse Dodge

Abstract: The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks, domains, and datasets, often at an extreme scale. This imposes new engineering challenges: efforts in constructing datasets and models have been fragmented, and their formats and interfaces are incompati… ▽ More The success of large language models has shifted the evaluation paradigms in natural language processing (NLP). The community's interest has drifted towards comparing NLP models across many tasks, domains, and datasets, often at an extreme scale. This imposes new engineering challenges: efforts in constructing datasets and models have been fragmented, and their formats and interfaces are incompatible. As a result, it often takes extensive (re)implementation efforts to make fair and controlled comparisons at scale. Catwalk aims to address these issues. Catwalk provides a unified interface to a broad range of existing NLP datasets and models, ranging from both canonical supervised training and fine-tuning, to more modern paradigms like in-context learning. Its carefully-designed abstractions allow for easy extensions to many others. Catwalk substantially lowers the barriers to conducting controlled experiments at scale. For example, we finetuned and evaluated over 64 models on over 86 datasets with a single command, without writing any code. Maintained by the AllenNLP team at the Allen Institute for Artificial Intelligence (AI2), Catwalk is an ongoing open-source effort: https://github.com/allenai/catwalk. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: technical report, work in progress

arXiv:2310.20707 [pdf, other]

What's In My Big Data?

Authors: Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

Abstract: Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion of evaluation data (contamination). In this work, we propose What's In My Big Data? (WIMBD), a platform and a set of sixteen analyses that allow us to reveal and compare the contents of large text corp… ▽ More Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion of evaluation data (contamination). In this work, we propose What's In My Big Data? (WIMBD), a platform and a set of sixteen analyses that allow us to reveal and compare the contents of large text corpora. WIMBD builds on two basic capabilities -- count and search -- at scale, which allows us to analyze more than 35 terabytes on a standard compute node. We apply WIMBD to ten different corpora used to train popular language models, including C4, The Pile, and RedPajama. Our analysis uncovers several surprising and previously undocumented findings about these corpora, including the high prevalence of duplicate, synthetic, and low-quality content, personally identifiable information, toxic language, and benchmark contamination. For instance, we find that about 50% of the documents in RedPajama and LAION-2B-en are duplicates. In addition, several datasets used for benchmarking models trained on such corpora are contaminated with respect to important benchmarks, including the Winograd Schema Challenge and parts of GLUE and SuperGLUE. We open-source WIMBD's code and artifacts to provide a standard set of evaluations for new text-based corpora and to encourage more analyses and transparency around them. △ Less

Submitted 5 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: Published at ICLR 2024 spotlight

arXiv:2307.09701 [pdf, other]

Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation

Authors: Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi

Abstract: Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practical challenges in model evaluation and comparison. For example, hardware is challenging to control due to disparate levels of accessibility across diffe… ▽ More Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practical challenges in model evaluation and comparison. For example, hardware is challenging to control due to disparate levels of accessibility across different institutions. Moreover, improvements in metrics such as FLOPs often fail to translate to progress in real-world applications. In response, we introduce Pentathlon, a benchmark for holistic and realistic evaluation of model efficiency. Pentathlon focuses on inference, which accounts for a majority of the compute in a model's lifecycle. It offers a strictly-controlled hardware platform, and is designed to mirror real-world applications scenarios. It incorporates a suite of metrics that target different aspects of efficiency, including latency, throughput, memory overhead, and energy consumption. Pentathlon also comes with a software library that can be seamlessly integrated into any codebase and enable evaluation. As a standardized and centralized evaluation platform, Pentathlon can drastically reduce the workload to make fair and reproducible efficiency comparisons. While initially focused on natural language processing (NLP) models, Pentathlon is designed to allow flexible extension to other fields. We envision Pentathlon will stimulate algorithmic innovations in building efficient models, and foster an increased awareness of the social and environmental implications in the development of future-generation NLP models. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2305.14864 [pdf, other]

How To Train Your (Compressed) Large Language Model

Authors: Ananya Harsh Jha, Tom Sherborne, Evan Pete Walsh, Dirk Groeneveld, Emma Strubell, Iz Beltagy

Abstract: With the increase in the size of large language models (LLMs), we need compression methods that can reduce the model size while preserving the generality and zero-shot promptability of the model. This goal is more ambitious than the typical compression setup, which reduces the model's size at the expense of specializing it to a specific end-task. To study this, we develop a task-agnostic compressi… ▽ More With the increase in the size of large language models (LLMs), we need compression methods that can reduce the model size while preserving the generality and zero-shot promptability of the model. This goal is more ambitious than the typical compression setup, which reduces the model's size at the expense of specializing it to a specific end-task. To study this, we develop a task-agnostic compression pipeline with a large-scale evaluation comprising language modeling perplexity and 12 zero-shot end-tasks. Our results show that a simple layer-wise pruning followed by continued language model pretraining matches or outperforms three existing state-of-the-art baselines while being 1.5x more computationally efficient. However, unlike typical task-specialized compression, our best-compressed model significantly underperforms a similar-sized model trained from scratch. We posit the half-sized pretrained model as an upper bound for task-agnostic compression and call for future work to bridge this gap under a reasonable token budget. Our findings highlight the inadequacy of existing compression methods for LLMs and establish a requirement for new methods that preserve a model's generality and zero-shot promptability under compression. We release our code and evaluation setup to facilitate reproducibility and help iterate on method design. △ Less

Submitted 18 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 13 pages, 6 figures, 5 tables

arXiv:2304.09570 [pdf, other]

doi 10.3847/1538-4365/acd24b

Slow Solar Wind Connection Science during Solar Orbiter's First Close Perihelion Passage

Authors: Stephanie L. Yardley, Christopher J. Owen, David M. Long, Deborah Baker, David H. Brooks, Vanessa Polito, Lucie M. Green, Sarah Matthews, Mathew Owens, Mike Lockwood, David Stansby, Alexander W. James, Gherado Valori, Alessandra Giunta, Miho Janvier, Nawin Ngampoopun, Teodora Mihailescu, Andy S. H. To, Lidia van Driel-Gesztelyi, Pascal Demoulin, Raffaella D'Amicis, Ryan J. French, Gabriel H. H. Suen, Alexis P. Roulliard, Rui F. Pinto , et al. (54 additional authors not shown)

Abstract: The Slow Solar Wind Connection Solar Orbiter Observing Plan (Slow Wind SOOP) was developed to utilise the extensive suite of remote sensing and in situ instruments on board the ESA/NASA Solar Orbiter mission to answer significant outstanding questions regarding the origin and formation of the slow solar wind. The Slow Wind SOOP was designed to link remote sensing and in situ measurements of slow w… ▽ More The Slow Solar Wind Connection Solar Orbiter Observing Plan (Slow Wind SOOP) was developed to utilise the extensive suite of remote sensing and in situ instruments on board the ESA/NASA Solar Orbiter mission to answer significant outstanding questions regarding the origin and formation of the slow solar wind. The Slow Wind SOOP was designed to link remote sensing and in situ measurements of slow wind originating at open-closed field boundaries. The SOOP ran just prior to Solar Orbiter's first close perihelion passage during two remote sensing windows (RSW1 and RSW2) between 2022 March 3-6 and 2022 March 17-22, while Solar Orbiter was at a heliocentric distance of 0.55-0.51 and 0.38-0.34 au from the Sun, respectively. Coordinated observation campaigns were also conducted by Hinode and IRIS. The magnetic connectivity tool was used, along with low latency in situ data, and full-disk remote sensing observations, to guide the target pointing of Solar Orbiter. Solar Orbiter targeted an active region complex during RSW1, the boundary of a coronal hole, and the periphery of a decayed active region during RSW2. Post-observation analysis using the magnetic connectivity tool along with in situ measurements from MAG and SWA/PAS, show that slow solar wind, with velocities between 210 and 600 km/s, arrived at the spacecraft originating from two out of the three of the target regions. The Slow Wind SOOP, despite presenting many challenges, was very successful, providing a blueprint for planning future observation campaigns that rely on the magnetic connectivity of Solar Orbiter. △ Less

Submitted 20 April, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

Comments: 24 pages, 10 figures

arXiv:2302.06718 [pdf]

doi 10.1109/TASC.2008.920808

Critical Current Longitudinal and Transverse Strain Sensitivities of High JC Nb3Sn Conductors

Authors: Jun Lu, Ke Han, Robert P. Walsh, Todd Atkins, Scott T. Bole

Abstract: Characterizing critical current IC of Nb3Sn strands as function of a strain is very important for large high field superconducting magnet applications such as the superconducting outsert coil of the series-connected hybrid at the NHMFL and the ITER magnets. Apparatuses for measuring IC versus longitudinal strain and transverse stress have been developed and used at the NHMFL. We have characterized… ▽ More Characterizing critical current IC of Nb3Sn strands as function of a strain is very important for large high field superconducting magnet applications such as the superconducting outsert coil of the series-connected hybrid at the NHMFL and the ITER magnets. Apparatuses for measuring IC versus longitudinal strain and transverse stress have been developed and used at the NHMFL. We have characterized the IC strain sensitivities of a few candidate strands for the series-connected-hybrid. In addition, IC irreversibility strains are measured for the recently developed ITER high JC strands. The different strain sensitivities for different strands are discussed. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 4 pages, 6 figures

Journal ref: IEEE Transactions on Applied Superconductivity, 2008

arXiv:2211.14959 [pdf, other]

Method for in-solution, high-throughput T1 relaxometry using fluorescent nanodiamonds

Authors: Erin. S. Grant, Mina Barzegar Amiri Olia, Ella. P. Walsh, Liam T. Hall, Gawain McColl, David A. Simpson

Abstract: Fluorescent nanodiamonds (FNDs) have been exploited as sensitive quantum probes for nanoscale chemical and biological sensing applications, with the majority of demonstrations to date relying on the detection of single FNDs. This places significant limits on the measurement time, throughput and statistical significance of a measured result as there is usually marked inhomogeneity within FND sample… ▽ More Fluorescent nanodiamonds (FNDs) have been exploited as sensitive quantum probes for nanoscale chemical and biological sensing applications, with the majority of demonstrations to date relying on the detection of single FNDs. This places significant limits on the measurement time, throughput and statistical significance of a measured result as there is usually marked inhomogeneity within FND samples. Here we have developed a measurement platform that can report the T1 spin relaxation time from a large ensemble of FNDs in solution. We first describe a refined sensing protocol for this modality and then use it to identify the optimal FND size for the detection of paramagnetic targets. Our approach is simple to set up, robust and can be used for rapid material characterisation or a variety of in-situ quantum sensing applications. △ Less

Submitted 27 November, 2022; originally announced November 2022.

Comments: 8 pages, 3 figures

arXiv:2211.12149 [pdf, ps, other]

Representing integers as a sum of three cubes

Authors: Jon Grantham, P. G. Walsh

Abstract: In this article we further develop methods for representing integers as a sum of three cubes. In particular, a barrier to solving the case $k=3$, which was outlined in a previous paper of the second author, is overcome. A very recent computation indicates that the method is quite favourable to other methods in terms of time estimates. A hybrid of the method presented here and those in a previous p… ▽ More In this article we further develop methods for representing integers as a sum of three cubes. In particular, a barrier to solving the case $k=3$, which was outlined in a previous paper of the second author, is overcome. A very recent computation indicates that the method is quite favourable to other methods in terms of time estimates. A hybrid of the method presented here and those in a previous paper is currently underway for unsolved cases. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: 4 pages

MSC Class: 11D25

arXiv:2211.12138 [pdf, ps, other]

Lower bounds for ranks using Pell equations

Authors: P. G. Walsh

Abstract: We examine the ranks of a subfamily of curves in a previous article, which are derived from the existence of solutions to certain Pell equations. We exhibit an abundance of curves of moderately large rank, and prove under mild conditions that these curves have rank at least three. We examine the ranks of a subfamily of curves in a previous article, which are derived from the existence of solutions to certain Pell equations. We exhibit an abundance of curves of moderately large rank, and prove under mild conditions that these curves have rank at least three. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: 4 pages

MSC Class: 11G05

Journal ref: published in Rad HAZU Matematicke znanosti (2022)

arXiv:2211.02788 [pdf]

doi 10.1002/apxr.202300036

Designing magnetic properties in CrSBr through hydrostatic pressure and ligand substitution

Authors: Evan J. Telford, Daniel G. Chica, Kaichen Xie, Nicholas S. Manganaro, Chun-Ying Huang, Jordan Cox, Avalon H. Dismukes, Xiaoyang Zhu, James P. S. Walsh, Ting Cao, Cory R. Dean, Xavier Roy, Michael E. Ziebel

Abstract: The ability to control magnetic properties of materials is crucial for fundamental research and underpins many information technologies. In this context, two-dimensional materials are a particularly exciting platform due to their high degree of tunability and ease of implementation into nanoscale devices. Here we report two approaches for manipulating the A-type antiferromagnetic properties of the… ▽ More The ability to control magnetic properties of materials is crucial for fundamental research and underpins many information technologies. In this context, two-dimensional materials are a particularly exciting platform due to their high degree of tunability and ease of implementation into nanoscale devices. Here we report two approaches for manipulating the A-type antiferromagnetic properties of the layered semiconductor CrSBr through hydrostatic pressure and ligand substitution. Hydrostatic pressure compresses the unit cell, increasing the interlayer exchange energy while lowering the Néel temperature. Ligand substitution, realized synthetically through Cl alloying, anisotropically compresses the unit cell and suppresses the Cr-halogen covalency, reducing the magnetocrystalline anisotropy energy and decreasing the Néel temperature. A detailed structural analysis combined with first-principles calculations reveal that alterations in the magnetic properties are intricately related to changes in direct Cr-Cr exchange interactions and the Cr-anion superexchange pathways. Further, we demonstrate that Cl alloying enables chemical tuning of the interlayer coupling from antiferromagnetic to ferromagnetic, which is unique amongst known two-dimensional magnets. The magnetic tunability, combined with a high ordering temperature, chemical stability, and functional semiconducting properties, make CrSBr an ideal candidate for pre- and post-synthetic design of magnetism in two-dimensional materials. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: Main text: 17 pages, 4 figures. Supporting Information: 34 pages, 32 figures, 4 tables

arXiv:2210.13643 [pdf, other]

Large-scale optical characterization of solid-state quantum emitters

Authors: Madison Sutula, Ian Christen, Eric Bersin, Michael P. Walsh, Kevin C. Chen, Justin Mallek, Alexander Melville, Michael Titze, Edward S. Bielejec, Scott Hamilton, Danielle Braje, P. Benjamin Dixon, Dirk R. Englund

Abstract: Solid-state quantum emitters have emerged as a leading quantum memory for quantum networking applications. However, standard optical characterization techniques are neither efficient nor repeatable at scale. In this work, we introduce and demonstrate spectroscopic techniques that enable large-scale, automated characterization of color centers. We first demonstrate the ability to track color center… ▽ More Solid-state quantum emitters have emerged as a leading quantum memory for quantum networking applications. However, standard optical characterization techniques are neither efficient nor repeatable at scale. In this work, we introduce and demonstrate spectroscopic techniques that enable large-scale, automated characterization of color centers. We first demonstrate the ability to track color centers by registering them to a fabricated machine-readable global coordinate system, enabling systematic comparison of the same color center sites over many experiments. We then implement resonant photoluminescence excitation in a widefield cryogenic microscope to parallelize resonant spectroscopy, achieving two orders of magnitude speed-up over confocal microscopy. Finally, we demonstrate automated chip-scale characterization of color centers and devices at room temperature, imaging thousands of microscope fields of view. These tools will enable accelerated identification of useful quantum emitters at chip-scale, enabling advances in scaling up color center platforms for quantum information applications, materials science, and device design and characterization. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 17 pages, 13 figures

arXiv:2210.10258 [pdf, other]

Continued Pretraining for Better Zero- and Few-Shot Promptability

Authors: Zhaofeng Wu, Robert L. Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, Iz Beltagy

Abstract: Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language p… ▽ More Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we find that continued pretraining using MAML-style meta-learning, a method that directly optimizes few-shot promptability, yields subpar performance. We validate our findings with two prompt tuning methods, and, based on our results, we provide concrete recommendations to optimize promptability for different use cases. △ Less

Submitted 20 October, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

Comments: EMNLP 2022

arXiv:2208.02350 [pdf, other]

doi 10.3847/1538-4357/ac8667

Energy transport during 3D small-scale reconnection driven by anisotropic plasma turbulence

Authors: Jeffersson A. Agudelo Rueda, Daniel Verscharen, Robert T. Wicks, Christopher J. Owen, Georgios Nicolaou, Kai Germaschewski, Andrew P. Walsh, Ioannis Zouganelis, Santiago Vargas Domínguez

Abstract: Energy dissipation in collisionless plasmas is a longstanding fundamental physics problem. Although it is well known that magnetic reconnection and turbulence are coupled and transport energy from system-size scales to sub-proton scales, the details of the energy distribution and energy dissipation channels remain poorly understood. Especially, the energy transfer and transport associated with thr… ▽ More Energy dissipation in collisionless plasmas is a longstanding fundamental physics problem. Although it is well known that magnetic reconnection and turbulence are coupled and transport energy from system-size scales to sub-proton scales, the details of the energy distribution and energy dissipation channels remain poorly understood. Especially, the energy transfer and transport associated with three dimensional (3D) small-scale reconnection that occurs as a consequence of a turbulent cascade is unknown. We use an explicit fully kinetic particle-in-cell code to simulate 3D small scale magnetic reconnection events forming in anisotropic and Alfvénic decaying turbulence. We identify a highly dynamic and asymmetric reconnection event that involves two reconnecting flux ropes. We use a two-fluid approach based on the Boltzmann equation to study the spatial energy transfer associated with the reconnection event and compare the power density terms in the two-fluid energy equations with standard energy-based dam**, heating and dissipation proxies. Our findings suggest that the electron bulk flow transports thermal energy density more efficiently than kinetic energy density. Moreover, in our turbulent reconnection event, the energy-density transfer is dominated by plasma compression. This is consistent with turbulent current sheets and turbulent reconnection events, but not with laminar reconnection. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: Accepted for publication in Apj

arXiv:2203.06211 [pdf, other]

Staged Training for Transformer Language Models

Authors: Sheng Shen, Pete Walsh, Kurt Keutzer, Jesse Dodge, Matthew Peters, Iz Beltagy

Abstract: The current standard approach to scaling transformer language models trains each model size from a different random initialization. As an alternative, we consider a staged training setup that begins with a small model and incrementally increases the amount of compute used for training by applying a "growth operator" to increase the model depth and width. By initializing each stage with the output… ▽ More The current standard approach to scaling transformer language models trains each model size from a different random initialization. As an alternative, we consider a staged training setup that begins with a small model and incrementally increases the amount of compute used for training by applying a "growth operator" to increase the model depth and width. By initializing each stage with the output of the previous one, the training process effectively re-uses the compute from prior stages and becomes more efficient. Our growth operators each take as input the entire training state (including model parameters, optimizer state, learning rate schedule, etc.) and output a new training state from which training continues. We identify two important properties of these growth operators, namely that they preserve both the loss and the "training dynamics" after applying the operator. While the loss-preserving property has been discussed previously, to the best of our knowledge this work is the first to identify the importance of preserving the training dynamics (the rate of decrease of the loss during training). To find the optimal schedule for stages, we use the scaling laws from (Kaplan et al., 2020) to find a precise schedule that gives the most compute saving by starting a new stage when training efficiency starts decreasing. We empirically validate our growth operators and staged training for autoregressive language models, showing up to 22% compute savings compared to a strong baseline trained from scratch. Our code is available at https://github.com/allenai/staged-training. △ Less

Submitted 11 March, 2022; originally announced March 2022.

arXiv:2202.12389 [pdf]

Direct evidence of magnetic reconnection onset via the tearing instability

Authors: Mayur R. Bakrania, I. Jonathan Rae, Andrew P. Walsh, Daniel Verscharen, Andy W. Smith, Colin Forsyth, Anna Tenerani

Abstract: Magnetic reconnection is a sporadic process responsible for energy release in space and laboratory plasmas. It is believed that the tearing mode instability may be responsible for the onset of reconnection in the magnetotail. However, due to its elusive nature, there is an absence of in-situ observations of the tearing instability prior to magnetic reconnection in our nearest natural plasma labora… ▽ More Magnetic reconnection is a sporadic process responsible for energy release in space and laboratory plasmas. It is believed that the tearing mode instability may be responsible for the onset of reconnection in the magnetotail. However, due to its elusive nature, there is an absence of in-situ observations of the tearing instability prior to magnetic reconnection in our nearest natural plasma laboratory. Using neural network outlier detection methods in conjunction with Cluster spacecraft data, we find unique electron pitch angle distributions that are consistent with simulation predictions of the tearing instability and the subsequent evolution of plasma electrons and reconnection. We confirm that the events identified via our neural network outlier method are well above the tearing stability threshold based on the criterion detailed in this paper. We find signatures of magnetic reconnection minutes after the majority of tearing observations. Our analysis of the tearing instability provides new insights into the fundamental understanding of the mechanism responsible for reconnection, a process that is ubiquitous in different astrophysical plasma regimes across the universe and in laboratory experiments on Earth. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2103.13232 [pdf, other]

doi 10.1017/S0022377821000404

Three-dimensional magnetic reconnection in particle-in-cell simulations of anisotropic plasma turbulence

Authors: Jeffersson A. Agudelo Rueda, Daniel Verscharen, Robert T. Wicks, Christopher J. Owen, Georgios Nicolaou, Andrew P. Walsh, Ioannis Zouganelis, Kai Germaschewski, Santiago Vargas Domínguez

Abstract: We use 3D fully kinetic particle-in-cell simulations to study the occurrence of magnetic reconnection in a simulation of decaying turbulence created by anisotropic counter-propagating low-frequency Alfvén waves consistent with critical-balance theory. We observe the formation of small-scale current-density structures such as current filaments and current sheets as well as the formation of magnetic… ▽ More We use 3D fully kinetic particle-in-cell simulations to study the occurrence of magnetic reconnection in a simulation of decaying turbulence created by anisotropic counter-propagating low-frequency Alfvén waves consistent with critical-balance theory. We observe the formation of small-scale current-density structures such as current filaments and current sheets as well as the formation of magnetic flux ropes as part of the turbulent cascade. The large magnetic structures present in the simulation domain retain the initial anisotropy while the small-scale structures produced by the turbulent cascade are less anisotropic. To quantify the occurrence of reconnection in our simulation domain, we develop a new set of indicators based on intensity thresholds to identify reconnection events in which both ions and electrons are heated and accelerated in 3D particle-in-cell simulations. According to the application of these indicators, we identify the occurrence of reconnection events in the simulation domain and analyse one of these events in detail. The event is related to the reconnection of two flux ropes, and the associated ion and electron exhausts exhibit a complex three-dimensional structure. We study the profiles of plasma and magnetic-field fluctuations recorded along artificial-spacecraft trajectories passing near and through the reconnection region. Our results suggest the presence of particle heating and acceleration related to small-scale reconnection events within magnetic flux ropes produced by the anisotropic Alfvénic turbulent cascade in the solar wind. These events are related to current structures of order a few ion inertial lengths in size. △ Less

Submitted 24 March, 2021; originally announced March 2021.

Comments: Accepted for publication in J. Plasma Phys

arXiv:2009.10772 [pdf, other]

doi 10.1051/0004-6361/202038445

The Solar Orbiter Science Activity Plan: translating solar and heliospheric physics questions into action

Authors: I. Zouganelis, A. De Groof, A. P. Walsh, D. R. Williams, D. Mueller, O. C. St Cyr, F. Auchere, D. Berghmans, A. Fludra, T. S. Horbury, R. A. Howard, S. Krucker, M. Maksimovic, C. J. Owen, J. Rodriiguez-Pacheco, M. Romoli, S. K. Solanki, C. Watson, L. Sanchez, J. Lefort, P. Osuna, H. R. Gilbert, T. Nieves-Chinchilla, L. Abbo, O. Alexandrova , et al. (160 additional authors not shown)

Abstract: Solar Orbiter is the first space mission observing the solar plasma both in situ and remotely, from a close distance, in and out of the ecliptic. The ultimate goal is to understand how the Sun produces and controls the heliosphere, filling the Solar System and driving the planetary environments. With six remote-sensing and four in-situ instrument suites, the coordination and planning of the operat… ▽ More Solar Orbiter is the first space mission observing the solar plasma both in situ and remotely, from a close distance, in and out of the ecliptic. The ultimate goal is to understand how the Sun produces and controls the heliosphere, filling the Solar System and driving the planetary environments. With six remote-sensing and four in-situ instrument suites, the coordination and planning of the operations are essential to address the following four top-level science questions: (1) What drives the solar wind and where does the coronal magnetic field originate? (2) How do solar transients drive heliospheric variability? (3) How do solar eruptions produce energetic particle radiation that fills the heliosphere? (4) How does the solar dynamo work and drive connections between the Sun and the heliosphere? Maximising the mission's science return requires considering the characteristics of each orbit, including the relative position of the spacecraft to Earth (affecting downlink rates), trajectory events (such as gravitational assist manoeuvres), and the phase of the solar activity cycle. Furthermore, since each orbit's science telemetry will be downloaded over the course of the following orbit, science operations must be planned at mission level, rather than at the level of individual orbits. It is important to explore the way in which those science questions are translated into an actual plan of observations that fits into the mission, thus ensuring that no opportunities are missed. First, the overarching goals are broken down into specific, answerable questions along with the required observations and the so-called Science Activity Plan (SAP) is developed to achieve this. The SAP groups objectives that require similar observations into Solar Orbiter Observing Plans (SOOPs), resulting in a strategic, top-level view of the optimal opportunities for science observations during the mission lifetime. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 20 pages, 1 figure, accepted by Astronomy & Astrophysics

Journal ref: A&A 642, A3 (2020)

arXiv:2009.10466 [pdf, other]

doi 10.3389/fspas.2020.593516

Using dimensionality reduction and clustering techniques to classify space plasma regimes

Authors: Mayur R. Bakrania, I. Jonathan Rae, Andrew P. Walsh, Daniel Verscharen, Andy W. Smith

Abstract: Collisionless space plasma environments are typically characterised by distinct particle populations. Although moments of their velocity distribution functions help in distinguishing different plasma regimes, the distribution functions themselves provide more comprehensive information about the plasma state, especially at times when the distribution function includes non-thermal effects. Unlike mo… ▽ More Collisionless space plasma environments are typically characterised by distinct particle populations. Although moments of their velocity distribution functions help in distinguishing different plasma regimes, the distribution functions themselves provide more comprehensive information about the plasma state, especially at times when the distribution function includes non-thermal effects. Unlike moments, however, distribution functions are not easily characterised by a small number of parameters, making their classification more difficult to achieve. In order to perform this classification, we propose to distinguish between the different plasma regions by applying dimensionality reduction and clustering methods to electron distributions in pitch angle and energy space. We utilise four separate algorithms to achieve our plasma classifications: autoencoders, principal component analysis, mean shift, and agglomerative clustering. We test our classification algorithms by applying our scheme to data from the Cluster-PEACE instrument measured in the Earth's magnetotail. Traditionally, it is thought that the Earth's magnetotail is split into three different regions (the plasma sheet, the plasma sheet boundary layer, and the lobes), that are primarily defined by their plasma characteristics. Starting with the ECLAT database with associated classifications based on the plasma parameters, we identify 8 distinct groups of distributions, that are dependent upon significantly more complex plasma and field dynamics. By comparing the average distributions as well as the plasma and magnetic field parameters for each region, we relate several of the groups to different plasma sheet populations, and the rest we attribute to the plasma sheet boundary layer and the lobes. We find clear distinctions between each of our classified regions and the ECLAT results. △ Less

Submitted 21 October, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

Comments: Published in Frontiers in Astronomy and Space Sciences, 20 pages, 7 figures

arXiv:2009.02471 [pdf]

Archaeology in a Vacuum: Obstacles to and Solutions for Develo** a Real Space Archaeology

Authors: International Space Station Archaeological Project, :, Alice C. Gorman, Justin St. P. Walsh

Abstract: This paper outlines some of the difficulties faced by archaeologists studying human activity in outer space. The International Space Station Archaeological Project has identified solutions to these problems, including the use of historic photographic archives and documentation of discard practices such as processes associated with the return of space-flown items to Earth. This paper outlines some of the difficulties faced by archaeologists studying human activity in outer space. The International Space Station Archaeological Project has identified solutions to these problems, including the use of historic photographic archives and documentation of discard practices such as processes associated with the return of space-flown items to Earth. △ Less

Submitted 5 September, 2020; originally announced September 2020.

Comments: 19 pages with 5 figures. To be published in Archaeology Outside of the Box: Investigations at the Edge of the Discipline, H. Barnard and R. Boytner, eds. Los Angeles: Cotsen Institute of Archaeology Press

arXiv:2007.12024 [pdf]

doi 10.1103/PhysRevLett.125.077202

Pressure induced collapse of magnetic order in jarosite

Authors: Ryan A. Klein, James P. S. Walsh, Samantha M. Clarke, Zhenxian Liu, E. Ercan Alp, Wenli Bi, Yue Meng, Alison B. Altman, Paul Chow, Yuming Xiao, M. R. Norman, James M. Rondinelli, Steven D. Jacobsen, Danilo Puggioni, Danna E. Freedman

Abstract: We report a pressure-induced phase transition in the frustrated kagomé material jarosite at ~45 GPa, which leads to the disappearance of magnetic order. Using a suite of experimental techniques, we characterize the structural, electronic, and magnetic changes in jarosite through this phase transition. Synchrotron powder X-ray diffraction and Fourier transform infrared spectroscopy experiments, ana… ▽ More We report a pressure-induced phase transition in the frustrated kagomé material jarosite at ~45 GPa, which leads to the disappearance of magnetic order. Using a suite of experimental techniques, we characterize the structural, electronic, and magnetic changes in jarosite through this phase transition. Synchrotron powder X-ray diffraction and Fourier transform infrared spectroscopy experiments, analyzed in aggregate with the results from density functional theory calculations, indicate that the material changes from a R-3m structure to a structure with a R-3c space group. The resulting phase features a rare twisted kagomé lattice in which the integrity of the equilateral Fe3+ triangles persists. Based on symmetry arguments we hypothesize that the resulting structural changes alter the magnetic interactions to favor a possible quantum paramagnetic phase at high pressure. △ Less

Submitted 23 July, 2020; originally announced July 2020.

Comments: Manuscript and Supplement included

Journal ref: Phys. Rev. Lett. 25, 077202 (2020)

arXiv:2005.12622 [pdf, other]

doi 10.1051/0004-6361/202037840

Statistics of Solar Wind Electron Breakpoint Energies Using Machine Learning Techniques

Authors: Mayur R. Bakrania, I. Jonathan Rae, Andrew P. Walsh, Daniel Verscharen, Andy W. Smith, Téo Bloch, Clare E. J. Watt

Abstract: Solar wind electron velocity distributions at 1 au consist of a thermal "core" population and two suprathermal populations: "halo" and "strahl". The core and halo are quasi-isotropic, whereas the strahl typically travels radially outwards along the parallel and/or anti-parallel direction with respect to the interplanetary magnetic field. With Cluster-PEACE data, we analyse energy and pitch angle d… ▽ More Solar wind electron velocity distributions at 1 au consist of a thermal "core" population and two suprathermal populations: "halo" and "strahl". The core and halo are quasi-isotropic, whereas the strahl typically travels radially outwards along the parallel and/or anti-parallel direction with respect to the interplanetary magnetic field. With Cluster-PEACE data, we analyse energy and pitch angle distributions and use machine learning techniques to provide robust classifications of these solar wind populations. Initially, we use unsupervised algorithms to classify halo and strahl differential energy flux distributions to allow us to calculate relative number densities, which are of the same order as previous results. Subsequently, we apply unsupervised algorithms to phase space density distributions over ten years to study the variation of halo and strahl breakpoint energies with solar wind parameters. In our statistical study, we find both halo and strahl suprathermal breakpoint energies display a significant increase with core temperature, with the halo exhibiting a more positive correlation than the strahl. We conclude low energy strahl electrons are scattering into the core at perpendicular pitch angles. This increases the number of Coulomb collisions and extends the perpendicular core population to higher energies, resulting in a larger difference between halo and strahl breakpoint energies at higher core temperatures. Statistically, the locations of both suprathermal breakpoint energies decrease with increasing solar wind speed. In the case of halo breakpoint energy, we observe two distinct profiles above and below 500 km/s. We relate this to the difference in origin of fast and slow solar wind. △ Less

Submitted 7 July, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: Published in Astronomy & Astrophysics, 11 pages, 10 figures

Journal ref: A&A 639, A46 (2020)

arXiv:2004.12013 [pdf, other]

Recovering individual-level spatial inference from aggregated binary data

Authors: Nelson B. Walker, Trevor J. Hefley, Anne E. Ballmann, Robin E. Russell, Daniel P. Walsh

Abstract: Binary regression models are commonly used in disciplines such as epidemiology and ecology to determine how spatial covariates influence individuals. In many studies, binary data are shared in a spatially aggregated form to protect privacy. For example, rather than reporting the location and result for each individual that was tested for a disease, researchers may report that a disease was detecte… ▽ More Binary regression models are commonly used in disciplines such as epidemiology and ecology to determine how spatial covariates influence individuals. In many studies, binary data are shared in a spatially aggregated form to protect privacy. For example, rather than reporting the location and result for each individual that was tested for a disease, researchers may report that a disease was detected or not detected within geopolitical units. Often, the spatial aggregation process obscures the values of response variables, spatial covariates, and locations of each individual, which makes recovering individual-level inference difficult. We show that applying a series of transformations, including a change of support, to a bivariate point process model allows researchers to recover individual-level inference for spatial covariates from spatially aggregated binary data. The series of transformations preserves the convenient interpretation of desirable binary regression models that are commonly applied to individual-level data. Using a simulation experiment, we compare the performance of our proposed method under varying types of spatial aggregation against the performance of standard approaches using the original individual-level data. We illustrate our method by modeling individual-level probability of infection using a data set that has been aggregated to protect an at-risk and endangered species of bats. Our simulation experiment and data illustration demonstrate the utility of the proposed method when access to original non-aggregated data is impractical or prohibited. △ Less

Submitted 6 May, 2021; v1 submitted 24 April, 2020; originally announced April 2020.

arXiv:1911.05265 [pdf]

doi 10.1038/s41586-020-2441-3

Large-scale integration of near-indistinguishable artificial atoms in hybrid photonic circuits

Authors: Noel H. Wan, Tsung-Ju Lu, Kevin C. Chen, Michael P. Walsh, Matthew E. Trusheim, Lorenzo De Santis, Eric A. Bersin, Isaac B. Harris, Sara L. Mouradian, Ian R. Christen, Edward S. Bielejec, Dirk Englund

Abstract: A central challenge in develo** quantum computers and long-range quantum networks lies in the distribution of entanglement across many individually controllable qubits. Colour centres in diamond have emerged as leading solid-state 'artificial atom' qubits, enabling on-demand remote entanglement, coherent control of over 10 ancillae qubits with minute-long coherence times, and memory-enhanced qua… ▽ More A central challenge in develo** quantum computers and long-range quantum networks lies in the distribution of entanglement across many individually controllable qubits. Colour centres in diamond have emerged as leading solid-state 'artificial atom' qubits, enabling on-demand remote entanglement, coherent control of over 10 ancillae qubits with minute-long coherence times, and memory-enhanced quantum communication. A critical next step is to integrate large numbers of artificial atoms with photonic architectures to enable large-scale quantum information processing systems. To date, these efforts have been stymied by qubit inhomogeneities, low device yield, and complex device requirements. Here, we introduce a process for the high-yield heterogeneous integration of 'quantum micro-chiplets' (QMCs) -- diamond waveguide arrays containing highly coherent colour centres -- with an aluminium nitride (AlN) photonic integrated circuit (PIC). Our process enables the development of a 72-channel defect-free array of germanium-vacancy (GeV) and silicon-vacancy (SiV) colour centres in a PIC. Photoluminescence spectroscopy reveals long-term stable and narrow average optical linewidths of 54 MHz (146 MHz) for GeV (SiV) emitters, close to the lifetime-limited linewidth of 32 MHz (93 MHz). Additionally, inhomogeneities in the individual qubits can be compensated in situ with integrated tuning of the optical frequencies over 100 GHz. The ability to assemble large numbers of nearly indistinguishable artificial atoms into phase-stable PICs provides an architecture toward multiplexed quantum repeaters and general-purpose quantum computers. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: 5 figures

Journal ref: Nature 583, 226-231(2020)

arXiv:1810.06419 [pdf, other]

doi 10.1073/pnas.1821761116

Effects of Microstructure Formation on the Stability of Vapor Deposited Glasses

Authors: Alex R. Moore, Patrick J. Walsh, Zahra Fakhraai, Robert A. Riggleman

Abstract: Glasses formed by physical vapor deposition (PVD) are an interesting new class of materials, exhibiting properties thought to be equivalent to those of glasses aged for thousands of years. Exerting control over the structure and properties of PVD glasses formed with different types of glass-forming molecules is now an emerging challenge. In this work, we study coarse grained models of organic glas… ▽ More Glasses formed by physical vapor deposition (PVD) are an interesting new class of materials, exhibiting properties thought to be equivalent to those of glasses aged for thousands of years. Exerting control over the structure and properties of PVD glasses formed with different types of glass-forming molecules is now an emerging challenge. In this work, we study coarse grained models of organic glass formers containing fluorocarbon tails of increasing length, corresponding to an increased tendency to form microstructures. We use simulated PVD to examine how the presence of the microphase separated domains in the supercooled liquid influences the ability to form stable glasses. This model suggests that increasing molecule tail length results in decreased thermodynamic and kinetic stability of the molecules in PVD films. The reduced stability is further linked to the reduced ability of these molecules to equilibrate at the free surface during PVD. We find that as the tail length is increased, the relaxation time near the surface of the supercooled equilibrium liquid films of these molecules are slowed and become essentially bulk-like, due to the segregation of the fluorocarbon tails to the free surface. Surface diffusion is also markedly reduced due to clustering of the molecules at the surface. Based on these results, we propose a trap** mechanism where tails are unable to move between local phase separated domains on the relevant deposition time scales. △ Less

Submitted 15 October, 2018; originally announced October 2018.

arXiv:1701.04701 [pdf]

doi 10.5194/angeo-32-705-2014

Dawn-dusk asymmetries in the coupled solar wind-magnetosphere-ionosphere system: a review

Authors: A. P. Walsh, S. Haaland, C. Forsyth, A. M. Keesee, J. Kissinger, K. Li, A. Runov, J. Soucek, B. M. Walsh, S. Wing, M. G. G. T. Taylor

Abstract: Dawn-dusk asymmetries are ubiquitous features of the coupled solar-wind-magnetosphere-ionosphere system. During the last decades, increasing availability of satellite and ground-based measurements has made it possible to study these phenomena in more detail. Numerous publications have documented the existence of persistent asymmetries in processes, properties and topology of plasma structures in v… ▽ More Dawn-dusk asymmetries are ubiquitous features of the coupled solar-wind-magnetosphere-ionosphere system. During the last decades, increasing availability of satellite and ground-based measurements has made it possible to study these phenomena in more detail. Numerous publications have documented the existence of persistent asymmetries in processes, properties and topology of plasma structures in various regions of geospace. In this paper, we present a review of our present knowledge of some of the most pronounced dawn-dusk asymmetries. We focus on four key aspects: (1) the role of external influences such as the solar wind and its interaction with the Earth's magnetosphere; (2) properties of the magnetosphere itself; (3) the role of the ionosphere and (4) feedback and coupling between regions. We have also identified potential inconsistencies and gaps in our understanding of dawn-dusk asymmetries in the Earth's magnetosphere and ionosphere. △ Less

Submitted 17 January, 2017; originally announced January 2017.

arXiv:1604.08159 [pdf, ps, other]

doi 10.1002/fld.4209

Fujiwhara interaction of tropical cyclone scale vortices using a weighted residual collocation method

Authors: Raymond P Walsh, Jahrul M Alam

Abstract: The fundamental interaction between tropical cyclones was investigated through a series of water tank experiements by Fujiwhara [20, 21, 22]. However, a complete understanding of tropical cyclones remains an open research challenge although there have been numerous investigations through measurments with aircrafts/satellites, as well as with numerical simulations. This article presents a computati… ▽ More The fundamental interaction between tropical cyclones was investigated through a series of water tank experiements by Fujiwhara [20, 21, 22]. However, a complete understanding of tropical cyclones remains an open research challenge although there have been numerous investigations through measurments with aircrafts/satellites, as well as with numerical simulations. This article presents a computational model for simulating the interaction between cyclones. The proposed numerical method is presented briefly, where the time integration is performed by projecting the discrete system onto a Krylov subspace. The method filters the large scale fluid dynamics using a multiresolution approximation, and the unresolved dynamics is modeled with a Smagorinsky type subgrid scale parameterization scheme. Numerical experiments with Fujiwhara interactions are considered to verify modeling accuracy. An excellent agreement between the present simulation and a reference simulation at Re = 5000 has been demonstrated. At Re = 37440, the kinetic energy of cyclones is seen consolidated into larger scales with concurrent enstrophy cascade, suggesting a steady increase of energy containing scales, a phenomena that is typical in two-dimensional turbulence theory. The primary results of this article suggest a novel avenue for addressing some of the computational challenges of mesoscale atmospheric circulations. △ Less

Submitted 25 September, 2015; originally announced April 2016.

Comments: 24 pages, 11 figures, submitted

arXiv:1512.06980 [pdf]

doi 10.1038/nphys3565

Cassini in situ observations of long-duration magnetic reconnection in Saturn's magnetotail

Authors: Christopher S. Arridge, Jonathan P. Eastwood, Caitriona M. Jackman, Gang-Kai Poh, James A. Slavin, Michelle F. Thomsen, Nicolas André, Xianzhe Jia, Ariah Kidder, Laurent Lamy, Aikaterina Radioti, Dan B. Reisenfeld, Nick Sergis, Martin Volwerk, Andrew P. Walsh, Philippe Zarka, Andrew J. Coates, Michele K. Dougherty

Abstract: Magnetic reconnection is a fundamental process in solar system and astrophysical plasmas, through which stored magnetic energy associated with current sheets is converted into thermal, kinetic and wave energy. Magnetic reconnection is also thought to be a key process involved in shedding internally produced plasma from the giant magnetospheres at Jupiter and Saturn through topological reconfigurat… ▽ More Magnetic reconnection is a fundamental process in solar system and astrophysical plasmas, through which stored magnetic energy associated with current sheets is converted into thermal, kinetic and wave energy. Magnetic reconnection is also thought to be a key process involved in shedding internally produced plasma from the giant magnetospheres at Jupiter and Saturn through topological reconfiguration of the magnetic field. The region where magnetic fields reconnect is known as the diffusion region and in this letter we report on the first encounter of the Cassini spacecraft with a diffusion region in Saturn's magnetotail. The data also show evidence of magnetic reconnection over a period of 19 h revealing that reconnection can, in fact, act for prolonged intervals in a rapidly rotating magnetosphere. We show that reconnection can be a significant pathway for internal plasma loss at Saturn. This counters the view of reconnection as a transient method of internal plasma loss at Saturn. These results, although directly relating to the magnetosphere of Saturn, have applications in the understanding of other rapidly rotating magnetospheres, including that of Jupiter and other astrophysical bodies. △ Less

Submitted 22 December, 2015; originally announced December 2015.

Comments: Initially submitted version (submitted 24 March 2015), published online in Nature Physics 30 November 2015

arXiv:1210.6875 [pdf, other]

doi 10.1103/PhysRevSTAB.16.042001

Effect of high temperature heat treatments on the quality factor of a large-grain superconducting radio-frequency niobium cavity

Authors: P. Dhakal, G. Ciovati, G. R. Myneni, K. E. Gray, N. Groll, P. Maheshwari, D. M. McRae, R. Pike, T. Proslier, F. Stevie, R. P. Walsh, Q. Yang, J. Zasadzinzki

Abstract: Large-grain Nb has become a viable alternative to fine-grain Nb for the fabrication of superconducting radio-frequency cavities. In this contribution we report the results from a heat treatment study of a large-grain 1.5 GHz single-cell cavity made of "medium purity" Nb. The baseline surface preparation prior to heat treatment consisted of standard buffered chemical polishing. The heat treatment i… ▽ More Large-grain Nb has become a viable alternative to fine-grain Nb for the fabrication of superconducting radio-frequency cavities. In this contribution we report the results from a heat treatment study of a large-grain 1.5 GHz single-cell cavity made of "medium purity" Nb. The baseline surface preparation prior to heat treatment consisted of standard buffered chemical polishing. The heat treatment in the range 800 - 1400 C was done in a newly designed vacuum induction furnace. Q0 values of the order of 2x1010 at 2.0 K and peak surface magnetic field (Bp) of 90 mT were achieved reproducibly. A Q0-value of (5+-1)1010 at 2.0 K and Bp = 90 mT was obtained after heat treatment at 1400 C. This is the highest value ever reported at this temperature, frequency and field. Samples heat treated with the cavity at 1400 C were analyzed by secondary ion mass spectrometry, secondary electron microscopy, energy dispersive X-ray, point contact tunneling and X-ray diffraction and revealed a complex surface composition which includes titanium oxide, increased carbon and nitrogen content but reduced hydrogen concentration compared to a non heat-treated sample. △ Less

Submitted 25 March, 2013; v1 submitted 25 October, 2012; originally announced October 2012.

arXiv:1110.0763 [pdf, ps, other]

Non-random walks in monkeys and humans

Authors: Denis Boyer, Margaret C. Crofoot, Peter D. Walsh

Abstract: Principles of self-organization play an increasingly central role in models of human activity. Notably, individual human displacements exhibit strongly recurrent patterns that are characterized by scaling laws and can be mechanistically modelled as self-attracting walks. Recurrence is not, however, unique to human displacements. Here we report that the mobility patterns of wild capuchin monkeys ar… ▽ More Principles of self-organization play an increasingly central role in models of human activity. Notably, individual human displacements exhibit strongly recurrent patterns that are characterized by scaling laws and can be mechanistically modelled as self-attracting walks. Recurrence is not, however, unique to human displacements. Here we report that the mobility patterns of wild capuchin monkeys are not random walks and exhibit recurrence properties similar to those of cell phone users, suggesting spatial cognition mechanisms shared with humans. We also show that the highly uneven visitation patterns within monkey home ranges are not entirely self-generated but are forced by spatio-temporal habitat heterogeneities. If models of human mobility are to become useful tools for predictive purposes, they will need to consider the interaction between memory and environmental heterogeneities. △ Less

Submitted 27 March, 2012; v1 submitted 4 October, 2011; originally announced October 2011.

Comments: 18 pages, 3 figures

Journal ref: J. R. Soc. Interface 9, 842-847 (2012)

arXiv:1006.0079 [pdf, ps, other]

doi 10.1098/rsta.2010.0275

Modeling the mobility of living organisms in heterogeneous landscapes: Does memory improve foraging success?

Authors: Denis Boyer, Peter D. Walsh

Abstract: Thanks to recent technological advances, it is now possible to track with an unprecedented precision and for long periods of time the movement patterns of many living organisms in their habitat. The increasing amount of data available on single trajectories offers the possibility of understanding how animals move and of testing basic movement models. Random walks have long represented the main des… ▽ More Thanks to recent technological advances, it is now possible to track with an unprecedented precision and for long periods of time the movement patterns of many living organisms in their habitat. The increasing amount of data available on single trajectories offers the possibility of understanding how animals move and of testing basic movement models. Random walks have long represented the main description for micro-organisms and have also been useful to understand the foraging behaviour of large animals. Nevertheless, most vertebrates, in particular humans and other primates, rely on sophisticated cognitive tools such as spatial maps, episodic memory and travel cost discounting. These properties call for other modeling approaches of mobility patterns. We propose a foraging framework where a learning mobile agent uses a combination of memory-based and random steps. We investigate how advantageous it is to use memory for exploiting resources in heterogeneous and changing environments. An adequate balance of determinism and random exploration is found to maximize the foraging efficiency and to generate trajectories with an intricate spatio-temporal order. Based on this approach, we propose some tools for analysing the non-random nature of mobility patterns in general. △ Less

Submitted 12 October, 2010; v1 submitted 1 June, 2010; originally announced June 2010.

Comments: 14 pages, 4 figures, improved discussion

Showing 1–34 of 34 results for author: Walsh, P