-
Energy-Aware Random Access Networks: Connection-Based versus Packet-Based
Authors:
Anshan Yuan,
Fangming Zhao,
Xinghua Sun
Abstract:
Characterizing and comparing the optimal energy efficiency in energy-aware machine-to-machine (M2M) random access networks remains a challenge due to the distributed nature of the access behavior of nodes. To address this issue, this letter focuses on the energy efficiency limits of two typical random access schemes, i.e., connection-based Aloha and packet-based Aloha, based on which we conducted…
▽ More
Characterizing and comparing the optimal energy efficiency in energy-aware machine-to-machine (M2M) random access networks remains a challenge due to the distributed nature of the access behavior of nodes. To address this issue, this letter focuses on the energy efficiency limits of two typical random access schemes, i.e., connection-based Aloha and packet-based Aloha, based on which we conducted a performance comparison. Specifically, by integrating limited energy constraints and network throughput, the lifetime throughput can be derived, and further optimized with a guarantee of targeted lifetime via selecting the transmission probability. Then we present a comparative study on the optimal lifetime throughput of packet-based Aloha and connection-based Aloha to characterize criteria for beneficial connection establishment.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Who's asking? User personas and the mechanics of latent misalignment
Authors:
Asma Ghandeharioun,
Ann Yuan,
Marius Guerard,
Emily Reif,
Michael A. Lepori,
Lucas Dixon
Abstract:
Despite investments in improving model safety, studies show that misaligned capabilities remain latent in safety-tuned models. In this work, we shed light on the mechanics of this phenomenon. First, we show that even when model generations are safe, harmful content can persist in hidden representations and can be extracted by decoding from earlier layers. Then, we show that whether the model divul…
▽ More
Despite investments in improving model safety, studies show that misaligned capabilities remain latent in safety-tuned models. In this work, we shed light on the mechanics of this phenomenon. First, we show that even when model generations are safe, harmful content can persist in hidden representations and can be extracted by decoding from earlier layers. Then, we show that whether the model divulges such content depends significantly on its perception of who it is talking to, which we refer to as user persona. In fact, we find manipulating user persona to be even more effective for eliciting harmful content than direct attempts to control model refusal. We study both natural language prompting and activation steering as control methods and show that activation steering is significantly more effective at bypassing safety filters. We investigate why certain personas break model safeguards and find that they enable the model to form more charitable interpretations of otherwise dangerous queries. Finally, we show we can predict a persona's effect on refusal given only the geometry of its steering vector.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Improving Speech Decoding from ECoG with Self-Supervised Pretraining
Authors:
Brian A. Yuan,
Joseph G. Makin
Abstract:
Recent work on intracranial brain-machine interfaces has demonstrated that spoken speech can be decoded with high accuracy, essentially by treating the problem as an instance of supervised learning and training deep neural networks to map from neural activity to text. However, such networks pay for their expressiveness with very large numbers of labeled data, a requirement that is particularly bur…
▽ More
Recent work on intracranial brain-machine interfaces has demonstrated that spoken speech can be decoded with high accuracy, essentially by treating the problem as an instance of supervised learning and training deep neural networks to map from neural activity to text. However, such networks pay for their expressiveness with very large numbers of labeled data, a requirement that is particularly burdensome for invasive neural recordings acquired from human patients. On the other hand, these patients typically produce speech outside of the experimental blocks used for training decoders. Making use of such data, and data from other patients, to improve decoding would ease the burden of data collection -- especially onerous for dys- and anarthric patients. Here we demonstrate that this is possible, by reengineering wav2vec -- a simple, self-supervised, fully convolutional model that learns latent representations of audio using a noise-contrastive loss -- for electrocorticographic (ECoG) data. We train this model on unlabelled ECoG recordings, and subsequently use it to transform ECoG from labeled speech sessions into wav2vec's representation space, before finally training a supervised encoder-decoder to map these representations to text. We experiment with various numbers of labeled blocks; for almost all choices, the new representations yield superior decoding performance to the original ECoG data, and in no cases do they yield worse. Performance can also be improved in some cases by pretraining wav2vec on another patient's data. In the best cases, wav2vec's representations decrease word error rates over the original data by upwards of 50%.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Maps between spherical group rings
Authors:
Shachar Carmeli,
Thomas Nikolaus,
Allen Yuan
Abstract:
We prove that for finitely generated abelian groups $A$ and $B$, the space of $\mathbb{E}_\infty$-ring maps between the spherical groups rings $\mathbb{S}[A] \to \mathbb{S}[B]$ is equivalent to the discrete set of group homomorphisms $A \to B$. We also prove generalizations where the sphere is replaced by other ring spectra, e.g. we give a formula for the strict units in group rings of the form…
▽ More
We prove that for finitely generated abelian groups $A$ and $B$, the space of $\mathbb{E}_\infty$-ring maps between the spherical groups rings $\mathbb{S}[A] \to \mathbb{S}[B]$ is equivalent to the discrete set of group homomorphisms $A \to B$. We also prove generalizations where the sphere is replaced by other ring spectra, e.g. we give a formula for the strict units in group rings of the form $R[A]$ for $A$ a finite $p$-group and $R$ $p$-completely chromatically complete.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Prediction Is All MoE Needs: Expert Load Distribution Goes from Fluctuating to Stabilizing
Authors:
Peizhuang Cong,
Aomufei Yuan,
Shimao Chen,
Yuxuan Tian,
Bowen Ye,
Tong Yang
Abstract:
MoE facilitates the development of large models by making the computational complexity of the model no longer scale linearly with increasing parameters. The learning sparse gating network selects a set of experts for each token to be processed; however, this may lead to differences in the number of tokens processed by each expert over several successive iterations, i.e., the expert load fluctuatio…
▽ More
MoE facilitates the development of large models by making the computational complexity of the model no longer scale linearly with increasing parameters. The learning sparse gating network selects a set of experts for each token to be processed; however, this may lead to differences in the number of tokens processed by each expert over several successive iterations, i.e., the expert load fluctuations, which reduces computational parallelization and resource utilization. To this end, we traced and analyzed loads of each expert in the training iterations for several large language models in this work, and defined the transient state with "obvious load fluctuation" and the stable state with "temporal locality". Moreover, given the characteristics of these two states and the computational overhead, we deployed three classical prediction algorithms that achieve accurate expert load prediction results. For the GPT3 350M model, the average error rates for predicting the expert load proportion over the next 1,000 and 2,000 steps are approximately 1.3% and 1.8%, respectively. This work can provide valuable guidance for expert placement or resource allocation for MoE model training. Based on this work, we will propose an expert placement scheme for transient and stable states in our coming work.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Phase sensitive information from a planar Josephson junction
Authors:
Andrew C. Yuan,
Steven A. Kivelson
Abstract:
We analyze both the general symmetry-related and more microscopic considerations that govern the Josephson tunneling across a finite planar junction between a known $s$-wave superconductor and a candidate unconventional superconductor (e.g., $d_{x^2-y^2}$-wave). Due to the finite size of the probe, the Josephson current possesses an edge contribution, which is shown to be the dominant contribution…
▽ More
We analyze both the general symmetry-related and more microscopic considerations that govern the Josephson tunneling across a finite planar junction between a known $s$-wave superconductor and a candidate unconventional superconductor (e.g., $d_{x^2-y^2}$-wave). Due to the finite size of the probe, the Josephson current possesses an edge contribution, which is shown to be the dominant contribution under certain conditions. Thus, the dependence of the edge contribution on the geometry of the junction can serve as a direct probe of the symmetry of the order parameter in the unconventional superconductor.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Coalescing sets preserving cospectrality of graphs arising from block similarity matrices
Authors:
Sajid Bin Mahamud,
Steve Butler,
Hannah Graff,
Nick Layman,
Taylor Luck,
Jiah **,
Noah Owen,
Angela Yuan
Abstract:
Coalescing involves gluing one or more rooted graphs onto another graph. Under specific conditions, it is possible to start with cospectral graphs that are coalesced in similar ways that will result in new cospectral graphs. We present a sufficient condition for this based on the block structure of similarity matrices, possibly with additional constraints depending on which type of matrix is being…
▽ More
Coalescing involves gluing one or more rooted graphs onto another graph. Under specific conditions, it is possible to start with cospectral graphs that are coalesced in similar ways that will result in new cospectral graphs. We present a sufficient condition for this based on the block structure of similarity matrices, possibly with additional constraints depending on which type of matrix is being considered. The matrices considered in this paper include the adjacency, Laplacian, signless Laplacian, distance, and generalized distance matrix.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
ConstitutionalExperts: Training a Mixture of Principle-based Prompts
Authors:
Savvas Petridis,
Ben Wedin,
Ann Yuan,
James Wexler,
Nithum Thain
Abstract:
Large language models (LLMs) are highly capable at a variety of tasks given the right prompt, but writing one is still a difficult and tedious process. In this work, we introduce ConstitutionalExperts, a method for learning a prompt consisting of constitutional principles (i.e. rules), given a training dataset. Unlike prior methods that optimize the prompt as a single entity, our method incrementa…
▽ More
Large language models (LLMs) are highly capable at a variety of tasks given the right prompt, but writing one is still a difficult and tedious process. In this work, we introduce ConstitutionalExperts, a method for learning a prompt consisting of constitutional principles (i.e. rules), given a training dataset. Unlike prior methods that optimize the prompt as a single entity, our method incrementally improves the prompt by surgically editing individual principles. We also show that we can improve overall performance by learning unique prompts for different semantic regions of the training data and using a mixture-of-experts (MoE) architecture to route inputs at inference time. We compare our method to other state of the art prompt-optimization techniques across six benchmark datasets. We also investigate whether MoE improves these other techniques. Our results suggest that ConstitutionalExperts outperforms other prompt optimization techniques by 10.9% (F1) and that mixture-of-experts improves all techniques, suggesting its broad applicability.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
ForTune: Running Offline Scenarios to Estimate Impact on Business Metrics
Authors:
Georges Dupret,
Konstantin Sozinov,
Carmen Barcena Gonzalez,
Ziggy Zacks,
Amber Yuan,
Benjamin Carterette,
Manuel Mai,
Shubham Bansal,
Gwo Liang,
Lien,
Andrey Gatash,
Roberto Sanchis Ojeda,
Mounia Lalmas
Abstract:
Making ideal decisions as a product leader in a web-facing company is extremely difficult. In addition to navigating the ambiguity of customer satisfaction and achieving business goals, one must also pave a path forward for ones' products and services to remain relevant, desirable, and profitable. Data and experimentation to test product hypotheses are key to informing product decisions. Online co…
▽ More
Making ideal decisions as a product leader in a web-facing company is extremely difficult. In addition to navigating the ambiguity of customer satisfaction and achieving business goals, one must also pave a path forward for ones' products and services to remain relevant, desirable, and profitable. Data and experimentation to test product hypotheses are key to informing product decisions. Online controlled experiments by A/B testing may provide the best data to support such decisions with high confidence, but can be time-consuming and expensive, especially when one wants to understand impact to key business metrics such as retention or long-term value. Offline experimentation allows one to rapidly iterate and test, but often cannot provide the same level of confidence, and cannot easily shine a light on impact on business metrics. We introduce a novel, lightweight, and flexible approach to investigating hypotheses, called scenario analysis, that aims to support product leaders' decisions using data about users and estimates of business metrics. Its strengths are that it can provide guidance on trade-offs that are incurred by growing or shifting consumption, estimate trends in long-term outcomes like retention and other important business metrics, and can generate hypotheses about relationships between metrics at scale.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
Authors:
Koyena Pal,
Jiuding Sun,
Andrew Yuan,
Byron C. Wallace,
David Bau
Abstract:
We conjecture that hidden state vectors corresponding to individual input tokens encode information sufficient to accurately predict several tokens ahead. More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$? To test this, we measure linear appr…
▽ More
We conjecture that hidden state vectors corresponding to individual input tokens encode information sufficient to accurately predict several tokens ahead. More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$? To test this, we measure linear approximation and causal intervention methods in GPT-J-6B to evaluate the degree to which individual hidden states in the network contain signal rich enough to predict future hidden states and, ultimately, token outputs. We find that, at some layers, we can approximate a model's output with more than 48% accuracy with respect to its prediction of subsequent tokens through a single hidden state. Finally we present a "Future Lens" visualization that uses these methods to create a new view of transformer states.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Breaking the Degrees-of-Freedom Limit of Holographic MIMO Communications: A 3-D Antenna Array Topology
Authors:
Shuai S. A. Yuan,
Jie Wu,
Hong**g Xu,
Tengjiao Wang,
Da Li,
Xiaoming Chen,
Chongwen Huang,
Sheng Sun,
Shilie Zheng,
Xianmin Zhang,
Er-** Li,
Wei E. I. Sha
Abstract:
The performance of holographic multiple-input multiple-output (MIMO) communications, employing two-dimensional (2-D) planar antenna arrays, is typically compromised by finite degrees-of-freedom (DOF) stemming from limited array size. The DOF constraint becomes significant when the element spacing approaches approximately half a wavelength, thereby restricting the overall performance of MIMO system…
▽ More
The performance of holographic multiple-input multiple-output (MIMO) communications, employing two-dimensional (2-D) planar antenna arrays, is typically compromised by finite degrees-of-freedom (DOF) stemming from limited array size. The DOF constraint becomes significant when the element spacing approaches approximately half a wavelength, thereby restricting the overall performance of MIMO systems. To break this inherent limitation, we propose a novel three-dimensional (3-D) antenna array that strategically explores the untapped vertical dimension. We investigate the performance of MIMO systems utilizing 3-D arrays across different multi-path scenarios, encompassing Rayleigh channels with varying angular spreads and the 3rd generation partnership project (3GPP) channels. We subsequently showcase the advantages of these 3-D arrays over their 2-D counterparts with the same aperture sizes. As a proof of concept, a practical dipole-based 3-D array, facilitated by an electromagnetic band-gap (EBG) reflecting surface, is conceived, constructed, and evaluated. The experimental results align closely with full-wave simulations, and channel simulations substantiate that the DOF and capacity constraints of traditional holographic MIMO systems can be surpassed by adopting such a 3-D array configuration.
△ Less
Submitted 27 February, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Relative cyclotomic structures and equivariant complex cobordism
Authors:
Andrew J. Blumberg,
Michael A. Mandell,
Allen Yuan
Abstract:
We describe a structure on a commutative ring (pre)cyclotomic spectrum $R$ that gives rise to a (pre)cyclotomic structure on topological Hochschild homology ($THH$) relative to its underlying commutative ring spectrum. This lets us construct $TC$ relative to $R$, denoted $TC^{R}$, and we prove some descent results relating $TC^{R}$ and $TC$. We explore several examples of this structure on familia…
▽ More
We describe a structure on a commutative ring (pre)cyclotomic spectrum $R$ that gives rise to a (pre)cyclotomic structure on topological Hochschild homology ($THH$) relative to its underlying commutative ring spectrum. This lets us construct $TC$ relative to $R$, denoted $TC^{R}$, and we prove some descent results relating $TC^{R}$ and $TC$. We explore several examples of this structure on familiar $\mathbb{T}$-equivariant commutative ring spectra including the periodic $\mathbb{T}$-equivariant complex cobordism spectrum $MUP_{\mathbb{T}}$ and a new (connective) equivariant version of the complex cobordism spectrum $MU$.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
An Empirical Study on the Use of Static Analysis Tools in Open Source Embedded Software
Authors:
Mingjie Shen,
Akul Pillai,
Brian A. Yuan,
James C. Davis,
Aravind Machiry
Abstract:
This paper performs the first study to understand the prevalence, challenges, and effectiveness of using Static Application Security Testing (SAST) tools on Open-Source Embedded Software (EMBOSS) repositories. We collect a corpus of 258 of the most popular EMBOSS projects, representing 13 distinct categories such as real-time operating systems, network stacks, and applications. To understand the c…
▽ More
This paper performs the first study to understand the prevalence, challenges, and effectiveness of using Static Application Security Testing (SAST) tools on Open-Source Embedded Software (EMBOSS) repositories. We collect a corpus of 258 of the most popular EMBOSS projects, representing 13 distinct categories such as real-time operating systems, network stacks, and applications. To understand the current use of SAST tools on EMBOSS, we measured this corpus and surveyed developers. To understand the challenges and effectiveness of using SAST tools on EMBOSS projects, we applied these tools to the projects in our corpus. We report that almost none of these projects (just 3%) use SAST tools beyond those baked into the compiler, and developers give rationales such as ineffectiveness and false positives. In applying SAST tools ourselves, we show that minimal engineering effort and project expertise are needed to apply many tools to a given EMBOSS project. GitHub's CodeQL was the most effective SAST tool -- using its built-in security checks we found a total of 540 defects (with a false positive rate of 23%) across the 258 projects, with 399 (74%) likely security vulnerabilities, including in projects maintained by Microsoft, Amazon, and the Apache Foundation. EMBOSS engineers have confirmed 273 (51%) of these defects, mainly by accepting our pull requests. Two CVEs were issued. In summary, we urge EMBOSS engineers to adopt the current generation of SAST tools, which offer low false positive rates and are effective at finding security-relevant defects.
△ Less
Submitted 29 September, 2023;
originally announced October 2023.
-
Dropout Attacks
Authors:
Andrew Yuan,
Alina Oprea,
Cheng Tan
Abstract:
Dropout is a common operator in deep learning, aiming to prevent overfitting by randomly drop** neurons during training. This paper introduces a new family of poisoning attacks against neural networks named DROPOUTATTACK. DROPOUTATTACK attacks the dropout operator by manipulating the selection of neurons to drop instead of selecting them uniformly at random. We design, implement, and evaluate fo…
▽ More
Dropout is a common operator in deep learning, aiming to prevent overfitting by randomly drop** neurons during training. This paper introduces a new family of poisoning attacks against neural networks named DROPOUTATTACK. DROPOUTATTACK attacks the dropout operator by manipulating the selection of neurons to drop instead of selecting them uniformly at random. We design, implement, and evaluate four DROPOUTATTACK variants that cover a broad range of scenarios. These attacks can slow or stop training, destroy prediction accuracy of target classes, and sabotage either precision or recall of a target class. In our experiments of training a VGG-16 model on CIFAR-100, our attack can reduce the precision of the victim class by 34.6% (from 81.7% to 47.1%) without incurring any degradation in model accuracy
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Exactly Solvable Model of Randomly Coupled Twisted Superconducting Bilayers
Authors:
Andrew C. Yuan
Abstract:
Motivated by recent experiments on twisted junctions of cuprate superconductors (SC), it was proposed [1] that at zero temperature, a random first order Josephson coupling $J_1(\textbf{r}) \cos φ$ generates an "effective" global second order coupling, $J_2\cos(2φ)$, with a sign that favors $φ= \pm π/2$, i.e., spontaneous breaking of time reversal symmetry (TRS). To obtain a more controlled underst…
▽ More
Motivated by recent experiments on twisted junctions of cuprate superconductors (SC), it was proposed [1] that at zero temperature, a random first order Josephson coupling $J_1(\textbf{r}) \cos φ$ generates an "effective" global second order coupling, $J_2\cos(2φ)$, with a sign that favors $φ= \pm π/2$, i.e., spontaneous breaking of time reversal symmetry (TRS). To obtain a more controlled understanding of the suggested "disorder-induced-order" mechanism, we construct an exactly solvable lattice mean field model and prove that when the disorder-average $\bar{J}_1=0$, the model exhibits a TRS breaking phase for all temperatures below the SC transition, i.e., $T_c = T_{\mathrm{TRSB}}$, regardless of the specific form of disorder. In the presence of nonzero $\bar{J}_1\ne 0$, we show that the two transitions split linearly for small $\bar{J}_1 \ll κ$ (where $κ$ is the in-plane SC stiffness), and that $T_{\mathrm{TRSB}}$ vanishes for $\bar J_1> J_c$ where $ J_c= \overline{J^2_1}/κ$ in the weak disorder limit.
[1] A. C. Yuan, Y. Vituri, E. Berg, B. Spivak, and S. A. Kivelson, Inhomogeneity-induced time-reversal symmetry breaking in cuprate twist-junctions, arXiv preprint arXiv:2305.15472 (2023)
△ Less
Submitted 2 December, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Observation of gamma rays up to 320 TeV from the middle-aged TeV pulsar wind nebula HESS J1849$-$000
Authors:
M. Amenomori,
S. Asano,
Y. W. Bao,
X. J. Bi,
D. Chen,
T. L. Chen,
W. Y. Chen,
Xu Chen,
Y. Chen,
Cirennima,
S. W. Cui,
Danzengluobu,
L. K. Ding,
J. H. Fang,
K. Fang,
C. F. Feng,
Zhaoyang Feng,
Z. Y. Feng,
Qi Gao,
A. Gomi,
Q. B. Gou,
Y. Q. Guo,
Y. Y. Guo,
Y. Hayashi,
H. H. He
, et al. (93 additional authors not shown)
Abstract:
Gamma rays from HESS J1849$-$000, a middle-aged TeV pulsar wind nebula (PWN), are observed by the Tibet air shower array and the muon detector array. The detection significance of gamma rays reaches $4.0\, σ$ and $4.4\, σ$ levels above 25 TeV and 100 TeV, respectively, in units of Gaussian standard deviation $σ$. The energy spectrum measured between $40\, {\rm TeV} < E < 320\, {\rm TeV}$ for the f…
▽ More
Gamma rays from HESS J1849$-$000, a middle-aged TeV pulsar wind nebula (PWN), are observed by the Tibet air shower array and the muon detector array. The detection significance of gamma rays reaches $4.0\, σ$ and $4.4\, σ$ levels above 25 TeV and 100 TeV, respectively, in units of Gaussian standard deviation $σ$. The energy spectrum measured between $40\, {\rm TeV} < E < 320\, {\rm TeV}$ for the first time is described with a simple power-law function of ${\rm d}N/{\rm d}E = (2.86 \pm 1.44) \times 10^{-16}(E/40\, {\rm TeV})^{-2.24 \pm 0.41}\, {\rm TeV}^{-1}\, {\rm cm}^{-2}\, {\rm s}^{-1}$. The gamma-ray energy spectrum from the sub-TeV ($E < 1\, {\rm TeV}$) to sub-PeV ($100\, {\rm TeV} < E < 1\, {\rm PeV}$) ranges including the results of previous studies can be modeled with the leptonic scenario, inverse Compton scattering by high-energy electrons accelerated by the PWN of PSR J1849$-$0001. On the other hand, the gamma-ray energy spectrum can also be modeled with the hadronic scenario in which gamma rays are generated from the decay of neutral pions produced by collisions between accelerated cosmic-ray protons and the ambient molecular cloud found in the gamma-ray emitting region. The cutoff energy of cosmic-ray protons $E_{\rm p\, cut}$, cut is estimated at ${\rm log}_{10}(E_{\rm p,\, cut}/{\rm TeV}) = 3.73^{+2.98}_{-0.66}$, suggesting that protons are accelerated up to the PeV energy range. Our study thus proposes that HESS J1849$-$000 should be further investigated as a new candidate for a Galactic PeV cosmic-ray accelerator, PeVatron.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Measurement of the Gamma-Ray Energy Spectrum beyond 100 TeV from the HESS J1843$-$033 Region
Authors:
M. Amenomori,
S. Asano,
Y. W. Bao,
X. J. Bi,
D. Chen,
T. L. Chen,
W. Y. Chen,
Xu Chen,
Y. Chen,
Cirennima,
S. W. Cui,
Danzengluobu,
L. K. Ding,
J. H. Fang,
K. Fang,
C. F. Feng,
Zhaoyang Feng,
Z. Y. Feng,
Qi Gao,
A. Gomi,
Q. B. Gou,
Y. Q. Guo,
Y. Y. Guo,
H. H. He,
Z. T. He
, et al. (91 additional authors not shown)
Abstract:
HESS J1843$-$033 is a very-high-energy gamma-ray source whose origin remains unidentified. This work presents, for the first time, the energy spectrum of gamma rays beyond $100\, {\rm TeV}$ from the HESS J1843$-$033 region using the data recorded by the Tibet air shower array and its underground muon detector array. A gamma-ray source with an extension of $0.34^{\circ} \pm 0.12^{\circ}$ is success…
▽ More
HESS J1843$-$033 is a very-high-energy gamma-ray source whose origin remains unidentified. This work presents, for the first time, the energy spectrum of gamma rays beyond $100\, {\rm TeV}$ from the HESS J1843$-$033 region using the data recorded by the Tibet air shower array and its underground muon detector array. A gamma-ray source with an extension of $0.34^{\circ} \pm 0.12^{\circ}$ is successfully detected above $25\, {\rm TeV}$ at $(α,\, δ) = (281.09^{\circ}\pm 0.10^{\circ},\, -3.76^{\circ}\pm 0.09^{\circ})$ near HESS J1843$-$033 with a statistical significance of $6.2\, σ$, and the source is named TASG J1844$-$038. The position of TASG J1844$-$038 is consistent with those of HESS J1843$-$033, eHWC J1842$-$035, and LHAASO J1843$-$0338. The measured gamma-ray energy spectrum in $25\, {\rm TeV} < E < 130\, {\rm TeV}$ is described with ${\rm d}N/{\rm d}E = (9.70\pm 1.89)\times 10^{-16} (E/40\, {\rm TeV})^{-3.26\pm 0.30}\, {\rm TeV}^{-1} {\rm cm}^{-2} {\rm s}^{-1}$, and the spectral fit to the combined spectra of HESS J1843$-$033, LHAASO J1843$-$0338, and TASG J1844$-$038 implies the existence of a cutoff at $49.5\pm 9.0\, {\rm TeV}$. Associations of TASG J1844-038 with SNR G28.6$-$0.1 and PSR J1844-0346 are also discussed in detail for the first time.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Project Aria: A New Tool for Egocentric Multi-Modal AI Research
Authors:
Jakob Engel,
Kiran Somasundaram,
Michael Goesele,
Albert Sun,
Alexander Gamino,
Andrew Turner,
Arjang Talattof,
Arnie Yuan,
Bilal Souti,
Brighid Meredith,
Cheng Peng,
Chris Sweeney,
Cole Wilson,
Dan Barnes,
Daniel DeTone,
David Caruso,
Derek Valleroy,
Dinesh Ginjupalli,
Duncan Frost,
Edward Miller,
Elias Mueggler,
Evgeniy Oleinik,
Fan Zhang,
Guruprasad Somasundaram,
Gustavo Solaira
, et al. (49 additional authors not shown)
Abstract:
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul…
▽ More
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data.
△ Less
Submitted 1 October, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Absence of Floating Phase in Superconductors with Time-reversal Symmetry Breaking on any Lattice
Authors:
Andrew C. Yuan
Abstract:
Due to the interplay of multi-component order parameters (e.g., a twisted bilayer superconductor with inter-layer Josephson coupling or a frustrated ($n\ge 3$)-band superconductor), a superconductor can possess a $U(1)\times \mathbb{Z}_2$ symmetry, corresponding to the superconducting $T_c$ and time-reversal symmetry breaking transition $T_\text{TRSB}$, respectively. It was then conjectured that i…
▽ More
Due to the interplay of multi-component order parameters (e.g., a twisted bilayer superconductor with inter-layer Josephson coupling or a frustrated ($n\ge 3$)-band superconductor), a superconductor can possess a $U(1)\times \mathbb{Z}_2$ symmetry, corresponding to the superconducting $T_c$ and time-reversal symmetry breaking transition $T_\text{TRSB}$, respectively. It was then conjectured that in this class of Hamiltonians, there exists a vast parameter regime $\mathcal{O}$ such that the system exhibits vestigial TRSB, i.e., $T_\text{TRSB} > T_c$, while at the boundary $\partial \mathcal{O}$, the system possesses a single phase transition $T_\text{TRSB}=T_c$. In this paper, we provide evidence towards this conjecture by mathematically eliminating the possibility of a floating phase, i.e., $T_\text{TRSB} < T_c$, for the strong coupling regime. More specifically, we prove that the correlation functions of $U(1)$ spins are bounded above by that of $\mathbb{Z}_2$ spins for all temperatures and lattice structures (e.g., $\mathbb{Z}^d$ for all $d$). In particular, this guarantees the existence of high-$T_c$ TRSB (and consequently topological) superconductivity in a large class of Hamiltonians. Note that the same property can also be proven for a certain parameter regime ($Δ\ge 4/5$) of the generalized XY model on any lattice structure, despite belonging to an entirely distinct class of $U(1)\times \mathbb{Z}_2$ Hamiltonians.
△ Less
Submitted 27 February, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
DiLogics: Creating Web Automation Programs With Diverse Logics
Authors:
Kevin Pu,
Jim Yang,
Angel Yuan,
Minyi Ma,
Rui Dong,
Xinyu Wang,
Yan Chen,
Tovi Grossman
Abstract:
Knowledge workers frequently encounter repetitive web data entry tasks, like updating records or placing orders. Web automation increases productivity, but translating tasks to web actions accurately and extending to new specifications is challenging. Existing tools can automate tasks that perform the same logical trace of UI actions (e.g., input text in each field in order), but do not support ta…
▽ More
Knowledge workers frequently encounter repetitive web data entry tasks, like updating records or placing orders. Web automation increases productivity, but translating tasks to web actions accurately and extending to new specifications is challenging. Existing tools can automate tasks that perform the same logical trace of UI actions (e.g., input text in each field in order), but do not support tasks requiring different executions based on varied input conditions. We present DiLogics, a programming-by-demonstration system that utilizes NLP to assist users in creating web automation programs that handle diverse specifications. DiLogics first semantically segments input data to structured task steps. By recording user demonstrations for each step, DiLogics generalizes the web macros to novel but semantically similar task requirements. Our evaluation showed that non-experts can effectively use DiLogics to create automation programs that fulfill diverse input instructions. DiLogics provides an efficient, intuitive, and expressive method for develo** web automation programs satisfying diverse specifications.
△ Less
Submitted 18 August, 2023; v1 submitted 10 August, 2023;
originally announced August 2023.
-
Inhomogeneity-Induced Time-Reversal Symmetry Breaking in Cuprate Twist-Junctions
Authors:
Andrew C. Yuan,
Yaar Vituri,
Erez Berg,
Boris Spivak,
Steven A. Kivelson
Abstract:
The lowest order Josephson coupling, $J_1(θ)\cos(φ)$, between two d-wave superconductors with phase-difference $φ$ across the junction vanishes when their relative orientation is rotated by $θ=π/4$. However, in the presence of inhomogeneity, $J_{1}(\mathbf{r})$ is non-zero locally, with a sign that fluctuates in space. We show that such a random $J_1$ generates a global second-harmonic Josephson c…
▽ More
The lowest order Josephson coupling, $J_1(θ)\cos(φ)$, between two d-wave superconductors with phase-difference $φ$ across the junction vanishes when their relative orientation is rotated by $θ=π/4$. However, in the presence of inhomogeneity, $J_{1}(\mathbf{r})$ is non-zero locally, with a sign that fluctuates in space. We show that such a random $J_1$ generates a global second-harmonic Josephson coupling, $J_2\cos(2φ)$, with a sign that favors $φ= \pm π/2$, i.e., spontaneous breaking of time reversal symmetry. The magnitude of $J_2$ is substantially enhanced if the spatial correlations of $J_1(\mathbf{r})$ extend over large distances, such as would be expected in the presence of large amplitude twist-angle angle disorder or significant local electronic nematicity. We argue that this effect likely accounts for the recent observations in twisted Josephson junctions between high temperature superconductors.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Effects of Mutual Coupling on Degree of Freedom and Antenna Efficiency in Holographic MIMO Communications
Authors:
Shuai S. A. Yuan,
Xiaoming Chen,
Chongwen Huang,
Wei E. I. Sha
Abstract:
The holographic multiple-input-multiple-output (MIMO) communications refer to the MIMO systems built with ultra-dense antenna arrays, whose channel models and potential applications have attracted increasing attentions recently. When the spacing between adjacent array elements is larger than half wavelength, the effect of mutual coupling can generally be neglected in current antenna designs. Howev…
▽ More
The holographic multiple-input-multiple-output (MIMO) communications refer to the MIMO systems built with ultra-dense antenna arrays, whose channel models and potential applications have attracted increasing attentions recently. When the spacing between adjacent array elements is larger than half wavelength, the effect of mutual coupling can generally be neglected in current antenna designs. However, in holographic MIMO communications, the influence of strong mutual coupling on antenna characteristics is inevitable, resulting in distorted radiation patterns and low radiation efficiencies. In this paper, starting from the analytical correlation and efficiency models, we investigate how the mutual coupling affects the capacity of a space-constrained MIMO system from the aspects of degree of freedom (DOF) and antenna efficiency. The involved fundamental concepts of correlation, DOF, efficiency and mutual coupling are crucial for both antenna and wireless-communication engineers when designing emerging MIMO communication systems.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning
Authors:
Maximilian Mozes,
Tolga Bolukbasi,
Ann Yuan,
Frederick Liu,
Nithum Thain,
Lucas Dixon
Abstract:
Pretrained large language models (LLMs) are able to solve a wide variety of tasks through transfer learning. Various explainability methods have been developed to investigate their decision making process. TracIn (Pruthi et al., 2020) is one such gradient-based method which explains model inferences based on the influence of training examples. In this paper, we explore the use of TracIn to improve…
▽ More
Pretrained large language models (LLMs) are able to solve a wide variety of tasks through transfer learning. Various explainability methods have been developed to investigate their decision making process. TracIn (Pruthi et al., 2020) is one such gradient-based method which explains model inferences based on the influence of training examples. In this paper, we explore the use of TracIn to improve model performance in the parameter-efficient tuning (PET) setting. We develop conversational safety classifiers via the prompt-tuning PET method and show how the unique characteristics of the PET regime enable TracIn to identify the cause for certain misclassifications by LLMs. We develop a new methodology for using gradient-based explainability techniques to improve model performance, G-BAIR: gradient-based automated iterative recovery. We show that G-BAIR can recover LLM performance on benchmarks after manually corrupting training labels. This suggests that influence methods like TracIn can be used to automatically perform data cleaning, and introduces the potential for interactive debugging and relabeling for PET-based transfer learning methods.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Towards Agile Text Classifiers for Everyone
Authors:
Maximilian Mozes,
Jessica Hoffmann,
Katrin Tomanek,
Muhamed Kouate,
Nithum Thain,
Ann Yuan,
Tolga Bolukbasi,
Lucas Dixon
Abstract:
Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text cla…
▽ More
Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text classification, whereby classifiers are trained using small, targeted datasets that can be quickly developed for a particular policy. Experimenting with 7 datasets from three safety-related domains, comprising 15 annotation schemes, led to our key finding: prompt-tuning large language models, like PaLM 62B, with a labeled dataset of as few as 80 examples can achieve state-of-the-art performance. We argue that this enables a paradigm shift for text classification, especially for models supporting safer online discourse. Instead of collecting millions of examples to attempt to create universal safety classifiers over months or years, classifiers could be tuned using small datasets, created by individuals or small organizations, tailored for specific use cases, and iterated on and adapted in the time-span of a day.
△ Less
Submitted 21 October, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
$G$-spectra of cyclic defect
Authors:
Tony Feng,
David Treumann,
Allen Yuan
Abstract:
Broué's Abelian Defect Conjecture predicts interesting derived equivalences between derived categories of modular representations of finite groups. We investigate a generalization of Broué's Conjecture to ring spectrum coefficients and prove this generalization in the cyclic defect case, following an argument of Rouquier.
Broué's Abelian Defect Conjecture predicts interesting derived equivalences between derived categories of modular representations of finite groups. We investigate a generalization of Broué's Conjecture to ring spectrum coefficients and prove this generalization in the cyclic defect case, following an argument of Rouquier.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Assigning Agents to Increase Network-Based Neighborhood Diversity
Authors:
Zirou Qiu,
Andrew Yuan,
Chen Chen,
Madhav V. Marathe,
S. S. Ravi,
Daniel J. Rosenkrantz,
Richard E. Stearns,
Anil Vullikanti
Abstract:
Motivated by real-world applications such as the allocation of public housing, we examine the problem of assigning a group of agents to vertices (e.g., spatial locations) of a network so that the diversity level is maximized. Specifically, agents are of two types (characterized by features), and we measure diversity by the number of agents who have at least one neighbor of a different type. This p…
▽ More
Motivated by real-world applications such as the allocation of public housing, we examine the problem of assigning a group of agents to vertices (e.g., spatial locations) of a network so that the diversity level is maximized. Specifically, agents are of two types (characterized by features), and we measure diversity by the number of agents who have at least one neighbor of a different type. This problem is known to be NP-hard, and we focus on develo** approximation algorithms with provable performance guarantees. We first present a local-improvement algorithm for general graphs that provides an approximation factor of 1/2. For the special case where the sizes of agent subgroups are similar, we present a randomized approach based on semidefinite programming that yields an approximation factor better than 1/2. Further, we show that the problem can be solved efficiently when the underlying graph is treewidth-bounded and obtain a polynomial time approximation scheme (PTAS) for the problem on planar graphs. Lastly, we conduct experiments to evaluate the per-performance of the proposed algorithms on synthetic and real-world networks.
△ Less
Submitted 29 March, 2024; v1 submitted 7 January, 2023;
originally announced January 2023.
-
Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers
Authors:
Daphne Ippolito,
Ann Yuan,
Andy Coenen,
Sehmon Burnam
Abstract:
Recent developments in natural language generation (NLG) using neural language models have brought us closer than ever to the goal of building AI-powered creative writing tools. However, most prior work on human-AI collaboration in the creative writing domain has evaluated new systems with amateur writers, typically in contrived user studies of limited scope. In this work, we commissioned 13 profe…
▽ More
Recent developments in natural language generation (NLG) using neural language models have brought us closer than ever to the goal of building AI-powered creative writing tools. However, most prior work on human-AI collaboration in the creative writing domain has evaluated new systems with amateur writers, typically in contrived user studies of limited scope. In this work, we commissioned 13 professional, published writers from a diverse set of creative writing backgrounds to craft stories using Wordcraft, a text editor with built-in AI-powered writing assistance tools. Using interviews and participant journals, we discuss the potential of NLG to have significant impact in the creative writing domain--especially with respect to brainstorming, generation of story details, world-building, and research assistance. Experienced writers, more so than amateurs, typically have well-developed systems and methodologies for writing, as well as distinctive voices and target audiences. Our work highlights the challenges in building for these writers; NLG technologies struggle to preserve style and authorial voice, and they lack deep understanding of story contents. In order for AI-powered writing assistants to realize their full potential, it is essential that they take into account the diverse goals and expertise of human writers.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization
Authors:
Chris Junchi Li,
Angela Yuan,
Gauthier Gidel,
Quanquan Gu,
Michael I. Jordan
Abstract:
We propose a new first-order optimization algorithm -- AcceleratedGradient-OptimisticGradient (AG-OG) Descent Ascent -- for separable convex-concave minimax optimization. The main idea of our algorithm is to carefully leverage the structure of the minimax problem, performing Nesterov acceleration on the individual component and optimistic gradient on the coupling component. Equipped with proper re…
▽ More
We propose a new first-order optimization algorithm -- AcceleratedGradient-OptimisticGradient (AG-OG) Descent Ascent -- for separable convex-concave minimax optimization. The main idea of our algorithm is to carefully leverage the structure of the minimax problem, performing Nesterov acceleration on the individual component and optimistic gradient on the coupling component. Equipped with proper restarting, we show that AG-OG achieves the optimal convergence rate (up to a constant) for a variety of settings, including bilinearly coupled strongly convex-strongly concave minimax optimization (bi-SC-SC), bilinearly coupled convex-strongly concave minimax optimization (bi-C-SC), and bilinear games. We also extend our algorithm to the stochastic setting and achieve the optimal convergence rate in both bi-SC-SC and bi-C-SC settings. AG-OG is the first single-call algorithm with optimal convergence rates in both deterministic and stochastic settings for bilinearly coupled minimax optimization problems.
△ Less
Submitted 14 August, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Electromagnetic Effective-Degree-of-Freedom Limit of a MIMO System in 2-D Inhomogeneous Environment
Authors:
Shuai S. A. Yuan,
Zi He,
Sheng Sun,
Xiaoming Chen,
Chongwen Huang,
Wei E. I. Sha
Abstract:
Compared with a single-input-single-output (SISO) wireless communication system, the benefit of multiple-input-multiple-output (MIMO) technology originates from its extra degree of freedom (DOF), also referred as scattering channels or spatial electromagnetic (EM) modes, brought by spatial multiplexing. When the physical sizes of transmitting and receiving arrays are fixed, and there are sufficien…
▽ More
Compared with a single-input-single-output (SISO) wireless communication system, the benefit of multiple-input-multiple-output (MIMO) technology originates from its extra degree of freedom (DOF), also referred as scattering channels or spatial electromagnetic (EM) modes, brought by spatial multiplexing. When the physical sizes of transmitting and receiving arrays are fixed, and there are sufficient antennas (typically with half-wavelength spacings), the DOF limit is only dependent on the propagating environment. Analytical methods can be used to estimate this limit in free space, and some approximate models are adopted in stochastic environments, such as Clarke's model and Ray-tracing methods. However, this DOF limit in an certain inhomogeneous environment has not been well discussed with rigorous full-wave numerical methods. In this work, volume integral equation (VIE) is implemented for investigating the limit of MIMO effective degree of freedom (EDOF) in three representative two-dimensional (2-D) inhomogeneous environments. Moreover, we clarify the relation between the performance of a MIMO system and the scattering characteristics of its propagating environment.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Authors:
Zixiang Chen,
Chris Junchi Li,
Angela Yuan,
Quanquan Gu,
Michael I. Jordan
Abstract:
With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov Decision Process (MDP) models in the literature for tractable RL…
▽ More
With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov Decision Process (MDP) models in the literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and the functional eluder dimension as a complexity measure of the ABC class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly stronger assumption, OPERA improves the state-of-the-art sample complexity results by a factor of $dH$. Our framework provides a generic interface to design and analyze new RL models and algorithms.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
Multiband mean-field theory of the $d+ig$ superconductivity scenario in Sr$_2$RuO$_4$
Authors:
Andrew C. Yuan,
Erez Berg,
Steven A. Kivelson
Abstract:
Many seemingly contradictory experimental findings concerning the superconducting state in Sr$_2$RuO$_4$ can be accounted for on the basis of a conjectured accidental degeneracy between two patterns of pairing that are unrelated to each other under the $(D_{4h})$ symmetry of the crystal: a $d_{x^2-y^2}$-wave $(B_{1g})$ and a $g_{xy(x^2-y^2)}$-wave $(A_{2g})$ superconducting state. In this paper, w…
▽ More
Many seemingly contradictory experimental findings concerning the superconducting state in Sr$_2$RuO$_4$ can be accounted for on the basis of a conjectured accidental degeneracy between two patterns of pairing that are unrelated to each other under the $(D_{4h})$ symmetry of the crystal: a $d_{x^2-y^2}$-wave $(B_{1g})$ and a $g_{xy(x^2-y^2)}$-wave $(A_{2g})$ superconducting state. In this paper, we propose a generic multi-band model in which the $g$-wave pairing involving the $xz$ and $yz$ orbitals arises from second-nearest-neighbor interactions. Even if time-reversal symmetry is broken in a $d+ig$ state, such a superconductor remains gapless with a Bogoliubov Fermi surface that approximates a (vertical) line node. The model gives rise to a strain-dependent splitting between the critical temperature $T_c$ and the time-reversal symmetry-breaking temperature $T_\text{trsb}$ that is qualitatively similar to some of the experimental observations in Sr$_2$RuO$_4$.
△ Less
Submitted 26 July, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
The sphere of semiadditive height 1
Authors:
Allen Yuan
Abstract:
We construct a lift of the $p$-complete sphere to the universal height $1$ higher semiadditive stable $\infty$-category tsade-$1$ of Carmeli--Schlank--Yanovski, providing a counterexample, at height $1$, to their conjecture that the natural functor from tsade-$n$ to $\mathrm{Sp}_{T(n)}$ is an equivalence. We then record some consequences of the construction, including an observation of T. Schlank…
▽ More
We construct a lift of the $p$-complete sphere to the universal height $1$ higher semiadditive stable $\infty$-category tsade-$1$ of Carmeli--Schlank--Yanovski, providing a counterexample, at height $1$, to their conjecture that the natural functor from tsade-$n$ to $\mathrm{Sp}_{T(n)}$ is an equivalence. We then record some consequences of the construction, including an observation of T. Schlank that this gives a conceptual proof of a classical theorem of Lee on the stable cohomotopy of Eilenberg--MacLane spaces.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
The Chromatic Nullstellensatz
Authors:
Robert Burklund,
Tomer M. Schlank,
Allen Yuan
Abstract:
We show that Lubin--Tate theories attached to algebraically closed fields are characterized among $T(n)$-local $\mathbb{E}_{\infty}$-rings as those that satisfy an analogue of Hilbert's Nullstellensatz. Furthermore, we show that for every $T(n)$-local $\mathbb{E}_{\infty}$-ring $R$, the collection of $\mathbb{E}_\infty$-ring maps from $R$ to such Lubin-Tate theories jointly detect nilpotence. In p…
▽ More
We show that Lubin--Tate theories attached to algebraically closed fields are characterized among $T(n)$-local $\mathbb{E}_{\infty}$-rings as those that satisfy an analogue of Hilbert's Nullstellensatz. Furthermore, we show that for every $T(n)$-local $\mathbb{E}_{\infty}$-ring $R$, the collection of $\mathbb{E}_\infty$-ring maps from $R$ to such Lubin-Tate theories jointly detect nilpotence. In particular, we deduce that every non-zero $T(n)$-local $\mathbb{E}_{\infty}$-ring $R$ admits an $\mathbb{E}_\infty$-ring map to such a Lubin-Tate theory. As consequences, we construct $\mathbb{E}_{\infty}$ complex orientations of algebraically closed Lubin-Tate theories, compute the strict Picard spectra of such Lubin-Tate theories, and prove redshift for the algebraic $\mathrm{K}$-theory of arbitrary $\mathbb{E}_{\infty}$-rings.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia
Authors:
Steven M. Goodman,
Erin Buehler,
Patrick Clary,
Andy Coenen,
Aaron Donsbach,
Tiffanie N. Horne,
Michal Lahav,
Robert Macdonald,
Rain Breaw Michaels,
Ajit Narayanan,
Mahima Pushkarna,
Joel Riley,
Alex Santana,
Lei Shi,
Rachel Sweeney,
Phil Weaver,
Ann Yuan,
Meredith Ringel Morris
Abstract:
Prior work has explored the writing challenges experienced by people with dyslexia, and the potential for new spelling, grammar, and word retrieval technologies to address these challenges. However, the capabilities for natural language generation demonstrated by the latest class of large language models (LLMs) highlight an opportunity to explore new forms of human-AI writing support tools. In thi…
▽ More
Prior work has explored the writing challenges experienced by people with dyslexia, and the potential for new spelling, grammar, and word retrieval technologies to address these challenges. However, the capabilities for natural language generation demonstrated by the latest class of large language models (LLMs) highlight an opportunity to explore new forms of human-AI writing support tools. In this paper, we introduce LaMPost, a prototype email-writing interface that explores the potential for LLMs to power writing support tools that address the varied needs of people with dyslexia. LaMPost draws from our understanding of these needs and introduces novel AI-powered features for email-writing, including: outlining main ideas, generating a subject line, suggesting changes, rewriting a selection. We evaluated LaMPost with 19 adults with dyslexia, identifying many promising routes for further exploration (including the popularity of the "rewrite" and "subject line" features), but also finding that the current generation of LLMs may not surpass the accuracy and quality thresholds required to meet the needs of writers with dyslexia. Surprisingly, we found that participants' awareness of the AI had no effect on their perception of the system, nor on their feelings of autonomy, expression, and self-efficacy when writing emails. Our findings yield further insight into the benefits and drawbacks of using LLMs as writing support for adults with dyslexia and provide a foundation to build upon in future research.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
The Case for a Single Model that can Both Generate Continuations and Fill in the Blank
Authors:
Daphne Ippolito,
Liam Dugan,
Emily Reif,
Ann Yuan,
Andy Coenen,
Chris Callison-Burch
Abstract:
The task of inserting text into a specified position in a passage, known as fill in the blank (FitB), is useful for a variety of applications where writers interact with a natural language generation (NLG) system to craft text. While previous work has tackled this problem with models trained specifically to do the fill-in-the-blank task, a more useful model is one that can effectively perform _bot…
▽ More
The task of inserting text into a specified position in a passage, known as fill in the blank (FitB), is useful for a variety of applications where writers interact with a natural language generation (NLG) system to craft text. While previous work has tackled this problem with models trained specifically to do the fill-in-the-blank task, a more useful model is one that can effectively perform _both_ FitB and continuation. In this work, we evaluate the feasibility of using a single model to do both tasks. We show that models pre-trained with a FitB-style objective are capable of both tasks, while models pre-trained for continuation are not. Finally, we show how FitB models can be easily finetuned to allow for fine-grained control over the length and word choice of the generation.
△ Less
Submitted 30 June, 2022; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Tight Lower Bounds on Worst-Case Guarantees for Zero-Shot Learning with Attributes
Authors:
Alessio Mazzetto,
Cristina Menghini,
Andrew Yuan,
Eli Upfal,
Stephen H. Bach
Abstract:
We develop a rigorous mathematical analysis of zero-shot learning with attributes. In this setting, the goal is to label novel classes with no training data, only detectors for attributes and a description of how those attributes are correlated with the target classes, called the class-attribute matrix. We develop the first non-trivial lower bound on the worst-case error of the best map from attri…
▽ More
We develop a rigorous mathematical analysis of zero-shot learning with attributes. In this setting, the goal is to label novel classes with no training data, only detectors for attributes and a description of how those attributes are correlated with the target classes, called the class-attribute matrix. We develop the first non-trivial lower bound on the worst-case error of the best map from attributes to classes for this setting, even with perfect attribute detectors. The lower bound characterizes the theoretical intrinsic difficulty of the zero-shot problem based on the available information -- the class-attribute matrix -- and the bound is practically computable from it. Our lower bound is tight, as we show that we can always find a randomized map from attributes to classes whose expected error is upper bounded by the value of the lower bound. We show that our analysis can be predictive of how standard zero-shot methods behave in practice, including which classes will likely be confused with others.
△ Less
Submitted 28 November, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Electromagnetic Effective Degree of Freedom of a MIMO System in Free Space
Authors:
Shuai S. A. Yuan,
Zi He,
Xiaoming Chen,
Chongwen Huang,
Wei E. I. Sha
Abstract:
Effective degree of freedom (EDOF) of a multiple-input-multiple-output (MIMO) system represents its equivalent number of independent single-input-single-output (SISO) systems, which directly characterizes the communication performance. Traditional EDOF only considers single polarization, where the full polarized components degrade into two independent transverse components under the far-field appr…
▽ More
Effective degree of freedom (EDOF) of a multiple-input-multiple-output (MIMO) system represents its equivalent number of independent single-input-single-output (SISO) systems, which directly characterizes the communication performance. Traditional EDOF only considers single polarization, where the full polarized components degrade into two independent transverse components under the far-field approximation. However, the traditional model is not applicable to complex scenarios especially for the near-field region. Based on an electromagnetic (EM) channel model built from the dyadic Green's function, we first calculate the EM EDOF to estimate the performance of an arbitrary MIMO system with full polarizations in free space. Then, we clarify the relations between the limit of EDOF and the optimal number of sources/receivers. Finally, potential benefits of near-field MIMO communications are demonstrated with the EM EDOF, in which the contribution of the longitudinally polarized source is taken into account. This work establishes a fundamental EM framework for MIMO wireless communications.
△ Less
Submitted 1 January, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Examples of chromatic redshift in algebraic $K$-theory
Authors:
Allen Yuan
Abstract:
We give a simple argument to detect chromatic redshift in the algebraic $K$-theory of $\mathbb{E}_{\infty}$-ring spectra and give two applications: we show for $n\geq 1$ that $K(E_n)$, the algebraic $K$-theory of any height $n$ Lubin-Tate theory, has nontrivial $T(n+1)$-localization, and that $K^{(n)}(k)$, the $n$-fold iterated algebraic $K$-theory of a field $k$ of characteristic different from…
▽ More
We give a simple argument to detect chromatic redshift in the algebraic $K$-theory of $\mathbb{E}_{\infty}$-ring spectra and give two applications: we show for $n\geq 1$ that $K(E_n)$, the algebraic $K$-theory of any height $n$ Lubin-Tate theory, has nontrivial $T(n+1)$-localization, and that $K^{(n)}(k)$, the $n$-fold iterated algebraic $K$-theory of a field $k$ of characteristic different from $p$, has nontrivial $T(n)$-localization.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets
Authors:
Ann Yuan,
Daphne Ippolito,
Vitaly Nikolaev,
Chris Callison-Burch,
Andy Coenen,
Sebastian Gehrmann
Abstract:
NLP researchers need more, higher-quality text datasets. Human-labeled datasets are expensive to collect, while datasets collected via automatic retrieval from the web such as WikiBio are noisy and can include undesired biases. Moreover, data sourced from the web is often included in datasets used to pretrain models, leading to inadvertent cross-contamination of training and test sets. In this wor…
▽ More
NLP researchers need more, higher-quality text datasets. Human-labeled datasets are expensive to collect, while datasets collected via automatic retrieval from the web such as WikiBio are noisy and can include undesired biases. Moreover, data sourced from the web is often included in datasets used to pretrain models, leading to inadvertent cross-contamination of training and test sets. In this work we introduce a novel method for efficient dataset curation: we use a large language model to provide seed generations to human raters, thereby changing dataset authoring from a writing task to an editing task. We use our method to curate SynthBio - a new evaluation set for WikiBio - composed of structured attribute lists describing fictional individuals, mapped to natural language biographies. We show that our dataset of fictional biographies is less noisy than WikiBio, and also more balanced with respect to gender and nationality.
△ Less
Submitted 12 January, 2022; v1 submitted 11 November, 2021;
originally announced November 2021.
-
Perspective-taking to Reduce Affective Polarization on Social Media
Authors:
Martin Saveski,
Nabeel Gillani,
Ann Yuan,
Prashanth Vijayaraghavan,
Deb Roy
Abstract:
The intensification of affective polarization worldwide has raised new questions about how social media platforms might be further fracturing an already-divided public sphere. As opposed to ideological polarization, affective polarization is defined less by divergent policy preferences and more by strong negative emotions towards opposing political groups, and thus arguably poses a formidable thre…
▽ More
The intensification of affective polarization worldwide has raised new questions about how social media platforms might be further fracturing an already-divided public sphere. As opposed to ideological polarization, affective polarization is defined less by divergent policy preferences and more by strong negative emotions towards opposing political groups, and thus arguably poses a formidable threat to rational democratic discourse. We explore if prompting perspective-taking on social media platforms can help enhance empathy between opposing groups as a first step towards reducing affective polarization. Specifically, we deploy a randomized field experiment through a browser extension to 1,611 participants on Twitter, which enables participants to randomly replace their feeds with those belonging to accounts whose political views either agree with or diverge from their own. We find that simply exposing participants to "outgroup" feeds enhances engagement, but not an understanding of why others hold their political views. On the other hand, framing the experience in familiar, empathic terms by prompting participants to recall a disagreement with a friend does not affect engagement, but does increase their ability to understand opposing views. Our findings illustrate how social media platforms might take simple steps that align with business objectives to reduce affective polarization.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Chromatic convergence for the algebraic K-theory of the sphere spectrum
Authors:
Andrew J. Blumberg,
Michael A. Mandell,
Allen Yuan
Abstract:
We show that the map from $K({\mathbb S})$ to its chromatic completion is a connective cover and identify the fiber in $K$-theoretic terms. We combine this with recent work of Land-Mathew-Meier-Tamme to prove a form of "Waldhausen's Chromatic Convergence Conjecture": we show that the map $K({\mathbb S}_{(p)})_{(p)}\to \mathop{\rm holim} K(L^{f}_{n}{\mathbb S})_{(p)}$ is the inclusion of a wedge su…
▽ More
We show that the map from $K({\mathbb S})$ to its chromatic completion is a connective cover and identify the fiber in $K$-theoretic terms. We combine this with recent work of Land-Mathew-Meier-Tamme to prove a form of "Waldhausen's Chromatic Convergence Conjecture": we show that the map $K({\mathbb S}_{(p)})_{(p)}\to \mathop{\rm holim} K(L^{f}_{n}{\mathbb S})_{(p)}$ is the inclusion of a wedge summand.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Higher semiadditive Grothendieck-Witt theory and the $K(1)$-local sphere
Authors:
Shachar Carmeli,
Allen Yuan
Abstract:
We develop a higher semiadditive version of Grothendieck-Witt theory. We then apply the theory in the case of a finite field to study the higher semiadditive structure of the $K(1)$-local sphere at the prime $2$. As a further application, we compute and clarify certain power operations in the homotopy of the $K(1)$-local sphere.
We develop a higher semiadditive version of Grothendieck-Witt theory. We then apply the theory in the case of a finite field to study the higher semiadditive structure of the $K(1)$-local sphere at the prime $2$. As a further application, we compute and clarify certain power operations in the homotopy of the $K(1)$-local sphere.
△ Less
Submitted 28 December, 2022; v1 submitted 24 September, 2021;
originally announced September 2021.
-
A Recipe For Arbitrary Text Style Transfer with Large Language Models
Authors:
Emily Reif,
Daphne Ippolito,
Ann Yuan,
Andy Coenen,
Chris Callison-Burch,
Jason Wei
Abstract:
In this paper, we leverage large language models (LMs) to perform zero-shot text style transfer. We present a prompting method that we call augmented zero-shot learning, which frames style transfer as a sentence rewriting task and requires only a natural language instruction, without model fine-tuning or exemplars in the target style. Augmented zero-shot learning is simple and demonstrates promisi…
▽ More
In this paper, we leverage large language models (LMs) to perform zero-shot text style transfer. We present a prompting method that we call augmented zero-shot learning, which frames style transfer as a sentence rewriting task and requires only a natural language instruction, without model fine-tuning or exemplars in the target style. Augmented zero-shot learning is simple and demonstrates promising results not just on standard style transfer tasks such as sentiment, but also on arbitrary transformations such as "make this melodramatic" or "insert a metaphor."
△ Less
Submitted 31 March, 2022; v1 submitted 8 September, 2021;
originally announced September 2021.
-
Potential PeVatron supernova remnant G106.3+2.7 seen in the highest-energy gamma rays
Authors:
M. Amenomori,
Y. W. Bao,
X. J. Bi,
D. Chen,
T. L. Chen,
W. Y. Chen,
Xu Chen,
Y. Chen,
Cirennima,
S. W. Cui,
Danzengluobu,
L. K. Ding,
J. H. Fang,
K. Fang,
C. F. Feng,
Zhaoyang Feng,
Z. Y. Feng,
Qi Gao,
Q. B. Gou,
Y. Q. Guo,
Y. Y. Guo,
H. H. He,
Z. T. He,
K. Hibino,
N. Hotta
, et al. (70 additional authors not shown)
Abstract:
Cosmic rays (protons and other atomic nuclei) are believed to gain energies of petaelectronvolts (PeV) and beyond at astrophysical particle accelerators called 'PeVatrons' inside our Galaxy. Although a characteristic feature of a PeVatron is expected to be a hard gamma-ray energy spectrum that extends beyond 100 teraelectronvolts (TeV) without a cutoff, none of the currently known sources exhibits…
▽ More
Cosmic rays (protons and other atomic nuclei) are believed to gain energies of petaelectronvolts (PeV) and beyond at astrophysical particle accelerators called 'PeVatrons' inside our Galaxy. Although a characteristic feature of a PeVatron is expected to be a hard gamma-ray energy spectrum that extends beyond 100 teraelectronvolts (TeV) without a cutoff, none of the currently known sources exhibits such a spectrum due to the low maximum energy of accelerated cosmic rays or insufficient detector sensitivity around 100 TeV. Here we report the observation of gamma-ray emission from the supernova remnant G106.3+2.7 above 10 TeV. This work provides flux data points up to and above 100 TeV and indicates that the very-high-energy gamma-ray emission above 10 TeV is well correlated with a molecular cloud rather than the pulsar PSR J2229+6114. Regarding the gamma-ray emission mechanism of G106.3+2.7, this morphological feature appears to favor a hadronic origin via the π0 decay caused by accelerated relativistic protons over a leptonic one via the inverse-Compton scattering by relativistic electrons. Furthermore, we point out that an X-ray flux upper limit on the synchrotron spectrum would provide important information to firmly establish the hadronic scenario as the mechanism of particle acceleration at the source.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
Wordcraft: a Human-AI Collaborative Editor for Story Writing
Authors:
Andy Coenen,
Luke Davis,
Daphne Ippolito,
Emily Reif,
Ann Yuan
Abstract:
As neural language models grow in effectiveness, they are increasingly being applied in real-world settings. However these applications tend to be limited in the modes of interaction they support. In this extended abstract, we propose Wordcraft, an AI-assisted editor for story writing in which a writer and a dialog system collaborate to write a story. Our novel interface uses few-shot learning and…
▽ More
As neural language models grow in effectiveness, they are increasingly being applied in real-world settings. However these applications tend to be limited in the modes of interaction they support. In this extended abstract, we propose Wordcraft, an AI-assisted editor for story writing in which a writer and a dialog system collaborate to write a story. Our novel interface uses few-shot learning and the natural affordances of conversation to support a variety of interactions. Our editor provides a sandbox for writers to probe the boundaries of transformer-based language models and paves the way for future human-in-the-loop training pipelines and novel evaluation methods.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Gamma-ray Observation of the Cygnus Region in the 100 TeV Energy Region
Authors:
M. Amenomori,
Y. W. Bao,
X. J. Bi,
D. Chen,
T. L. Chen,
W. Y. Chen,
Xu Chen,
Y. Chen,
Cirennima,
S. W. Cui,
Danzengluobu,
L. K. Ding,
J. H. Fang,
K. Fang,
C. F. Feng,
Zhaoyang Feng,
Z. Y. Feng,
Qi Gao,
A. Gomi,
Q. B. Gou,
Y. Q. Guo,
Y. Y. Guo,
H. H. He,
Z. T. He,
K. Hibino
, et al. (88 additional authors not shown)
Abstract:
We report observations of gamma-ray emissions with energies in the 100 TeV energy region from the Cygnus region in our Galaxy. Two sources are significantly detected in the directions of the Cygnus OB1 and OB2 associations. Based on their positional coincidences, we associate one with a pulsar PSR J2032+4127 and the other mainly with a pulsar wind nebula PWN G75.2+0.1 with the pulsar moving away f…
▽ More
We report observations of gamma-ray emissions with energies in the 100 TeV energy region from the Cygnus region in our Galaxy. Two sources are significantly detected in the directions of the Cygnus OB1 and OB2 associations. Based on their positional coincidences, we associate one with a pulsar PSR J2032+4127 and the other mainly with a pulsar wind nebula PWN G75.2+0.1 with the pulsar moving away from its original birthplace situated around the centroid of the observed gamma-ray emission. This work would stimulate further studies of particle acceleration mechanisms at these gamma-ray sources.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Approaching the Fundamental Limit of Orbital Angular Momentum Multiplexing Through a Hologram Metasurface
Authors:
Shuai S. A. Yuan,
Jie Wu,
Menglin L. N. Chen,
Zhihao Lan,
Liang Zhang,
Sheng Sun,
Zhixiang Huang,
Xiaoming Chen,
Shilie Zheng,
Li Jun Jiang,
Xianmin Zhang,
Wei E. I. Sha
Abstract:
Establishing and approaching the fundamental limit of orbital angular momentum (OAM) multiplexing are necessary and increasingly urgent for current multiple-input multiple-output research. In this work, we elaborate the fundamental limit in terms of independent scattering channels (or degrees of freedom of scattered fields) through angular-spectral analysis, in conjunction with a rigorous Green fu…
▽ More
Establishing and approaching the fundamental limit of orbital angular momentum (OAM) multiplexing are necessary and increasingly urgent for current multiple-input multiple-output research. In this work, we elaborate the fundamental limit in terms of independent scattering channels (or degrees of freedom of scattered fields) through angular-spectral analysis, in conjunction with a rigorous Green function method. The scattering channel limit is universal for arbitrary spatial mode multiplexing, which is launched by a planar electromagnetic device, such as antenna, metasurface, etc, with a predefined physical size. As a proof of concept, we demonstrate both theoretically and experimentally the limit by a metasurface hologram that transforms orthogonal OAM modes to plane-wave modes scattered at critically separated angular-spectral regions. Particularly, a minimax optimization algorithm is applied to suppress angular spectrum aliasing, achieving good performances in both full-wave simulation and experimental measurement at microwave frequencies. This work offers a theoretical upper bound and corresponding approach route for engineering designs of OAM multiplexing.
△ Less
Submitted 1 January, 2022; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Context-aware Heterogeneous Graph Attention Network for User Behavior Prediction in Local Consumer Service Platform
Authors:
Peiyuan Zhu,
Xiaofeng Wang,
Zisen Sang,
Aiquan Yuan,
Guodong Cao
Abstract:
As a new type of e-commerce platform developed in recent years, local consumer service platform provides users with software to consume service to the nearby store or to the home, such as Groupon and Koubei. Different from other common e-commerce platforms, the behavior of users on the local consumer service platform is closely related to their real-time local context information. Therefore, build…
▽ More
As a new type of e-commerce platform developed in recent years, local consumer service platform provides users with software to consume service to the nearby store or to the home, such as Groupon and Koubei. Different from other common e-commerce platforms, the behavior of users on the local consumer service platform is closely related to their real-time local context information. Therefore, building a context-aware user behavior prediction system is able to provide both merchants and users better service in local consumer service platforms. However, most of the previous work just treats the contextual information as an ordinary feature into the prediction model to obtain the prediction list under a specific context, which ignores the fact that the interest of a user in different contexts is often significantly different. Hence, in this paper, we propose a context-aware heterogeneous graph attention network (CHGAT) to dynamically generate the representation of the user and to estimate the probability for future behavior. Specifically, we first construct the meta-path based heterogeneous graphs with the historical behaviors from multiple sources and comprehend heterogeneous vertices in the graph with a novel unified knowledge representing approach. Next, a multi-level attention mechanism is introduced for context-aware aggregation with graph vertices, which contains the vertex-level attention network and the path-level attention network. Both of them aim to capture the semantic correlation between information contained in the graph and the outside real-time contextual information in the search system. Then the model proposed in this paper aggregates specific graphs with their corresponding context features and obtains the representation of user interest under a specific context and input it into the prediction network to finally obtain the predicted probability of user behavior.
△ Less
Submitted 29 June, 2021; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Tests for partial correlation between repeatedly observed nonstationary nonlinear timeseries
Authors:
Kenneth D. Harris,
Alex E. Yuan
Abstract:
We describe two families of statistical tests to detect partial correlation in vectorial timeseries. The tests measure whether an observed timeseries Y can be predicted from a second series X, even after accounting for a third series Z which may correlate with X. They do not make any assumptions on the nature of these timeseries, such as stationarity or linearity, but they do require that multiple…
▽ More
We describe two families of statistical tests to detect partial correlation in vectorial timeseries. The tests measure whether an observed timeseries Y can be predicted from a second series X, even after accounting for a third series Z which may correlate with X. They do not make any assumptions on the nature of these timeseries, such as stationarity or linearity, but they do require that multiple statistically independent recordings of the 3 series are available. Intuitively, the tests work by asking if the series Y recorded on one experiment can be better predicted from X recorded on the same experiment than on a different experiment, after accounting for the prediction from Z recorded on both experiments.
△ Less
Submitted 24 April, 2024; v1 submitted 13 June, 2021;
originally announced June 2021.
-
Strain-induced time reversal breaking and half quantum vortices near a putative superconducting tetra-critical point in Sr$_2$RuO$_4$
Authors:
Andrew C. Yuan,
Erez Berg,
Steven A. Kivelson
Abstract:
It has been shown [1] that many seemingly contradictory experimental findings concerning the superconducting state in Sr$_2$RuO$_4$ can be accounted for as resulting from the existence of an assumed tetra-critical point at near ambient pressure at which $d_{x^2-y^2}$ and $g_{xy(x^2-y^2)}$ superconducting states are degenerate. We perform both a Landau-Ginzburg and a microscopic mean-field analysis…
▽ More
It has been shown [1] that many seemingly contradictory experimental findings concerning the superconducting state in Sr$_2$RuO$_4$ can be accounted for as resulting from the existence of an assumed tetra-critical point at near ambient pressure at which $d_{x^2-y^2}$ and $g_{xy(x^2-y^2)}$ superconducting states are degenerate. We perform both a Landau-Ginzburg and a microscopic mean-field analysis of the effect of spatially varying strain on such a state. In the presence of finite $xy$ shear strain, the superconducting state consists of two possible symmetry-related time-reversal symmetry (TRS) preserving states: $d \pm g$. However, at domain walls between two such regions, TRS can be broken, resulting in a $d+ig$ state. More generally, we find that various natural patterns of spatially varying strain induce a rich variety of superconducting textures, including half-quantum fluxoids. These results may resolve some of the apparent inconsistencies between the theoretical proposal and various experimental observations, including the suggestive evidence of half-quantum vortices [2].
[1] Steven A Kivelson, Andrew C Yuan, BJ Ramshaw, and Ronny Thomale, "A proposal for reconciling diverse experiments on the superconducting state in Sr$_2$RuO$_4$," npj Quantum Mater 5 (2020).
[2] J Jang, DG Ferguson, V Vakaryuk, Raffi Budakian, SB Chung, PM Goldbart, and Y Maeno, "Observation of half-height magnetization steps in Sr$_2$RuO$_4$," Science 331, 186-188 (2011).
△ Less
Submitted 2 June, 2021;
originally announced June 2021.