-
Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization
Authors:
Mert Esencan,
Tarun Advaith Kumar,
Ata Akbari Asanjan,
P. Aaron Lott,
Masoud Mohseni,
Can Unlu,
Davide Venturelli,
Alan Ho
Abstract:
Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to…
▽ More
Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods.
△ Less
Submitted 19 June, 2024;
originally announced July 2024.
-
Estimating Idea Production: A Methodological Survey
Authors:
Ege Erdil,
Tamay Besiroglu,
Anson Ho
Abstract:
Accurately modeling the production of new ideas is crucial for innovation theory and endogenous growth models. This paper provides a comprehensive methodological survey of strategies for estimating idea production functions. We explore various methods, including naive approaches, linear regression, maximum likelihood estimation, and Bayesian inference, each suited to different data availability se…
▽ More
Accurately modeling the production of new ideas is crucial for innovation theory and endogenous growth models. This paper provides a comprehensive methodological survey of strategies for estimating idea production functions. We explore various methods, including naive approaches, linear regression, maximum likelihood estimation, and Bayesian inference, each suited to different data availability settings. Through case studies ranging from total factor productivity to software R&D, we show how to apply these methodologies in practice. Our synthesis provides researchers with guidance on strategies for characterizing idea production functions and highlights obstacles that must be addressed through further empirical validation.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
The MOST Hosts Survey: spectroscopic observation of the host galaxies of ~40,000 transients using DESI
Authors:
Maayane T. Soumagnac,
Peter Nugent,
Robert A. Knop,
Anna Y. Q. Ho,
William Hohensee,
Autumn Awbrey,
Alexis Andersen,
Greg Aldering,
Matan Ventura,
Jessica N. Aguilar,
Steven Ahlen,
Segev Y. Benzvi,
David Brooks,
Dillon Brout,
Todd Claybaugh,
Tamara M. Davis,
Kyle Dawson,
Axel de la Macorra,
Arjun Dey,
Biprateep Dey,
Peter Doel,
Kelly A. Douglass,
Jaime E. Forero-Romero,
Enrique Gaztanaga,
Satya Gontcho A Gontcho
, et al. (32 additional authors not shown)
Abstract:
We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF,…
▽ More
We present the MOST Hosts survey (Multi-Object Spectroscopy of Transient Hosts). The survey is planned to run throughout the five years of operation of the Dark Energy Spectroscopic Instrument (DESI) and will generate a spectroscopic catalog of the hosts of most transients observed to date, in particular all the supernovae observed by most public, untargeted, wide-field, optical surveys (PTF/iPTF, SDSS II, ZTF, DECAT, DESIRT). Scientific questions for which the MOST Hosts survey will be useful include Type Ia supernova cosmology, fundamental plane and peculiar velocity measurements, and the understanding of the correlations between transients and their host galaxy properties. Here, we present the first release of the MOST Hosts survey: 21,931 hosts of 20,235 transients. These numbers represent 36% of the final MOST Hosts sample, consisting of 60,212 potential host galaxies of 38,603 transients (a transient can be assigned multiple potential hosts). Of these galaxies, 40% do not appear in the DESI primary target list and therefore require a specific program like MOST Hosts. Of all the transients in the MOST Hosts list, only 26.7% have existing classifications, and so the survey will provide redshifts (and luminosities) for nearly 30,000 transients. A preliminary Hubble diagram and a transient luminosity-duration diagram are shown as examples of future potential uses of the MOST Hosts survey. The survey will also provide a training sample of spectroscopically observed transients for photometry-only classifiers, as we enter an era when most newly observed transients will lack spectroscopic classification. The MOST Hosts DESI survey data will be released through the Wiserep platform on a rolling cadence and updated to match the DESI releases. Dates of future releases and updates are available through the https://mosthosts.desi.lbl.gov website.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Hybridizable discontinuous Galerkin methods for solving the two-fluid plasma model
Authors:
Andrew Ho,
Uri Shumlak
Abstract:
The two-fluid plasma model has a wide range of timescales which must all be numerically resolved regardless of the timescale on which plasma dynamics occurs. The answer to solving numerically stiff systems is generally to utilize unconditionally stable implicit time advance methods. Hybridizable discontinuous Galerkin (HDG) methods have emerged as a powerful tool for solving stiff partial differen…
▽ More
The two-fluid plasma model has a wide range of timescales which must all be numerically resolved regardless of the timescale on which plasma dynamics occurs. The answer to solving numerically stiff systems is generally to utilize unconditionally stable implicit time advance methods. Hybridizable discontinuous Galerkin (HDG) methods have emerged as a powerful tool for solving stiff partial differential equations. The HDG framework combines the advantages of the discontinuous Galerkin (DG) method, such as high-order accuracy and flexibility in handling mixed hyperbolic/parabolic PDEs with the advantage of classical continuous finite element methods for constructing small numerically stable global systems which can be solved implicitly. In this research we quantify the numerical stability conditions for the two-fluid equations and demonstrate how HDG can be used to avoid the strict stability requirements while maintaining high order accurate results.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Elastic properties and thermodynamic anomalies of supersolids
Authors:
Milan Rakic,
Andrew F. Ho,
Derek K. K. Lee
Abstract:
We study a supersolid in the context of a Gross-Pitaevskii theory with a non-local effective potential. We employ a homogenisation technique which allows us to calculate the elastic moduli, supersolid fraction and other state variables of the system. Our methodology is verified against numerical simulations of elastic deformations. We can also verify that the long-wavelength Goldstone modes that e…
▽ More
We study a supersolid in the context of a Gross-Pitaevskii theory with a non-local effective potential. We employ a homogenisation technique which allows us to calculate the elastic moduli, supersolid fraction and other state variables of the system. Our methodology is verified against numerical simulations of elastic deformations. We can also verify that the long-wavelength Goldstone modes that emerge from this technique agree with Bogoliubov theory. We find a thermodynamic anomaly that the supersolid does not obey the thermodynamic relation $\partial P / \partial V \bigr|_N = - n \, \partial P / \partial N \bigr|_V$, which we claim is a feature unique to supersolids.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Context-aware LLM-based Safe Control Against Latent Risks
Authors:
Quan Khanh Luu,
Xiyu Deng,
Anh Van Ho,
Yorie Nakahira
Abstract:
It is challenging for autonomous control systems to perform complex tasks in the presence of latent risks. Motivated by this challenge, this paper proposes an integrated framework that involves Large Language Models (LLMs), stochastic gradient descent (SGD), and optimization-based control. In the first phrase, the proposed framework breaks down complex tasks into a sequence of smaller subtasks, wh…
▽ More
It is challenging for autonomous control systems to perform complex tasks in the presence of latent risks. Motivated by this challenge, this paper proposes an integrated framework that involves Large Language Models (LLMs), stochastic gradient descent (SGD), and optimization-based control. In the first phrase, the proposed framework breaks down complex tasks into a sequence of smaller subtasks, whose specifications account for contextual information and latent risks. In the second phase, these subtasks and their parameters are refined through a dual process involving LLMs and SGD. LLMs are used to generate rough guesses and failure explanations, and SGD is used to fine-tune parameters. The proposed framework is tested using simulated case studies of robots and vehicles. The experiments demonstrate that the proposed framework can mediate actions based on the context and latent risks and learn complex behaviors efficiently.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Algorithmic progress in language models
Authors:
Anson Ho,
Tamay Besiroglu,
Ege Erdil,
David Owen,
Robi Rahman,
Zifan Carl Guo,
David Atkinson,
Neil Thompson,
Jaime Sevilla
Abstract:
We investigate the rate at which algorithms for pre-training language models have improved since the advent of deep learning. Using a dataset of over 200 language model evaluations on Wikitext and Penn Treebank spanning 2012-2023, we find that the compute required to reach a set performance threshold has halved approximately every 8 months, with a 95% confidence interval of around 5 to 14 months,…
▽ More
We investigate the rate at which algorithms for pre-training language models have improved since the advent of deep learning. Using a dataset of over 200 language model evaluations on Wikitext and Penn Treebank spanning 2012-2023, we find that the compute required to reach a set performance threshold has halved approximately every 8 months, with a 95% confidence interval of around 5 to 14 months, substantially faster than hardware gains per Moore's Law. We estimate augmented scaling laws, which enable us to quantify algorithmic progress and determine the relative contributions of scaling models versus innovations in training algorithms. Despite the rapid pace of algorithmic progress and the development of new architectures such as the transformer, our analysis reveals that the increase in compute made an even larger contribution to overall performance improvements over this time period. Though limited by noisy benchmark data, our analysis quantifies the rapid progress in language modeling, shedding light on the relative contributions from compute and algorithms.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
ROSE: Rotation-based Squeezing Robotic Gripper toward Universal Handling of Objects
Authors:
Son Tien Bui,
Shinya Kawano,
Van Anh Ho
Abstract:
Robotics hand/grippers nowadays are not limited to manufacturing lines; instead, they are widely utilized in cluttered environments, such as restaurants, farms, and warehouses. In such scenarios, they need to deal with high uncertainty of the grasped objects' shapes, postures, surfaces, and material properties, which requires complex integration of sensing and decision-making process. On the other…
▽ More
Robotics hand/grippers nowadays are not limited to manufacturing lines; instead, they are widely utilized in cluttered environments, such as restaurants, farms, and warehouses. In such scenarios, they need to deal with high uncertainty of the grasped objects' shapes, postures, surfaces, and material properties, which requires complex integration of sensing and decision-making process. On the other hand, integrating soft materials into the gripper's design may tolerate the above uncertainties and reduce complexity in control. In this paper, we introduce ROSE, a novel soft gripper that can embrace the object and squeeze it by buckling a funnel-liked thin-walled soft membrane around the object by simple rotation of the base. Thanks to this design, ROSE hand can adapt to a wide range of objects that can fit in the funnel and handle with gentle grip** force. Regardless of this, ROSE can generate a high lift force (up to 33kgf) while significantly reducing the normal pressure on the gripped objects. In our experiment, a 198g ROSE can be integrated into a robot arm with a single actuation and successfully lift various types of objects, even after 400,000 trials. The embracing mechanism helps reduce the dependence of friction between the object and the membrane, as ROSE could pick up a chicken egg submerged inside an olive oil tank. We also report a feasible design for equip** the ROSE hand with tactile sensing while appealing to the scalability of the design to fit a wide range of objects. Video: https://youtu.be/E1wAI09LaoY
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
Complements of locally flat submanifolds are finite CW complexes
Authors:
Andrew Ho
Abstract:
We show that if $Y$ is a compact topological manifold and $X$ is a locally flat submanifold, then the complement $Y - X$ is homotopy equivalent to a finite CW complex. This is a direct proof, and does not rely on much of the theory of topological manifolds.
We show that if $Y$ is a compact topological manifold and $X$ is a locally flat submanifold, then the complement $Y - X$ is homotopy equivalent to a finite CW complex. This is a direct proof, and does not rely on much of the theory of topological manifolds.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
EuroPED-NN: Uncertainty aware surrogate model
Authors:
A. Panera Alvarez,
A. Ho,
A. Jarvinen,
S. Saarelma,
S. Wiesen,
JET Contributors
Abstract:
This work successfully generates uncertainty aware surrogate models, via the Bayesian neural network with noise contrastive prior (BNN-NCP) technique, of the EuroPED plasma pedestal model using data from the JET-ILW pedestal database and subsequent model evaluations. All this conform EuroPED-NN. The BNN-NCP technique is proven to be a good fit for uncertainty aware surrogate models, matching the o…
▽ More
This work successfully generates uncertainty aware surrogate models, via the Bayesian neural network with noise contrastive prior (BNN-NCP) technique, of the EuroPED plasma pedestal model using data from the JET-ILW pedestal database and subsequent model evaluations. All this conform EuroPED-NN. The BNN-NCP technique is proven to be a good fit for uncertainty aware surrogate models, matching the output results as a regular neural network, providing prediction's confidence as uncertainties, and highlighting the out of distribution (OOD) regions using surrogate model uncertainties. This provides critical insights into model robustness and reliability. EuroPED-NN has been physically validated, first, analyzing electron density $n_e\!\left(ψ_{\text{pol}}=0.94\right)$ with respect to increasing plasma current, $I_p$, and second, validating the $Δ-β_{p,ped}$ relation associated with the EuroPED model. Affirming the robustness of the underlying physics learned by the surrogate model.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
AT2019pim: A Luminous Orphan Afterglow from a Moderately Relativistic Outflow
Authors:
Daniel A. Perley,
Anna Y. Q. Ho,
Michael Fausnaugh,
Gavin P. Lamb,
Mansi M. Kasliwal,
Tomas Ahumada,
Shreya Anand,
Igor Andreoni,
Eric Bellm,
Varun Bhalerao,
Bryce Bolin,
Thomas G. Brink,
Eric Burns,
S. Bradley Cenko,
Alessandra Corsi,
Alexei V. Filippenko,
Dmitry Frederiks,
Adam Goldstein,
Rachel Hamburg,
Rahul Jayaraman,
Peter G. Jonker,
Erik C. Kool,
Shrinivas Kulkarni,
Harsh Kumar,
Russ Laher
, et al. (12 additional authors not shown)
Abstract:
Classical gamma-ray bursts (GRBs) have two distinct emission episodes: prompt emission from ultra-relativistic ejecta and afterglow from shocked circumstellar material. While both components are extremely luminous in known GRBs, a variety of scenarios predict the existence of luminous afterglow emission with little or no associated high-energy prompt emission. We present AT 2019pim, the first secu…
▽ More
Classical gamma-ray bursts (GRBs) have two distinct emission episodes: prompt emission from ultra-relativistic ejecta and afterglow from shocked circumstellar material. While both components are extremely luminous in known GRBs, a variety of scenarios predict the existence of luminous afterglow emission with little or no associated high-energy prompt emission. We present AT 2019pim, the first secure example of this phenomenon to be identified. Serendipitously discovered during follow-up observations of a gravitational-wave trigger and located in a contemporaneous TESS sector, it is hallmarked by a fast-rising (t ~ 2 hr), luminous (M_UV,peak ~ -24.4 mag) optical transient with accompanying luminous X-ray and radio emission. No gamma-ray emission consistent with the time and location of the transient was detected by Fermi-GBM or by Konus, placing strong limits on an accompanying GRB. We investigate several independent observational aspects of the afterglow in the context of constraints on relativistic motion and find all of them are consistent with an initial Lorentz factor of Gamma_0 ~ 30-50, significantly lower than in any well-observed GRB and consistent with the theoretically-predicted "dirty fireball" scenario in which the high-energy prompt emission is stifled by pair production. However, we cannot rule out a structured jet model in which only the line-of-sight material was ejected at low-Gamma, off-axis from a classical high-Gamma jet core. This event represents a milestone in orphan afterglow searches, demonstrating that luminous afterglows with weak or no detectable gamma-ray radiation exist in nature and can be discovered by high-cadence optical surveys.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Evaluating Language-Model Agents on Realistic Autonomous Tasks
Authors:
Megan Kinniment,
Lucas Jun Koba Sato,
Haoxing Du,
Brian Goodrich,
Max Hasin,
Lawrence Chan,
Luke Harold Miles,
Tao R. Lin,
Hjalmar Wijk,
Joel Burget,
Aaron Ho,
Elizabeth Barnes,
Paul Christiano
Abstract:
In this report, we explore the ability of language model agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild. We refer to this cluster of capabilities as "autonomous replication and adaptation" or ARA. We believe that systems capable of ARA could have wide-reaching and hard-to-anticipate consequences, and that measuring and forecasting…
▽ More
In this report, we explore the ability of language model agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild. We refer to this cluster of capabilities as "autonomous replication and adaptation" or ARA. We believe that systems capable of ARA could have wide-reaching and hard-to-anticipate consequences, and that measuring and forecasting ARA may be useful for informing measures around security, monitoring, and alignment. Additionally, once a system is capable of ARA, placing bounds on a system's capabilities may become significantly more difficult.
We construct four simple example agents that combine language models with tools that allow them to take actions in the world. We then evaluate these agents on 12 tasks relevant to ARA. We find that these language model agents can only complete the easiest tasks from this list, although they make some progress on the more challenging tasks. Unfortunately, these evaluations are not adequate to rule out the possibility that near-future agents will be capable of ARA. In particular, we do not think that these evaluations provide good assurance that the ``next generation'' of language models (e.g. 100x effective compute scaleup on existing models) will not yield agents capable of ARA, unless intermediate evaluations are performed during pretraining. Relatedly, we expect that fine-tuning of the existing models could produce substantially more competent agents, even if the fine-tuning is not directly targeted at ARA.
△ Less
Submitted 4 January, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Limits to the Energy Efficiency of CMOS Microprocessors
Authors:
Anson Ho,
Ege Erdil,
Tamay Besiroglu
Abstract:
CMOS microprocessors have achieved massive energy efficiency gains but may reach limits soon. This paper presents an approach to estimating the limits on the maximum floating point operations per Joule (FLOP/J) for CMOS microprocessors. We analyze the three primary sources of energy dissipation: transistor switching, interconnect capacitances and leakage power. Using first-principles calculations…
▽ More
CMOS microprocessors have achieved massive energy efficiency gains but may reach limits soon. This paper presents an approach to estimating the limits on the maximum floating point operations per Joule (FLOP/J) for CMOS microprocessors. We analyze the three primary sources of energy dissipation: transistor switching, interconnect capacitances and leakage power. Using first-principles calculations of minimum energy costs based on Landauer's principle, prior estimates of relevant parameters, and empirical data on hardware, we derive the energy cost per FLOP for each component. Combining these yields a geometric mean estimate of 4.7e15 FP4/J for the maximum CMOS energy efficiency, roughly two hundred-fold more efficient than current microprocessors.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Minutes-duration Optical Flares with Supernova Luminosities
Authors:
Anna Y. Q. Ho,
Daniel A. Perley,
** Chen,
Steve Schulze,
Vik Dhillon,
Harsh Kumar,
Aswin Suresh,
Vishwajeet Swain,
Michael Bremer,
Stephen J. Smartt,
Joseph P. Anderson,
G. C. Anupama,
Supachai Awiphan,
Sudhanshu Barway,
Eric C. Bellm,
Sagi Ben-Ami,
Varun Bhalerao,
Thomas de Boer,
Thomas G. Brink,
Rick Burruss,
Poonam Chandra,
Ting-Wan Chen,
Wen-** Chen,
Jeff Cooke,
Michael W. Coughlin
, et al. (52 additional authors not shown)
Abstract:
In recent years, certain luminous extragalactic optical transients have been observed to last only a few days. Their short observed duration implies a different powering mechanism from the most common luminous extragalactic transients (supernovae) whose timescale is weeks. Some short-duration transients, most notably AT2018cow, display blue optical colours and bright radio and X-ray emission. Seve…
▽ More
In recent years, certain luminous extragalactic optical transients have been observed to last only a few days. Their short observed duration implies a different powering mechanism from the most common luminous extragalactic transients (supernovae) whose timescale is weeks. Some short-duration transients, most notably AT2018cow, display blue optical colours and bright radio and X-ray emission. Several AT2018cow-like transients have shown hints of a long-lived embedded energy source, such as X-ray variability, prolonged ultraviolet emission, a tentative X-ray quasiperiodic oscillation, and large energies coupled to fast (but subrelativistic) radio-emitting ejecta. Here we report observations of minutes-duration optical flares in the aftermath of an AT2018cow-like transient, AT2022tsd (the "Tasmanian Devil"). The flares occur over a period of months, are highly energetic, and are likely nonthermal, implying that they arise from a near-relativistic outflow or jet. Our observations confirm that in some AT2018cow-like transients the embedded energy source is a compact object, either a magnetar or an accreting black hole.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Efficient training sets for surrogate models of tokamak turbulence with Active Deep Ensembles
Authors:
L. Zanisi,
A. Ho,
T. Madula,
J. Barr,
J. Citrin,
S. Pamela,
J. Buchanan,
F. Casson,
V. Gopakumar,
JET contributors
Abstract:
Model-based plasma scenario development lies at the heart of the design and operation of future fusion powerplants. Including turbulent transport in integrated models is essential for delivering a successful roadmap towards operation of ITER and the design of DEMO-class devices. Given the highly iterative nature of integrated models, fast machine-learning-based surrogates of turbulent transport ar…
▽ More
Model-based plasma scenario development lies at the heart of the design and operation of future fusion powerplants. Including turbulent transport in integrated models is essential for delivering a successful roadmap towards operation of ITER and the design of DEMO-class devices. Given the highly iterative nature of integrated models, fast machine-learning-based surrogates of turbulent transport are fundamental to fulfil the pressing need for faster simulations opening up pulse design, optimization, and flight simulator applications. A significant bottleneck is the generation of suitably large training datasets covering a large volume in parameter space, which can be prohibitively expensive to obtain for higher fidelity codes.
In this work, we propose ADEPT (Active Deep Ensembles for Plasma Turbulence), a physics-informed, two-stage Active Learning strategy to ease this challenge. Active Learning queries a given model by means of an acquisition function that identifies regions where additional data would improve the surrogate model. We provide a benchmark study using available data from the literature for the QuaLiKiz quasilinear transport model. We demonstrate quantitatively that the physics-informed nature of the proposed workflow reduces the need to perform simulations in stable regions of the parameter space, resulting in significantly improved data efficiency. We show an up to a factor of 20 reduction in training dataset size needed to achieve the same performance as random sampling. We then validate the surrogates on multichannel integrated modelling of ITG-dominated JET scenarios and demonstrate that they recover the performance of QuaLiKiz to better than 10\%. This matches the performance obtained in previous work, but with two orders of magnitude fewer training data points.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Building Footprint Extraction in Dense Areas using Super Resolution and Frame Field Learning
Authors:
Vuong Nguyen,
Anh Ho,
Duc-Anh Vu,
Nguyen Thi Ngoc Anh,
Tran Ngoc Thang
Abstract:
Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial imag…
▽ More
Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial image, allowing for finer details to be captured. This enhanced imagery serves as input to a multitask learning module, which consists of a segmentation head and a frame field learning head to effectively handle the irregular building structures. Our model is supervised by adaptive loss weighting, enabling extraction of sharp edges and fine-grained polygons which is difficult due to overlap** buildings and low data quality. Extensive experiments on a slum area in India that mimics a dense area demonstrate that our proposed approach significantly outperforms the current state-of-the-art methods by a large margin.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
The On-axis Jetted Tidal Disruption Event AT2022cmc: X-ray Observations and Broadband Spectral Modeling
Authors:
Yuhan Yao,
Wenbin Lu,
Fiona Harrison,
S. R. Kulkarni,
Suvi Gezari,
Muryel Guolo,
S. Bradley Cenko,
Anna Y. Q. Ho
Abstract:
AT2022cmc was recently reported as the first on-axis jetted tidal disruption event (TDE) discovered in the last decade, and the fourth on-axis jetted TDE candidate known so far. In this work, we present NuSTAR hard X-ray (3--30 keV) observations of AT2022cmc, as well as soft X-ray (0.3--6 keV) observations obtained by NICER, Swift, and XMM-Newton. Our analysis reveals that the broadband X-ray spec…
▽ More
AT2022cmc was recently reported as the first on-axis jetted tidal disruption event (TDE) discovered in the last decade, and the fourth on-axis jetted TDE candidate known so far. In this work, we present NuSTAR hard X-ray (3--30 keV) observations of AT2022cmc, as well as soft X-ray (0.3--6 keV) observations obtained by NICER, Swift, and XMM-Newton. Our analysis reveals that the broadband X-ray spectra can be well described by a broken power-law with $f_ν\propto ν^{-0.5}$ ($f_ν\propto ν^{-1}$) below (above) the rest-frame break energy of $E_{\rm bk}\sim 10$ keV at observer-frame $t_{\rm obs}=7.8$ and 17.6 days since discovery. At $t_{\rm obs} = 36.2$ days, the X-ray spectrum is consistent with either a single power-law or a broken power-law. By modeling the spectral energy distribution evolution from radio to hard X-ray across the three NuSTAR observing epochs, we find that the sub-millimeter/radio emission originates from external shocks at large distances $\gtrsim\! 10^{17}$ cm from the black hole, the UV/optical light comes from a thermal envelope with radius $\sim\!10^{15}$ cm, and the X-ray emission is consistent with synchrotron radiation powered by energy dissipation at intermediate radii within the (likely magnetically dominated) jet. We constrain the bulk Lorentz factor of the jet to be of the order 10--100. Our interpretation differs from the model proposed by Pasham et al. (2023) where both the radio and X-rays come from the same emitting zone in a matter-dominated jet. Our model for the jet X-ray emission has broad implications on the nature of relativistic jets in other sources such as gamma-ray bursts.
△ Less
Submitted 20 February, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Gamma-ray Transient Network Science Analysis Group Report
Authors:
Eric Burns,
Michael Coughlin,
Kendall Ackley,
Igor Andreoni,
Marie-Anne Bizouard,
Floor Broekgaarden,
Nelson L. Christensen,
Filippo D'Ammando,
James DeLaunay,
Henrike Fleischhack,
Raymond Frey,
Chris L. Fryer,
Adam Goldstein,
Bruce Grossan,
Rachel Hamburg,
Dieter H. Hartmann,
Anna Y. Q. Ho,
Eric J. Howell,
C. Michelle Hui,
Leah Jenks,
Alyson Joens,
Stephen Lesage,
Andrew J. Levan,
Amy Lien,
Athina Meli
, et al. (12 additional authors not shown)
Abstract:
The Interplanetary Network (IPN) is a detection, localization and alert system that utilizes the arrival time of transient signals in gamma-ray detectors on spacecraft separated by planetary baselines to geometrically locate the origin of these transients. Due to the changing astrophysical landscape and the new emphasis on time domain and multi-messenger astrophysics (TDAMM) from the Pathways to D…
▽ More
The Interplanetary Network (IPN) is a detection, localization and alert system that utilizes the arrival time of transient signals in gamma-ray detectors on spacecraft separated by planetary baselines to geometrically locate the origin of these transients. Due to the changing astrophysical landscape and the new emphasis on time domain and multi-messenger astrophysics (TDAMM) from the Pathways to Discovery in Astronomy and Astrophysics for the 2020s, this Gamma-ray Transient Network Science Analysis Group was tasked to understand the role of the IPN and high-energy monitors in this new era. The charge includes describing the science made possible with these facilities, tracing the corresponding requirements and capabilities, and highlighting where improved operations of existing instruments and the IPN would enhance TDAMM science. While this study considers the full multiwavelength and multimessenger context, the findings are specific to space-based high-energy monitors. These facilities are important both for full characterization of these transients as well as facilitating follow-up observations through discovery and localization. The full document reports a brief history of this field, followed by our detailed analyses and findings in some 68 pages, providing a holistic overview of the role of the IPN and high-energy monitors in the coming decades.
△ Less
Submitted 5 October, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Millimeter Observations of the Type II SN2023ixf: Constraints on the Proximate Circumstellar Medium
Authors:
Edo Berger,
Garrett K. Keating,
Raffaella Margutti,
Keiichi Maeda,
Kate D. Alexander,
Yvette Cendes,
Tarraneh Eftekhari,
Mark Gurwell,
Daichi Hiramatsu,
Anna Y. Q. Ho,
Tanmoy Laskar,
Ramprasad Rao,
Peter K. G. Williams
Abstract:
We present 1.3 mm (230 GHz) observations of the recent and nearby Type II supernova, SN2023ixf, obtained with the Submillimeter Array (SMA) at 2.6-18.6 days after explosion. The observations were obtained as part the SMA Large Program POETS (Pursuit of Extragalactic Transients with the SMA). We do not detect any emission at the location of SN2023ixf, with the deepest limits of…
▽ More
We present 1.3 mm (230 GHz) observations of the recent and nearby Type II supernova, SN2023ixf, obtained with the Submillimeter Array (SMA) at 2.6-18.6 days after explosion. The observations were obtained as part the SMA Large Program POETS (Pursuit of Extragalactic Transients with the SMA). We do not detect any emission at the location of SN2023ixf, with the deepest limits of $L_ν(230\,{\rm GHz})\lesssim 8.6\times 10^{25}$ erg s$^{-1}$ Hz$^{-1}$ at 2.7 and 7.7 days, and $L_ν(230\,{\rm GHz})\lesssim 3.4\times 10^{25}$ erg s$^{-1}$ Hz$^{-1}$ at 18.6 days. These limits are about a factor of 2 times dimmer than the mm emission from SN2011dh (IIb), about an order of magnitude dimmer compared to SN1993J (IIb) and SN2018ivc (IIL), and about 30 times dimmer than the most luminous non-relativistic SNe in the mm-band (Type IIb/Ib/Ic). Using these limits in the context of analytical models that include synchrotron self-absorption and free-free absorption we place constraints on the proximate circumstellar medium around the progenitor star, to a scale of $\sim 2\times 10^{15}$ cm, excluding the range $\dot{M}\sim {\rm few}\times 10^{-6}-10^{-2}$ M$_\odot$ yr$^{-1}$ (for a wind velocity, $v_w=115$ km s$^{-1}$, and ejecta velocity, $v_{\rm eje}\sim (1-2)\times 10^4$ km s$^{-1}$). These results are consistent with an inference of the mass loss rate based on optical spectroscopy ($\sim 2\times 10^{-2}$ M$_\odot$ yr$^{-1}$ for $v_w=115$ km s$^{-1}$), but are in tension with the inference from hard X-rays ($\sim 7\times 10^{-4}$ M$_\odot$ yr$^{-1}$ for $v_w=115$ km s$^{-1}$). This tension may be alleviated by a non-homogeneous and confined CSM, consistent with results from high-resolution optical spectroscopy.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Probing pre-supernova mass loss in double-peaked Type Ibc supernovae from the Zwicky Transient Facility
Authors:
Kaustav K. Das,
Mansi M. Kasliwal,
Jesper Sollerman,
Christoffer Fremling,
I. Irani,
Shing-Chi Leung,
Sheng Yang,
Samantha Wu,
Jim Fuller,
Shreya Anand,
Igor Andreoni,
C. Barbarino,
Thomas G. Brink,
Kishalay De,
Alison Dugas,
Steven L. Groom,
George Helou,
K-Ryan Hinds,
Anna Y. Q. Ho,
Viraj Karambelkar,
S. R. Kulkarni,
Daniel A. Perley,
Josiah Purdum,
Nicolas Regnault,
Steve Schulze
, et al. (11 additional authors not shown)
Abstract:
Eruptive mass loss of massive stars prior to supernova (SN) explosion is key to understanding their evolution and end fate. An observational signature of pre-SN mass loss is the detection of an early, short-lived peak prior to the radioactive-powered peak in the lightcurve of the SN. This is usually attributed to the SN shock passing through an extended envelope or circumstellar medium (CSM). Such…
▽ More
Eruptive mass loss of massive stars prior to supernova (SN) explosion is key to understanding their evolution and end fate. An observational signature of pre-SN mass loss is the detection of an early, short-lived peak prior to the radioactive-powered peak in the lightcurve of the SN. This is usually attributed to the SN shock passing through an extended envelope or circumstellar medium (CSM). Such an early peak is common for double-peaked Type IIb SNe with an extended Hydrogen envelope but is uncommon for normal Type Ibc SNe with very compact progenitors. In this paper, we systematically study a sample of 14 double-peaked Type Ibc SNe out of 475 Type Ibc SNe detected by the Zwicky Transient Facility. The rate of these events is ~ 3-9 % of Type Ibc SNe. A strong correlation is seen between the peak brightness of the first and the second peak. We perform a holistic analysis of this sample's photometric and spectroscopic properties. We find that six SNe have ejecta mass less than 1.5 Msun. Based on the nebular spectra and lightcurve properties, we estimate that the progenitor masses for these are less than ~ 12 Msun. The rest have an ejecta mass > 2.4 Msun and a higher progenitor mass. This sample suggests that the SNe with low progenitor masses undergo late-time binary mass transfer. Meanwhile, the SNe with higher progenitor masses are consistent with wave-driven mass loss or pulsation-pair instability-driven mass loss simulations.
△ Less
Submitted 19 June, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Authors:
Shravan Nayak,
Surangika Ranathunga,
Sarubi Thillainathan,
Rikki Hung,
Anthony Rinaldi,
Yining Wang,
Jonah Mackey,
Andrew Ho,
En-Shiun Annie Lee
Abstract:
NMT systems trained on Pre-trained Multilingual Sequence-Sequence (PMSS) models flounder when sufficient amounts of parallel data is not available for fine-tuning. This specifically holds for languages missing/under-represented in these models. The problem gets aggravated when the data comes from different domains. In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is…
▽ More
NMT systems trained on Pre-trained Multilingual Sequence-Sequence (PMSS) models flounder when sufficient amounts of parallel data is not available for fine-tuning. This specifically holds for languages missing/under-represented in these models. The problem gets aggravated when the data comes from different domains. In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is extremely beneficial for domain-specific NMT, especially when target domain data is limited/unavailable and the considered languages are missing or under-represented in the PMSS model. We quantify the domain-specific results variations using a domain-divergence test, and show that ITFT can mitigate the impact of domain divergence to some extent.
△ Less
Submitted 23 September, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Long-rising Type II Supernovae in the Zwicky Transient Facility Census of the Local Universe
Authors:
Tawny Sit,
Mansi M. Kasliwal,
Anastasios Tzanidakis,
Kishalay De,
Christoffer Fremling,
Jesper Sollerman,
Avishay Gal-Yam,
Adam A. Miller,
Scott Adams,
Robert Aloisi,
Igor Andreoni,
Matthew Chu,
David Cook,
Kaustav Kashyap Das,
Alison Dugas,
Steven L. Groom,
Anna Y. Q. Ho,
Viraj Karambelkar,
James D. Neill,
Frank J. Masci,
Michael S. Medford,
Josiah Purdum,
Yashvi Sharma,
Roger Smith,
Robert Stein
, et al. (3 additional authors not shown)
Abstract:
SN 1987A was an unusual hydrogen-rich core-collapse supernova originating from a blue supergiant star. Similar blue supergiant explosions remain a small family of events, and are broadly characterized by their long rises to peak. The Zwicky Transient Facility (ZTF) Census of the Local Universe (CLU) experiment aims to construct a spectroscopically complete sample of transients occurring in galaxie…
▽ More
SN 1987A was an unusual hydrogen-rich core-collapse supernova originating from a blue supergiant star. Similar blue supergiant explosions remain a small family of events, and are broadly characterized by their long rises to peak. The Zwicky Transient Facility (ZTF) Census of the Local Universe (CLU) experiment aims to construct a spectroscopically complete sample of transients occurring in galaxies from the CLU galaxy catalog. We identify 13 long-rising (>40 days) Type II supernovae from the volume-limited CLU experiment during a 3.5 year period from June 2018 to December 2021, approximately doubling the previously known number of these events. We present photometric and spectroscopic data of these 13 events, finding peak r-band absolute magnitudes ranging from -15.6 to -17.5 mag and the tentative detection of Ba II lines in 9 events. Using our CLU sample of events, we derive a long-rising Type II supernova rate of $1.37^{+0.26}_{-0.30}\times10^{-6}$ Mpc$^{-3}$ yr$^{-1}$, $\approx$1.4% of the total core-collapse supernova rate. This is the first volumetric rate of these events estimated from a large, systematic, volume-limited experiment.
△ Less
Submitted 12 March, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
1100 days in the life of the supernova 2018ibb -- The best pair-instability supernova candidate, to date
Authors:
Steve Schulze,
Claes Fransson,
Alexandra Kozyreva,
Ting-Wan Chen,
Ofer Yaron,
Anders Jerkstrand,
Avishay Gal-Yam,
Jesper Sollerman,
Lin Yan,
Tuomas Kangas,
Giorgos Leloudas,
Conor M. B. Omand,
Stephen J. Smartt,
Yi Yang,
Matt Nicholl,
Nikhil Sarin,
Yuhan Yao,
Thomas G. Brink,
Amir Sharon,
Andrea Rossi,
** Chen,
Zhihao Chen,
Aleksandar Cikota,
Kishalay De,
Andrew J. Drake
, et al. (41 additional authors not shown)
Abstract:
Abridged - Stars with ZAMS masses between 140 and $260 M_\odot$ are thought to explode as pair-instability supernovae (PISNe). During their thermonuclear runaway, PISNe can produce up to several tens of solar masses of radioactive nickel, resulting in luminous transients similar to some superluminous supernovae (SLSNe). Yet, no unambiguous PISN has been discovered so far. SN2018ibb is a H-poor SLS…
▽ More
Abridged - Stars with ZAMS masses between 140 and $260 M_\odot$ are thought to explode as pair-instability supernovae (PISNe). During their thermonuclear runaway, PISNe can produce up to several tens of solar masses of radioactive nickel, resulting in luminous transients similar to some superluminous supernovae (SLSNe). Yet, no unambiguous PISN has been discovered so far. SN2018ibb is a H-poor SLSN at $z=0.166$ that evolves extremely slowly compared to the hundreds of known SLSNe. Between mid 2018 and early 2022, we monitored its photometric and spectroscopic evolution from the UV to the NIR with 2-10m class telescopes. SN2018ibb radiated $>3\times10^{51} \rm erg$ during its evolution, and its bolometric light curve reached $>2\times10^{44} \rm erg\,s^{-1}$ at peak. The long-lasting rise of $>93$ rest-frame days implies a long diffusion time, which requires a very high total ejected mass. The PISN mechanism naturally provides both the energy source ($^{56}$Ni) and the long diffusion time. Theoretical models of PISNe make clear predictions for their photometric and spectroscopic properties. SN2018ibb complies with most tests on the light curves, nebular spectra and host galaxy, potentially all tests with the interpretation we propose. Both the light curve and the spectra require 25-44 $M_\odot$ of freshly nucleosynthesised $^{56}$Ni, pointing to the explosion of a metal-poor star with a He-core mass of 120-130 $M_\odot$ at the time of death. This interpretation is also supported by the tentative detection of [Co II]$λ$1.025$μ$m, which has never been observed in any other PISN candidate or SLSN before. Powering by a central engine, such as a magnetar or a black hole, can be excluded with high confidence. This makes SN2018ibb by far the best candidate for being a PISN, to date.
△ Less
Submitted 24 November, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Collapsars as Sites of r-process Nucleosynthesis: Systematic Near-Infrared Follow-up of Type Ic-BL Supernovae
Authors:
Shreya Anand,
Jennifer Barnes,
Sheng Yang,
Mansi M. Kasliwal,
Michael W. Coughlin,
Jesper Sollerman,
Kishalay De,
Christoffer Fremling,
Alessandra Corsi,
Anna Y. Q. Ho,
Arvind Balasubramanian,
Conor Omand,
Gokul P. Srinivasaragavan,
S. Bradley Cenko,
Tomas Ahumada,
Igor Andreoni,
Aishwarya Dahiwale,
Kaustav Kashyap Das,
Jacob Jencson,
Viraj Karambelkar,
Harsh Kumar,
Brian D. Metzger,
Daniel Perley,
Nikhil Sarin,
Tassilo Schweyer
, et al. (19 additional authors not shown)
Abstract:
One of the open questions following the discovery of GW170817 is whether neutron star mergers are the only astrophysical sites capable of producing $r$-process elements. Simulations have shown that 0.01-0.1M$_\odot$ of $r$-process material could be generated in the outflows originating from the accretion disk surrounding the rapidly rotating black hole that forms as a remnant to both neutron star…
▽ More
One of the open questions following the discovery of GW170817 is whether neutron star mergers are the only astrophysical sites capable of producing $r$-process elements. Simulations have shown that 0.01-0.1M$_\odot$ of $r$-process material could be generated in the outflows originating from the accretion disk surrounding the rapidly rotating black hole that forms as a remnant to both neutron star mergers and collapsing massive stars associated with long-duration gamma-ray bursts (collapsars). The hallmark signature of $r$-process nucleosynthesis in the binary neutron star merger GW170817 was its long-lasting near-infrared emission, thus motivating a systematic photometric study of the light curves of broadlined stripped-envelope (Ic-BL) supernovae (SNe) associated with collapsars. We present the first systematic study of 25 SNe Ic-BL -- including 18 observed with the Zwicky Transient Facility and 7 from the literature -- in the optical/near-infrared bands to determine what quantity of $r$-process material, if any, is synthesized in these explosions. Using semi-analytic models designed to account for $r$-process production in SNe Ic-BL, we perform light curve fitting to derive constraints on the $r$-process mass for these SNe. We also perform independent light curve fits to models without $r$-process. We find that the $r$-process-free models are a better fit to the light curves of the objects in our sample. Thus we find no compelling evidence of $r$-process enrichment in any of our objects. Further high-cadence infrared photometric studies and nebular spectroscopic analysis would be sensitive to smaller quantities of $r$-process ejecta mass or indicate whether all collapsars are completely devoid of $r$-process nucleosynthesis.
△ Less
Submitted 12 February, 2024; v1 submitted 17 February, 2023;
originally announced February 2023.
-
The accuracy of mutual potential approximations in simulations of binary asteroids
Authors:
Alex Ho,
Margrethe Wold,
Mohammad Poursina,
John T. Conway
Abstract:
Simulations of asteroid binaries commonly use mutual gravitational potentials approximated by series expansions, leading to truncation errors, and also preventing correct computations of the forces and torques when the bodies are close. We make of a recently developed method where the mutual potential is calculated with the use of surface integrals and is exact for bodies of ellipsoidal shapes. Th…
▽ More
Simulations of asteroid binaries commonly use mutual gravitational potentials approximated by series expansions, leading to truncation errors, and also preventing correct computations of the forces and torques when the bodies are close. We make of a recently developed method where the mutual potential is calculated with the use of surface integrals and is exact for bodies of ellipsoidal shapes. The solutions produced by the surface integration method are compared with an approach that expands the mutual potential, truncated at second and fourth order. The approximate solutions are generated with the ``General Use Binary Asteroid Simulator'' (gubas). We find that the differences in the forces and torques are the largest when the bodies are nearly touching. These differences can exceed 1000% if the shape of the primary is highly elongated. Long term simulations show more than 100% difference in the dynamics if the bodies are initially close, while the differences are negligible if the bodies are initially far apart. For simulations with two triaxial ellipsoids, the computational efficiency of the surface integral method is comparable to fourth order approximations with gubas, and superior to potentials truncated to order eight or higher.
△ Less
Submitted 24 January, 2023;
originally announced January 2023.
-
Unique determination of localized basis in molecular spin
Authors:
Le Tuan Anh Ho,
Liviu Ungur
Abstract:
Localized basis plays an important role in comprehending the magnetic dynamics in molecular spins from a physics perspective. Nonetheless, the uniqueness and rigor of its determination have received limited attention. In this study, we propose a new determination of the localized basis applicable to both non-Kramers and Kramers molecular spin systems, leveraging the time-reversal symmetry of the s…
▽ More
Localized basis plays an important role in comprehending the magnetic dynamics in molecular spins from a physics perspective. Nonetheless, the uniqueness and rigor of its determination have received limited attention. In this study, we propose a new determination of the localized basis applicable to both non-Kramers and Kramers molecular spin systems, leveraging the time-reversal symmetry of the spin Hamiltonian and the molecular spin's main magnetic axis. By introducing this, we establish a distinct and practical means of determining the localized basis, enabling the association of a molecular spin wave function with either an "up" or "down" magnetic moment orientation in molecular spins. This finding facilitates a comprehensive interpretation of magnetic dynamics and simplifies the construction of theoretical models for materials analysis.
△ Less
Submitted 7 September, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Dissipative Landau-Zener transition with decoherence rate
Authors:
Le Tuan Anh Ho,
Liviu Ungur,
Liviu F. Chibotaru
Abstract:
An innovative microscopic model with a minimal number of parameters: tunneling splitting gap, external field swee** velocity, and decoherence rate is used to describe dynamics of the dissipative Landau-Zener transition in the presence of the decoherence. In limiting cases, the derived equation of motion gives rise to the well-known Landau-Zener and Kayanuma formula. In a general case, the descri…
▽ More
An innovative microscopic model with a minimal number of parameters: tunneling splitting gap, external field swee** velocity, and decoherence rate is used to describe dynamics of the dissipative Landau-Zener transition in the presence of the decoherence. In limiting cases, the derived equation of motion gives rise to the well-known Landau-Zener and Kayanuma formula. In a general case, the description demonstrates a non-monotonic flip** probability with respect to the swee** velocity, which is also found in some other models. This non-monotony can be explained by considering the competition and timescale of the quantum tunneling, crossing period, and decoherence process. The simplicity and robustness of the theory offer a practical and novel description of the Landau-Zener transition. In addition, it promises an alternative method to the electron paramagnetic resonance in measuring the effective decoherence rate of relevant quantum systems.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Coherence/incoherence transition temperature in molecular spin
Authors:
Le Tuan Anh Ho,
Liviu Ungur,
Liviu F. Chibotaru
Abstract:
We examine the coherence/incoherence transition temperature of a generic molecular spin. Our results demonstrates that a molecular spin with a high coherence/incoherence transition temperature should possess a low spin number and low axiality, or high spin number and high axiality. Interestingly, the latter is better protected from the magnetic noises than the former and thus be the best candidate…
▽ More
We examine the coherence/incoherence transition temperature of a generic molecular spin. Our results demonstrates that a molecular spin with a high coherence/incoherence transition temperature should possess a low spin number and low axiality, or high spin number and high axiality. Interestingly, the latter is better protected from the magnetic noises than the former and thus be the best candidate for a robust electron-based molecular spin qubit/qudit. The transition temperature can be further optimized if a large non-axial component of the spin Hamiltonian exists.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Quantum tunneling of magnetization in molecular spin
Authors:
Le Tuan Anh Ho,
Liviu Ungur,
Liviu F. Chibotaru
Abstract:
We examine the quantum tunneling of magnetization in molecular spin in weak interaction with a bath subject to Redfield master equation. By designing a microscopic model for a multilevel spin system using only a generic Hamiltonian and applying stationary approximation for excited doublets/singlets, we derive a key equation of motion for the quantum tunneling of magnetization process which is appl…
▽ More
We examine the quantum tunneling of magnetization in molecular spin in weak interaction with a bath subject to Redfield master equation. By designing a microscopic model for a multilevel spin system using only a generic Hamiltonian and applying stationary approximation for excited doublets/singlets, we derive a key equation of motion for the quantum tunneling of magnetization process which is applicable in the whole temperature domain. From this equation, we find that in general three tunneling rates are needed to accurately describe the quantum tunneling process. More importantly, behavior of the quantum tunneling in the intermediate temperature domain where there exists a transition between incoherent and coherent quantum tunneling is also unraveled for the first time. Limiting cases at low and high temperature and/or low magnetic field are also worked out where some popular well-known results are reproduced. Last but not least, a new interpretation of the quantum tunneling of magnetization is proposed where we reveal the similarity between this relaxation process with a driven damped harmonic oscillator.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Non-monotonic temperature dependence and first-order phase transition of relaxation times in molecular spin
Authors:
Le Tuan Anh Ho,
Liviu Ungur,
Liviu F. Chibotaru
Abstract:
We derive a simple system of equations to describe the magnetization relaxation of a molecular spin in weak interaction with a thermal bath for the whole temperature domain. Using this for the intermediate temperature domain where the transition from coherent to incoherent relaxation occurs, we find that the slowest relaxation mode shows a first-order phase transition. Associated with this transit…
▽ More
We derive a simple system of equations to describe the magnetization relaxation of a molecular spin in weak interaction with a thermal bath for the whole temperature domain. Using this for the intermediate temperature domain where the transition from coherent to incoherent relaxation occurs, we find that the slowest relaxation mode shows a first-order phase transition. Associated with this transition, an unusual non-monotonic temperature-dependence of the relaxation rate of this mode is also demonstrated. Contrary to the popular belief, this non-monotony gives rise to a peculiar but observable behavior where increasing temperature will not only result in a smaller rate of the slowest relaxation mode but also may lead to a slower decaying of the magnetization after some relaxing time. Additionally, it is also shown that magnetization relaxation in this intermediate temperature domain can only be accurately described by a bi- or tri-exponential form. The physical reason underlying these features can be attributed to the role of the quantum tunneling effect and different but comparative relaxation modes. A simple experiment to confirm our findings on the first-order phase transition and the non-monotony of the relaxation rate is accordingly proposed.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Searching for Supernovae in HETDEX Data Release 3
Authors:
J. Vinko,
B. P. Thomas,
J. C. Wheeler,
A. Y. Q. Ho,
E. Mentuch Cooper,
K. Gebhardt,
R. Ciardullo,
D. J. Farrow,
G. J. Hill,
Z. Jager,
W. Kollatschny,
C. Liu,
E. Regos,
K. Sarneczky
Abstract:
We have extracted 636 spectra taken at the positions of 583 transient sources from the third Data Release of the Hobby-Eberly Telescope Dark Energy eXperiment (HETDEX). The transients were discovered by the Zwicky Transient Facility (ZTF) during 2018 - 2022. The HETDEX spectra are useful to classify a large number of objects found by photometric surveys for free. We attempt to explore and classify…
▽ More
We have extracted 636 spectra taken at the positions of 583 transient sources from the third Data Release of the Hobby-Eberly Telescope Dark Energy eXperiment (HETDEX). The transients were discovered by the Zwicky Transient Facility (ZTF) during 2018 - 2022. The HETDEX spectra are useful to classify a large number of objects found by photometric surveys for free. We attempt to explore and classify the spectra by utilizing machine learning (ML) and template matching techniques. We have identified two transient sources, ZTF20aatpoos = AT2020fiz and ZTF19abdkelq as supernova candidates. We classify AT2020fiz as a Type IIP supernova observed ~10 days after explosion, and we propose ZTF19abdkelq as a likely Type Ia SN caught ~40 days after maximum light. ZTF photometry of these two sources are consistent with their classification as supernovae. Beside these two objects, we have confirmed several ZTF transients as variable AGNs based on their spectral appearance, and also determined the host galaxy types for several other ZTF transients.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
The prevalence and influence of circumstellar material around hydrogen-rich supernova progenitors
Authors:
Rachel J. Bruch,
Avishay Gal-Yam,
Ofer Yaron,
** Chen,
Nora L. Strotjohann,
Ido Irani,
Erez Zimmerman,
Steve Schulze,
Yi Yang,
Young-Lo Kim,
Mattia Bulla,
Jesper Sollerman,
Mickael Rigault,
Eran Ofek,
Maayane Soumagnac,
Frank J. Masci,
Christoffer Fremling,
Daniel Perley,
Jakob Nordin,
S. Bradley Cenko,
Anna Y. Q. Ho,
S. Adams,
Igor Adreoni,
Eric C. Bellm,
Nadia Blagorodnova
, et al. (22 additional authors not shown)
Abstract:
Narrow transient emission lines (flash-ionization features) in early supernova (SN) spectra trace the presence of circumstellar material (CSM) around the massive progenitor stars of core-collapse SNe. The lines disappear within days after the SN explosion, suggesting that this material is spatially confined, and originates from enhanced mass loss shortly (months to a few years) prior to explosion.…
▽ More
Narrow transient emission lines (flash-ionization features) in early supernova (SN) spectra trace the presence of circumstellar material (CSM) around the massive progenitor stars of core-collapse SNe. The lines disappear within days after the SN explosion, suggesting that this material is spatially confined, and originates from enhanced mass loss shortly (months to a few years) prior to explosion. We performed a systematic survey of H-rich (Type II) SNe discovered within less than two days from explosion during the first phase of the Zwicky Transient Facility (ZTF) survey (2018-2020), finding thirty events for which a first spectrum was obtained within $< 2$ days from explosion. The measured fraction of events showing flash ionisation features ($>36\%$ at $95\%$ confidence level) confirms that elevated mass loss in massive stars prior to SN explosion is common. We find that SNe II showing flash ionisation features are not significantly brighter, nor bluer, nor more slowly rising than those without. This implies that CSM interaction does not contribute significantly to their early continuum emission, and that the CSM is likely optically thin. We measured the persistence duration of flash ionisation emission and find that most SNe show flash features for $\approx 5 $ days. Rarer events, with persistence timescales $>10$ days, are brighter and rise longer, suggesting these may be intermediate between regular SNe II and strongly-interacting SNe IIn.
△ Less
Submitted 13 December, 2022; v1 submitted 6 December, 2022;
originally announced December 2022.
-
A very luminous jet from the disruption of a star by a massive black hole
Authors:
Igor Andreoni,
Michael W. Coughlin,
Daniel A. Perley,
Yuhan Yao,
Wenbin Lu,
S. Bradley Cenko,
Harsh Kumar,
Shreya Anand,
Anna Y. Q. Ho,
Mansi M. Kasliwal,
Antonio de Ugarte Postigo,
Ana Sagues-Carracedo,
Steve Schulze,
D. Alexander Kann,
S. R. Kulkarni,
Jesper Sollerman,
Nial Tanvir,
Armin Rest,
Luca Izzo,
Jean J. Somalwar,
David L. Kaplan,
Tomas Ahumada,
G. C. Anupama,
Katie Auchettl,
Sudhanshu Barway
, et al. (56 additional authors not shown)
Abstract:
Tidal disruption events (TDEs) are bursts of electromagnetic energy released when supermassive black holes (SMBHs) at the centers of galaxies violently disrupt a star that passes too close. TDEs provide a new window to study accretion onto SMBHs; in some rare cases, this accretion leads to launching of a relativistic jet, but the necessary conditions are not fully understood. The best studied jett…
▽ More
Tidal disruption events (TDEs) are bursts of electromagnetic energy released when supermassive black holes (SMBHs) at the centers of galaxies violently disrupt a star that passes too close. TDEs provide a new window to study accretion onto SMBHs; in some rare cases, this accretion leads to launching of a relativistic jet, but the necessary conditions are not fully understood. The best studied jetted TDE to date is Swift J1644+57, which was discovered in gamma-rays, but was too obscured by dust to be seen at optical wavelengths. Here we report the optical discovery of AT2022cmc, a rapidly fading source at cosmological distance (redshift z=1.19325) whose unique lightcurve transitioned into a luminous plateau within days. Observations of a bright counterpart at other wavelengths, including X-rays, sub-millimeter, and radio, supports the interpretation of AT2022cmc as a jetted TDE containing a synchrotron "afterglow", likely launched by a SMBH with spin $a \gtrsim 0.3$. Using 4 years of Zwicky Transient Facility (ZTF) survey data, we calculate a rate of $0.02 ^{+ 0.04 }_{- 0.01 }$ Gpc$^{-3}$ yr$^{-1}$ for on-axis jetted TDEs based on the luminous, fast-fading red component, thus providing a measurement complementary to the rates derived from X-ray and radio observations. Correcting for the beaming angle effects, this rate confirms that about 1% of TDEs have relativistic jets. Optical surveys can use AT2022cmc as a prototype to unveil a population of jetted TDEs.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Volumetric rates of Luminous Red Novae and Intermediate Luminosity Red Transients with the Zwicky Transient Facility
Authors:
Viraj R. Karambelkar,
Mansi M. Kasliwal,
Nadejda Blagorodnova,
Jesper Sollerman,
Robert Aloisi,
Shreya G. Anand,
Igor Andreoni,
Thomas G. Brink,
Rachel Bruch,
David Cook,
Kaustav Kashyap Das,
Kishalay De,
Andrew Drake,
Alexei V. Filippenko,
Christoffer Fremling,
George Helou,
Anna Ho,
Jacob Jencson,
David Jones,
Russ R. Laher,
Frank J. Masci,
Kishore C. Patra,
Josiah Purdum,
Alexander Reedy,
Tawny Sit
, et al. (5 additional authors not shown)
Abstract:
Luminous red novae (LRNe) are transients characterized by low luminosities and expansion velocities, and are associated with mergers or common envelope ejections in stellar binaries. Intermediate-luminosity red transients (ILRTs) are an observationally similar class with unknown origins, but generally believed to either be electron capture supernovae (ECSN) in super-AGB stars, or outbursts in dust…
▽ More
Luminous red novae (LRNe) are transients characterized by low luminosities and expansion velocities, and are associated with mergers or common envelope ejections in stellar binaries. Intermediate-luminosity red transients (ILRTs) are an observationally similar class with unknown origins, but generally believed to either be electron capture supernovae (ECSN) in super-AGB stars, or outbursts in dusty luminous blue variables (LBVs). In this paper, we present a systematic sample of 8 LRNe and 8 ILRTs detected as part of the Census of the Local Universe (CLU) experiment on the Zwicky Transient Facility (ZTF). The CLU experiment spectroscopically classifies ZTF transients associated with nearby ($<150$ Mpc) galaxies, achieving 80% completeness for m$_{r}<20$\,mag. Using the ZTF-CLU sample, we derive the first systematic LRNe volumetric-rate of 7.8$^{+6.5}_{-3.7}\times10^{-5}$ Mpc$^{-3}$ yr$^{-1}$ in the luminosity range $-16\leq$M$_{\rm{r}}$$\leq -11$ mag. We find that in this luminosity range, the LRN rate scales as dN/dL $\propto L^{-2.5\pm0.3}$ - significantly steeper than the previously derived scaling of $L^{-1.4\pm0.3}$ for lower luminosity LRNe (M$_{V}\geq-10$). The steeper power law for LRNe at high luminosities is consistent with the massive merger rates predicted by binary population synthesis models. We find that the rates of the brightest LRNe (M$_{r}\leq-13$ mag) are consistent with a significant fraction of them being progenitors of double compact objects (DCOs) that merge within a Hubble time. For ILRTs, we derive a volumetric rate of $2.6^{+1.8}_{-1.4}\times10^{-6}$ Mpc$^{-3}$yr$^{-1}$ for M$_{\rm{r}}\leq-13.5$, that scales as dN/dL $\propto L^{-2.5\pm0.5}$. This rate is $\approx1-5\%$ of the local core-collapse supernova rate, and is consistent with theoretical ECSN rate estimates.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Will we run out of data? Limits of LLM scaling based on human-generated data
Authors:
Pablo Villalobos,
Anson Ho,
Jaime Sevilla,
Tamay Besiroglu,
Lennart Heim,
Marius Hobbhahn
Abstract:
We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock o…
▽ More
We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained. We explore how progress in language modeling can continue when human-generated text datasets cannot be scaled any further. We argue that synthetic data generation, transfer learning from data-rich domains, and data efficiency improvements might support further progress.
△ Less
Submitted 4 June, 2024; v1 submitted 25 October, 2022;
originally announced November 2022.
-
Soft Robotic Link with Controllable Transparency for Vision-based Tactile and Proximity Sensing
Authors:
Quan Khanh Luu,
Dinh Quang Nguyen,
Nhan Huu Nguyen,
Van Anh Ho
Abstract:
Robots have been brought to work close to humans in many scenarios. For coexistence and collaboration, robots should be safe and pleasant for humans to interact with. To this end, the robots could be both physically soft with multimodal sensing/perception, so that the robots could have better awareness of the surrounding environment, as well as to respond properly to humans' action/intention. This…
▽ More
Robots have been brought to work close to humans in many scenarios. For coexistence and collaboration, robots should be safe and pleasant for humans to interact with. To this end, the robots could be both physically soft with multimodal sensing/perception, so that the robots could have better awareness of the surrounding environment, as well as to respond properly to humans' action/intention. This paper introduces a novel soft robotic link, named ProTac, that possesses multiple sensing modes: tactile and proximity sensing, based on computer vision and a functional material. These modalities come from a layered structure of a soft transparent silicon skin, a polymer dispersed liquid crystal (PDLC) film, and reflective markers. Here, the PDLC film can switch actively between the opaque and the transparent state, from which the tactile sensing and proximity sensing can be obtained by using cameras solely built inside the ProTac link. In this paper, inference algorithms for tactile proximity perception are introduced. Evaluation results of two sensing modalities demonstrated that, with a simple activation strategy, ProTac link could effectively perceive useful information from both approaching and in-contact obstacles. The proposed sensing device is expected to bring in ultimate solutions for design of robots with softness, whole-body and multimodal sensing, and safety control strategies.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
A search for relativistic ejecta in a sample of ZTF broad-lined Type Ic supernovae
Authors:
Alessandra Corsi,
Anna Y. Q. Ho,
S. Bradley Cenko,
Shrinivas R. Kulkarni,
Shreya Anand,
Sheng Yang,
Jesper Sollerman,
Gokul P. Srinivasaragavan,
Conor M. B. Omand,
Arvind Balasubramanian,
Dale A. Frail,
Christoffer Fremling,
Daniel A. Perley,
Yuhan Yao,
Aishwarya S. Dahiwale,
Kishalay De,
Alison Dugas,
Matthew Hankins,
Jacob Jencson,
Mansi M. Kasliwal,
Anastasios Tzanidakis,
Eric C. Bellm,
Russ R. Laher,
Frank J. Masci,
Josiah N. Purdum
, et al. (1 additional authors not shown)
Abstract:
The dividing line between gamma-ray bursts (GRBs) and ordinary stripped-envelope core-collapse supernovae (SNe) is yet to be fully understood. Observationally map** the variety of ejecta outcomes (ultra-relativistic, mildly-relativistic or non-relativistic) in SNe of Type Ic with broad lines (Ic-BL) can provide a key test to stellar explosion models. However, this requires large samples of the r…
▽ More
The dividing line between gamma-ray bursts (GRBs) and ordinary stripped-envelope core-collapse supernovae (SNe) is yet to be fully understood. Observationally map** the variety of ejecta outcomes (ultra-relativistic, mildly-relativistic or non-relativistic) in SNe of Type Ic with broad lines (Ic-BL) can provide a key test to stellar explosion models. However, this requires large samples of the rare Ic-BL events with follow-up observations in the radio, where fast ejecta can be probed largely free of geometry and viewing angle effects. Here, we present the results of a radio (and X-ray) follow-up campaign of 16 SNe Ic-BL detected by the Zwicky Transient Facility (ZTF). Our radio campaign resulted in 4 counterpart detections and 12 deep upper limits. None of the events in our sample is as relativistic as SN 1998bw and we constrain the fraction of SN 1998bw-like explosions to $< 19\%$ (3$σ$ Gaussian equivalent), a factor of $\approx 2$ smaller than previously established. We exclude relativistic ejecta with radio luminosity densities in between $\approx 5\times10^{27}$ erg s$^{-1}$ Hz$^{-1}$ and $\approx 10^{29}$ erg s$^{-1}$ Hz$^{-1}$ at $t\gtrsim 20$ d since explosion for $\approx 60\%$ of the events in our sample. This shows that SNe Ic-BL similar to the GRB-associated SN 1998bw, SN 2003lw, SN 2010dh, or to the relativistic SN 2009bb and iPTF17cw, are rare. Our results also exclude an association of the SNe Ic-BL in our sample with largely off-axis GRBs with energies $E\gtrsim 10^{50}$ erg. The parameter space of SN2006aj-like events (faint and fast-peaking radio emission) is, on the other hand, left largely unconstrained and systematically exploring it represents a promising line of future research.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Electro-nuclear transition into a spatially modulated magnetic state in YbRh$_2$Si$_2$
Authors:
J. Knapp,
L. V. Levitin,
J. Nyéki,
A. F. Ho,
B. Cowan,
J. Saunders,
M. Brando,
C. Geibel,
K. Kliemt,
C. Krellner
Abstract:
The nature of the antiferromagnetic order in the heavy fermion metal YbRh$_2$Si$_2$, its quantum criticality, and superconductivity, which appears at low mK temperatures, remain open questions. We report measurements of the heat capacity over the wide temperature range 180 $μ$K - 80 mK, using current sensing noise thermometry. In zero magnetic field we observe a remarkably sharp heat capacity anom…
▽ More
The nature of the antiferromagnetic order in the heavy fermion metal YbRh$_2$Si$_2$, its quantum criticality, and superconductivity, which appears at low mK temperatures, remain open questions. We report measurements of the heat capacity over the wide temperature range 180 $μ$K - 80 mK, using current sensing noise thermometry. In zero magnetic field we observe a remarkably sharp heat capacity anomaly at 1.5 mK, which we identify as an electro-nuclear transition into a state with spatially modulated electronic magnetic order of maximum amplitude 0.1$μ_B$. We also report results of measurements in magnetic fields in the range 0 to 70 mT, applied perpendicular to the c-axis, which show eventual suppression of this order. These results demonstrate a coexistence of a large moment antiferromagnet with putative superconductivity.
△ Less
Submitted 19 February, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Authors:
Tilman Räuker,
Anson Ho,
Stephen Casper,
Dylan Hadfield-Menell
Abstract:
The last decade of machine learning has seen drastic increases in scale and capabilities. Deep neural networks (DNNs) are increasingly being deployed in the real world. However, they are difficult to analyze, raising concerns about using them without a rigorous understanding of how they function. Effective tools for interpreting them will be important for building more trustworthy AI by hel** to…
▽ More
The last decade of machine learning has seen drastic increases in scale and capabilities. Deep neural networks (DNNs) are increasingly being deployed in the real world. However, they are difficult to analyze, raising concerns about using them without a rigorous understanding of how they function. Effective tools for interpreting them will be important for building more trustworthy AI by hel** to identify problems, fix bugs, and improve basic understanding. In particular, "inner" interpretability techniques, which focus on explaining the internal components of DNNs, are well-suited for develo** a mechanistic understanding, guiding manual modifications, and reverse engineering solutions.
Much recent work has focused on DNN interpretability, and rapid progress has thus far made a thorough systematization of methods difficult. In this survey, we review over 300 works with a focus on inner interpretability tools. We introduce a taxonomy that classifies methods by what part of the network they help to explain (weights, neurons, subnetworks, or latent representations) and whether they are implemented during (intrinsic) or after (post hoc) training. To our knowledge, we are also the first to survey a number of connections between interpretability research and work in adversarial robustness, continual learning, modularity, network compression, and studying the human visual system. We discuss key challenges and argue that the status quo in interpretability research is largely unproductive. Finally, we highlight the importance of future work that emphasizes diagnostics, debugging, adversaries, and benchmarking in order to make interpretability tools more useful to engineers in practical applications.
△ Less
Submitted 18 August, 2023; v1 submitted 26 July, 2022;
originally announced July 2022.
-
Human Mobility Networks Manifest Dissimilar Resilience Characteristics at Macroscopic, Substructure, and Microscopic Scales
Authors:
Chia-Wei Hsu,
Matthew Alexander Ho,
Ali Mostafavi
Abstract:
Human mobility networks can reveal insights into resilience phenomena, such as population response to, impacts on, and recovery from crises. The majority of human mobility network resilience characterizations, however, focus mainly on macroscopic network properties; little is known about variation in measured resilience characteristics (i.e., the extent of impact and recovery duration) across macr…
▽ More
Human mobility networks can reveal insights into resilience phenomena, such as population response to, impacts on, and recovery from crises. The majority of human mobility network resilience characterizations, however, focus mainly on macroscopic network properties; little is known about variation in measured resilience characteristics (i.e., the extent of impact and recovery duration) across macroscopic, substructure (motif), and microscopic mobility scales. To address this gap, in this study, we examine the human mobility network in eight parishes in Louisiana (USA) impacted by the 2021 Hurricane Ida. We constructed human mobility networks using location-based data and examined three sets of measures: (1) macroscopic measures, such as network density, giant component size, and modularity; (2) substructure measures, such motif distribution; and (3) microscopic mobility measures, such as the radius of gyration and average travel distance. To determine the extent of impact and duration of recovery, for each measure, we established the baseline values and examined the fluctuation of measures during the perturbation caused by Hurricane Ida. The results reveal the variation of impact extent and recovery duration obtained from different sets of measures at different scales. Macroscopic measures, such as giant components, tend to recover more quickly than substructure and microscopic measures. In fact, microscopic measures tend to recover more slowly than measures in other scales. These findings suggest that resilience characteristics in human mobility networks are scale-variant, and thus, a single measure at a particular scale may not be representative of the perturbation impacts and recovery duration in the network as a whole. These results spotlight the need to use measures at different scales to properly characterize resilience in human mobility networks.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
Machine Learning Model Sizes and the Parameter Gap
Authors:
Pablo Villalobos,
Jaime Sevilla,
Tamay Besiroglu,
Lennart Heim,
Anson Ho,
Marius Hobbhahn
Abstract:
We study trends in model size of notable machine learning systems over time using a curated dataset. From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude. The trend then accelerated, with model size increasing by another five orders of magnitude in just 4 years from 2018 to 2022. Vision models grew at a more constant pace, totaling 7 orders of magnitude…
▽ More
We study trends in model size of notable machine learning systems over time using a curated dataset. From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude. The trend then accelerated, with model size increasing by another five orders of magnitude in just 4 years from 2018 to 2022. Vision models grew at a more constant pace, totaling 7 orders of magnitude of growth between 1950 and 2022.
We also identify that, since 2020, there have been many language models below 20B parameters, many models above 70B parameters, but a scarcity of models in the 20-70B parameter range. We refer to that scarcity as the parameter gap.
We provide some stylized facts about the parameter gap and propose a few hypotheses to explain it. The explanations we favor are: (a) increasing model size beyond 20B parameters requires adopting different parallelism techniques, which makes mid-sized models less cost-effective, (b) GPT-3 was one order of magnitude larger than previous language models, and researchers afterwards primarily experimented with bigger models to outperform it. While these dynamics likely exist, and we believe they play some role in generating the gap, we don't have high confidence that there are no other, more important dynamics at play.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Dynamics of asteroid systems post rotational fission
Authors:
Alex Ho,
Margrethe Wold,
Mohammad Poursina,
John T. Conway
Abstract:
Asteroid binaries found amongst the Near-Earth objects are believed to have formed from rotational fission. In this paper, we aim to study the dynamical evolution of asteroid systems the moment after fission. The initial condition is modelled as a contact binary, similar to that of Boldrin et al. (2016). Both bodies are modelled as ellipsoids, and the secondary is given an initial rotation angle a…
▽ More
Asteroid binaries found amongst the Near-Earth objects are believed to have formed from rotational fission. In this paper, we aim to study the dynamical evolution of asteroid systems the moment after fission. The initial condition is modelled as a contact binary, similar to that of Boldrin et al. (2016). Both bodies are modelled as ellipsoids, and the secondary is given an initial rotation angle about its body-fixed $y$-axis. Moreover, we consider six different cases, three where the density of the secondary varies, and three where we vary its shape. The simulations consider 45 different initial tilt angles of the secondary, each with 37 different mass ratios. We start the dynamical simulations at the moment the contact binary reaches a spin fission limit, and our model ensures that the closest distance between the surfaces of the two bodies is always kept at 1 cm. The forces, torques and gravitational potential between the two bodies are modelled using a newly developed surface integration scheme, giving exact results for two ellipsoids. We find that more than 80% of the simulations end with the two bodies impacting, and collisions between the bodies are more common when the density of the secondary is lower, or when it becomes more elongated. When comparing with data on asteroid pairs from Pravec et al. (2019) we find that variations in density and shape of the secondary can account for some of the spread seen in the rotation period for observed pairs. Furthermore, the secondary may also reach a spin limit for surface disruption, creating a ternary/multiple system. We find that secondary fission typically occurs within the first five hours after the contact binary separates, and is more common when the secondary is less dense or more elongated.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
SN 2019zrk, a bright SN 2009ip analog with a precursor
Authors:
Claes Fransson,
Jesper Sollerman,
Nora L. Strotjohann,
Sheng Yang,
Steve Schulze,
Cristina Barbarino,
Erik C. Kool,
Eran O. Ofek,
Arien Crellin-Quick,
Kishalay De,
Andrew J. Drake,
Christoffer Fremling,
Avishay Gal-Yam,
Anna Y. Q. Ho,
Mansi M. Kasliwal
Abstract:
We present photometric and spectroscopic observations of the Type IIn supernova SN 2019zrk (also known as ZTF20aacbyec). The SN shows a $\gtrsim$ 100 day precursor, with a slow rise, followed by a rapid rise to M $\sim -19.2$ in the $r$ and $g$ bands. The post-peak light-curve decline is well fit with an exponential decay with a timescale of $\sim 39$ days, but it shows prominent undulations, with…
▽ More
We present photometric and spectroscopic observations of the Type IIn supernova SN 2019zrk (also known as ZTF20aacbyec). The SN shows a $\gtrsim$ 100 day precursor, with a slow rise, followed by a rapid rise to M $\sim -19.2$ in the $r$ and $g$ bands. The post-peak light-curve decline is well fit with an exponential decay with a timescale of $\sim 39$ days, but it shows prominent undulations, with an amplitude of $\sim 1$ mag. Both the light curve and spectra are dominated by an interaction with a dense circumstellar medium (CSM), probably from previous mass ejections. The spectra evolve from a scattering-dominated Type IIn spectrum to a spectrum with strong P-Cygni absorptions. The expansion velocity is high, $\sim 16,000$ km s$^{-1}$, even in the last spectra. The last spectrum $\sim 110$ days after the main eruption reveals no evidence for advanced nucleosynthesis. From analysis of the spectra and light curves, we estimate the mass-loss rate to be $\sim 4 \times 10^{-2}$ M$_\odot$ yr$^{-1}$ for a CSM velocity of 100 km s$^{-1}$, and a CSM mass of $\gtrsim 1$ M$_\odot$. We find strong similarities for both the precursor, general light curve, and spectral evolution with SN 2009ip and similar SNe, although SN 2019zrk displays a brighter peak magnitude. Different scenarios for the nature of the 09ip-class of SNe, based on pulsational pair instability eruptions, wave heating, and mergers, are discussed. }
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Models of Millimeter and Radio Emission from Interacting Supernovae
Authors:
Nitika Yadlapalli,
Vikram Ravi,
Anna Y. Q. Ho
Abstract:
This work utilizes established models of synchrotron-powered light curves for core-collapse supernovae in dense circumstellar environments, namely type IIn and Ibn, to demonstrate the potential for detecting millimeter emission from these events. The progenitor types of these supernovae are still an open question, but using the synchrotron light curves as probes for the circumstellar environments…
▽ More
This work utilizes established models of synchrotron-powered light curves for core-collapse supernovae in dense circumstellar environments, namely type IIn and Ibn, to demonstrate the potential for detecting millimeter emission from these events. The progenitor types of these supernovae are still an open question, but using the synchrotron light curves as probes for the circumstellar environments could shed light on the mass-loss histories of the progenitors and discern between different theories. Observations in millimeter bands are particularly fruitful, as they probe regions at smaller radii and higher ambient densities, where centimeter emission tends to be self-absorbed. In our application of these light curves, we explore a diversity of progenitor types and mass-loss profiles to understand their effects on the light curve shapes. Additionally, we fit model parameters to the 8\,GHz light curve of type IIn supernova 2006jd and then create millimeter light curves using these parameters to show the possibility of detecting an early millimeter peak from such an event. We predict that next generation millimeter surveys will possess the capability to detect nearby and extreme events. However, there is a pressing need for millimeter follow-up of optically discovered interacting supernovae to more completely sample the true population.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Causal Analysis of Generic Time Series Data Applied for Market Prediction
Authors:
Anton Kolonin,
Ali Raheman,
Mukul Vishwas,
Ikram Ansari,
Juan Pinzon,
Alice Ho
Abstract:
We explore the applicability of the causal analysis based on temporally shifted (lagged) Pearson correlation applied to diverse time series of different natures in context of the problem of financial market prediction. Theoretical discussion is followed by description of the practical approach for specific environment of time series data with diverse nature and sparsity, as applied for environment…
▽ More
We explore the applicability of the causal analysis based on temporally shifted (lagged) Pearson correlation applied to diverse time series of different natures in context of the problem of financial market prediction. Theoretical discussion is followed by description of the practical approach for specific environment of time series data with diverse nature and sparsity, as applied for environments of financial markets. The data involves various financial metrics computable from raw market data such as real-time trades and snapshots of the limit order book as well as metrics determined upon social media news streams such as sentiment and different cognitive distortions. The approach is backed up with presentation of algorithmic framework for data acquisition and analysis, concluded with experimental results, and summary pointing out at the possibility to discriminate causal connections between different sorts of real field market data with further discussion on present issues and possible directions of the following work.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
The long-active afterglow of GRB 210204A: Detection of the most delayed flares in a Gamma-Ray Burst
Authors:
Harsh Kumar,
Rahul Gupta,
Divita Saraogi,
Tomás Ahumada,
Igor Andreoni,
G. C. Anupama,
Amar Aryan,
Sudhanshu Barway,
Varun Bhalerao,
Poonam Chandra,
Michael W. Coughlin,
Dimple,
Anirban Dutta,
Ankur Ghosh,
Anna Y. Q. Ho,
E. C. Kool,
Amit Kumar,
Michael S. Medford,
Kuntal Misra,
Shashi B. Pandey,
Daniel A. Perley,
Reed Riddle,
Amit Kumar Ror,
Jason M. Setiadi,
Yuhan Yao
Abstract:
We present results from extensive broadband follow-up of GRB 210204A over the period of thirty days. We detect optical flares in the afterglow at 7.6 x 10^5 s and 1.1 x 10^6 s after the burst: the most delayed flaring ever detected in a GRB afterglow. At the source redshift of 0.876, the rest-frame delay is 5.8 x 10^5 s (6.71 d). We investigate possible causes for this flaring and conclude that th…
▽ More
We present results from extensive broadband follow-up of GRB 210204A over the period of thirty days. We detect optical flares in the afterglow at 7.6 x 10^5 s and 1.1 x 10^6 s after the burst: the most delayed flaring ever detected in a GRB afterglow. At the source redshift of 0.876, the rest-frame delay is 5.8 x 10^5 s (6.71 d). We investigate possible causes for this flaring and conclude that the most likely cause is a refreshed shock in the jet. The prompt emission of the GRB is within the range of typical long bursts: it shows three disjoint emission episodes, which all follow the typical GRB correlations. This suggests that GRB 210204A might not have any special properties that caused late-time flaring, and the lack of such detections for other afterglows might be resulting from the paucity of late-time observations. Systematic late-time follow-up of a larger sample of GRBs can shed more light on such afterglow behaviour. Further analysis of the GRB 210204A shows that the late time bump in the light curve is highly unlikely due to underlying SNe at redshift (z) = 0.876 and is more likely due to the late time flaring activity. The cause of this variability is not clearly quantifiable due to the lack of multi-band data at late time constraints by the bad weather conditions. The flare of GRB 210204A is the latest flare detected to date.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
Pulse Profiles and Polarization of Terzan 5 Pulsars
Authors:
Ashley R. Martsen,
Scott M. Ransom,
Megan E. DeCesar,
Paulo C. C. Freire,
Jason W. T. Hessels,
Anna Y. Q. Ho,
Ryan S. Lynch,
Ingrid H. Stairs,
Yuankun Wang
Abstract:
Terzan 5 is a rich globular cluster within the galactic bulge that contains 39 known millisecond pulsars, the largest known population of any globular cluster. The Terzan 5 pulsars are faint, so that individual observations of most of the pulsars have too little signal-to-noise (S/N) to measure reliable flux density or polarization information. We combined over 5.2\,days of archival data, at each…
▽ More
Terzan 5 is a rich globular cluster within the galactic bulge that contains 39 known millisecond pulsars, the largest known population of any globular cluster. The Terzan 5 pulsars are faint, so that individual observations of most of the pulsars have too little signal-to-noise (S/N) to measure reliable flux density or polarization information. We combined over 5.2\,days of archival data, at each of 1.5\,GHz and 2.0\,GHz, taken with the Green Bank Telescope over the past 11\,years. We created high S/N profiles for 32 of the pulsars and determined precise rotation measures (RMs) for 28 of them. We used the RMs, and the known pulsar positions and dispersion measures (DMs), to map the projected parallel component of the Galactic magnetic field toward the cluster. The $\langle B_{||}\rangle$ shows a rough gradient of $\sim$6\,nG/arcsec ($\sim$160\,nG/parsec), or fractionally, a change of $\sim$20$\%$ in the right ascension direction across the cluster, implying Galactic magnetic field variability at sub-parsec scales. We also measured average flux densities $S_ν$ for the pulsars, ranging from $\sim$10\,$μ$Jy to $\sim$2\,mJy, and an average spectral index $α= -1.35$, where $S_ν\propto ν^α)$. This spectral index is flatter than most known pulsars, likely a selection effect due to the high frequencies used in pulsar searches to mitigate dispersion and scattering. The inferred pulsar luminosity function is roughly power-law, with slope $(d\log N)/(d\log L) = -1$ at the high-luminosity end. At the low-luminosity end, there are incompleteness effects implying that Terzan 5 contains many more pulsars to be found.
△ Less
Submitted 11 November, 2022; v1 submitted 12 April, 2022;
originally announced April 2022.
-
In search of short gamma-ray burst optical counterpart with the Zwicky Transient Facility
Authors:
Tomás Ahumada,
Shreya Anand,
Michael W. Coughlin,
Igor Andreoni,
Erik C. Kool,
Harsh Kumar,
Simeon Reusch,
Ana Sagués-Carracedo,
Robert Stein,
S. Bradley Cenko,
Mansi M. Kasliwal,
Leo P. Singer,
Rachel Dunwoody,
Joseph Mangan,
Varun Bhalerao,
Mattia Bulla,
Eric Burns,
Matthew J. Graham,
David L. Kaplan,
Daniel Perley,
Mouza Almualla,
Joshua S. Bloom,
Virginia Cunningham,
Kishalay De,
Pradip Gatkine
, et al. (24 additional authors not shown)
Abstract:
The Fermi Gamma-ray Burst Monitor (GBM) triggers on-board in response to $\sim$ 40 short gamma-ray bursts (SGRBs) per year; however, their large localization regions have made the search for optical counterparts a challenging endeavour. We have developed and executed an extensive program with the wide field of view of the Zwicky Transient Facility (ZTF) camera, mounted on the Palomar 48 inch Oschi…
▽ More
The Fermi Gamma-ray Burst Monitor (GBM) triggers on-board in response to $\sim$ 40 short gamma-ray bursts (SGRBs) per year; however, their large localization regions have made the search for optical counterparts a challenging endeavour. We have developed and executed an extensive program with the wide field of view of the Zwicky Transient Facility (ZTF) camera, mounted on the Palomar 48 inch Oschin telescope (P48), to perform target-of-opportunity (ToO) observations on 10 Fermi-GBM SGRBs during 2018 and 2020-2021. Bridging the large sky areas with small field of view optical telescopes in order to track the evolution of potential candidates, we look for the elusive SGRB afterglows and kilonovae (KNe) associated with these high-energy events. No counterpart has yet been found, even though more than 10 ground based telescopes, part of the Global Relay of Observatories Watching Transients Happen (GROWTH) network, have taken part in these efforts. The candidate selection procedure and the follow-up strategy have shown that ZTF is an efficient instrument for searching for poorly localized SGRBs, retrieving a reasonable number of candidates to follow-up and showing promising capabilities as the community approaches the multi-messenger era. Based on the median limiting magnitude of ZTF, our searches would have been able to retrieve a GW170817-like event up to $\sim$ 200 Mpc and SGRB afterglows to z = 0.16 or 0.4, depending on the assumed underlying energy model. Future ToOs will expand the horizon to z = 0.2 and 0.7 respectively.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
A Mass Formula For Artin--Schreier Curves Over Finite Fields
Authors:
Anne M. Ho,
Rachel Pries
Abstract:
We study a mass formula for Artin--Schreier curves of genus $g$ defined over a finite field $k$ of characteristic $p$. For an odd prime $p$ and for small $g$, we determine the number of $k$-isomorphism classes of Artin-Schreier curves of genus $g$, weighted by the order of the centralizer of the Artin-Schreier involution in the automorphism group. This extends earlier results by several authors in…
▽ More
We study a mass formula for Artin--Schreier curves of genus $g$ defined over a finite field $k$ of characteristic $p$. For an odd prime $p$ and for small $g$, we determine the number of $k$-isomorphism classes of Artin-Schreier curves of genus $g$, weighted by the order of the centralizer of the Artin-Schreier involution in the automorphism group. This extends earlier results by several authors in characteristic $p=2$.
Keywords: Artin-Schreier curve, finite field, automorphism, Mass formula, moduli space, arithmetic statistics
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
Snowmass 2021 CMB-S4 White Paper
Authors:
Kevork Abazajian,
Arwa Abdulghafour,
Graeme E. Addison,
Peter Adshead,
Zeeshan Ahmed,
Marco Ajello,
Daniel Akerib,
Steven W. Allen,
David Alonso,
Marcelo Alvarez,
Mustafa A. Amin,
Mandana Amiri,
Adam Anderson,
Behzad Ansarinejad,
Melanie Archipley,
Kam S. Arnold,
Matt Ashby,
Han Aung,
Carlo Baccigalupi,
Carina Baker,
Abhishek Bakshi,
Debbie Bard,
Denis Barkats,
Darcy Barron,
Peter S. Barry
, et al. (331 additional authors not shown)
Abstract:
This Snowmass 2021 White Paper describes the Cosmic Microwave Background Stage 4 project CMB-S4, which is designed to cross critical thresholds in our understanding of the origin and evolution of the Universe, from the highest energies at the dawn of time through the growth of structure to the present day. We provide an overview of the science case, the technical design, and project plan.
This Snowmass 2021 White Paper describes the Cosmic Microwave Background Stage 4 project CMB-S4, which is designed to cross critical thresholds in our understanding of the origin and evolution of the Universe, from the highest energies at the dawn of time through the growth of structure to the present day. We provide an overview of the science case, the technical design, and project plan.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.