-
WikiChat: Stop** the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Authors:
Sina J. Semnani,
Violet Z. Yao,
Heidi C. Zhang,
Monica S. Lam
Abstract:
This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia, the largest curated free-text corpus.
WikiChat generates a response from an LLM, retains only the grounded facts, and combines them with additional information it retrieves from the corpus to form factual and engagi…
▽ More
This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia, the largest curated free-text corpus.
WikiChat generates a response from an LLM, retains only the grounded facts, and combines them with additional information it retrieves from the corpus to form factual and engaging responses. We distill WikiChat based on GPT-4 into a 7B-parameter LLaMA model with minimal loss of quality, to significantly improve its latency, cost and privacy, and facilitate research and deployment.
Using a novel hybrid human-and-LLM evaluation methodology, we show that our best system achieves 97.3% factual accuracy in simulated conversations. It significantly outperforms all retrieval-based and LLM-based baselines, and by 3.9%, 38.6% and 51.0% on head, tail and recent knowledge compared to GPT-4. Compared to previous state-of-the-art retrieval-based chatbots, WikiChat is also significantly more informative and engaging, just like an LLM.
WikiChat achieves 97.9% factual accuracy in conversations with human users about recent topics, 55.0% better than GPT-4, while receiving significantly higher user ratings and more favorable comments.
△ Less
Submitted 27 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata
Authors:
Silei Xu,
Shicheng Liu,
Theo Culhane,
Elizaveta Pertseva,
Meng-Hsi Wu,
Sina J. Semnani,
Monica S. Lam
Abstract:
While large language models (LLMs) can answer many questions correctly, they can also hallucinate and give wrong answers. Wikidata, with its over 12 billion facts, can be used to ground LLMs to improve their factuality. This paper presents WikiWebQuestions, a high-quality question answering benchmark for Wikidata. Ported over from WebQuestions for Freebase, it consists of real-world data with SPAR…
▽ More
While large language models (LLMs) can answer many questions correctly, they can also hallucinate and give wrong answers. Wikidata, with its over 12 billion facts, can be used to ground LLMs to improve their factuality. This paper presents WikiWebQuestions, a high-quality question answering benchmark for Wikidata. Ported over from WebQuestions for Freebase, it consists of real-world data with SPARQL annotation. This paper presents a few-shot sequence-to-sequence semantic parser for Wikidata. We modify SPARQL to use the unique domain and property names instead of their IDs. We train the parser to use either the results from an entity linker or mentions in the query. We fine-tune LLaMA by adding the few-shot training data to that used to fine-tune Alpaca. Our experimental results demonstrate the effectiveness of this methodology, establishing a strong baseline of 76% and 65% answer accuracy in the dev and test sets of WikiWebQuestions, respectively. By pairing our semantic parser with GPT-3, we combine verifiable results with qualified GPT-3 guesses to provide useful answers to 96% of the questions in dev. We also show that our method outperforms the state-of-the-art for the QALD-7 Wikidata dataset by 3.6% in F1 score.
△ Less
Submitted 5 November, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Rapidly Evolving Transients in Archival ZTF Public Alerts
Authors:
Wenxiong Li,
Iair Arcavi,
Ehud Nakar,
Alexei V. Filippenko,
Thomas G. Brink,
WeiKang Zheng,
Marco C. Lam,
Ido Keinan,
Seán J. Brennan,
Noi Shitrit
Abstract:
We search the archival Zwicky Transient Facility public survey for rapidly evolving transient (RET) candidates based on well-defined criteria between 2018 May and 2021 December. The search yielded 19 bona-fide RET candidates, corresponding to a discovery rate of $\sim 5.2$ events per year. Even with a Galactic latitude cut of $20^\circ$, 8 of the 19 events ($\sim 42$%) are Galactic, including one…
▽ More
We search the archival Zwicky Transient Facility public survey for rapidly evolving transient (RET) candidates based on well-defined criteria between 2018 May and 2021 December. The search yielded 19 bona-fide RET candidates, corresponding to a discovery rate of $\sim 5.2$ events per year. Even with a Galactic latitude cut of $20^\circ$, 8 of the 19 events ($\sim 42$%) are Galactic, including one with a light-curve shape closely resembling that of the GW170817 kilonova (KN). An additional event is a nova in M31. Four out of the 19 events ($\sim 21$%) are confirmed extragalactic RETs (one confirmed here for the first time) and the origin of 6 additional events cannot be determined. We did not find any extragalactic events resembling the GW170817 KN, from which we obtain an upper limit on the volumetric rate of GW170817-like KNe of $R \le$ 2400 Gpc$^{-3}$ yr$^{-1}$ (95% confidence). These results can be used for quantifying contaminants to RET searches in transient alert streams, specifically when searching for kilonovae independently of gravitational-wave and gamma-ray-burst triggers.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
AT 2021loi: A Bowen Fluorescence Flare with a Rebrightening Episode, Occurring in a Previously-Known AGN
Authors:
Lydia Makrygianni,
Benny Trakhtenbrot,
Iair Arcavi,
Claudio Ricci,
Marco C. Lam,
Assaf Horesh,
Itai Sfaradi,
K. Azalee Bostroem,
Griffin Hosseinzadeh,
D. Andrew Howell,
Craig Pellegrino,
Rob Fender,
David A. Green,
David R. A. Williams,
Joe Bright
Abstract:
AT 2021loi is an optical-ultraviolet transient located at the center of its host galaxy. Its spectral features identify it as a member of the ``Bowen Fluorescence Flare'' (BFF) class. The first member of this class was considered to be related to a tidal disruption event, but enhanced accretion onto an already active supermassive black hole was suggested as an alternative explanation. AT 2021loi,…
▽ More
AT 2021loi is an optical-ultraviolet transient located at the center of its host galaxy. Its spectral features identify it as a member of the ``Bowen Fluorescence Flare'' (BFF) class. The first member of this class was considered to be related to a tidal disruption event, but enhanced accretion onto an already active supermassive black hole was suggested as an alternative explanation. AT 2021loi, having occurred in a previously-known unobscured AGN, strengthens the latter interpretation. Its light curve is similar to those of previous BFFs, showing a rebrightening approximately one year after the main peak (which was not explicitly identified, but might be the case, in all previous BFFs). An emission feature around 4680 A, seen in the pre-flare spectrum, strengthens by a factor of $\sim$2 around the optical peak of the flare, and is clearly seen as a double peaked feature then, suggesting a blend of NIII $λ4640$ with HeII $\lambda4686$ as its origin. The appearance of OIII $λ$3133 and possible NIII $λ\lambda4097,4103$ (blended with H$δ$) during the flare further support a Bowen Fluorescence classification. Here, we present ZTF, ATLAS, Keck, Las Cumbres Observatory, NEOWISE-R, $Swift$, AMI and VLA observations of AT 2021loi, making it one of the best observed BFFs to date. AT 2021loi thus provides some clarity on the nature of BFFs but also further demonstrates the diversity of nuclear transients.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
On the Twistor Theory of Almost-Grassmannian Manifolds
Authors:
Matthew Lam
Abstract:
In this document we present a twistor correspondence for half-flat almost-Grassmannian structures on real and complex manifolds. We provide foundational results regarding local theory in the complex setting and a global correspondence when the underlying manifold is a real Grassmannian of 2-planes. Whereas twistor constructions typically involve moduli of closed curves in a complex manifold, we ut…
▽ More
In this document we present a twistor correspondence for half-flat almost-Grassmannian structures on real and complex manifolds. We provide foundational results regarding local theory in the complex setting and a global correspondence when the underlying manifold is a real Grassmannian of 2-planes. Whereas twistor constructions typically involve moduli of closed curves in a complex manifold, we utilize and expand upon the more flexible approach pioneered by LeBrun and Mason using moduli of curves-with-boundary.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
The James Webb Space Telescope Mission
Authors:
Jonathan P. Gardner,
John C. Mather,
Randy Abbott,
James S. Abell,
Mark Abernathy,
Faith E. Abney,
John G. Abraham,
Roberto Abraham,
Yasin M. Abul-Huda,
Scott Acton,
Cynthia K. Adams,
Evan Adams,
David S. Adler,
Maarten Adriaensen,
Jonathan Albert Aguilar,
Mansoor Ahmed,
Nasif S. Ahmed,
Tanjira Ahmed,
Rüdeger Albat,
Loïc Albert,
Stacey Alberts,
David Aldridge,
Mary Marsha Allen,
Shaune S. Allen,
Martin Altenburg
, et al. (983 additional authors not shown)
Abstract:
Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astrono…
▽ More
Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astronomers will celebrate their accomplishments for the life of the mission, potentially as long as 20 years, and beyond. This report and the scientific discoveries that follow are extended thank-you notes to the 20,000 team members. The telescope is working perfectly, with much better image quality than expected. In this and accompanying papers, we give a brief history, describe the observatory, outline its objectives and current observing program, and discuss the inventions and people who made it possible. We cite detailed reports on the design and the measured performance on orbit.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Evolved Massive Stars at Low-metallicity V. Mass-Loss Rate of Red Supergiant Stars in the Small Magellanic Cloud
Authors:
Ming Yang,
Alceste Z. Bonanos,
Biwei Jiang,
Emmanouil Zapartas,
Jian Gao,
Yi Ren,
Man I Lam,
Tianding Wang,
Grigoris Maravelias,
Panagiotis Gavras,
Shu Wang,
Xiaodian Chen,
Frank Tramper,
Stephan de Wit,
Bingqiu Chen,
**g Wen,
Jiaming Liu,
Hao Tian,
Konstantinos Antoniadis,
Changqing Luo
Abstract:
We assemble the most complete and clean red supergiant (RSG) sample (2,121 targets) so far in the Small Magellanic Cloud (SMC) with 53 different bands of data to study the MLR of RSGs. In order to match the observed spectral energy distributions (SEDs), a theoretical grid of 17,820 Oxygen-rich models (``normal'' and ``dusty'' grids are half-and-half) is created by the radiatively-driven wind model…
▽ More
We assemble the most complete and clean red supergiant (RSG) sample (2,121 targets) so far in the Small Magellanic Cloud (SMC) with 53 different bands of data to study the MLR of RSGs. In order to match the observed spectral energy distributions (SEDs), a theoretical grid of 17,820 Oxygen-rich models (``normal'' and ``dusty'' grids are half-and-half) is created by the radiatively-driven wind model of the DUSTY code, covering a wide range of dust parameters. We select the best model for each target by calculating the minimal modified chi-square and visual inspection. The resulting MLRs from DUSTY are converted to real MLRs based on the scaling relation, for which a total MLR of $6.16\times10^{-3}$ $M_\odot$ yr$^{-1}$ is measured (corresponding to a dust-production rate of $\sim6\times10^{-6}$ $M_\odot$ yr$^{-1}$), with a typical MLR of $\sim10^{-6}$ $M_\odot$ yr$^{-1}$ for the general population of the RSGs. The complexity of mass-loss estimation based on the SED is fully discussed for the first time, indicating large uncertainties based on the photometric data (potentially up to one order of magnitude or more). The Hertzsprung-Russell and luminosity versus median absolute deviation diagrams of the sample indicate the positive relation between luminosity and MLR. Meanwhile, the luminosity versus MLR diagrams show a ``knee-like'' shape with enhanced mass-loss occurring above $\log_{10}(L/L_\odot)\approx4.6$, which may be due to the degeneracy of luminosity, pulsation, low surface gravity, convection, and other factors. We derive our MLR relation by using a third-order polynomial to fit the sample and compare our result with previous empirical MLR prescriptions. Given that our MLR prescription is based on a much larger sample than previous determinations, it provides a more accurate relation at the cool and luminous region of the H-R diagram at low-metallicity compared to previous studies.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Searching for continuous Gravitational Waves in the second data release of the International Pulsar Timing Array
Authors:
M. Falxa,
S. Babak,
P. T. Baker,
B. Bécsy,
A. Chalumeau,
S. Chen,
Z. Chen,
N. J. Cornish,
L. Guillemot,
J. S. Hazboun,
C. M. F. Mingarelli,
A. Parthasarathy,
A. Petiteau,
N. S. Pol,
A. Sesana,
S. B. Spolaor,
S. R. Taylor,
G. Theureau,
M. Vallisneri,
S. J. Vigeland,
C. A. Witt,
X. Zhu,
J. Antoniadis,
Z. Arzoumanian,
M. Bailes
, et al. (102 additional authors not shown)
Abstract:
The International Pulsar Timing Array 2nd data release is the combination of datasets from worldwide collaborations. In this study, we search for continuous waves: gravitational wave signals produced by individual supermassive black hole binaries in the local universe. We consider binaries on circular orbits and neglect the evolution of orbital frequency over the observational span. We find no evi…
▽ More
The International Pulsar Timing Array 2nd data release is the combination of datasets from worldwide collaborations. In this study, we search for continuous waves: gravitational wave signals produced by individual supermassive black hole binaries in the local universe. We consider binaries on circular orbits and neglect the evolution of orbital frequency over the observational span. We find no evidence for such signals and set sky averaged 95% upper limits on their amplitude h 95 . The most sensitive frequency is 10nHz with h 95 = 9.1 10-15 . We achieved the best upper limit to date at low and high frequencies of the PTA band thanks to improved effective cadence of observations. In our analysis, we have taken into account the recently discovered common red noise process, which has an impact at low frequencies. We also find that the peculiar noise features present in some pulsars data must be taken into account to reduce the false alarm. We show that using custom noise models is essential in searching for continuous gravitational wave signals and setting the upper limit.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design
Authors:
Michelle S. Lam,
Zixian Ma,
Anne Li,
Izequiel Freitas,
Dakuo Wang,
James A. Landay,
Michael S. Bernstein
Abstract:
Machine learning practitioners often end up tunneling on low-level technical details like model architectures and performance metrics. Could early model development instead focus on high-level questions of which factors a model ought to pay attention to? Inspired by the practice of sketching in design, which distills ideas to their minimal representation, we introduce model sketching: a technical…
▽ More
Machine learning practitioners often end up tunneling on low-level technical details like model architectures and performance metrics. Could early model development instead focus on high-level questions of which factors a model ought to pay attention to? Inspired by the practice of sketching in design, which distills ideas to their minimal representation, we introduce model sketching: a technical framework for iteratively and rapidly authoring functional approximations of a machine learning model's decision-making logic. Model sketching refocuses practitioner attention on composing high-level, human-understandable concepts that the model is expected to reason over (e.g., profanity, racism, or sarcasm in a content moderation task) using zero-shot concept instantiation. In an evaluation with 17 ML practitioners, model sketching reframed thinking from implementation to higher-level exploration, prompted iteration on a broader range of model designs, and helped identify gaps in the problem formulation$\unicode{x2014}$all in a fraction of the time ordinarily required to build a model.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
Design and Mechanics of Cable-Driven Rolling Diaphragm Transmission for High-Transparency Robotic Motion
Authors:
Hoi Man Lam,
W. Jared Walker,
Lucas Jonasch,
Dimitri Schreiber,
Michael C. Yip
Abstract:
Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. U…
▽ More
Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. Utilization of cable drives compliment the rolling diaphragm transmission's advantages, but maintaining cable tension is crucial for optimal and consistent performance. In this paper, a coaxial opposed rolling diaphragm layout with cable drive and an electronic transmission control system are investigated, with a focus on system reliability and scalability. Mechanical features are proposed which enable force balancing, decoupling of transmission pressure from bearing loads, and maintenance of cable tension. Key considerations and procedures for automation of transmission setup, phasing, and operation are also presented. We also present an analysis of system stiffness to identify key compliance contributors, and conduct experiments to validate prototype design performance.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with a Distilled Representation
Authors:
Mehrad Moradshahi,
Sina J. Semnani,
Monica S. Lam
Abstract:
Task-oriented Dialogue (ToD) agents are mostly limited to a few widely-spoken languages, mainly due to the high cost of acquiring training data for each language. Existing low-cost approaches that rely on cross-lingual embeddings or naive machine translation sacrifice a lot of accuracy for data efficiency, and largely fail in creating a usable dialogue agent. We propose automatic methods that use…
▽ More
Task-oriented Dialogue (ToD) agents are mostly limited to a few widely-spoken languages, mainly due to the high cost of acquiring training data for each language. Existing low-cost approaches that rely on cross-lingual embeddings or naive machine translation sacrifice a lot of accuracy for data efficiency, and largely fail in creating a usable dialogue agent. We propose automatic methods that use ToD training data in a source language to build a high-quality functioning dialogue agent in another target language that has no training data (i.e. zero-shot) or a small training set (i.e. few-shot). Unlike most prior work in cross-lingual ToD that only focuses on Dialogue State Tracking (DST), we build an end-to-end agent.
We show that our approach closes the accuracy gap between few-shot and existing full-shot methods for ToD agents. We achieve this by (1) improving the dialogue data representation, (2) improving entity-aware machine translation, and (3) automatic filtering of noisy translations.
We evaluate our approach on the recent bilingual dialogue dataset BiToD. In Chinese to English transfer, in the zero-shot setting, our method achieves 46.7% and 22.0% in Task Success Rate (TSR) and Dialogue Success Rate (DSR) respectively. In the few-shot setting where 10% of the data in the target language is used, we improve the state-of-the-art by 15.2% and 14.0%, coming within 5% of full-shot training.
△ Less
Submitted 18 February, 2023;
originally announced February 2023.
-
Modeling of parallel power MOSFETs in steady-state
Authors:
Minh Nhat Huynh,
Minh Khoi Nguyen Tien,
Cong Toai Truong,
Minh Tri Nguyen,
Quoc Minh Lam,
Van Tu Duong,
Huy Hung Nguyen,
Tan Tien Nguyen
Abstract:
In high-power applications, multiple power MOSFETs are connected in parallel and treated as a single switch in order to handle much larger total currents. In this paper, a parallel power MOSFETs model from the turnoff state until they reach their steady state is introduced. The model represents the relationship between each power MOSFET's gate voltage and the current distribution among them. The s…
▽ More
In high-power applications, multiple power MOSFETs are connected in parallel and treated as a single switch in order to handle much larger total currents. In this paper, a parallel power MOSFETs model from the turnoff state until they reach their steady state is introduced. The model represents the relationship between each power MOSFET's gate voltage and the current distribution among them. The study's key purpose is to use the model for dealing with the asymmetry in sharing current and power loss between these semiconductor devices during the steady state region.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Antibacterial Activity of Zinc Oxide Thin Films by Atomic Layer Deposition for Personal Protective Equipment Applications
Authors:
Li Tao,
Kinga Vojnits,
Man In Lam,
Xuejun Lu,
Sepideh Pakpour,
Jian Liu
Abstract:
The global pandemic has significantly increased the demand for personal protective equipment (PPE). The antimicrobial coating has been broadly applied to PPE to improve its prevention capability, especially after prolonged usage. However, antimicrobial coating by traditional methods, such as chemical vapor deposition, spraying, and slurry coating, suffers from drawbacks such as low efficiency, poo…
▽ More
The global pandemic has significantly increased the demand for personal protective equipment (PPE). The antimicrobial coating has been broadly applied to PPE to improve its prevention capability, especially after prolonged usage. However, antimicrobial coating by traditional methods, such as chemical vapor deposition, spraying, and slurry coating, suffers from drawbacks such as low efficiency, poor coverage, and loose adhesion to PPE. To overcome these limitations, this work adopted an atomic layer deposition (ALD) technique to deposit a zinc oxide (ZnO) thin film (~ 312.8 nm thick) as an antimicrobial coating and proved its advantage for depositing uniform ZnO coating on PPE with a fabric structure. Analysis by X-ray diffraction, Raman spectroscopy, and X-ray photoelectron spectroscopy confirmed the crystal structure and chemical composition of the ALD-ZnO. Ultraviolet-visible spectra disclosed a high absorption level of about 4.8 from 200 nm to 380 nm wavelength for ALD-ZnO, in contrast to 3.8 for commercial ZnO powders. Moreover, the ALD-ZnO exhibited strong antimicrobial properties when tested against Escherichia coli (E.Coli) in contrast to the control and bare glass samples. The colony-forming unit (CFU/ml) remained zero for all ALD-ZnO samples while varying between 2.30*109 and 4.97*109 with a median of 4.36*109 for the control and between 9.10*108 and 3.27*109 with a median of 2.04x 109 for the bare glass. Statistic analysis using null-hypothesis significance testing revealed that the calculated P value, between bare glass and control, ALD-ZnO and control, and bare glass and control, were all smaller than 0.0001 and significantly smaller than 0.05 alpha value, suggesting a high confidence level of ALD-ZnO as the main factor for preventing E.Coli growth.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
Authors:
Maximilian Lam,
Jeff Johnson,
Wenjie Xiong,
Kiwan Maeng,
Udit Gupta,
Yang Li,
Liangzhen Lai,
Ilias Leontiadis,
Minsoo Rhu,
Hsien-Hsin S. Lee,
Vijay Janapa Reddi,
Gu-Yeon Wei,
David Brooks,
G. Edward Suh
Abstract:
On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the or…
▽ More
On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To overcome this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than $20 \times$ over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over $5 \times$ additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to $100,000$ queries per second -- a $>100 \times$ throughput improvement over a CPU-based baseline -- while maintaining model accuracy.
△ Less
Submitted 25 September, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
The NANOGrav 12.5-year Data Set: Bayesian Limits on Gravitational Waves from Individual Supermassive Black Hole Binaries
Authors:
Zaven Arzoumanian,
Paul T. Baker,
Laura Blecha,
Harsha Blumer,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Bence Bécsy,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Siyuan Chen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Justin A. Ellis,
E. C. Ferrara,
William Fiore,
Emmanuel Fonseca,
Gabriel E. Freedman
, et al. (53 additional authors not shown)
Abstract:
Pulsar timing array collaborations, such as the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), are seeking to detect nanohertz gravitational waves emitted by supermassive black hole binaries formed in the aftermath of galaxy mergers. We have searched for continuous waves from individual circular supermassive black hole binaries using the NANOGrav's recent 12.5-year data s…
▽ More
Pulsar timing array collaborations, such as the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), are seeking to detect nanohertz gravitational waves emitted by supermassive black hole binaries formed in the aftermath of galaxy mergers. We have searched for continuous waves from individual circular supermassive black hole binaries using the NANOGrav's recent 12.5-year data set. We created new methods to accurately model the uncertainties on pulsar distances in our analysis, and we implemented new techniques to account for a common red noise process in pulsar timing array data sets while searching for deterministic gravitational wave signals, including continuous waves. As we found no evidence for continuous waves in our data, we placed 95\% upper limits on the strain amplitude of continuous waves emitted by these sources. At our most sensitive frequency of 7.65 nanohertz, we placed a sky-averaged limit of $h_0 < $ $(6.82 \pm 0.35) \times 10^{-15}$, and $h_0 <$ $(2.66 \pm 0.15) \times 10^{-15}$ in our most sensitive sky location. Finally, we placed a multi-messenger limit of $\mathcal{M} <$ $(1.41 \pm 0.02) \times 10^9 M_\odot$ on the chirp mass of the supermassive black hole binary candidate 3C~66B.
△ Less
Submitted 6 June, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
Decreasing behavior of the depth functions of edge ideals
Authors:
Ha Thi Thu Hien,
Ha Minh Lam,
Ngo Viet Trung
Abstract:
Let $I$ be the edge ideal of a connected non-bipartite graph and $R$ the base polynomial ring. Then $\operatorname{depth} R/I \ge 1$ and $\operatorname{depth} R/I^t = 0$ for $t \gg 1$. We give combinatorial conditions for $\operatorname{depth} R/I^t = 1$ for some $t$ in between and show that the depth function is non-increasing thereafter. Especially, the depth function quickly decreases to 0 afte…
▽ More
Let $I$ be the edge ideal of a connected non-bipartite graph and $R$ the base polynomial ring. Then $\operatorname{depth} R/I \ge 1$ and $\operatorname{depth} R/I^t = 0$ for $t \gg 1$. We give combinatorial conditions for $\operatorname{depth} R/I^t = 1$ for some $t$ in between and show that the depth function is non-increasing thereafter. Especially, the depth function quickly decreases to 0 after reaching 1. We show that if $\operatorname{depth} R/I = 1$ then $\operatorname{depth} R/I^2 = 0$ and if $\operatorname{depth} R/I^2 = 1$ then $\operatorname{depth} R/I^5 = 0$. Other similar results suggest that if $\operatorname{depth} R/I^t = 1$ then $\operatorname{depth} R/I^{t+3} = 0$. This a surprising phenomenon because the depth of a power can determine a smaller depth of another power. Furthermore, we are able to give a simple combinatorial criterion for $\operatorname{depth} R/I^{(t)} = 1$ for $t \gg 1$ and show that the condition $\operatorname{depth} R/I^{(t)} = 1$ is persistent, where $I^{(t)}$ denotes the $t$-th symbolic powers of $I$.
△ Less
Submitted 22 January, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Nonradial stability of expanding Goldreich-Weber stars
Authors:
Mahir Hadžić,
Juhi Jang,
King Ming Lam
Abstract:
Goldreich-Weber solutions constitute a finite-parameter of expanding and collapsing solutions to the mass-critical Euler-Poisson system. Two subclasses of this family correspond to compactly supported density profiles suitably modulated by the dynamic radius of the star that expands at the self-similar rate $λ(t)_{t\to\infty}\sim t^{\frac23}$ and linear rate $λ(t)_{t\to\infty}\sim t$ respectively.…
▽ More
Goldreich-Weber solutions constitute a finite-parameter of expanding and collapsing solutions to the mass-critical Euler-Poisson system. Two subclasses of this family correspond to compactly supported density profiles suitably modulated by the dynamic radius of the star that expands at the self-similar rate $λ(t)_{t\to\infty}\sim t^{\frac23}$ and linear rate $λ(t)_{t\to\infty}\sim t$ respectively. We prove two results: any linearly expanding Goldreich-Weber star is nonlinearly stable, while any given self-similarly expanding Goldreich-Weber star is codimension-4 nonlinearly stable against irrotational perturbations.
The codimension-4 condition in the latter result is optimal and reflects the presence of 4 unstable directions in the linearised dynamics in self-similar coordinates, which are induced by the conservation of the energy and the momentum. This result can be viewed as a codimension-1 nonlinear stability of the moduli space of self-similarly expanding Goldreich-Weber stars against irrotational perturbations.
△ Less
Submitted 11 May, 2024; v1 submitted 21 December, 2022;
originally announced December 2022.
-
The wide-field, multiplexed, spectroscopic facility WEAVE: Survey design, overview, and simulated implementation
Authors:
Shoko **,
Scott C. Trager,
Gavin B. Dalton,
J. Alfonso L. Aguerri,
J. E. Drew,
Jesús Falcón-Barroso,
Boris T. Gänsicke,
Vanessa Hill,
Angela Iovino,
Matthew M. Pieri,
Bianca M. Poggianti,
D. J. B. Smith,
Antonella Vallenari,
Don Carlos Abrams,
David S. Aguado,
Teresa Antoja,
Alfonso Aragón-Salamanca,
Yago Ascasibar,
Carine Babusiaux,
Marc Balcells,
R. Barrena,
Giuseppina Battaglia,
Vasily Belokurov,
Thomas Bensby,
Piercarlo Bonifacio
, et al. (190 additional authors not shown)
Abstract:
WEAVE, the new wide-field, massively multiplexed spectroscopic survey facility for the William Herschel Telescope, will see first light in late 2022. WEAVE comprises a new 2-degree field-of-view prime-focus corrector system, a nearly 1000-multiplex fibre positioner, 20 individually deployable 'mini' integral field units (IFUs), and a single large IFU. These fibre systems feed a dual-beam spectrogr…
▽ More
WEAVE, the new wide-field, massively multiplexed spectroscopic survey facility for the William Herschel Telescope, will see first light in late 2022. WEAVE comprises a new 2-degree field-of-view prime-focus corrector system, a nearly 1000-multiplex fibre positioner, 20 individually deployable 'mini' integral field units (IFUs), and a single large IFU. These fibre systems feed a dual-beam spectrograph covering the wavelength range 366$-$959\,nm at $R\sim5000$, or two shorter ranges at $R\sim20\,000$. After summarising the design and implementation of WEAVE and its data systems, we present the organisation, science drivers and design of a five- to seven-year programme of eight individual surveys to: (i) study our Galaxy's origins by completing Gaia's phase-space information, providing metallicities to its limiting magnitude for $\sim$3 million stars and detailed abundances for $\sim1.5$ million brighter field and open-cluster stars; (ii) survey $\sim0.4$ million Galactic-plane OBA stars, young stellar objects and nearby gas to understand the evolution of young stars and their environments; (iii) perform an extensive spectral survey of white dwarfs; (iv) survey $\sim400$ neutral-hydrogen-selected galaxies with the IFUs; (v) study properties and kinematics of stellar populations and ionised gas in $z<0.5$ cluster galaxies; (vi) survey stellar populations and kinematics in $\sim25\,000$ field galaxies at $0.3\lesssim z \lesssim 0.7$; (vii) study the cosmic evolution of accretion and star formation using $>1$ million spectra of LOFAR-selected radio sources; (viii) trace structures using intergalactic/circumgalactic gas at $z>2$. Finally, we describe the WEAVE Operational Rehearsals using the WEAVE Simulator.
△ Less
Submitted 31 October, 2023; v1 submitted 7 December, 2022;
originally announced December 2022.
-
An unusual pulse shape change event in PSR J1713+0747 observed with the Green Bank Telescope and CHIME
Authors:
Ross J. Jennings,
James M. Cordes,
Shami Chatterjee,
Maura A. McLaughlin,
Paul B. Demorest,
Zaven Arzoumanian,
Paul T. Baker,
Harsha Blumer,
Paul R. Brook,
Tyler Cohen,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Timothy Dolch,
Elizabeth C. Ferrara,
Emmanuel Fonseca,
Deborah C. Good,
Jeffrey S. Hazboun,
Megan L. Jones,
David L. Kaplan,
Michael T. Lam,
T. Joseph W. Lazio,
Duncan R. Lorimer,
**g Luo,
Ryan S. Lynch
, et al. (19 additional authors not shown)
Abstract:
The millisecond pulsar J1713+0747 underwent a sudden and significant pulse shape change between April 16 and 17, 2021 (MJDs 59320 and 59321). Subsequently, the pulse shape gradually recovered over the course of several months. We report the results of continued multi-frequency radio observations of the pulsar made using the Canadian Hydrogen Intensity Map** Experiment (CHIME) and the 100-meter G…
▽ More
The millisecond pulsar J1713+0747 underwent a sudden and significant pulse shape change between April 16 and 17, 2021 (MJDs 59320 and 59321). Subsequently, the pulse shape gradually recovered over the course of several months. We report the results of continued multi-frequency radio observations of the pulsar made using the Canadian Hydrogen Intensity Map** Experiment (CHIME) and the 100-meter Green Bank Telescope (GBT) in a three-year period encompassing the shape change event, between February 2020 and February 2023. As of February 2023, the pulse shape had returned to a state similar to that seen before the event, but with measurable changes remaining. The amplitude of the shape change and the accompanying TOA residuals display a strong non-monotonic dependence on radio frequency, demonstrating that the event is neither a glitch (the effects of which should be independent of radio frequency, $ν$) nor a change in dispersion measure (DM) alone (which would produce a delay proportional to $ν^{-2}$). However, it does bear some resemblance to the two previous "chromatic timing events" observed in J1713+0747 (Demorest et al. 2013; Lam et al. 2016), as well as to a similar event observed in PSR J1643-1224 in 2015 (Shannon et al. 2016).
△ Less
Submitted 31 January, 2024; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Telecom-wavelength spectra of a Rydberg state in a hot vapor
Authors:
Wenfang Li,
**** Du,
Mark Lam,
Wenhui Li
Abstract:
We study telecom-wavelength spectra of a Rydberg state in an atomic vapor with a three-photon excitation scheme. Two lasers of 780 nm and 776 nm are used to pump Rubidium-85 atoms in a vapor cell to the $5D_{\mathrm{5/2}}$ state, from which a probe beam of 1292 nm in the O-band telecommunication wavelength drives a transition to the $21F_{\mathrm{7/2}}$ Rydberg state. We investigate the probe spec…
▽ More
We study telecom-wavelength spectra of a Rydberg state in an atomic vapor with a three-photon excitation scheme. Two lasers of 780 nm and 776 nm are used to pump Rubidium-85 atoms in a vapor cell to the $5D_{\mathrm{5/2}}$ state, from which a probe beam of 1292 nm in the O-band telecommunication wavelength drives a transition to the $21F_{\mathrm{7/2}}$ Rydberg state. We investigate the probe spectra over the power of pump lasers. The simulation based on a 4-level theoretical model captures the main features of the experimental results. This spectroscopic study paves the way for future experiments of making a direct link between fiber optics and radio transmission via Rydberg atoms.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Heliosphere Meets Interstellar Medium, in a Galactic Context
Authors:
Stella Koch Ocker,
James Cordes,
Shami Chatterjee,
Jeffrey Hazboun,
Timothy Dolch,
Daniel Stinebring,
Dustin Madison,
Stephen White,
Gregory Taylor,
Natalia Lewandowska,
Michael Lam
Abstract:
The physical conditions within our heliosphere are driven by the Sun's motion through an evolving interstellar environment that remains largely unexplored. The next generation of outer heliosphere and interstellar explorers will answer fundamental questions about the heliosphere's relationship with the very local interstellar medium (VLISM) by diving deeper into the Sun's interstellar surroundings…
▽ More
The physical conditions within our heliosphere are driven by the Sun's motion through an evolving interstellar environment that remains largely unexplored. The next generation of outer heliosphere and interstellar explorers will answer fundamental questions about the heliosphere's relationship with the very local interstellar medium (VLISM) by diving deeper into the Sun's interstellar surroundings. The impact of these future missions will be vastly enhanced by concurrent, interdisciplinary studies that examine the direct connections between conditions within the heliosphere, the heliosphere's immediate interstellar environment, and the larger-scale Galactic ISM. Comparisons of the heliosphere and VLISM to their analogs across the Galaxy will constrain the global processes sha** both stellar astrospheres and their sustained impact on the ISM.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
Linear Stability of liquid Lane-Emden stars
Authors:
King Ming Lam
Abstract:
We establish various qualitative properties of liquid Lane-Emden stars in $\mathbb{R}^d$, including bounds for its density profile $ρ$ and radius $R$. Using them we prove that against radial perturbations, the liquid Lane-Emden stars are linearly stable when $γ\geq 2(d-1)/d$; linearly stable when $γ<2(d-1)/d$ for stars with small relative central density $ρ(0)-ρ(R)$; and linearly unstable when…
▽ More
We establish various qualitative properties of liquid Lane-Emden stars in $\mathbb{R}^d$, including bounds for its density profile $ρ$ and radius $R$. Using them we prove that against radial perturbations, the liquid Lane-Emden stars are linearly stable when $γ\geq 2(d-1)/d$; linearly stable when $γ<2(d-1)/d$ for stars with small relative central density $ρ(0)-ρ(R)$; and linearly unstable when $γ<2(d-1)/d$ for stars with large central density. Such dependence on central density is not seen in the gaseous Lane-Emden stars.
△ Less
Submitted 9 May, 2023; v1 submitted 13 August, 2022;
originally announced August 2022.
-
The Science Performance of JWST as Characterized in Commissioning
Authors:
Jane Rigby,
Marshall Perrin,
Michael McElwain,
Randy Kimble,
Scott Friedman,
Matt Lallo,
René Doyon,
Lee Feinberg,
Pierre Ferruit,
Alistair Glasse,
Marcia Rieke,
George Rieke,
Gillian Wright,
Chris Willott,
Knicole Colon,
Stefanie Milam,
Susan Neff,
Christopher Stark,
Jeff Valenti,
Jim Abell,
Faith Abney,
Yasin Abul-Huda,
D. Scott Acton,
Evan Adams,
David Adler
, et al. (601 additional authors not shown)
Abstract:
This paper characterizes the actual science performance of the James Webb Space Telescope (JWST), as determined from the six month commissioning period. We summarize the performance of the spacecraft, telescope, science instruments, and ground system, with an emphasis on differences from pre-launch expectations. Commissioning has made clear that JWST is fully capable of achieving the discoveries f…
▽ More
This paper characterizes the actual science performance of the James Webb Space Telescope (JWST), as determined from the six month commissioning period. We summarize the performance of the spacecraft, telescope, science instruments, and ground system, with an emphasis on differences from pre-launch expectations. Commissioning has made clear that JWST is fully capable of achieving the discoveries for which it was built. Moreover, almost across the board, the science performance of JWST is better than expected; in most cases, JWST will go deeper faster than expected. The telescope and instrument suite have demonstrated the sensitivity, stability, image quality, and spectral range that are necessary to transform our understanding of the cosmos through observations spanning from near-earth asteroids to the most distant galaxies.
△ Less
Submitted 10 April, 2023; v1 submitted 12 July, 2022;
originally announced July 2022.
-
WDPhotTools -- A White Dwarf Photometric Toolkit in Python
Authors:
M. C. Lam,
K. W. Yuen,
M. J. Green,
W. Li
Abstract:
From data collection to photometric fitting and analysis of white dwarfs to generating a white dwarf luminosity function requires numerous Astrophysical, Mathematical and Computational domain knowledge. The steep learning curve makes it difficult to enter the field, and often individuals have to reinvent the wheel to perform identical data reduction and analysis tasks. We have gathered a wide rang…
▽ More
From data collection to photometric fitting and analysis of white dwarfs to generating a white dwarf luminosity function requires numerous Astrophysical, Mathematical and Computational domain knowledge. The steep learning curve makes it difficult to enter the field, and often individuals have to reinvent the wheel to perform identical data reduction and analysis tasks. We have gathered a wide range of publicly available white dwarf cooling models and synthetic photometry to provide a toolkit that allows (1) visualisation of various models, (2) photometric fitting of a white dwarf with or without distance and reddening, and (3) the computing of white dwarf luminosity functions with a choice of initial mass function, main sequence evolution model, star formation history, initial-final mass relation, and white dwarf cooling model. We have recomputed and compared the effective temperature of the white dwarfs from the Gaia EDR3 white dwarf catalogue. The two independent works show excellent agreement in the temperature solutions.
△ Less
Submitted 22 October, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Theory for constructing effective models for electrons in generic bilayer graphene
Authors:
H. Minh Lam,
V. Nam Do
Abstract:
We present and discuss in detail practical techniques in formulating effective models to describe the dynamics of low-energy electrons in generic bilayer graphene. Starting from a tight-binding model using the $p_z$ orbital of carbon atoms as a representation basis set, we reformulate it into the problem of coupling between Bloch states defined in each graphene layer. This approach allows transfer…
▽ More
We present and discuss in detail practical techniques in formulating effective models to describe the dynamics of low-energy electrons in generic bilayer graphene. Starting from a tight-binding model using the $p_z$ orbital of carbon atoms as a representation basis set, we reformulate it into the problem of coupling between Bloch states defined in each graphene layer. This approach allows transferring the original problem into the determination of Bloch states in two independent material layers and coupling rules of such states. We show two schemes to parameterize coupled Bloch state vectors. For the bilayer graphene configurations of small twist angle in which the long wavelength approximation is applicable, we show that an effective Hamiltonian can be written in the canonical form of a kinetic term defined by the momentum operator and a potential term defined by the position operator. The validity of effective models of different sophistication levels and their potential application in treating various physical aspects are numerically discussed.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Authors:
Rongjie Huang,
Max W. Y. Lam,
Jun Wang,
Dan Su,
Dong Yu,
Yi Ren,
Zhou Zhao
Abstract:
Denoising diffusion probabilistic models (DDPMs) have recently achieved leading performances in many generative tasks. However, the inherited iterative sampling process costs hindered their applications to speech synthesis. This paper proposes FastDiff, a fast conditional diffusion model for high-quality speech synthesis. FastDiff employs a stack of time-aware location-variable convolutions of div…
▽ More
Denoising diffusion probabilistic models (DDPMs) have recently achieved leading performances in many generative tasks. However, the inherited iterative sampling process costs hindered their applications to speech synthesis. This paper proposes FastDiff, a fast conditional diffusion model for high-quality speech synthesis. FastDiff employs a stack of time-aware location-variable convolutions of diverse receptive field patterns to efficiently model long-term time dependencies with adaptive conditions. A noise schedule predictor is also adopted to reduce the sampling steps without sacrificing the generation quality. Based on FastDiff, we design an end-to-end text-to-speech synthesizer, FastDiff-TTS, which generates high-fidelity speech waveforms without any intermediate feature (e.g., Mel-spectrogram). Our evaluation of FastDiff demonstrates the state-of-the-art results with higher-quality (MOS 4.28) speech samples. Also, FastDiff enables a sampling speed of 58x faster than real-time on a V100 GPU, making diffusion models practically applicable to speech synthesis deployment for the first time. We further show that FastDiff generalized well to the mel-spectrogram inversion of unseen speakers, and FastDiff-TTS outperformed other competing methods in end-to-end text-to-speech synthesis. Audio samples are available at \url{https://FastDiff.github.io/}.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
Learning to Deblur using Light Field Generated and Real Defocus Images
Authors:
Lingyan Ruan,
Bin Chen,
Jizhou Li,
Miuling Lam
Abstract:
Defocus deblurring is a challenging task due to the spatially varying nature of defocus blur. While deep learning approach shows great promise in solving image restoration problems, defocus deblurring demands accurate training data that consists of all-in-focus and defocus image pairs, which is difficult to collect. Naive two-shot capturing cannot achieve pixel-wise correspondence between the defo…
▽ More
Defocus deblurring is a challenging task due to the spatially varying nature of defocus blur. While deep learning approach shows great promise in solving image restoration problems, defocus deblurring demands accurate training data that consists of all-in-focus and defocus image pairs, which is difficult to collect. Naive two-shot capturing cannot achieve pixel-wise correspondence between the defocused and all-in-focus image pairs. Synthetic aperture of light fields is suggested to be a more reliable way to generate accurate image pairs. However, the defocus blur generated from light field data is different from that of the images captured with a traditional digital camera. In this paper, we propose a novel deep defocus deblurring network that leverages the strength and overcomes the shortcoming of light fields. We first train the network on a light field-generated dataset for its highly accurate image correspondence. Then, we fine-tune the network using feature loss on another dataset collected by the two-shot method to alleviate the differences between the defocus blur exists in the two domains. This strategy is proved to be highly effective and able to achieve the state-of-the-art performance both quantitatively and qualitatively on multiple test sets. Extensive ablation studies have been conducted to analyze the effect of each network module to the final performance.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Authors:
Max W. Y. Lam,
Jun Wang,
Dan Su,
Dong Yu
Abstract:
Diffusion probabilistic models (DPMs) and their extensions have emerged as competitive generative models yet confront challenges of efficient sampling. We propose a new bilateral denoising diffusion model (BDDM) that parameterizes both the forward and reverse processes with a schedule network and a score network, which can train with a novel bilateral modeling objective. We show that the new surro…
▽ More
Diffusion probabilistic models (DPMs) and their extensions have emerged as competitive generative models yet confront challenges of efficient sampling. We propose a new bilateral denoising diffusion model (BDDM) that parameterizes both the forward and reverse processes with a schedule network and a score network, which can train with a novel bilateral modeling objective. We show that the new surrogate objective can achieve a lower bound of the log marginal likelihood tighter than a conventional surrogate. We also find that BDDM allows inheriting pre-trained score network parameters from any DPMs and consequently enables speedy and stable learning of the schedule network and optimization of a noise schedule for sampling. Our experiments demonstrate that BDDMs can generate high-fidelity audio samples with as few as three sampling steps. Moreover, compared to other state-of-the-art diffusion-based neural vocoders, BDDMs produce comparable or higher quality samples indistinguishable from human speech, notably with only seven sampling steps (143x faster than WaveGrad and 28.6x faster than DiffWave). We release our code at https://github.com/tencent-ailab/bddm.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
ThingTalk: An Extensible, Executable Representation Language for Task-Oriented Dialogues
Authors:
Monica S. Lam,
Giovanni Campagna,
Mehrad Moradshahi,
Sina J. Semnani,
Silei Xu
Abstract:
Task-oriented conversational agents rely on semantic parsers to translate natural language to formal representations. In this paper, we propose the design and rationale of the ThingTalk formal representation, and how the design improves the development of transactional task-oriented agents.
ThingTalk is built on four core principles: (1) representing user requests directly as executable statemen…
▽ More
Task-oriented conversational agents rely on semantic parsers to translate natural language to formal representations. In this paper, we propose the design and rationale of the ThingTalk formal representation, and how the design improves the development of transactional task-oriented agents.
ThingTalk is built on four core principles: (1) representing user requests directly as executable statements, covering all the functionality of the agent, (2) representing dialogues formally and succinctly to support accurate contextual semantic parsing, (3) standardizing types and interfaces to maximize reuse between agents, and (4) allowing multiple, independently-developed agents to be composed in a single virtual assistant. ThingTalk is developed as part of the Genie Framework that allows developers to quickly build transactional agents given a database and APIs.
We compare ThingTalk to existing representations: SMCalFlow, SGD, TreeDST. Compared to the others, the ThingTalk design is both more general and more cost-effective. Evaluated on the MultiWOZ benchmark, using ThingTalk and associated tools yields a new state of the art accuracy of 79% turn-by-turn.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference
Authors:
Maximilian Lam,
Michael Mitzenmacher,
Vijay Janapa Reddi,
Gu-Yeon Wei,
David Brooks
Abstract:
Multiparty computation approaches to secure neural network inference commonly rely on garbled circuits for securely executing nonlinear activation functions. However, garbled circuits require excessive communication between server and client, impose significant storage overheads, and incur large runtime penalties. To reduce these costs, we propose an alternative to garbled circuits: Tabula, an alg…
▽ More
Multiparty computation approaches to secure neural network inference commonly rely on garbled circuits for securely executing nonlinear activation functions. However, garbled circuits require excessive communication between server and client, impose significant storage overheads, and incur large runtime penalties. To reduce these costs, we propose an alternative to garbled circuits: Tabula, an algorithm based on secure lookup tables. Our approach precomputes lookup tables during an offline phase that contains the result of all possible nonlinear function calls. Because these tables incur exponential storage costs in the number of operands and the precision of the input values, we use quantization to reduce these storage costs to make this approach practical. This enables an online phase where securely computing the result of a nonlinear function requires just a single round of communication, with communication cost equal to twice the number of bits of the input to the nonlinear function. In practice our approach costs 2 bytes of communication per nonlinear function call in the online phase. Compared to garbled circuits with 8-bit quantized inputs, when computing individual nonlinear functions during the online phase, experiments show Tabula with 8-bit activations uses between $280$-$560 \times$ less communication, is over $100\times$ faster, and uses a comparable (within a factor of 2) amount of storage; compared against other state-of-the-art protocols Tabula achieves greater than $40\times$ communication reduction. This leads to significant performance gains over garbled circuits with quantized inputs during the online phase of secure inference of neural networks: Tabula reduces end-to-end inference communication by up to $9 \times$ and achieves an end-to-end inference speedup of up to $50 \times$, while imposing comparable storage and offline preprocessing costs.
△ Less
Submitted 16 June, 2024; v1 submitted 5 March, 2022;
originally announced March 2022.
-
Jury Learning: Integrating Dissenting Voices into Machine Learning Models
Authors:
Mitchell L. Gordon,
Michelle S. Lam,
Joon Sung Park,
Kayur Patel,
Jeffrey T. Hancock,
Tatsunori Hashimoto,
Michael S. Bernstein
Abstract:
Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups' labels. We intr…
▽ More
Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups' labels. We introduce jury learning, a supervised ML approach that resolves these disagreements explicitly through the metaphor of a jury: defining which people or groups, in what proportion, determine the classifier's prediction. For example, a jury learning model for online toxicity might centrally feature women and Black jurors, who are commonly targets of online harassment. To enable jury learning, we contribute a deep learning architecture that models every annotator in a dataset, samples from annotators' models to populate the jury, then runs inference to classify. Our architecture enables juries that dynamically adapt their composition, explore counterfactuals, and visualize dissent.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Identifying Blue Large Amplitude Pulsators from Gaia DR2 & ZTF DR3
Authors:
Paul Ross McWhirter,
Marco C. Lam
Abstract:
Blue Large Amplitude Pulsators (BLAPs) are hot, subluminous stars undergoing rapid variability with periods of under 60 mins. They have been linked with the early stages of pre-white dwarfs and hot subdwarfs. They are a rare class of variable star due to their evolutionary history within interacting binary systems and the short timescales relative to their lifetime in which they are pulsationally…
▽ More
Blue Large Amplitude Pulsators (BLAPs) are hot, subluminous stars undergoing rapid variability with periods of under 60 mins. They have been linked with the early stages of pre-white dwarfs and hot subdwarfs. They are a rare class of variable star due to their evolutionary history within interacting binary systems and the short timescales relative to their lifetime in which they are pulsationally unstable. All currently known BLAPs are relatively faint (15-19 mag) and are located in the Galactic plane. These stars have intrinsically blue colours but the large interstellar extinction in the Galactic plane prevents them from swift identification using colour-based selection criteria. In this paper, we correct the Gaia $G$-band apparent magnitude and $G_{\mathrm{BP}}-G_{\mathrm{RP}}$ colours of 89.6 million sources brighter than 19 mag in the Galactic plane with good quality photometry combined with supplementary all-sky data totalling 162.3 million sources. Selecting sources with colours consistent with the known population of BLAPs and performing a cross-match with the Zwicky Transient Facility (ZTF) DR3, we identify 98 short period candidate variables. Manual inspection of the period-folded light curves reveals 22 candidate BLAPs. Of these targets, 6 are consistent with the observed periods and light curves of the known BLAPs, 10 are within the theoretical period range of BLAPs and 6 are candidate high-gravity BLAPs. We present follow-up spectra of 21 of these candidate sources and propose to classify 1 of them as a BLAP, and tentatively assign an additional 8 of them as BLAPs for future population studies.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
The International Pulsar Timing Array second data release: Search for an isotropic Gravitational Wave Background
Authors:
J. Antoniadis,
Z. Arzoumanian,
S. Babak,
M. Bailes,
A. -S. Bak Nielsen,
P. T. Baker,
C. G. Bassa,
B. Becsy,
A. Berthereau,
M. Bonetti,
A. Brazier,
P. R. Brook,
M. Burgay,
S. Burke-Spolaor,
R. N. Caballero,
J. A. Casey-Clyde,
A. Chalumeau,
D. J. Champion,
M. Charisi,
S. Chatterjee,
S. Chen,
I. Cognard,
J. M. Cordes,
N. J. Cornish,
F. Crawford
, et al. (101 additional authors not shown)
Abstract:
We searched for an isotropic stochastic gravitational wave background in the second data release of the International Pulsar Timing Array, a global collaboration synthesizing decadal-length pulsar-timing campaigns in North America, Europe, and Australia. In our reference search for a power law strain spectrum of the form $h_c = A(f/1\,\mathrm{yr}^{-1})^α$, we found strong evidence for a spectrally…
▽ More
We searched for an isotropic stochastic gravitational wave background in the second data release of the International Pulsar Timing Array, a global collaboration synthesizing decadal-length pulsar-timing campaigns in North America, Europe, and Australia. In our reference search for a power law strain spectrum of the form $h_c = A(f/1\,\mathrm{yr}^{-1})^α$, we found strong evidence for a spectrally-similar low-frequency stochastic process of amplitude $A = 3.8^{+6.3}_{-2.5}\times10^{-15}$ and spectral index $α= -0.5 \pm 0.5$, where the uncertainties represent 95\% credible regions, using information from the auto- and cross-correlation terms between the pulsars in the array. For a spectral index of $α= -2/3$, as expected from a population of inspiralling supermassive black hole binaries, the recovered amplitude is $A = 2.8^{+1.2}_{-0.8}\times10^{-15}$. Nonetheless, no significant evidence of the Hellings-Downs correlations that would indicate a gravitational-wave origin was found. We also analyzed the constituent data from the individual pulsar timing arrays in a consistent way, and clearly demonstrate that the combined international data set is more sensitive. Furthermore, we demonstrate that this combined data set produces comparable constraints to recent single-array data sets which have more data than the constituent parts of the combination. Future international data releases will deliver increased sensitivity to gravitational wave radiation, and significantly increase the detection probability.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
Single-lens mass measurement in the high-magnification microlensing event Gaia19bld located in the Galactic disc
Authors:
K. A. Rybicki,
Ł. Wyrzykowski,
E. Bachelet,
A. Cassan,
P. Zieliński,
A. Gould,
S. Calchi Novati,
J. C. Yee,
Y. -H. Ryu,
M. Gromadzki,
P. Mikołajczyk,
N. Ihanec,
K. Kruszyńska,
F. -J. Hambsch,
S. Zoła,
S. J. Fossey,
S. Awiphan,
N. Nakharutai,
F. Lewis,
F. Olivares E.,
S. Hodgkin,
A. Delgado,
E. Breedt,
D. L. Harrison,
M. vanLeeuwen
, et al. (44 additional authors not shown)
Abstract:
We present the photometric analysis of Gaia19bld, a high-magnification ($A\approx60$) microlensing event located in the southern Galactic plane, which exhibited finite source and microlensing parallax effects. Due to a prompt detection by the Gaia satellite and the very high brightness of $I = 9.05~$mag at the peak, it was possible to collect a complete and unique set of multi-channel follow-up ob…
▽ More
We present the photometric analysis of Gaia19bld, a high-magnification ($A\approx60$) microlensing event located in the southern Galactic plane, which exhibited finite source and microlensing parallax effects. Due to a prompt detection by the Gaia satellite and the very high brightness of $I = 9.05~$mag at the peak, it was possible to collect a complete and unique set of multi-channel follow-up observations, which allowed us to determine all parameters vital for the characterisation of the lens and the source in the microlensing event. Gaia19bld was discovered by the Gaia satellite and was subsequently intensively followed up with a network of ground-based observatories and the Spitzer Space Telescope. We collected multiple high-resolution spectra with Very Large Telescope (VLT)/X-Shooter to characterise the source star. The event was also observed with VLT Interferometer (VLTI)/PIONIER during the peak. Here we focus on the photometric observations and model the light curve composed of data from Gaia, Spitzer, and multiple optical, ground-based observatories. We find the best-fitting solution with parallax and finite source effects. We derived the limit on the luminosity of the lens based on the blended light model and spectroscopic distance. We compute the mass of the lens to be $1.13 \pm 0.03~M_{\odot}$ and derive its distance to be $5.52^{+0.35}_{-0.64}~\mathrm{kpc}$. The lens is likely a main sequence star, however its true nature has yet to be verified by future high-resolution observations. Our results are consistent with interferometric measurements of the angular Einstein radius, emphasising that interferometry can be a new channel for determining the masses of objects that would otherwise remain undetectable, including stellar-mass black holes.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Bayesian Solar Wind Modeling with Pulsar Timing Arrays
Authors:
Jeffrey S. Hazboun,
Joseph Simon,
Dustin R. Madison,
Zaven Arzoumanian,
Kathryn Crowter,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Justin A. Ellis,
Robert D. Ferdman,
Elizabeth C. Ferrara,
Emmanuel Fonseca,
Peter A. Gentile,
Glenn Jones,
Megan L. Jones,
Michael T. Lam,
Lina Levin,
Duncan R. Lorimer,
Ryan S. Lynch,
Maura A. McLaughlin,
Cherry Ng,
David J. Nice,
Timothy T. Pennucci,
Scott M. Ransom,
Paul S. Ray
, et al. (5 additional authors not shown)
Abstract:
Using Bayesian analyses we study the solar electron density with the NANOGrav 11-year pulsar timing array (PTA) dataset. Our model of the solar wind is incorporated into a global fit starting from pulse times-of-arrival. We introduce new tools developed for this global fit, including analytic expressions for solar electron column densities and open source models for the solar wind that port into e…
▽ More
Using Bayesian analyses we study the solar electron density with the NANOGrav 11-year pulsar timing array (PTA) dataset. Our model of the solar wind is incorporated into a global fit starting from pulse times-of-arrival. We introduce new tools developed for this global fit, including analytic expressions for solar electron column densities and open source models for the solar wind that port into existing PTA software. We perform an ab initio recovery of various solar wind model parameters. We then demonstrate the richness of information about the solar electron density, $n_E$, that can be gleaned from PTA data, including higher order corrections to the simple $1/r^2$ model associated with a free-streaming wind (which are informative probes of coronal acceleration physics), quarterly binned measurements of $n_E$ and a continuous time-varying model for $n_E$ spanning approximately one solar cycle period. Finally, we discuss the importance of our model for chromatic noise mitigation in gravitational-wave analyses of pulsar timing data and the potential of develo** synergies between sophisticated PTA solar electron density models and those developed by the solar physics community.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Authors:
Daniel Galvez,
Greg Diamos,
Juan Ciro,
Juan Felipe Cerón,
Keith Achorn,
Anjali Gopi,
David Kanter,
Maximilian Lam,
Mark Mazumder,
Vijay Janapa Reddi
Abstract:
The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset). The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions. We describe our data collection methodology and release our data collection…
▽ More
The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset). The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions. We describe our data collection methodology and release our data collection system under the Apache 2.0 license. We show that a model trained on this dataset achieves a 9.98% word error rate on Librispeech's test-clean test set.Finally, we discuss the legal and ethical issues surrounding the creation of a sizable machine learning corpora and plans for continued maintenance of the project under MLCommons's sponsorship.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues
Authors:
Mehrad Moradshahi,
Victoria Tsai,
Giovanni Campagna,
Monica S. Lam
Abstract:
Robust state tracking for task-oriented dialogue systems currently remains restricted to a few popular languages. This paper shows that given a large-scale dialogue data set in one language, we can automatically produce an effective semantic parser for other languages using machine translation. We propose automatic translation of dialogue datasets with alignment to ensure faithful translation of s…
▽ More
Robust state tracking for task-oriented dialogue systems currently remains restricted to a few popular languages. This paper shows that given a large-scale dialogue data set in one language, we can automatically produce an effective semantic parser for other languages using machine translation. We propose automatic translation of dialogue datasets with alignment to ensure faithful translation of slot values and eliminate costly human supervision used in previous benchmarks. We also propose a new contextual semantic parsing model, which encodes the formal slots and values, and only the last agent and user utterances. We show that the succinct representation reduces the compounding effect of translation errors, without harming the accuracy in practice.
We evaluate our approach on several dialogue state tracking benchmarks. On RiSAWOZ, CrossWOZ, CrossWOZ-EN, and MultiWOZ-ZH datasets we improve the state of the art by 11%, 17%, 20%, and 0.3% in joint goal accuracy. We present a comprehensive error analysis for all three datasets showing erroneous annotations can lead to misguided judgments on the quality of the model.
Finally, we present RiSAWOZ English and German datasets, created using our translation methodology. On these datasets, accuracy is within 11% of the original showing that high-accuracy multilingual dialogue datasets are possible without relying on expensive human annotations. We release our datasets and software open source.
△ Less
Submitted 18 February, 2023; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Automated SpectroPhotometric Image REDuction (ASPIRED)
Authors:
Marco C. Lam,
Robert J. Smith,
Iair Arcavi,
Iain A. Steele,
Josh Veitch-Michaelis,
Lukasz Wyrzykowski
Abstract:
We provide a suite of public open-source spectral data-reduction software to rapidly obtain scientific products from all forms of long-slit-like spectroscopic observations. Automated SpectroPhotometric REDuction (ASPIRED) is a Python-based spectral data-reduction toolkit. It is designed to be a general toolkit with high flexibility for users to refine and optimize their data-reduction routines for…
▽ More
We provide a suite of public open-source spectral data-reduction software to rapidly obtain scientific products from all forms of long-slit-like spectroscopic observations. Automated SpectroPhotometric REDuction (ASPIRED) is a Python-based spectral data-reduction toolkit. It is designed to be a general toolkit with high flexibility for users to refine and optimize their data-reduction routines for the individual characteristics of their instruments. The default configuration is suitable for low-resolution long-slit spectrometers and provides a quick-look quality output. However, for repeatable science-ready reduced spectral data, some moderate one-time effort is necessary to modify the configuration. Fine-tuning and additional (pre)processing may be required to extend the reduction to systems with more complex setups. It is important to emphasize that although only a few parameters need updating, ensuring their correctness and suitability for generalization to the instrument can take time due to factors such as instrument stability. We compare some example spectra reduced with ASPIRED to published data processed with iraf-based and STARLINK-based pipelines, and find no loss in the quality of the final product. The Python-based, iraf-free ASPIRED can significantly ease the effort of an astronomer in constructing their own data-reduction workflow, enabling simpler solutions to data-reduction automation. This availability of near-real-time science-ready data will allow adaptive observing strategies, particularly important in, but not limited to, time-domain astronomy.
△ Less
Submitted 14 June, 2023; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Empirical LiK excited state potentials: connecting short range and near dissociation expansions
Authors:
Sofia Botsi,
Anbang Yang,
Mark M. Lam,
Sambit B. Pal,
Sunil Kumar,
Markus Debatin,
Kai Dieckmann
Abstract:
We report on a high-resolution spectroscopic survey of ${}^{6}\textrm{Li}{}^{40}\textrm{K}$ molecules near the $2\textrm{S}+4\textrm{P}$ dissociation threshold and produce a fully empirical representation for the $\textrm{B}^{1}Π$ potential by connecting available short- and long-range data. The purpose is to identify a suitable intermediate state for a coherent Raman transfer to the absolute grou…
▽ More
We report on a high-resolution spectroscopic survey of ${}^{6}\textrm{Li}{}^{40}\textrm{K}$ molecules near the $2\textrm{S}+4\textrm{P}$ dissociation threshold and produce a fully empirical representation for the $\textrm{B}^{1}Π$ potential by connecting available short- and long-range data. The purpose is to identify a suitable intermediate state for a coherent Raman transfer to the absolute ground state, and the creation of a molecular gas with dipolar interactions. Starting from weakly bound ultracold Feshbach molecules, the transition frequencies to twenty-six vibrational states are determined. Our data are combined with long-range measurements [Ridinger et al., EPL, 2011, 96, 33001], and near-dissociation expansions for the spin-orbit coupled potentials are fitted to extract the $C_6$ dispersion coefficients. A suitable vibrational level is identified by resolving its Zeeman structure and by comparing the experimentally attained g-factor to our theoretical prediction. Using mass-scaling of the short-range data for the $\textrm{B}^{1}Π$ [Pashov et al., Chem. Phys. Lett., 1998, 292, 615-620] and an updated value for its depth, we model the short- and the long-range data simultaneously and produce a Rydberg-Klein-Rees curve covering the entire range.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
The NANOGrav 12.5-year data set: Search for Non-Einsteinian Polarization Modes in theGravitational-Wave Background
Authors:
Zaven Arzoumanian,
Paul T. Baker,
Harsha Blumer,
Bence Becsy,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Maria Charisi,
Shami Chatterjee,
Siyuan Chen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Dallas M. DeGan,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Justin A. Ellis,
Elizabeth C. Ferrara,
William Fiore,
Emmanuel Fonseca,
Nathan Garver-Daniels,
Peter A. Gentile
, et al. (46 additional authors not shown)
Abstract:
We search NANOGrav's 12.5-year data set for evidence of a gravitational wave background (GWB) with all the spatial correlations allowed by general metric theories of gravity. We find no substantial evidence in favor of the existence of such correlations in our data. We find that scalar-transverse (ST) correlations yield signal-to-noise ratios and Bayes factors that are higher than quadrupolar (ten…
▽ More
We search NANOGrav's 12.5-year data set for evidence of a gravitational wave background (GWB) with all the spatial correlations allowed by general metric theories of gravity. We find no substantial evidence in favor of the existence of such correlations in our data. We find that scalar-transverse (ST) correlations yield signal-to-noise ratios and Bayes factors that are higher than quadrupolar (tensor transverse, TT) correlations. Specifically, we find ST correlations with a signal-to-noise ratio of 2.8 that are preferred over TT correlations (Hellings and Downs correlations) with Bayesian odds of about 20:1. However, the significance of ST correlations is reduced dramatically when we include modeling of the Solar System ephemeris systematics and/or remove pulsar J0030$+$0451 entirely from consideration. Even taking the nominal signal-to-noise ratios at face value, analyses of simulated data sets show that such values are not extremely unlikely to be observed in cases where only the usual TT modes are present in the GWB. In the absence of a detection of any polarization mode of gravity, we place upper limits on their amplitudes for a spectral index of $γ= 5$ and a reference frequency of $f_\text{yr} = 1 \text{yr}^{-1}$. Among the upper limits for eight general families of metric theories of gravity, we find the values of $A^{95\%}_{TT} = (9.7 \pm 0.4)\times 10^{-16}$ and $A^{95\%}_{ST} = (1.4 \pm 0.03)\times 10^{-15}$ for the family of metric spacetime theories that contain both TT and ST modes.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Bilateral Denoising Diffusion Models
Authors:
Max W. Y. Lam,
Jun Wang,
Rongjie Huang,
Dan Su,
Dong Yu
Abstract:
Denoising diffusion probabilistic models (DDPMs) have emerged as competitive generative models yet brought challenges to efficient sampling. In this paper, we propose novel bilateral denoising diffusion models (BDDMs), which take significantly fewer steps to generate high-quality samples. From a bilateral modeling objective, BDDMs parameterize the forward and reverse processes with a score network…
▽ More
Denoising diffusion probabilistic models (DDPMs) have emerged as competitive generative models yet brought challenges to efficient sampling. In this paper, we propose novel bilateral denoising diffusion models (BDDMs), which take significantly fewer steps to generate high-quality samples. From a bilateral modeling objective, BDDMs parameterize the forward and reverse processes with a score network and a scheduling network, respectively. We show that a new lower bound tighter than the standard evidence lower bound can be derived as a surrogate objective for training the two networks. In particular, BDDMs are efficient, simple-to-train, and capable of further improving any pre-trained DDPM by optimizing the inference noise schedules. Our experiments demonstrated that BDDMs can generate high-fidelity samples with as few as 3 sampling steps and produce comparable or even higher quality samples than DDPMs using 1000 steps with only 16 sampling steps (a 62x speedup).
△ Less
Submitted 14 September, 2021; v1 submitted 26 August, 2021;
originally announced August 2021.
-
ARCSnake: Reconfigurable Snake-Like Robot with Archimedean Screw Propulsion for Multi-Domain Mobility
Authors:
Florian Richter,
Peter V. Gavrilov,
Hoi Man Lam,
Amir Degani,
Michael C. Yip
Abstract:
Exploring and navigating in extreme environments, such as caves, oceans, and planetary bodies, are often too hazardous for humans, and as such, robots are possible surrogates. These robots are met with significant locomotion challenges that require traversing a wide range of surface roughnesses and topologies. Previous locomotion strategies, involving wheels or ambulatory motion, such as snake pla…
▽ More
Exploring and navigating in extreme environments, such as caves, oceans, and planetary bodies, are often too hazardous for humans, and as such, robots are possible surrogates. These robots are met with significant locomotion challenges that require traversing a wide range of surface roughnesses and topologies. Previous locomotion strategies, involving wheels or ambulatory motion, such as snake platforms, have success on specific surfaces but fail in others which could be detrimental in exploration and navigation missions. In this paper, we present a novel approach that combines snake-like robots with an Archimedean screw locomotion mechanism to provide multiple, effective mobility strategies in a large range of environments, including those that are difficult to traverse for wheeled and ambulatory robots. This work develops a robotic system called ARCSnake to demonstrate this locomotion principle and tested it in a variety of different terrains and environments in order to prove its controllable, multi-domain, navigation capabilities. These tests show a wide breadth of scenarios that ARCSnake can handle, hence demonstrating its ability to traverse through extreme terrains.
△ Less
Submitted 30 July, 2021;
originally announced July 2021.
-
Searching Water Megamasers By Using Mid-infrared Spectroscopy (I): Possible Mid-infrared Indicators
Authors:
Man I Lam,
C. Jakob Walcher,
Feng Gao,
Ming Yang,
Huan Li,
Lei Hao
Abstract:
Water megamasers at 22 GHz with a gas disk configuration in galaxies provide the most precise measurements of supermassive black hole masses, as well as independent constraints on the Hubble constant in the nearby universe. The existence of other maser types, such as jet or outflow masers, represents another tracer for AGN science. However, the detection rate of water megamasers in galaxies is ext…
▽ More
Water megamasers at 22 GHz with a gas disk configuration in galaxies provide the most precise measurements of supermassive black hole masses, as well as independent constraints on the Hubble constant in the nearby universe. The existence of other maser types, such as jet or outflow masers, represents another tracer for AGN science. However, the detection rate of water megamasers in galaxies is extremely low. Over 40 years, only $\sim$ 160 galaxies are found to harbour maser emission, and $\sim$ 30\% of them show features in their maser emission that indicate a disk-like geometry. Therefore, increasing the detection rate of masers is a crucial task to allow expanding on maser studies. We present a comparison of mid-infrared spectroscopic data between a maser galaxy sample and a Seyfert 2 control sample. We find that maser galaxies show significant peculiarities in their mid-infrared spectra: (1) Maser galaxies tend to present stronger silicate absorption at $τ$ 9.7 $μ$m than the control sample, (2) PAH 11.3 $μ$m emission in maser galaxies is much weaker than in the control sample, (3) spectral indices at 20-30 $μ$m are steeper in maser galaxies than in the control sample and tend to be mid-infrared enhanced population. We conclude that there may be good indicators in mid-infrared and far-infrared which could differentiate maser and non-maser Seyfert 2 galaxies. Upcoming infrared facilities, such as the James Webb Space Telescope, may be able to exploit these and other useful criteria and tracers for water megamaser observations.
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix
Authors:
Maximilian Lam,
Gu-Yeon Wei,
David Brooks,
Vijay Janapa Reddi,
Michael Mitzenmacher
Abstract:
We show that aggregated model updates in federated learning may be insecure. An untrusted central server may disaggregate user updates from sums of updates across participants given repeated observations, enabling the server to recover privileged information about individual users' private training data via traditional gradient inference attacks. Our method revolves around reconstructing participa…
▽ More
We show that aggregated model updates in federated learning may be insecure. An untrusted central server may disaggregate user updates from sums of updates across participants given repeated observations, enabling the server to recover privileged information about individual users' private training data via traditional gradient inference attacks. Our method revolves around reconstructing participant information (e.g: which rounds of training users participated in) from aggregated model updates by leveraging summary information from device analytics commonly used to monitor, debug, and manage federated learning systems. Our attack is parallelizable and we successfully disaggregate user updates on settings with up to thousands of participants. We quantitatively and qualitatively demonstrate significant improvements in the capability of various inference attacks on the disaggregated updates. Our attack enables the attribution of learned properties to individual users, violating anonymity, and shows that a determined central server may undermine the secure aggregation protocol to break individual users' data privacy in federated learning.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Authors:
Max W. Y. Lam,
Jun Wang,
Chao Weng,
Dan Su,
Dong Yu
Abstract:
End-to-end speech recognition generally uses hand-engineered acoustic features as input and excludes the feature extraction module from its joint optimization. To extract learnable and adaptive features and mitigate information loss, we propose a new encoder that adopts globally attentive locally recurrent (GALR) networks and directly takes raw waveform as input. We observe improved ASR performanc…
▽ More
End-to-end speech recognition generally uses hand-engineered acoustic features as input and excludes the feature extraction module from its joint optimization. To extract learnable and adaptive features and mitigate information loss, we propose a new encoder that adopts globally attentive locally recurrent (GALR) networks and directly takes raw waveform as input. We observe improved ASR performance and robustness by applying GALR on different window lengths to aggregate fine-grain temporal information into multi-scale acoustic features. Experiments are conducted on a benchmark dataset AISHELL-2 and two large-scale Mandarin speech corpus of 5,000 hours and 21,000 hours. With faster speed and comparable model size, our proposed multi-scale GALR waveform encoder achieved consistent character error rate reductions (CERRs) from 7.9% to 28.1% relative over strong baselines, including Conformer and TDNN-Conformer. In particular, our approach demonstrated notable robustness than the traditional handcrafted features and outperformed the baseline MFCC-based TDNN-Conformer model by a 15.2% CERR on a music-mixed real-world speech test set.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Widening Access to Applied Machine Learning with TinyML
Authors:
Vijay Janapa Reddi,
Brian Plancher,
Susan Kennedy,
Laurence Moroney,
Pete Warden,
Anant Agarwal,
Colby Banbury,
Massimo Banzi,
Matthew Bennett,
Benjamin Brown,
Sharad Chitlangia,
Radhika Ghosal,
Sarah Grafman,
Rupert Jaeger,
Srivatsan Krishnan,
Maximilian Lam,
Daniel Leiker,
Cara Mann,
Mark Mazumder,
Dominic Pajak,
Dhilan Ramaprasad,
J. Evan Smith,
Matthew Stewart,
Dustin Tingley
Abstract:
Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest tha…
▽ More
Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest that TinyML, ML on resource-constrained embedded devices, is an attractive means to widen access because TinyML both leverages low-cost and globally accessible hardware, and encourages the development of complete, self-contained applications, from data collection to deployment. To this end, a collaboration between academia (Harvard University) and industry (Google) produced a four-part MOOC that provides application-oriented instruction on how to develop solutions using TinyML. The series is openly available on the edX MOOC platform, has no prerequisites beyond basic programming, and is designed for learners from a global variety of backgrounds. It introduces pupils to real-world applications, ML algorithms, data-set engineering, and the ethical considerations of these technologies via hands-on programming and deployment of TinyML applications in both the cloud and their own microcontrollers. To facilitate continued learning, community building, and collaboration beyond the courses, we launched a standalone website, a forum, a chat, and an optional course-project competition. We also released the course materials publicly, ho** they will inspire the next generation of ML practitioners and educators and further broaden access to cutting-edge ML technologies.
△ Less
Submitted 9 June, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Searching For Gravitational Waves From Cosmological Phase Transitions With The NANOGrav 12.5-year dataset
Authors:
Zaven Arzoumanian,
Paul T. Baker,
Harsha Blumer,
Bence Bécsy,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Maria Charisi,
Shami Chatterjee,
Siyuan Chen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Justin A. Ellis,
Elizabeth C. Ferrara,
William Fiore,
Emmanuel Fonseca,
Nathan Garver-Daniels,
Peter A. Gentile,
Deborah C. Good,
Jeffrey S. Hazboun
, et al. (40 additional authors not shown)
Abstract:
We search for a first-order phase transition gravitational wave signal in 45 pulsars from the NANOGrav 12.5 year dataset. We find that the data can be modeled in terms of a strong first order phase transition taking place at temperatures below the electroweak scale. However, we do not observe any strong preference for a phase-transition interpretation of the signal over the standard astrophysical…
▽ More
We search for a first-order phase transition gravitational wave signal in 45 pulsars from the NANOGrav 12.5 year dataset. We find that the data can be modeled in terms of a strong first order phase transition taking place at temperatures below the electroweak scale. However, we do not observe any strong preference for a phase-transition interpretation of the signal over the standard astrophysical interpretation in terms of supermassive black holes mergers; but we expect to gain additional discriminating power with future datasets, improving the signal to noise ratio and extending the sensitivity window to lower frequencies. An interesting open question is how well gravitational wave observatories could separate such signals.
△ Less
Submitted 11 January, 2022; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Learning adaptive coarse spaces of BDDC algorithms for stochastic elliptic problems with oscillatory and high contrast coefficients
Authors:
Eric Chung,
Hyea Hyun Kim,
Ming Fai Lam,
Lina Zhao
Abstract:
In this paper, we consider the balancing domain decomposition by constraints (BDDC) algorithm with adaptive coarse spaces for a class of stochastic elliptic problems. The key ingredient in the construction of the coarse space is the solutions of local spectral problems, which depend on the coefficient of the PDE. This poses a significant challenge for stochastic coefficients as it is computational…
▽ More
In this paper, we consider the balancing domain decomposition by constraints (BDDC) algorithm with adaptive coarse spaces for a class of stochastic elliptic problems. The key ingredient in the construction of the coarse space is the solutions of local spectral problems, which depend on the coefficient of the PDE. This poses a significant challenge for stochastic coefficients as it is computationally expensive to solve the local spectral problems for every realisation of the coefficient. To tackle this computational burden, we propose a machine learning approach. Our method is based on the use of a deep neural network (DNN) to approximate the relation between the stochastic coefficients and the coarse spaces. For the input of the DNN, we apply the Karhunen-Loève expansion and use the first few dominant terms in the expansion. The output of the DNN is the resulting coarse space, which is then applied with the standard adaptive BDDC algorithm. We will present some numerical results with oscillatory and high contrast coefficients to show the efficiency and robustness of the proposed scheme.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
The NANOGrav 12.5-Year Data Set: Polarimetry and Faraday Rotation Measures from Observations of Millisecond Pulsars with the Green Bank Telescope
Authors:
Haley M. Wahl,
Maura McLaughlin,
Peter A. Gentile,
Megan L. Jones,
Renée Spiewak,
Zaven Arzoumanian,
Kathryn Crowter,
Paul Demorest,
Megan E. DeCesar,
Timothy Dolch,
Justin A. Ellis,
Robert D. Ferdman,
Elizabeth C. Ferrara,
Emmanuel Fonseca,
Nate Garver-Daniels,
Glenn Jones,
Michael T. Lam,
Lina Levin,
Natalia Lewandowska,
Duncan Lorimer,
Ryan S. Lynch,
Dustin R. Madison,
Cherry Ng,
David J. Nice,
Timothy T. Pennucci
, et al. (6 additional authors not shown)
Abstract:
In this work, we present polarization profiles for 23 millisecond pulsars observed at 820 MHz and 1500 MHz with the Green Bank Telescope as part of the NANOGrav pulsar timing array. We calibrate the data using Mueller matrix solutions calculated from observations of PSRs B1929+10 and J1022+1001. We discuss the polarization profiles, which can be used to constrain pulsar emission geometry, and pres…
▽ More
In this work, we present polarization profiles for 23 millisecond pulsars observed at 820 MHz and 1500 MHz with the Green Bank Telescope as part of the NANOGrav pulsar timing array. We calibrate the data using Mueller matrix solutions calculated from observations of PSRs B1929+10 and J1022+1001. We discuss the polarization profiles, which can be used to constrain pulsar emission geometry, and present both the first published radio polarization profiles for nine pulsars and the discovery of very low intensity average profile components ("microcomponents") in four pulsars. Using the Faraday rotation measures, we measure for each pulsar and use it to calculate the Galactic magnetic field parallel to the line of sight for different lines of sight through the interstellar medium. We fit for linear and sinusoidal trends in time in the dispersion measure and Galactic magnetic field and detect magnetic field variations with a period of one year in some pulsars, but overall find that the variations in these parameters are more consistent with a stochastic origin.
△ Less
Submitted 6 December, 2022; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Observing crossover between quantum speed limits
Authors:
Gal Ness,
Manolo R. Lam,
Wolfgang Alt,
Dieter Meschede,
Yoav Sagi,
Andrea Alberti
Abstract:
Quantum mechanics sets fundamental limits on how fast quantum states can be transformed in time. Two well-known quantum speed limits are the Mandelstam-Tamm and the Margolus-Levitin bounds, which relate the maximum speed of evolution to the system's energy uncertainty and mean energy, respectively. Here, we test concurrently both limits in a multi-level system by following the motion of a single a…
▽ More
Quantum mechanics sets fundamental limits on how fast quantum states can be transformed in time. Two well-known quantum speed limits are the Mandelstam-Tamm and the Margolus-Levitin bounds, which relate the maximum speed of evolution to the system's energy uncertainty and mean energy, respectively. Here, we test concurrently both limits in a multi-level system by following the motion of a single atom in an optical trap using fast matter wave interferometry. Our data reveal two different regimes: one where the Mandelstam-Tamm limit constrains the evolution at all times, and a second where a crossover to the Margolus-Levitin limit is manifested at longer times. We take a geometric approach to quantify the deviation from the speed limit, measuring how much the matter wave's quantum evolution deviates from the geodesic path in the Hilbert space of the multi-level system. Our results, establishing quantum speed limits beyond the simple two-level system, are important to understand the ultimate performance of quantum computing devices and related advanced quantum technologies.
△ Less
Submitted 19 October, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.