-
Tweetorial Hooks: Generative AI Tools to Motivate Science on Social Media
Authors:
Tao Long,
Dorothy Zhang,
Grace Li,
Batool Taraif,
Samia Menon,
Kynnedy Simone Smith,
Sitong Wang,
Katy Ilonka Gero,
Lydia B. Chilton
Abstract:
Communicating science and technology is essential for the public to understand and engage in a rapidly changing world. Tweetorials are an emerging phenomenon where experts explain STEM topics on social media in creative and engaging ways. However, STEM experts struggle to write an engaging "hook" in the first tweet that captures the reader's attention. We propose methods to use large language mode…
▽ More
Communicating science and technology is essential for the public to understand and engage in a rapidly changing world. Tweetorials are an emerging phenomenon where experts explain STEM topics on social media in creative and engaging ways. However, STEM experts struggle to write an engaging "hook" in the first tweet that captures the reader's attention. We propose methods to use large language models (LLMs) to help users scaffold their process of writing a relatable hook for complex scientific topics. We demonstrate that LLMs can help writers find everyday experiences that are relatable and interesting to the public, avoid jargon, and spark curiosity. Our evaluation shows that the system reduces cognitive load and helps people write better hooks. Lastly, we discuss the importance of interactivity with LLMs to preserve the correctness, effectiveness, and authenticity of the writing.
△ Less
Submitted 5 December, 2023; v1 submitted 20 May, 2023;
originally announced May 2023.
-
Final Report for SAG 22: A Target Star Archive for Exoplanet Science
Authors:
Natalie R. Hinkel,
Joshua Pepper,
Christopher C. Stark,
Jennifer A. Burt,
David R. Ciardi,
Kevin K. Hardegree-Ullman,
Jacob Lustig-Yaeger,
Ravi Kopparapu,
Lokesh Mishra,
Karan Molaverdikhani,
Ilaria Pascucci,
Tyler Richey-Yowell,
E. J. Safron,
David J. Wilson,
Galen Bergsten,
Tabetha S. Boyajian,
J. A. Caballero,
K. Cunha,
Alyssa Columbus,
Shawn D. Domagal-Goldman,
Chuanfei Dong,
R. M. Elowitz,
Devanshu Jha,
Archit Kalra,
David W. Latham
, et al. (11 additional authors not shown)
Abstract:
Present and upcoming NASA missions will be intensively observing a selected, partially overlap** set of stars for exoplanet studies. Key physical and chemical information about these stars and their systems is needed for planning observations and interpreting the results. A target star archive of such data would benefit a wide cross-section of the exoplanet community by enhancing the chances of…
▽ More
Present and upcoming NASA missions will be intensively observing a selected, partially overlap** set of stars for exoplanet studies. Key physical and chemical information about these stars and their systems is needed for planning observations and interpreting the results. A target star archive of such data would benefit a wide cross-section of the exoplanet community by enhancing the chances of mission success and improving the efficiency of mission observatories. It would also provide a common, accessible resource for scientific analysis based on standardized assumptions, while revealing gaps or deficiencies in existing knowledge of stellar properties necessary for exoplanetary system characterization.
△ Less
Submitted 8 December, 2021;
originally announced December 2021.
-
Fast and Accurate Computation of Vertical Modes
Authors:
Jeffrey J. Early,
M. Pascale Lelong,
K. Shafer Smith
Abstract:
The vertical modes of linearized equations of motion are widely used by the oceanographic community in numerous theoretical and observational contexts. However, the standard approach for solving the generalized eigenvalue problem using second-order finite difference matrices produces $O(1)$ errors for all but the few lowest modes, and increasing resolution quickly becomes too slow as the computati…
▽ More
The vertical modes of linearized equations of motion are widely used by the oceanographic community in numerous theoretical and observational contexts. However, the standard approach for solving the generalized eigenvalue problem using second-order finite difference matrices produces $O(1)$ errors for all but the few lowest modes, and increasing resolution quickly becomes too slow as the computational complexity of eigenvalue algorithms increase as $O(n^3)$. Existing methods are therefore inadequate for computing a full spectrum of internal waves, such as needed for initializing a numerical model with a full internal wave spectrum. Here we show that rewriting the eigenvalue problem in stretched coordinates and projecting onto Chebyshev polynomials results in substantially more accurate modes than finite-differencing at a fraction of the computational cost. We also compute the surface quasigeostrophic modes using the same methods. All spectral and finite difference algorithms are made available in a suite of Matlab classes that have been validated against known analytical solutions in constant and exponential stratification.
△ Less
Submitted 31 October, 2019;
originally announced October 2019.
-
Energy-conserving Galerkin approximations for quasigeostrophic dynamics
Authors:
Matthew Watwood,
Ian Grooms,
Keith Julien,
K. Shafer Smith
Abstract:
A method is presented for constructing energy-conserving Galerkin approximations in the vertical coordinate of the full quasigeostrophic model with active surface buoyancy. The derivation generalizes the approach of Rocha \emph{et al.} (2016) to allow for general bases. Details are then presented for a specific set of bases: Legendre polynomials for potential vorticity and a recombined Legendre ba…
▽ More
A method is presented for constructing energy-conserving Galerkin approximations in the vertical coordinate of the full quasigeostrophic model with active surface buoyancy. The derivation generalizes the approach of Rocha \emph{et al.} (2016) to allow for general bases. Details are then presented for a specific set of bases: Legendre polynomials for potential vorticity and a recombined Legendre basis from Shen (1994) for the streamfunction. The method is tested in the context of linear baroclinic instability calculations, where it is compared to the standard second-order finite-difference method and to a Chebyshev collocation method. The Galerkin scheme is quite accurate even for a small number of degrees of freedom $N$, and growth rates converge much more quickly with increasing $N$ for the Galerkin scheme than for the finite-difference scheme. The Galerkin scheme is at least as accurate as finite differences and can in some cases achieve the same accuracy as the finite difference scheme with ten times fewer degrees of freedom. The energy-conserving Galerkin scheme is of comparable accuracy to the Chebyshev collocation scheme in most linear stability calculations, but not in the Eady problem where the Chebyshev scheme is significantly more accurate. Finally the three methods are compared in the context of a simplified version of the nonlinear equations: the two-surface model with zero potential vorticity. The Chebyshev scheme is the most accurate, followed by the Galerkin scheme and then the finite difference scheme. All three methods conserve energy with similar accuracy, despite not having any a priori guarantee of energy conservation for the Chebyshev scheme. Further nonlinear tests with non-zero potential vorticity to assess the merits of the methods will be performed in a future work.
△ Less
Submitted 24 January, 2019; v1 submitted 24 October, 2018;
originally announced October 2018.
-
Assessing Crosslingual Discourse Relations in Machine Translation
Authors:
Karin Sim Smith,
Lucia Specia
Abstract:
In an attempt to improve overall translation quality, there has been an increasing focus on integrating more linguistic elements into Machine Translation (MT). While significant progress has been achieved, especially recently with neural models, automatically evaluating the output of such systems is still an open problem. Current practice in MT evaluation relies on a single reference translation,…
▽ More
In an attempt to improve overall translation quality, there has been an increasing focus on integrating more linguistic elements into Machine Translation (MT). While significant progress has been achieved, especially recently with neural models, automatically evaluating the output of such systems is still an open problem. Current practice in MT evaluation relies on a single reference translation, even though there are many ways of translating a particular text, and it tends to disregard higher level information such as discourse. We propose a novel approach that assesses the translated output based on the source text rather than the reference translation, and measures the extent to which the semantics of the discourse elements (discourse relations, in particular) in the source text are preserved in the MT output. The challenge is to detect the discourse relations in the source text and determine whether these relations are correctly transferred crosslingually to the target language -- without a reference translation. This methodology could be used independently for discourse-level evaluation, or as a component in other metrics, at a time where substantial amounts of MT are online and would benefit from evaluation where the source text serves as a benchmark.
△ Less
Submitted 7 October, 2018;
originally announced October 2018.
-
The effect of surface buoyancy gradients on oceanic Rossby wave propagation
Authors:
Xiao Xiao,
K. Shafer Smith,
Shane R. Keating
Abstract:
Motivated by the discrepancy between satellite observations of coherent westward propagating surface features and Rossby wave theory, this paper revisits the planetary wave propagation problem, taking into account the effects of lateral buoyancy gradients at the ocean's surface. The standard theory for long baroclinic Rossby waves is based on an expansion of the quasigeostrophic stretching o…
▽ More
Motivated by the discrepancy between satellite observations of coherent westward propagating surface features and Rossby wave theory, this paper revisits the planetary wave propagation problem, taking into account the effects of lateral buoyancy gradients at the ocean's surface. The standard theory for long baroclinic Rossby waves is based on an expansion of the quasigeostrophic stretching operator in normal modes, $φ_n(z)$, satisfying a Neumann boundary condition at the surface, $φ_n'(0) = 0$. Buoyancy gradients are, by thermal wind balance, proportional to the vertical derivative of the streamfunction, thus such modes are unable to represent ubiquitous lateral buoyancy gradients in the ocean's mixed layer. Here, we re-derive the wave propagation problem in terms of an expansion in a recently-developed "surface-aware" (SA) basis that can account for buoyancy anomalies at the ocean's surface. The problem is studied in the context of an idealized Charney-like baroclinic wave problem set in an oceanic context, where a surface mean buoyancy gradient interacts with a constant interior potential vorticity gradient that results from both $β$ and the curvature of the mean shear. The wave frequencies, growth rates and phases are systematically compared to those computed from a two-layer model, a truncated expansion in standard baroclinic modes and to a high-vertical resolution calculation that represents the true solution. The full solution generally shows faster wave propagation when lateral surface gradients are present. Moreover, the wave problem in the SA basis best captures the full solution, even with just a two or three modes.
△ Less
Submitted 30 July, 2014;
originally announced July 2014.
-
High-resolution transcriptome analysis with long-read RNA sequencing
Authors:
Hyunghoon Cho,
Joe Davis,
Xin Li,
Kevin S. Smith,
Alexis Battle,
Stephen B. Montgomery
Abstract:
RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which d…
▽ More
RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2x75 bp and 2x262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-map** performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower map** bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.
△ Less
Submitted 28 May, 2014;
originally announced May 2014.
-
A surface-aware projection basis for quasigeostrophic flow
Authors:
K. S. Smith,
J. Vanneste
Abstract:
Recent studies indicate that altimetric observations of the ocean's mesoscale eddy field reflect the combined influence of surface buoyancy and interior potential vorticity anomalies. The former have a surface-trapped structure, while the latter have a more grave form. To assess the relative importance of each contribution to the signal, it is useful to project the observed field onto a set of mod…
▽ More
Recent studies indicate that altimetric observations of the ocean's mesoscale eddy field reflect the combined influence of surface buoyancy and interior potential vorticity anomalies. The former have a surface-trapped structure, while the latter have a more grave form. To assess the relative importance of each contribution to the signal, it is useful to project the observed field onto a set of modes that separates their influence in a natural way. However, the surface-trapped dynamics are not well-represented by standard baroclinic modes; moreover, they are dependent on horizontal scale.
Here we derive a modal decomposition that results from the simultaneous diagonalization of the energy and a generalisation of potential enstrophy that includes contributions from the surface buoyancy fields. This approach yields a family of orthonomal bases that depend on two parameters: the standard baroclinic modes are recovered in a limiting case, while other choices provide modes that represent surface and interior dynamics in an efficient way.
For constant stratification, these modes consist of symmetric and antisymmetric exponential modes that capture the surface dynamics, and a series of oscillating modes that represent the interior dynamics. Motivated by the ocean, where shears are concentrated near the upper surface, we also consider the special case of a quiescent lower surface. In this case, the interior modes are independent of wavenumber, and there is a single exponential surface mode that replaces the barotropic mode. We demonstrate the use and effectiveness of these modes by projecting the energy in a set of simulations of baroclinic turbulence.
△ Less
Submitted 20 June, 2012;
originally announced June 2012.