-
Understanding Editing Behaviors in Multilingual Wikipedia
Authors:
Suin Kim,
Sungjoon Park,
Scott A. Hale,
Sooyoung Kim,
Jeongmin Byun,
Alice Oh
Abstract:
Multilingualism is common offline, but we have a more limited understanding of the ways multilingualism is displayed online and the roles that multilinguals play in the spread of content between speakers of different languages. We take a computational approach to studying multilingualism using one of the largest user-generated content platforms, Wikipedia. We study multilingualism by collecting an…
▽ More
Multilingualism is common offline, but we have a more limited understanding of the ways multilingualism is displayed online and the roles that multilinguals play in the spread of content between speakers of different languages. We take a computational approach to studying multilingualism using one of the largest user-generated content platforms, Wikipedia. We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of the English, German, and Spanish editions of Wikipedia. This dataset contains over two million paragraphs edited by over 15,000 multilingual users from July 8 to August 9, 2013. We analyze these multilingual editors in terms of their engagement, interests, and language proficiency in their primary and non-primary (secondary) languages and find that the English edition of Wikipedia displays different dynamics from the Spanish and German editions. Users primarily editing the Spanish and German editions make more complex edits than users who edit these editions as a second language. In contrast, users editing the English edition as a second language make edits that are just as complex as the edits by users who primarily edit the English edition. In this way, English serves a special role bringing together content written by multilinguals from many language editions. Nonetheless, language remains a formidable hurdle to the spread of content: we find evidence for a complexity barrier whereby editors are less likely to edit complex content in a second language. In addition, we find that multilinguals are less engaged and show lower levels of language proficiency in their second languages. We also examine the topical interests of multilingual editors and find that there is no significant difference between primary and non-primary editors in each language.
△ Less
Submitted 28 August, 2015;
originally announced August 2015.
-
Looking for non-Gaussianity in all the right places: A new basis for non-separable bispectra
Authors:
Joyce Byun,
Nishant Agarwal,
Rachel Bean,
Richard Holman
Abstract:
Non-Gaussianity in the distribution of inflationary perturbations, measurable in statistics of the cosmic microwave background (CMB) and large scale structure fluctuations, can be used to probe non-trivial initial quantum states for these perturbations. The bispectrum shapes predicted for generic non-Bunch-Davies initial states are non-factorizable ("non-separable") and are highly oscillatory func…
▽ More
Non-Gaussianity in the distribution of inflationary perturbations, measurable in statistics of the cosmic microwave background (CMB) and large scale structure fluctuations, can be used to probe non-trivial initial quantum states for these perturbations. The bispectrum shapes predicted for generic non-Bunch-Davies initial states are non-factorizable ("non-separable") and are highly oscillatory functions of the three constituent wavenumbers. This can make the computation of CMB bispectra, in particular, computationally intractable. To efficiently compare with CMB data one needs to construct a separable template that has a significant similarity with the actual shape in momentum space. In this paper we consider a variety of inflationary scenarios, with different non-standard initial conditions, and how best to construct viable template matches. In addition to implementing commonly used separable polynomial and Fourier bases, we introduce a basis of localized piecewise spline functions. The spline basis is naturally nearly orthogonal, making it easy to implement and to extend to many modes. We show that, in comparison to existing techniques, the spline basis can provide better fits to the true bispectrum, as measured by the cosine between shapes, for sectors of the theory space of general initial states. As such, it offers a useful approach to investigate non-trivial features generated by fundamental properties of the inflationary Universe.
△ Less
Submitted 6 April, 2015;
originally announced April 2015.
-
Non-Gaussian Shape Discrimination with Spectroscopic Galaxy Surveys
Authors:
Joyce Byun,
Rachel Bean
Abstract:
[Abridged] We consider how galaxy clustering data, from Mpc to Gpc scales, from upcoming large scale structure surveys, such as Euclid and DESI, can provide discriminating information about the bispectrum shape arising from a variety of inflationary scenarios. Through exploring in detail the weighting of shape properties in the calculation of the halo bias and halo mass function we show how they p…
▽ More
[Abridged] We consider how galaxy clustering data, from Mpc to Gpc scales, from upcoming large scale structure surveys, such as Euclid and DESI, can provide discriminating information about the bispectrum shape arising from a variety of inflationary scenarios. Through exploring in detail the weighting of shape properties in the calculation of the halo bias and halo mass function we show how they probe a broad range of configurations, beyond those in the squeezed limit, that can help distinguish between shapes with similar large scale bias behaviors.
We assess the impact, on constraints for a diverse set of non-Gaussian shapes, of galaxy clustering information in the mildly non-linear regime, and surveys that span multiple redshifts and employ different galactic tracers of the dark matter distribution. Fisher forecasts are presented for a Euclid-like spectroscopic survey of H$α$-selected emission line galaxies (ELGs) using recent revisions of the expected H$α$ luminosity function, and a DESI-like survey, of luminous red galaxies (LRGs) and [O-II] doublet-selected ELGs, in combination with Planck-like CMB temperature and polarization data.
While ELG samples provide better probes of shapes that are divergent in the squeezed limit, LRG constraints, centered below $z<1$, yield stronger constraints on shapes with scale-independent large-scale halo biases, such as the equilateral template. The ELG and LRG samples provide complementary degeneracy directions for distinguishing between different shapes. If the Gaussian galaxy bias is constrained to better than a percent level, such as can be determined from the galaxy bispectrum or weak lensing, then the LSS and CMB data could provide complementary constraints that will enable differentiation of bispectra with distinct theoretical origins but with similar large scale, squeezed-limit properties.
△ Less
Submitted 2 October, 2014; v1 submitted 18 September, 2014;
originally announced September 2014.
-
Non-Gaussian Shape Recognition
Authors:
Joyce Byun,
Rachel Bean
Abstract:
A detection of primordial non-Gaussianity could transform our understanding of the fundamental theory of inflation. The precision promised by upcoming CMB and large-scale structure surveys raises a natural question: if a detection given a particular template is made, what does this truly tell us about the underlying theory? In this paper we present a systematic way to constrain a wide range of non…
▽ More
A detection of primordial non-Gaussianity could transform our understanding of the fundamental theory of inflation. The precision promised by upcoming CMB and large-scale structure surveys raises a natural question: if a detection given a particular template is made, what does this truly tell us about the underlying theory? In this paper we present a systematic way to constrain a wide range of non-Gaussian shapes, including general single and multi-field models and models with excited initial states. We present a separable, divergent basis able to recreate many shapes in the literature to high accuracy with between three and seven basis functions. The basis allows shapes to be grouped into broad "template classes", satisfying theoretically-relevant priors on their divergence properties in the squeezed limit. We forecast how well a Planck-like CMB survey could not only detect a general non-Gaussian signal but discern more about its shape, using existing templates and new ones we propose. This approach offers an opportunity to tie together minimal theoretical priors with observational constraints on the shape in general, and in the squeezed limit, to gain a deeper insight into what drove inflation.
△ Less
Submitted 3 March, 2014; v1 submitted 12 March, 2013;
originally announced March 2013.
-
HETDEX pilot survey for emission-line galaxies - I. Survey design, performance, and catalog
Authors:
Joshua J. Adams,
Guillermo A. Blanc,
Gary J. Hill,
Karl Gebhardt,
Niv Drory,
Lei Hao,
Ralf Bender,
Joyce Byun,
Robin Ciardullo,
Mark E. Cornell,
Steven L. Finkelstein,
Alex Fry,
Eric Gawiser,
Caryl Gronwall,
Ulrich Hopp,
Donghui Jeong,
Andreas Kelz,
Ralf Kelzenberg,
Eiichiro Komatsu,
Phillip J. MacQueen,
Jeremy Murphy,
P. Samuel Odoms,
Martin Roth,
Donald P. Schneider,
Joseph R. Tufts
, et al. (1 additional authors not shown)
Abstract:
We present a catalog of emission-line galaxies selected solely by their emission-line fluxes using a wide-field integral field spectrograph. This work is partially motivated as a pilot survey for the upcoming Hobby-Eberly Telescope Dark Energy Experiment (HETDEX). We describe the observations, reductions, detections, redshift classifications, line fluxes, and counterpart information for 397 emissi…
▽ More
We present a catalog of emission-line galaxies selected solely by their emission-line fluxes using a wide-field integral field spectrograph. This work is partially motivated as a pilot survey for the upcoming Hobby-Eberly Telescope Dark Energy Experiment (HETDEX). We describe the observations, reductions, detections, redshift classifications, line fluxes, and counterpart information for 397 emission-line galaxies detected over 169 sq.arcmin with a 3500-5800 Ang. bandpass under 5 Ang. full-width-half-maximum (FWHM) spectral resolution. The survey's best sensitivity for unresolved objects under photometric conditions is between 4-20 E-17 erg/s/sq.cm depending on the wavelength, and Ly-alpha luminosities between 3-6 E42 erg/s are detectable. This survey method complements narrowband and color-selection techniques in the search for high redshift galaxies with its different selection properties and large volume probed. The four survey fields within the COSMOS, GOODS-N, MUNICS, and XMM-LSS areas are rich with existing, complementary data. We find 104 galaxies via their high redshift Ly-alpha emission at 1.9<z<3.8, and the majority of the remainder objects are low redshift [OII]3727 emitters at z<0.56. The classification between low and high redshift objects depends on rest frame equivalent width, as well as other indicators, where available. Based on matches to X-ray catalogs, the active galactic nuclei (AGN) fraction amongst the Ly-alpha emitters (LAEs) is 6%. We also analyze the survey's completeness and contamination properties through simulations. We find five high-z, highly-significant, resolved objects with full-width-half-maximum sizes >44 sq.arcsec which appear to be extended Ly-alpha nebulae. We also find three high-z objects with rest frame Ly-alpha equivalent widths above the level believed to be achievable with normal star formation, EW(rest)>240 Ang.
△ Less
Submitted 1 November, 2010;
originally announced November 2010.
-
Automorphism groups of domains that depend on fewer than the maximal number of parameters
Authors:
Jisoo Byun,
Steven G. Krantz
Abstract:
We study domains in complex $n$-space with automorphism group that does not depend on the full $n$ dimensions of the ambient space. A sufficient geometric condition is obtained to guarantee that a domain has such a "thin" automorphism group. Examples are provided to illustrate the ideas.
We study domains in complex $n$-space with automorphism group that does not depend on the full $n$ dimensions of the ambient space. A sufficient geometric condition is obtained to guarantee that a domain has such a "thin" automorphism group. Examples are provided to illustrate the ideas.
△ Less
Submitted 25 October, 2008;
originally announced October 2008.
-
A Generalization of Connes-Kreimer Hopf Algebra
Authors:
Jungyoon Byun
Abstract:
``Bonsai'' Hopf algebras, introduced here, are generalizations of Connes-Kreimer Hopf algebras, which are motivated by Feynman diagrams and renormalization. We show that we can find operad structure on the set of bonsais. We introduce a new differential on these bonsai Hopf algebras, which is inspired by the tree differential. The cohomologies of these are computed here, and the relationship of…
▽ More
``Bonsai'' Hopf algebras, introduced here, are generalizations of Connes-Kreimer Hopf algebras, which are motivated by Feynman diagrams and renormalization. We show that we can find operad structure on the set of bonsais. We introduce a new differential on these bonsai Hopf algebras, which is inspired by the tree differential. The cohomologies of these are computed here, and the relationship of this differential with the appending operation $*$ of Connes-Kreimer Hopf algebras is investigated.
△ Less
Submitted 24 May, 2005;
originally announced May 2005.