-
PRISM: A Design Framework for Open-Source Foundation Model Safety
Authors:
Terrence Neumann,
Bryan Jones
Abstract:
The rapid advancement of open-source foundation models has brought transparency and accessibility to this groundbreaking technology. However, this openness has also enabled the development of highly-capable, unsafe models, as exemplified by recent instances such as WormGPT and FraudGPT, which are specifically designed to facilitate criminal activity. As the capabilities of open foundation models c…
▽ More
The rapid advancement of open-source foundation models has brought transparency and accessibility to this groundbreaking technology. However, this openness has also enabled the development of highly-capable, unsafe models, as exemplified by recent instances such as WormGPT and FraudGPT, which are specifically designed to facilitate criminal activity. As the capabilities of open foundation models continue to grow, potentially outpacing those of closed-source models, the risk of misuse by bad actors poses an increasingly serious threat to society. This paper addresses the critical question of how open foundation model developers should approach model safety in light of these challenges. Our analysis reveals that open-source foundation model companies often provide less restrictive acceptable use policies (AUPs) compared to their closed-source counterparts, likely due to the inherent difficulties in enforcing such policies once the models are released. To tackle this issue, we introduce PRISM, a design framework for open-source foundation model safety that emphasizes Private, Robust, Independent Safety measures, at Minimal marginal cost of compute. The PRISM framework proposes the use of modular functions that moderate prompts and outputs independently of the core language model, offering a more adaptable and resilient approach to safety compared to the brittle reinforcement learning methods currently used for value alignment. By focusing on identifying AUP violations and engaging the developer community in establishing consensus around safety design decisions, PRISM aims to create a safer open-source ecosystem that maximizes the potential of these powerful technologies while minimizing the risks to individuals and society as a whole.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion Related to Harms of Misinformation
Authors:
Terrence Neumann,
Sooyong Lee,
Maria De-Arteaga,
Sina Fazelpour,
Matthew Lease
Abstract:
The pervasive spread of misinformation and disinformation poses a significant threat to society. Professional fact-checkers play a key role in addressing this threat, but the vast scale of the problem forces them to prioritize their limited resources. This prioritization may consider a range of factors, such as varying risks of harm posed to specific groups of people. In this work, we investigate…
▽ More
The pervasive spread of misinformation and disinformation poses a significant threat to society. Professional fact-checkers play a key role in addressing this threat, but the vast scale of the problem forces them to prioritize their limited resources. This prioritization may consider a range of factors, such as varying risks of harm posed to specific groups of people. In this work, we investigate potential implications of using a large language model (LLM) to facilitate such prioritization. Because fact-checking impacts a wide range of diverse segments of society, it is important that diverse views are represented in the claim prioritization process. This paper examines whether a LLM can reflect the views of various groups when assessing the harms of misinformation, focusing on gender as a primary variable. We pose two central questions: (1) To what extent do prompts with explicit gender references reflect gender differences in opinion in the United States on topics of social relevance? and (2) To what extent do gender-neutral prompts align with gendered viewpoints on those topics? To analyze these questions, we present the TopicMisinfo dataset, containing 160 fact-checked claims from diverse topics, supplemented by nearly 1600 human annotations with subjective perceptions and annotator demographics. Analyzing responses to gender-specific and neutral prompts, we find that GPT 3.5-Turbo reflects empirically observed gender differences in opinion but amplifies the extent of these differences. These findings illuminate AI's complex role in moderating online communication, with implications for fact-checkers, algorithm designers, and the use of crowd-workers as annotators. We also release the TopicMisinfo dataset to support continuing research in the community.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
The Duck's Brain: Training and Inference of Neural Networks in Modern Database Engines
Authors:
Maximilian E. Schüle,
Thomas Neumann,
Alfons Kemper
Abstract:
Although database systems perform well in data access and manipulation, their relational model hinders data scientists from formulating machine learning algorithms in SQL. Nevertheless, we argue that modern database systems perform well for machine learning algorithms expressed in relational algebra. To overcome the barrier of the relational model, this paper shows how to transform data into a rel…
▽ More
Although database systems perform well in data access and manipulation, their relational model hinders data scientists from formulating machine learning algorithms in SQL. Nevertheless, we argue that modern database systems perform well for machine learning algorithms expressed in relational algebra. To overcome the barrier of the relational model, this paper shows how to transform data into a relational representation for training neural networks in SQL: We first describe building blocks for data transformation, model training and inference in SQL-92 and their counterparts using an extended array data type. Then, we compare the implementation for model training and inference using array data types to the one using a relational representation in SQL-92 only. The evaluation in terms of runtime and memory consumption proves the suitability of modern database systems for matrix algebra, although specialised array data types perform better than matrices in relational representation.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Authors:
Thilo von Neumann,
Christoph Boeddeker,
Tobias Cord-Landwehr,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
We propose a modular pipeline for the single-channel separation, recognition, and diarization of meeting-style recordings and evaluate it on the Libri-CSS dataset. Using a Continuous Speech Separation (CSS) system with a TF-GridNet separation architecture, followed by a speaker-agnostic speech recognizer, we achieve state-of-the-art recognition performance in terms of Optimal Reference Combination…
▽ More
We propose a modular pipeline for the single-channel separation, recognition, and diarization of meeting-style recordings and evaluate it on the Libri-CSS dataset. Using a Continuous Speech Separation (CSS) system with a TF-GridNet separation architecture, followed by a speaker-agnostic speech recognizer, we achieve state-of-the-art recognition performance in terms of Optimal Reference Combination Word Error Rate (ORC WER). Then, a d-vector-based diarization module is employed to extract speaker embeddings from the enhanced signals and to assign the CSS outputs to the correct speaker. Here, we propose a syntactically informed diarization using sentence- and word-level boundaries of the ASR module to support speaker turn detection. This results in a state-of-the-art Concatenated minimum-Permutation Word Error Rate (cpWER) for the full meeting recognition pipeline.
△ Less
Submitted 6 May, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Mixture Encoder Supporting Continuous Speech Separation for Meeting Recognition
Authors:
Peter Vieting,
Simon Berger,
Thilo von Neumann,
Christoph Boeddeker,
Ralf Schlüter,
Reinhold Haeb-Umbach
Abstract:
Many real-life applications of automatic speech recognition (ASR) require processing of overlapped speech. A commonmethod involves first separating the speech into overlap-free streams and then performing ASR on the resulting signals. Recently, the inclusion of a mixture encoder in the ASR model has been proposed. This mixture encoder leverages the original overlapped speech to mitigate the effect…
▽ More
Many real-life applications of automatic speech recognition (ASR) require processing of overlapped speech. A commonmethod involves first separating the speech into overlap-free streams and then performing ASR on the resulting signals. Recently, the inclusion of a mixture encoder in the ASR model has been proposed. This mixture encoder leverages the original overlapped speech to mitigate the effect of artifacts introduced by the speech separation. Previously, however, the method only addressed two-speaker scenarios. In this work, we extend this approach to more natural meeting contexts featuring an arbitrary number of speakers and dynamic overlaps. We evaluate the performance using different speech separators, including the powerful TF-GridNet model. Our experiments show state-of-the-art performance on the LibriCSS dataset and highlight the advantages of the mixture encoder. Furthermore, they demonstrate the strong separation of TF-GridNet which largely closes the gap between previous methods and oracle separation.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Third order QCD predictions for fiducial W-boson production
Authors:
John Campbell,
Tobias Neumann
Abstract:
Measurements of W-boson production at the LHC have reached percent-level precision and impose challenging demands on theoretical predictions. Such predictions directly limit the precision of measurements of fundamental quantities such as the W-boson mass and the weak mixing angle. A dominant source of uncertainty in predictions is from higher-order QCD effects. We present a calculation of W-boson…
▽ More
Measurements of W-boson production at the LHC have reached percent-level precision and impose challenging demands on theoretical predictions. Such predictions directly limit the precision of measurements of fundamental quantities such as the W-boson mass and the weak mixing angle. A dominant source of uncertainty in predictions is from higher-order QCD effects. We present a calculation of W-boson production at the level of $α_s^3$ at fixed order and including transverse-momentum resummation. We further show predictions for a direct comparison with low-pileup ATLAS transverse-momentum and fiducial cross-section measurements at $\sqrt{s}=5.02\text{ TeV}$. We discuss in detail the impact of modern PDFs. Our calculation including the matching to W+jet production at NNLO will be publicly available the upcoming CuTe-MCFM release and allows for theory-data comparison at the state-of-the-art level.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
Authors:
Thilo von Neumann,
Christoph Boeddeker,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
MeetEval is an open-source toolkit to evaluate all kinds of meeting transcription systems. It provides a unified interface for the computation of commonly used Word Error Rates (WERs), specifically cpWER, ORC-WER and MIMO-WER along other WER definitions. We extend the cpWER computation by a temporal constraint to ensure that only words are identified as correct when the temporal alignment is plaus…
▽ More
MeetEval is an open-source toolkit to evaluate all kinds of meeting transcription systems. It provides a unified interface for the computation of commonly used Word Error Rates (WERs), specifically cpWER, ORC-WER and MIMO-WER along other WER definitions. We extend the cpWER computation by a temporal constraint to ensure that only words are identified as correct when the temporal alignment is plausible. This leads to a better quality of the matching of the hypothesis string to the reference string that more closely resembles the actual transcription quality, and a system is penalized if it provides poor time annotations. Since word-level timing information is often not available, we present a way to approximate exact word-level timings from segment-level timings (e.g., a sentence) and show that the approximation leads to a similar WER as a matching with exact word-level annotations. At the same time, the time constraint leads to a speedup of the matching algorithm, which outweighs the additional overhead caused by processing the time stamps.
△ Less
Submitted 25 January, 2024; v1 submitted 21 July, 2023;
originally announced July 2023.
-
DashQL -- Complete Analysis Workflows with SQL
Authors:
André Kohn,
Dominik Moritz,
Thomas Neumann
Abstract:
We present DashQL, a language that describes complete analysis workflows in self-contained scripts. DashQL combines SQL, the grammar of relational database systems, with a grammar of graphics in a grammar of analytics. It supports preparing and visualizing arbitrarily complex SQL statements in a single coherent language. The proximity to SQL facilitates holistic optimizations of analysis workflows…
▽ More
We present DashQL, a language that describes complete analysis workflows in self-contained scripts. DashQL combines SQL, the grammar of relational database systems, with a grammar of graphics in a grammar of analytics. It supports preparing and visualizing arbitrarily complex SQL statements in a single coherent language. The proximity to SQL facilitates holistic optimizations of analysis workflows covering data input, encoding, transformations, and visualizations. These optimizations use model and query metadata for visualization-driven aggregation, remote predicate pushdown, and adaptive materialization. We introduce the DashQL language as an extension of SQL and describe the efficient and interactive processing of text-based analysis workflows.
△ Less
Submitted 7 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Does AI-Assisted Fact-Checking Disproportionately Benefit Majority Groups Online?
Authors:
Terrence Neumann,
Nicholas Wolczynski
Abstract:
In recent years, algorithms have been incorporated into fact-checking pipelines. They are used not only to flag previously fact-checked misinformation, but also to provide suggestions about which trending claims should be prioritized for fact-checking - a paradigm called `check-worthiness.' While several studies have examined the accuracy of these algorithms, none have investigated how the benefit…
▽ More
In recent years, algorithms have been incorporated into fact-checking pipelines. They are used not only to flag previously fact-checked misinformation, but also to provide suggestions about which trending claims should be prioritized for fact-checking - a paradigm called `check-worthiness.' While several studies have examined the accuracy of these algorithms, none have investigated how the benefits from these algorithms (via reduction in exposure to misinformation) are distributed amongst various online communities. In this paper, we investigate how diverse representation across multiple stages of the AI development pipeline affects the distribution of benefits from AI-assisted fact-checking for different online communities. We simulate information propagation through the network using our novel Topic-Aware, Community-Impacted Twitter (TACIT) simulator on a large Twitter followers network, tuned to produce realistic cascades of true and false information across multiple topics. Finally, using simulated data as a test bed, we implement numerous algorithmic fact-checking interventions that explicitly account for notions of diversity. We find that both representative and egalitarian methods for sampling and labeling check-worthiness model training data can lead to network-wide benefit concentrated in majority communities, while incorporating diversity into how fact-checkers use algorithmic recommendations can actively reduce inequalities in benefits between majority and minority communities. These findings contribute to an important conversation around the responsible implementation of AI-assisted fact-checking by social media platforms and fact-checking organizations.
△ Less
Submitted 9 February, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Jet-veto resummation at N$^3$LL$_\text{p}$+NNLO in boson production processes
Authors:
John M. Campbell,
R. Keith Ellis,
Tobias Neumann,
Satyajit Seth
Abstract:
Vetoing energetic jet activity is a crucial tool for suppressing backgrounds and enabling new physics searches at the LHC, but the introduction of a veto scale can introduce large logarithms that may need to be resummed. We present an implementation of jet-veto resummation for color-singlet processes at the level of N$^3$LL$_\text{p}$ matched to fixed-order NNLO predictions. Our public code MCFM a…
▽ More
Vetoing energetic jet activity is a crucial tool for suppressing backgrounds and enabling new physics searches at the LHC, but the introduction of a veto scale can introduce large logarithms that may need to be resummed. We present an implementation of jet-veto resummation for color-singlet processes at the level of N$^3$LL$_\text{p}$ matched to fixed-order NNLO predictions. Our public code MCFM allows for predictions of a single boson, such as $Z/γ^*$, $W^{\pm}$ or $H$, or with a pair of vector bosons, such as $W^+W^-$, $W^{\pm} Z$ or $ZZ$. The implementation relies on recent calculations of the soft and beam functions in the presence of a jet veto over all rapidities, with jets defined using a sequential recombination algorithm with jet radius $R$. However one of the ingredients that is required to reach full N$^3$LL accuracy is only known approximately, hence N$^3$LL$_\text{p}$. We describe in detail our formalism and compare with previous public codes that operate at the level of NNLL. Our higher-order predictions improve significantly upon NNLL calculations by reducing theoretical uncertainties. We demonstrate this by comparing our predictions with ATLAS and CMS results.
△ Less
Submitted 9 April, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems
Authors:
Thilo von Neumann,
Christoph Boeddeker,
Keisuke Kinoshita,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
We propose a general framework to compute the word error rate (WER) of ASR systems that process recordings containing multiple speakers at their input and that produce multiple output word sequences (MIMO). Such ASR systems are typically required, e.g., for meeting transcription. We provide an efficient implementation based on a dynamic programming search in a multi-dimensional Levenshtein distanc…
▽ More
We propose a general framework to compute the word error rate (WER) of ASR systems that process recordings containing multiple speakers at their input and that produce multiple output word sequences (MIMO). Such ASR systems are typically required, e.g., for meeting transcription. We provide an efficient implementation based on a dynamic programming search in a multi-dimensional Levenshtein distance tensor under the constraint that a reference utterance must be matched consistently with one hypothesis output. This also results in an efficient implementation of the ORC WER which previously suffered from exponential complexity. We give an overview of commonly used WER definitions for multi-speaker scenarios and show that they are specializations of the above MIMO WER tuned to particular application scenarios. We conclude with a discussion of the pros and cons of the various WER definitions and a recommendation when to use which.
△ Less
Submitted 21 July, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Transverse momentum resummation at N3LL+NNLO for diboson processes
Authors:
John M. Campbell,
R. Keith Ellis,
Tobias Neumann,
Satyajit Seth
Abstract:
Diboson processes are one of the most accessible and stringent probes of the Standard Model's electroweak gauge structure at the LHC. They will be probed at the percent level at the high-luminosity LHC, challenging current theory predictions. We present transverse momentum resummed calculations at N3LL+NNLO for the processes $ZZ$, $WZ$, $WH$ and $ZH$, compare our predictions with most recent LHC d…
▽ More
Diboson processes are one of the most accessible and stringent probes of the Standard Model's electroweak gauge structure at the LHC. They will be probed at the percent level at the high-luminosity LHC, challenging current theory predictions. We present transverse momentum resummed calculations at N3LL+NNLO for the processes $ZZ$, $WZ$, $WH$ and $ZH$, compare our predictions with most recent LHC data and present predictions at 13.6 TeV including theory uncertainty estimates. For $W^+W^-$ production we further present jet-veto resummed results at N3LLp+NNLO. Our calculations will be made publicly available in the upcoming MCFM release and allow future analyses to take advantage of improved predictions.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
MMS-MSG: A Multi-purpose Multi-Speaker Mixture Signal Generator
Authors:
Tobias Cord-Landwehr,
Thilo von Neumann,
Christoph Boeddeker,
Reinhold Haeb-Umbach
Abstract:
The scope of speech enhancement has changed from a monolithic view of single, independent tasks, to a joint processing of complex conversational speech recordings. Training and evaluation of these single tasks requires synthetic data with access to intermediate signals that is as close as possible to the evaluation scenario. As such data often is not available, many works instead use specialized d…
▽ More
The scope of speech enhancement has changed from a monolithic view of single, independent tasks, to a joint processing of complex conversational speech recordings. Training and evaluation of these single tasks requires synthetic data with access to intermediate signals that is as close as possible to the evaluation scenario. As such data often is not available, many works instead use specialized databases for the training of each system component, e.g WSJ0-mix for source separation. We present a Multi-purpose Multi-Speaker Mixture Signal Generator (MMS-MSG) for generating a variety of speech mixture signals based on any speech corpus, ranging from classical anechoic mixtures (e.g., WSJ0-mix) over reverberant mixtures (e.g., SMS-WSJ) to meeting-style data. Its highly modular and flexible structure allows for the simulation of diverse environments and dynamic mixing, while simultaneously enabling an easy extension and modification to generate new scenarios and mixture types. These meetings can be used for prototy**, evaluation, or training purposes. We provide example evaluation data and baseline results for meetings based on the WSJ corpus. Further, we demonstrate the usefulness for realistic scenarios by using MMS-MSG to provide training data for the LibriCSS database.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Report of the Topical Group on Top quark physics and heavy flavor production for Snowmass 2021
Authors:
Reinhard Schwienhorst,
Doreen Wackeroth,
Kaustubh Agashe,
Simone Alioli,
Javier Aparisi,
Giuseppe Bevilacqua,
Huan-Yu Bi,
Raymond Brock,
Abel Gutierrez Camacho,
Fernando Febres Cordero,
Jorge de Blas,
Regina Demina,
Yong Du,
Gauthier Durieux,
Jarrett Fein,
Roberto Franceschini,
Juan Fuster,
Maria Vittoria Garzelli,
Alessandro Gavardi,
Jason Gombas,
Christoph Grojean,
Jiale Gu,
Marco Guzzi,
Heribertus Bayu Hartanto,
Andre Hoang
, et al. (46 additional authors not shown)
Abstract:
This report summarizes the work of the Energy Frontier Topical Group on EW Physics: Heavy flavor and top quark physics (EF03) of the 2021 Community Summer Study (Snowmass). It aims to highlight the physics potential of top-quark studies and heavy-flavor production processes (bottom and charm) at the HL-LHC and possible future hadron and lepton colliders and running scenarios.
This report summarizes the work of the Energy Frontier Topical Group on EW Physics: Heavy flavor and top quark physics (EF03) of the 2021 Community Summer Study (Snowmass). It aims to highlight the physics potential of top-quark studies and heavy-flavor production processes (bottom and charm) at the HL-LHC and possible future hadron and lepton colliders and running scenarios.
△ Less
Submitted 6 November, 2022; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
Authors:
Keisuke Kinoshita,
Thilo von Neumann,
Marc Delcroix,
Christoph Boeddeker,
Reinhold Haeb-Umbach
Abstract:
Recent speaker diarization studies showed that integration of end-to-end neural diarization (EEND) and clustering-based diarization is a promising approach for achieving state-of-the-art performance on various tasks. Such an approach first divides an observed signal into fixed-length segments, then performs {\it segment-level} local diarization based on an EEND module, and merges the segment-level…
▽ More
Recent speaker diarization studies showed that integration of end-to-end neural diarization (EEND) and clustering-based diarization is a promising approach for achieving state-of-the-art performance on various tasks. Such an approach first divides an observed signal into fixed-length segments, then performs {\it segment-level} local diarization based on an EEND module, and merges the segment-level results via clustering to form a final global diarization result. The segmentation is done to limit the number of speakers in each segment since the current EEND cannot handle a large number of speakers. In this paper, we argue that such an approach involving the segmentation has several issues; for example, it inevitably faces a dilemma that larger segment sizes increase both the context available for enhancing the performance and the number of speakers for the local EEND module to handle. To resolve such a problem, this paper proposes a novel framework that performs diarization without segmentation. However, it can still handle challenging data containing many speakers and a significant amount of overlap** speech. The proposed method can take an entire meeting for inference and perform {\it utterance-by-utterance} diarization that clusters utterance activities in terms of speakers. To this end, we leverage a neural network training scheme called Graph-PIT proposed recently for neural source separation. Experiments with simulated active-meeting-like data and CALLHOME data show the superiority of the proposed approach over the conventional methods.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Fiducial Drell-Yan production at the LHC improved by transverse-momentum resummation at N$^4$LL+N$^3$LO
Authors:
Tobias Neumann,
John Campbell
Abstract:
Drell-Yan production is one of the precision cornerstones of the LHC, serving as calibration for measurements such as the $W$-boson mass. Its extreme precision at the level of 1% challenges theory predictions at the highest level. We present the first independent calculation of Drell-Yan production at order $α_s^3$ in transverse-momentum ($q_T$) resummation improved perturbation theory. Our calcul…
▽ More
Drell-Yan production is one of the precision cornerstones of the LHC, serving as calibration for measurements such as the $W$-boson mass. Its extreme precision at the level of 1% challenges theory predictions at the highest level. We present the first independent calculation of Drell-Yan production at order $α_s^3$ in transverse-momentum ($q_T$) resummation improved perturbation theory. Our calculation reaches the state-of-the-art through inclusion of the recently published four loop rapidity anomalous dimension and three loop massive axial-vector contributions. We compare to the most recent data from CMS with fiducial and differential cross-section predictions and find excellent agreement at the percent level. Our resummed calculation including the matching to $Z$+jet production at NNLO is publicly available in the upcoming CuTe-MCFM 10.3 release and allows for theory-data comparison at an unprecedented level.
△ Less
Submitted 10 November, 2022; v1 submitted 14 July, 2022;
originally announced July 2022.
-
A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network
Authors:
Tobias Gburrek,
Christoph Boeddeker,
Thilo von Neumann,
Tobias Cord-Landwehr,
Joerg Schmalenstroeer,
Reinhold Haeb-Umbach
Abstract:
We propose a system that transcribes the conversation of a typical meeting scenario that is captured by a set of initially unsynchronized microphone arrays at unknown positions. It consists of subsystems for signal synchronization, including both sampling rate and sampling time offset estimation, diarization based on speaker and microphone array position estimation, multi-channel speech enhancemen…
▽ More
We propose a system that transcribes the conversation of a typical meeting scenario that is captured by a set of initially unsynchronized microphone arrays at unknown positions. It consists of subsystems for signal synchronization, including both sampling rate and sampling time offset estimation, diarization based on speaker and microphone array position estimation, multi-channel speech enhancement, and automatic speech recognition. With the estimated diarization information, a spatial mixture model is initialized that is used to estimate beamformer coefficients for source separation. Simulations show that the speech recognition accuracy can be improved by synchronizing and combining multiple distributed microphone arrays compared to a single compact microphone array. Furthermore, the proposed informed initialization of the spatial mixture model delivers a clear performance advantage over random initialization.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Justice in Misinformation Detection Systems: An Analysis of Algorithms, Stakeholders, and Potential Harms
Authors:
Terrence Neumann,
Maria De-Arteaga,
Sina Fazelpour
Abstract:
Faced with the scale and surge of misinformation on social media, many platforms and fact-checking organizations have turned to algorithms for automating key parts of misinformation detection pipelines. While offering a promising solution to the challenge of scale, the ethical and societal risks associated with algorithmic misinformation detection are not well-understood. In this paper, we employ…
▽ More
Faced with the scale and surge of misinformation on social media, many platforms and fact-checking organizations have turned to algorithms for automating key parts of misinformation detection pipelines. While offering a promising solution to the challenge of scale, the ethical and societal risks associated with algorithmic misinformation detection are not well-understood. In this paper, we employ and extend upon the notion of informational justice to develop a framework for explicating issues of justice relating to representation, participation, distribution of benefits and burdens, and credibility in the misinformation detection pipeline. Drawing on the framework: (1) we show how injustices materialize for stakeholders across three algorithmic stages in the pipeline; (2) we suggest empirical measures for assessing these injustices; and (3) we identify potential sources of these harms. This framework should help researchers, policymakers, and practitioners reason about potential harms or risks associated with these algorithms and provide conceptual guidance for the design of algorithmic fairness audits in this domain.
△ Less
Submitted 29 April, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Computational challenges for multi-loop collider phenomenology
Authors:
Fernando Febres Cordero,
Andreas von Manteuffel,
Tobias Neumann
Abstract:
Precision measurements at the LHC and future colliders require theory predictions with uncertainties at the percent level for many observables. Theory uncertainties due to the perturbative truncation are particularly relevant and must be reduced to fully exploit the physics potential of collider experiments. In recent years the theoretical high energy physics community has made tremendous analytic…
▽ More
Precision measurements at the LHC and future colliders require theory predictions with uncertainties at the percent level for many observables. Theory uncertainties due to the perturbative truncation are particularly relevant and must be reduced to fully exploit the physics potential of collider experiments. In recent years the theoretical high energy physics community has made tremendous analytical and numerical advances to address this challenge. In this white paper, we survey state-of-the-art calculations in perturbative quantum field theory for collider phenomenology with a particular focus on the computational requirements at high perturbative orders. We show that these calculations can have specific high-performance-computing (HPC) profiles that should to be taken into account in future HPC resource planning.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
An Initialization Scheme for Meeting Separation with Spatial Mixture Models
Authors:
Christoph Boeddeker,
Tobias Cord-Landwehr,
Thilo von Neumann,
Reinhold Haeb-Umbach
Abstract:
Spatial mixture model (SMM) supported acoustic beamforming has been extensively used for the separation of simultaneously active speakers. However, it has hardly been considered for the separation of meeting data, that are characterized by long recordings and only partially overlap** speech. In this contribution, we show that the fact that often only a single speaker is active can be utilized fo…
▽ More
Spatial mixture model (SMM) supported acoustic beamforming has been extensively used for the separation of simultaneously active speakers. However, it has hardly been considered for the separation of meeting data, that are characterized by long recordings and only partially overlap** speech. In this contribution, we show that the fact that often only a single speaker is active can be utilized for a clever initialization of an SMM that employs time-varying class priors. In experiments on LibriCSS we show that the proposed initialization scheme achieves a significantly lower Word Error Rate (WER) on a downstream speech recognition task than a random initialization of the class probabilities by drawing from a Dirichlet distribution. With the only requirement that the number of speakers has to be known, we obtain a WER of 5.9 %, which is comparable to the best reported WER on this data set. Furthermore, the estimated speaker activity from the mixture model serves as a diarization based on spatial information.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Event Generators for High-Energy Physics Experiments
Authors:
J. M. Campbell,
M. Diefenthaler,
T. J. Hobbs,
S. Höche,
J. Isaacson,
F. Kling,
S. Mrenna,
J. Reuter,
S. Alioli,
J. R. Andersen,
C. Andreopoulos,
A. M. Ankowski,
E. C. Aschenauer,
A. Ashkenazi,
M. D. Baker,
J. L. Barrow,
M. van Beekveld,
G. Bewick,
S. Bhattacharya,
C. Bierlich,
E. Bothmann,
P. Bredt,
A. Broggio,
A. Buckley,
A. Butter
, et al. (186 additional authors not shown)
Abstract:
We provide an overview of the status of Monte-Carlo event generators for high-energy particle physics. Guided by the experimental needs and requirements, we highlight areas of active development, and opportunities for future improvements. Particular emphasis is given to physics models and algorithms that are employed across a variety of experiments. These common themes in event generator developme…
▽ More
We provide an overview of the status of Monte-Carlo event generators for high-energy particle physics. Guided by the experimental needs and requirements, we highlight areas of active development, and opportunities for future improvements. Particular emphasis is given to physics models and algorithms that are employed across a variety of experiments. These common themes in event generator development lead to a more comprehensive understanding of physics at the highest energies and intensities, and allow models to be tested against a wealth of data that have been accumulated over the past decades. A cohesive approach to event generator development will allow these models to be further improved and systematic uncertainties to be reduced, directly contributing to future experimental success. Event generators are part of a much larger ecosystem of computational tools. They typically involve a number of unknown model parameters that must be tuned to experimental data, while maintaining the integrity of the underlying physics models. Making both these data, and the analyses with which they have been obtained accessible to future users is an essential aspect of open science and data preservation. It ensures the consistency of physics models across a variety of experiments.
△ Less
Submitted 23 January, 2024; v1 submitted 21 March, 2022;
originally announced March 2022.
-
TecCoBot: Technology-aided support for self-regulated learning
Authors:
Norbert Pengel,
Anne Martin,
Roy Meissner,
Tamar Arndt,
Alexander Tobias Neumann,
Peter de Lange,
Heinz-Werner Wollersheim
Abstract:
In addition to formal learning at universities, like in lecture halls and seminar rooms, students are regularly confronted with self-study activities. Instead of being left to their own devices, students might benefit from a proper design of such activities, including pedagogical interventions. Such designs can increase the degree of activity and the contribution of self-study activities to the ac…
▽ More
In addition to formal learning at universities, like in lecture halls and seminar rooms, students are regularly confronted with self-study activities. Instead of being left to their own devices, students might benefit from a proper design of such activities, including pedagogical interventions. Such designs can increase the degree of activity and the contribution of self-study activities to the achievement of learning outcomes.
Especially in times of a global pandemic, self-study activities are increasingly executed at home, where students already use technology-enhanced materials, processes, and digital platforms. Thus we pick up these building blocks and introduce TecCoBot within this paper. TecCoBot is not only a chatbot, supporting students in reading texts by offering writing assignments and providing automated feedback on these, but also implements a design for self-study activities, typically only offered to a few students as face-to-face mentoring.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Monaural source separation: From anechoic to reverberant environments
Authors:
Tobias Cord-Landwehr,
Christoph Boeddeker,
Thilo von Neumann,
Catalin Zorila,
Rama Doddipatla,
Reinhold Haeb-Umbach
Abstract:
Impressive progress in neural network-based single-channel speech source separation has been made in recent years. But those improvements have been mostly reported on anechoic data, a situation that is hardly met in practice. Taking the SepFormer as a starting point, which achieves state-of-the-art performance on anechoic mixtures, we gradually modify it to optimize its performance on reverberant…
▽ More
Impressive progress in neural network-based single-channel speech source separation has been made in recent years. But those improvements have been mostly reported on anechoic data, a situation that is hardly met in practice. Taking the SepFormer as a starting point, which achieves state-of-the-art performance on anechoic mixtures, we gradually modify it to optimize its performance on reverberant mixtures. Although this leads to a word error rate improvement by 7 percentage points compared to the standard SepFormer implementation, the system ends up with only marginally better performance than a PIT-BLSTM separation system, that is optimized with rather straightforward means. This is surprising and at the same time sobering, challenging the practical usefulness of many improvements reported in recent years for monaural source separation on nonreverberant data.
△ Less
Submitted 10 May, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
SA-SDR: A novel loss function for separation of meeting style data
Authors:
Thilo von Neumann,
Keisuke Kinoshita,
Christoph Boeddeker,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
Many state-of-the-art neural network-based source separation systems use the averaged Signal-to-Distortion Ratio (SDR) as a training objective function. The basic SDR is, however, undefined if the network reconstructs the reference signal perfectly or if the reference signal contains silence, e.g., when a two-output separator processes a single-speaker recording. Many modifications to the plain SD…
▽ More
Many state-of-the-art neural network-based source separation systems use the averaged Signal-to-Distortion Ratio (SDR) as a training objective function. The basic SDR is, however, undefined if the network reconstructs the reference signal perfectly or if the reference signal contains silence, e.g., when a two-output separator processes a single-speaker recording. Many modifications to the plain SDR have been proposed that trade-off between making the loss more robust and distorting its value. We propose to switch from a mean over the SDRs of each individual output channel to a global SDR over all output channels at the same time, which we call source-aggregated SDR (SA-SDR). This makes the loss robust against silence and perfect reconstruction as long as at least one reference signal is not silent. We experimentally show that our proposed SA-SDR is more stable and preferable over other well-known modifications when processing meeting-style data that typically contains many silent or single-speaker regions.
△ Less
Submitted 21 April, 2022; v1 submitted 29 October, 2021;
originally announced October 2021.
-
Testing parton distribution functions with t-channel single-top-quark production
Authors:
John Campbell,
Tobias Neumann,
Zack Sullivan
Abstract:
The production of single top-quarks in the t-channel at hadron colliders imposes strong analytic constraints on parton distribution functions (PDFs) through its double deeply inelastic scattering (DDIS) form. We exploit this to provide novel consistency checks between LO, NLO and NNLO PDF fits and propose to include it as a constraint in future PDF fits. Furthermore, while it is well-known that th…
▽ More
The production of single top-quarks in the t-channel at hadron colliders imposes strong analytic constraints on parton distribution functions (PDFs) through its double deeply inelastic scattering (DDIS) form. We exploit this to provide novel consistency checks between LO, NLO and NNLO PDF fits and propose to include it as a constraint in future PDF fits. Furthermore, while it is well-known that the b-quark PDF is highly sensitive to the b-quark mass, we show that the treatment of this systematic uncertainty is still incomplete, fragmented or outright missing at the moment. Consequently, we conclude that the b-quark mass uncertainty is the dominant but so far broadly neglected theory uncertainty for this process.
△ Less
Submitted 29 November, 2021; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Authors:
Thilo von Neumann,
Keisuke Kinoshita,
Christoph Boeddeker,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
Automatic transcription of meetings requires handling of overlapped speech, which calls for continuous speech separation (CSS) systems. The uPIT criterion was proposed for utterance-level separation with neural networks and introduces the constraint that the total number of speakers must not exceed the number of output channels. When processing meeting-like data in a segment-wise manner, i.e., by…
▽ More
Automatic transcription of meetings requires handling of overlapped speech, which calls for continuous speech separation (CSS) systems. The uPIT criterion was proposed for utterance-level separation with neural networks and introduces the constraint that the total number of speakers must not exceed the number of output channels. When processing meeting-like data in a segment-wise manner, i.e., by separating overlap** segments independently and stitching adjacent segments to continuous output streams, this constraint has to be fulfilled for any segment. In this contribution, we show that this constraint can be significantly relaxed. We propose a novel graph-based PIT criterion, which casts the assignment of utterances to output channels in a graph coloring problem. It only requires that the number of concurrently active speakers must not exceed the number of output channels. As a consequence, the system can process an arbitrary number of speakers and arbitrarily long segments and thus can handle more diverse scenarios. Further, the stitching algorithm for obtaining a consistent output order in neighboring segments is of less importance and can even be eliminated completely, not the least reducing the computational effort. Experiments on meeting-style WSJ data show improvements in recognition performance over using the uPIT criterion.
△ Less
Submitted 20 September, 2021; v1 submitted 30 July, 2021;
originally announced July 2021.
-
Speeding Up Permutation Invariant Training for Source Separation
Authors:
Thilo von Neumann,
Christoph Boeddeker,
Keisuke Kinoshita,
Marc Delcroix,
Reinhold Haeb-Umbach
Abstract:
Permutation invariant training (PIT) is a widely used training criterion for neural network-based source separation, used for both utterance-level separation with utterance-level PIT (uPIT) and separation of long recordings with the recently proposed Graph-PIT. When implemented naively, both suffer from an exponential complexity in the number of utterances to separate, rendering them unusable for…
▽ More
Permutation invariant training (PIT) is a widely used training criterion for neural network-based source separation, used for both utterance-level separation with utterance-level PIT (uPIT) and separation of long recordings with the recently proposed Graph-PIT. When implemented naively, both suffer from an exponential complexity in the number of utterances to separate, rendering them unusable for large numbers of speakers or long realistic recordings. We present a decomposition of the PIT criterion into the computation of a matrix and a strictly monotonously increasing function so that the permutation or assignment problem can be solved efficiently with several search algorithms. The Hungarian algorithm can be used for uPIT and we introduce various algorithms for the Graph-PIT assignment problem to reduce the complexity to be polynomial in the number of utterances.
△ Less
Submitted 30 July, 2021;
originally announced July 2021.
-
The Diphoton $q_T$ spectrum at N$^3$LL$^\prime$+NNLO
Authors:
Tobias Neumann
Abstract:
We present a $q_T$-resummed calculation of diphoton production at order N$^3$LL$^\prime$+NNLO. To reach the primed level of accuracy we have implemented the recently published three-loop $\mathcal{O}(α_s^3)$ virtual corrections in the $q\bar{q}$ channel and the three-loop transverse momentum dependent beam functions and combined them with the existing infrastructure of CuTe-MCFM, a code performing…
▽ More
We present a $q_T$-resummed calculation of diphoton production at order N$^3$LL$^\prime$+NNLO. To reach the primed level of accuracy we have implemented the recently published three-loop $\mathcal{O}(α_s^3)$ virtual corrections in the $q\bar{q}$ channel and the three-loop transverse momentum dependent beam functions and combined them with the existing infrastructure of CuTe-MCFM, a code performing resummation at order N$^3$LL. While the primed predictions are parametrically not more accurate, one typically observes from lower orders and other processes that they are the dominant effect of the next order. We include in both the $q\bar{q}$ and loop-induced $gg$ channel the hard contributions consistently together at order $α_s^3$ and find that the resummed $q\bar{q}$ channel without matching stabilizes indeed. Due to large matching corrections and large contributions and uncertainties from the $gg$ channel, the overall improvements are small though. We furthermore study the effect of hybrid-cone photon isolation and hard-scale choice on our fully matched results to describe the ATLAS 8 TeV data and find that the hybrid-cone isolation destroys agreement at small $q_T$.
△ Less
Submitted 29 November, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Machine-learning based methodologies for 3d x-ray measurement, characterization and optimization for buried structures in advanced ic packages
Authors:
Ramanpreet S Pahwa,
Soon Wee Ho,
Ren Qin,
Richard Chang,
Oo Zaw Min,
Wang Jie,
Vempati Srinivasa Rao,
Tin Lay Nwe,
Yan**g Yang,
Jens Timo Neumann,
Ramani Pichumani,
Thomas Gregorich
Abstract:
For over 40 years lithographic silicon scaling has driven circuit integration and performance improvement in the semiconductor industry. As silicon scaling slows down, the industry is increasingly dependent on IC package technologies to contribute to further circuit integration and performance improvements. This is a paradigm shift and requires the IC package industry to reduce the size and increa…
▽ More
For over 40 years lithographic silicon scaling has driven circuit integration and performance improvement in the semiconductor industry. As silicon scaling slows down, the industry is increasingly dependent on IC package technologies to contribute to further circuit integration and performance improvements. This is a paradigm shift and requires the IC package industry to reduce the size and increase the density of internal interconnects on a scale which has never been done before. Traditional package characterization and process optimization relies on destructive techniques such as physical cross-sections and delayering to extract data from internal package features. These destructive techniques are not practical with today's advanced packages. In this paper we will demonstrate how data acquired non-destructively with a 3D X-ray microscope can be enhanced and optimized using machine learning, and can then be used to measure, characterize and optimize the design and production of buried interconnects in advanced IC packages. Test vehicles replicating 2.5D and HBM construction were designed and fabricated, and digital data was extracted from these test vehicles using 3D X-ray and machine learning techniques. The extracted digital data was used to characterize and optimize the design and production of the interconnects and demonstrates a superior alternative to destructive physical analysis. We report an mAP of 0.96 for 3D object detection, a dice score of 0.92 for 3D segmentation, and an average of 2.1um error for 3D metrology on the test dataset. This paper is the first part of a multi-part report.
△ Less
Submitted 19 May, 2021; v1 submitted 8 March, 2021;
originally announced March 2021.
-
Single-top-quark production in the $t$-channel at NNLO
Authors:
John Campbell,
Tobias Neumann,
Zack Sullivan
Abstract:
We present a calculation of t-channel single-top-quark production and decay in the five-flavor scheme at NNLO. Our results resolve a disagreement between two previous calculations of this process that found a difference in the inclusive cross section at the level of the NNLO coefficient itself. We compare in detail with the previous calculations at the inclusive, differential and fiducial level in…
▽ More
We present a calculation of t-channel single-top-quark production and decay in the five-flavor scheme at NNLO. Our results resolve a disagreement between two previous calculations of this process that found a difference in the inclusive cross section at the level of the NNLO coefficient itself. We compare in detail with the previous calculations at the inclusive, differential and fiducial level including b-quark tagging at a fixed scale $μ=m_t$. In addition, we advocate the use of double deep inelastic scattering (DDIS) scales ($μ^2=Q^2$ for the light-quark line and $μ^2=Q^2+m_t^2$ for the heavy-quark line) that maximize perturbative stability and allow for robust scale uncertainties. All NNLO and NLO$\,\otimes\,$NLO contributions for production and decay are included in the on-shell and vertex-function approximation. We present fiducial and differential results for a variety of observables used in Standard Model and Beyond Standard Model analyses, and find an important difference between the NLO and NNLO predictions of exclusive $t+n$-jet cross sections. Overall we find that NNLO corrections are crucial for a precise identification of the t-channel process.
△ Less
Submitted 18 February, 2021; v1 submitted 2 December, 2020;
originally announced December 2020.
-
Magnetic proximity effect on excitonic spin states in Mn-doped layered hybrid perovskites
Authors:
Timo Neumann,
Sascha Feldmann,
Philipp Moser,
Jonathan Zerhoch,
Tim van de Goor,
Alex Delhomme,
Thomas Winkler,
Jonathan J. Finley,
Clément Faugeras,
Martin S. Brandt,
Andreas V. Stier,
Felix Deschler
Abstract:
Materials combining the optoelectronic functionalities of semiconductors with control of the spin degree of freedom are highly sought after for the advancement of quantum technology devices. Here, we report the paramagnetic Ruddlesden-Popper hybrid perovskite Mn:(PEA)2PbI4 (PEA = phenethylammonium) in which the interaction of isolated Mn2+ ions with magnetically brightened excitons leads to circul…
▽ More
Materials combining the optoelectronic functionalities of semiconductors with control of the spin degree of freedom are highly sought after for the advancement of quantum technology devices. Here, we report the paramagnetic Ruddlesden-Popper hybrid perovskite Mn:(PEA)2PbI4 (PEA = phenethylammonium) in which the interaction of isolated Mn2+ ions with magnetically brightened excitons leads to circularly polarized photoluminescence. Using a combination of superconducting quantum interference device (SQUID) magnetometry and magneto-optical experiments, we find that the Brillouin-shaped polarization curve of the photoluminescence follows the magnetization of the material. This indicates coupling between localized manganese magnetic moments and exciton spins via a magnetic proximity effect. The saturation polarization of 15% at 4 K and 6 T indicates a highly imbalanced spin population and demonstrates that manganese do** enables efficient control of excitonic spin states in Ruddlesden-Popper perovskites. Our finding constitutes the first example of polarization control in magnetically doped hybrid perovskites and will stimulate research on this highly tuneable material platform that promises tailored interactions between magnetic moments and electronic states.
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
Fiducial $q_T$ resummation of color-singlet processes at N$^3$LL+NNLO
Authors:
Thomas Becher,
Tobias Neumann
Abstract:
We present a framework for $q_T$ resummation at N$^3$LL+NNLO accuracy for arbitrary color-singlet processes based on a factorization theorem in SCET. Our implementation CuTe-MCFM is fully differential in the Born kinematics and matches to large-$q_T$ fixed-order predictions at relative order $α_s^2$. It provides an efficient way to estimate uncertainties from fixed-order truncation, resummation, a…
▽ More
We present a framework for $q_T$ resummation at N$^3$LL+NNLO accuracy for arbitrary color-singlet processes based on a factorization theorem in SCET. Our implementation CuTe-MCFM is fully differential in the Born kinematics and matches to large-$q_T$ fixed-order predictions at relative order $α_s^2$. It provides an efficient way to estimate uncertainties from fixed-order truncation, resummation, and parton distribution functions. In addition to $W^\pm$, $Z$ and $H$ production, also the diboson processes $γγ,Zγ,ZH$ and $W^\pm H$ are available, including decays. We discuss and exemplify the framework with several direct comparisons to experimental measurements as well as inclusive benchmark results. In particular, we present novel results for $γγ$ and $Zγ$ at N$^3$LL+NNLO and discuss in detail the power corrections induced by photon isolation requirements.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
HL-LHC Computing Review: Common Tools and Community Software
Authors:
HEP Software Foundation,
:,
Thea Aarrestad,
Simone Amoroso,
Markus Julian Atkinson,
Joshua Bendavid,
Tommaso Boccali,
Andrea Bocci,
Andy Buckley,
Matteo Cacciari,
Paolo Calafiura,
Philippe Canal,
Federico Carminati,
Taylor Childers,
Vitaliano Ciulli,
Gloria Corti,
Davide Costanzo,
Justin Gage Dezoort,
Caterina Doglioni,
Javier Mauricio Duarte,
Agnieszka Dziurda,
Peter Elmer,
Markus Elsing,
V. Daniel Elvira,
Giulio Eulisse
, et al. (85 additional authors not shown)
Abstract:
Common and community software packages, such as ROOT, Geant4 and event generators have been a key part of the LHC's success so far and continued development and optimisation will be critical in the future. The challenges are driven by an ambitious physics programme, notably the LHC accelerator upgrade to high-luminosity, HL-LHC, and the corresponding detector upgrades of ATLAS and CMS. In this doc…
▽ More
Common and community software packages, such as ROOT, Geant4 and event generators have been a key part of the LHC's success so far and continued development and optimisation will be critical in the future. The challenges are driven by an ambitious physics programme, notably the LHC accelerator upgrade to high-luminosity, HL-LHC, and the corresponding detector upgrades of ATLAS and CMS. In this document we address the issues for software that is used in multiple experiments (usually even more widely than ATLAS and CMS) and maintained by teams of developers who are either not linked to a particular experiment or who contribute to common software within the context of their experiment activity. We also give space to general considerations for future software and projects that tackle upcoming challenges, no matter who writes it, which is an area where community convergence on best practice is extremely useful.
△ Less
Submitted 31 August, 2020;
originally announced August 2020.
-
Mechanism of carrier localization in doped perovskite nanocrystals for bright emission
Authors:
Sascha Feldmann,
Mahesh Gangishetty,
Ivona Bravic,
Timo Neumann,
Bo Peng,
Thomas Winkler,
Richard H. Friend,
Bartomeu Monserrat,
Daniel N. Congreve,
Felix Deschler
Abstract:
Nanocrystals based on metal-halide perovskites offer a promising material platform for highly efficient lighting. Using transient optical spectroscopy, we study excitation recombination dynamics in manganese-doped CsPb(Cl,Br)3 perovskite nanocrystals. We find an increase in the intrinsic excitonic radiative recombination rate upon do**, which is typically a challenging material property to tailo…
▽ More
Nanocrystals based on metal-halide perovskites offer a promising material platform for highly efficient lighting. Using transient optical spectroscopy, we study excitation recombination dynamics in manganese-doped CsPb(Cl,Br)3 perovskite nanocrystals. We find an increase in the intrinsic excitonic radiative recombination rate upon do**, which is typically a challenging material property to tailor. Supported by ab initio calculations, we can attribute the enhanced emission rates to increased exciton localization through lattice periodicity breaking from Mn dopants, which increases exciton effective masses and overlap of electron and hole wavefunctions and thus the oscillator strength. Our report of a fundamental strategy for improving luminescence efficiencies in perovskite nanocrystals will be valuable for maximizing efficiencies in light-emitting applications.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Hadronic vacuum polarization using gradient flow
Authors:
Robert V. Harlander,
Fabian Lange,
Tobias Neumann
Abstract:
The gradient-flow operator product expansion for QCD current correlators including operators up to mass dimension four is calculated through NNLO. This paves an alternative way for efficient lattice evaluations of hadronic vacuum polarization functions. In addition, flow-time evolution equations for flowed composite operators are derived. Their explicit form for the non-trivial dimension-four oper…
▽ More
The gradient-flow operator product expansion for QCD current correlators including operators up to mass dimension four is calculated through NNLO. This paves an alternative way for efficient lattice evaluations of hadronic vacuum polarization functions. In addition, flow-time evolution equations for flowed composite operators are derived. Their explicit form for the non-trivial dimension-four operators of QCD is given through order $α_s^3$.
△ Less
Submitted 25 August, 2021; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Multi-path RNN for hierarchical modeling of long sequential data and its application to speaker stream separation
Authors:
Keisuke Kinoshita,
Thilo von Neumann,
Marc Delcroix,
Tomohiro Nakatani,
Reinhold Haeb-Umbach
Abstract:
Recently, the source separation performance was greatly improved by time-domain audio source separation based on dual-path recurrent neural network (DPRNN). DPRNN is a simple but effective model for a long sequential data. While DPRNN is quite efficient in modeling a sequential data of the length of an utterance, i.e., about 5 to 10 second data, it is harder to apply it to longer sequences such as…
▽ More
Recently, the source separation performance was greatly improved by time-domain audio source separation based on dual-path recurrent neural network (DPRNN). DPRNN is a simple but effective model for a long sequential data. While DPRNN is quite efficient in modeling a sequential data of the length of an utterance, i.e., about 5 to 10 second data, it is harder to apply it to longer sequences such as whole conversations consisting of multiple utterances. It is simply because, in such a case, the number of time steps consumed by its internal module called inter-chunk RNN becomes extremely large. To mitigate this problem, this paper proposes a multi-path RNN (MPRNN), a generalized version of DPRNN, that models the input data in a hierarchical manner. In the MPRNN framework, the input data is represented at several (>3) time-resolutions, each of which is modeled by a specific RNN sub-module. For example, the RNN sub-module that deals with the finest resolution may model temporal relationship only within a phoneme, while the RNN sub-module handling the most coarse resolution may capture only the relationship between utterances such as speaker information. We perform experiments using simulated dialogue-like mixtures and show that MPRNN has greater model capacity, and it outperforms the current state-of-the-art DPRNN framework especially in online processing scenarios.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
Benchmarking Learned Indexes
Authors:
Ryan Marcus,
Andreas Kipf,
Alexander van Renen,
Mihail Stoian,
Sanchit Misra,
Alfons Kemper,
Thomas Neumann,
Tim Kraska
Abstract:
Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can i…
▽ More
Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We also investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.
△ Less
Submitted 29 June, 2020; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Authors:
Thilo von Neumann,
Christoph Boeddeker,
Lukas Drude,
Keisuke Kinoshita,
Marc Delcroix,
Tomohiro Nakatani,
Reinhold Haeb-Umbach
Abstract:
Most approaches to multi-talker overlapped speech separation and recognition assume that the number of simultaneously active speakers is given, but in realistic situations, it is typically unknown. To cope with this, we extend an iterative speech extraction system with mechanisms to count the number of sources and combine it with a single-talker speech recognizer to form the first end-to-end multi…
▽ More
Most approaches to multi-talker overlapped speech separation and recognition assume that the number of simultaneously active speakers is given, but in realistic situations, it is typically unknown. To cope with this, we extend an iterative speech extraction system with mechanisms to count the number of sources and combine it with a single-talker speech recognizer to form the first end-to-end multi-talker automatic speech recognition system for an unknown number of active speakers. Our experiments show very promising performance in counting accuracy, source separation and speech recognition on simulated clean mixtures from WSJ0-2mix and WSJ0-3mix. Among others, we set a new state-of-the-art word error rate on the WSJ0-2mix database. Furthermore, our system generalizes well to a larger number of speakers than it ever saw during training, as shown in experiments with the WSJ0-4mix database.
△ Less
Submitted 21 December, 2020; v1 submitted 4 June, 2020;
originally announced June 2020.
-
RadixSpline: A Single-Pass Learned Index
Authors:
Andreas Kipf,
Ryan Marcus,
Alexander van Renen,
Mihail Stoian,
Alfons Kemper,
Tim Kraska,
Thomas Neumann
Abstract:
Recent research has shown that learned models can outperform state-of-the-art index structures in size and lookup performance. While this is a very promising result, existing learned structures are often cumbersome to implement and are slow to build. In fact, most approaches that we are aware of require multiple training passes over the data.
We introduce RadixSpline (RS), a learned index that c…
▽ More
Recent research has shown that learned models can outperform state-of-the-art index structures in size and lookup performance. While this is a very promising result, existing learned structures are often cumbersome to implement and are slow to build. In fact, most approaches that we are aware of require multiple training passes over the data.
We introduce RadixSpline (RS), a learned index that can be built in a single pass over the data and is competitive with state-of-the-art learned index models, like RMI, in size and lookup performance. We evaluate RS using the SOSD benchmark and show that it achieves competitive results on all datasets, despite the fact that it only has two parameters.
△ Less
Submitted 22 May, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Challenges in Monte Carlo event generator software for High-Luminosity LHC
Authors:
The HSF Physics Event Generator WG,
:,
Andrea Valassi,
Efe Yazgan,
Josh McFayden,
Simone Amoroso,
Joshua Bendavid,
Andy Buckley,
Matteo Cacciari,
Taylor Childers,
Vitaliano Ciulli,
Rikkert Frederix,
Stefano Frixione,
Francesco Giuli,
Alexander Grohsjean,
Christian Gütschow,
Stefan Höche,
Walter Hopkins,
Philip Ilten,
Dmitri Konstantinov,
Frank Krauss,
Qiang Li,
Leif Lönnblad,
Fabio Maltoni,
Michelangelo Mangano
, et al. (16 additional authors not shown)
Abstract:
We review the main software and computing challenges for the Monte Carlo physics event generators used by the LHC experiments, in view of the High-Luminosity LHC (HL-LHC) physics programme. This paper has been prepared by the HEP Software Foundation (HSF) Physics Event Generator Working Group as an input to the LHCC review of HL-LHC computing, which has started in May 2020.
We review the main software and computing challenges for the Monte Carlo physics event generators used by the LHC experiments, in view of the High-Luminosity LHC (HL-LHC) physics programme. This paper has been prepared by the HEP Software Foundation (HSF) Physics Event Generator Working Group as an input to the LHCC review of HL-LHC computing, which has started in May 2020.
△ Less
Submitted 18 February, 2021; v1 submitted 28 April, 2020;
originally announced April 2020.
-
End-to-end training of time domain audio separation and recognition
Authors:
Thilo von Neumann,
Keisuke Kinoshita,
Lukas Drude,
Christoph Boeddeker,
Marc Delcroix,
Tomohiro Nakatani,
Reinhold Haeb-Umbach
Abstract:
The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multi-speaker speech recognition. However, up until now, state-of-the-art neural network-based time domain source separation has not yet been combined with E2E speech recognition. We here demonstrate how to combine a separation module based on a Convolutional Time domain Audi…
▽ More
The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multi-speaker speech recognition. However, up until now, state-of-the-art neural network-based time domain source separation has not yet been combined with E2E speech recognition. We here demonstrate how to combine a separation module based on a Convolutional Time domain Audio Separation Network (Conv-TasNet) with an E2E speech recognizer and how to train such a model jointly by distributing it over multiple GPUs or by approximating truncated back-propagation for the convolutional front-end. To put this work into perspective and illustrate the complexity of the design space, we provide a compact overview of single-channel multi-speaker recognition systems. Our experiments show a word error rate of 11.0% on WSJ0-2mix and indicate that our joint time domain model can yield substantial improvements over cascade DNN-HMM and monolithic E2E frequency domain systems proposed so far.
△ Less
Submitted 13 April, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
SOSD: A Benchmark for Learned Indexes
Authors:
Andreas Kipf,
Ryan Marcus,
Alexander van Renen,
Mihail Stoian,
Alfons Kemper,
Tim Kraska,
Thomas Neumann
Abstract:
A groundswell of recent work has focused on improving data management systems with learned components. Specifically, work on learned index structures has proposed replacing traditional index structures, such as B-trees, with learned models. Given the decades of research committed to improving index structures, there is significant skepticism about whether learned indexes actually outperform state-…
▽ More
A groundswell of recent work has focused on improving data management systems with learned components. Specifically, work on learned index structures has proposed replacing traditional index structures, such as B-trees, with learned models. Given the decades of research committed to improving index structures, there is significant skepticism about whether learned indexes actually outperform state-of-the-art implementations of traditional structures on real-world data. To answer this question, we propose a new benchmarking framework that comes with a variety of real-world datasets and baseline implementations to compare against. We also show preliminary results for selected index structures, and find that learned models indeed often outperform state-of-the-art implementations, and are therefore a promising direction for future research.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Precision phenomenology with MCFM
Authors:
John Campbell,
Tobias Neumann
Abstract:
Without proper control of numerical and methodological errors in theoretical predictions at the per mille level it is not possible to study the effect of input parameters in current hadron-collider measurements at the required precision. We present a new version of the parton-level code MCFM that achieves this requirement through its highly-parallelized nature, significant performance improvements…
▽ More
Without proper control of numerical and methodological errors in theoretical predictions at the per mille level it is not possible to study the effect of input parameters in current hadron-collider measurements at the required precision. We present a new version of the parton-level code MCFM that achieves this requirement through its highly-parallelized nature, significant performance improvements and new features. An automatic differential cutoff extrapolation is introduced to assess the cutoff dependence of all results, thus ensuring their reliability and potentially improving fixed-cutoff results by an order of magnitude. The efficient differential study of PDF uncertainties and PDF set differences at NNLO, for multiple PDF sets simultaneously, is achieved by exploiting correlations. We use these improvements to study uncertainties and PDF sensitivity at NNLO, using 371 PDF set members. The work described here permits NNLO studies that were previously prohibitively expensive, and lays the groundwork necessary for a future implementation of NNLO calculations with a jet at Born level in MCFM.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
GeoBlocks: A Query-Cache Accelerated Data Structure for Spatial Aggregation over Polygons
Authors:
Christian Winter,
Andreas Kipf,
Christoph Anneser,
Eleni Tzirita Zacharatou,
Thomas Neumann,
Alfons Kemper
Abstract:
As individual traffic and public transport in cities are changing, city authorities need to analyze urban geospatial data to improve transportation and infrastructure. To that end, they highly rely on spatial aggregation queries that extract summarized information from point data (e.g., Uber rides) contained in a given polygonal region (e.g., a city neighborhood). To support such queries, current…
▽ More
As individual traffic and public transport in cities are changing, city authorities need to analyze urban geospatial data to improve transportation and infrastructure. To that end, they highly rely on spatial aggregation queries that extract summarized information from point data (e.g., Uber rides) contained in a given polygonal region (e.g., a city neighborhood). To support such queries, current analysis tools either allow only predefined aggregates on predefined regions and are thus unsuitable for exploratory analyses, or access the raw data to compute aggregate results on-the-fly, which severely limits the interactivity. At the same time, existing pre-aggregation techniques are inadequate since they maintain aggregates over rectangular regions. As a result, when applied over arbitrary polygonal regions, they induce an approximation error that cannot be bounded. In this paper, we introduce GeoBlocks, a novel pre-aggregating data structure that supports spatial aggregation over arbitrary polygons. GeoBlocks closely approximate polygons using a set of fine-grained grid cells and, in contrast to prior work, allow to bound the approximation error by adjusting the cell size. Furthermore, GeoBlocks employ a trie-like cache that caches aggregate results of frequently queried regions, thereby dynamically adapting to the skew inherently present in query workloads and improving performance over time. In summary, GeoBlocks outperform on-the-fly aggregation by up to three orders of magnitude, achieving the sub-second query latencies required for interactive exploratory analytics.
△ Less
Submitted 16 March, 2021; v1 submitted 21 August, 2019;
originally announced August 2019.
-
DeepSPACE: Approximate Geospatial Query Processing with Deep Learning
Authors:
Dimitri Vorona,
Andreas Kipf,
Thomas Neumann,
Alfons Kemper
Abstract:
The amount of the available geospatial data grows at an ever faster pace. This leads to the constantly increasing demand for processing power and storage in order to provide data analysis in a timely manner. At the same time, a lot of geospatial processing is visual and exploratory in nature, thus having bounded precision requirements. We present DeepSPACE, a deep learning-based approximate geospa…
▽ More
The amount of the available geospatial data grows at an ever faster pace. This leads to the constantly increasing demand for processing power and storage in order to provide data analysis in a timely manner. At the same time, a lot of geospatial processing is visual and exploratory in nature, thus having bounded precision requirements. We present DeepSPACE, a deep learning-based approximate geospatial query processing engine which combines modest hardware requirements with the ability to answer flexible aggregation queries while kee** the required state to a few hundred KiBs.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
On the Impact of Memory Allocation on High-Performance Query Processing
Authors:
Dominik Durner,
Viktor Leis,
Thomas Neumann
Abstract:
Somewhat surprisingly, the behavior of analytical query engines is crucially affected by the dynamic memory allocator used. Memory allocators highly influence performance, scalability, memory efficiency and memory fairness to other processes. In this work, we provide the first comprehensive experimental analysis on the impact of memory allocation for high-performance query engines. We test five st…
▽ More
Somewhat surprisingly, the behavior of analytical query engines is crucially affected by the dynamic memory allocator used. Memory allocators highly influence performance, scalability, memory efficiency and memory fairness to other processes. In this work, we provide the first comprehensive experimental analysis on the impact of memory allocation for high-performance query engines. We test five state-of-the-art dynamic memory allocators and discuss their strengths and weaknesses within our DBMS. The right allocator can increase the performance of TPC-DS (SF 100) by 2.7x on a 4-socket Intel Xeon server.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
Results and techniques for higher order calculations within the gradient-flow formalism
Authors:
Johannes Artz,
Robert V. Harlander,
Fabian Lange,
Tobias Neumann,
Mario Prausa
Abstract:
We describe in detail the implementation of a systematic perturbative approach to observables in the QCD gradient-flow formalism. This includes a collection of all relevant Feynman rules of the five-dimensional field theory and the composite operators considered in this paper. Tools from standard perturbative calculations are used to obtain Green's functions at finite flow time $t$ at higher order…
▽ More
We describe in detail the implementation of a systematic perturbative approach to observables in the QCD gradient-flow formalism. This includes a collection of all relevant Feynman rules of the five-dimensional field theory and the composite operators considered in this paper. Tools from standard perturbative calculations are used to obtain Green's functions at finite flow time $t$ at higher orders in perturbation theory. The three-loop results for the quark condensate at finite $t$ and the conversion factor for the "ringed" quark fields to the $\overline{\mbox{MS}}$ scheme are presented as applications. We also re-evaluate an earlier result for the three-loop gluon condensate, improving on its accuracy.
△ Less
Submitted 13 September, 2019; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Estimating Cardinalities with Deep Sketches
Authors:
Andreas Kipf,
Dimitri Vorona,
Jonas Müller,
Thomas Kipf,
Bernhard Radke,
Viktor Leis,
Peter Boncz,
Thomas Neumann,
Alfons Kemper
Abstract:
We introduce Deep Sketches, which are compact models of databases that allow us to estimate the result sizes of SQL queries. Deep Sketches are powered by a new deep learning approach to cardinality estimation that can capture correlations between columns, even across tables. Our demonstration allows users to define such sketches on the TPC-H and IMDb datasets, monitor the training process, and run…
▽ More
We introduce Deep Sketches, which are compact models of databases that allow us to estimate the result sizes of SQL queries. Deep Sketches are powered by a new deep learning approach to cardinality estimation that can capture correlations between columns, even across tables. Our demonstration allows users to define such sketches on the TPC-H and IMDb datasets, monitor the training process, and run ad-hoc queries against trained sketches. We also estimate query cardinalities with HyPer and PostgreSQL to visualize the gains over traditional cardinality estimators.
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
Persistent Memory I/O Primitives
Authors:
Alexander van Renen,
Lukas Vogel,
Viktor Leis,
Thomas Neumann,
Alfons Kemper
Abstract:
I/O latency and throughput is one of the major performance bottlenecks for disk-based database systems. Upcoming persistent memory (PMem) technologies, like Intel's Optane DC Persistent Memory Modules, promise to bridge the gap between NAND-based flash (SSD) and DRAM, and thus eliminate the I/O bottleneck. In this paper, we provide one of the first performance evaluations of PMem in terms of bandw…
▽ More
I/O latency and throughput is one of the major performance bottlenecks for disk-based database systems. Upcoming persistent memory (PMem) technologies, like Intel's Optane DC Persistent Memory Modules, promise to bridge the gap between NAND-based flash (SSD) and DRAM, and thus eliminate the I/O bottleneck. In this paper, we provide one of the first performance evaluations of PMem in terms of bandwidth and latency. Based on the results, we develop guidelines for efficient PMem usage and two essential I/O primitives tuned for PMem: log writing and block flushing.
△ Less
Submitted 6 June, 2019; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Off-shell single-top-quark production in the Standard Model Effective Field Theory
Authors:
Tobias Neumann,
Zack Sullivan
Abstract:
We present a fully differential and spin-dependent $t$-channel single-top-quark calculation at next-to-leading order (NLO) in QCD including off-shell effects by using the complex mass scheme in the Standard Model (SM) and in the Standard Model Effective Field Theory (SMEFT). We include all relevant SMEFT operators at $1/Λ^2$ that contribute at NLO in QCD for a fully consistent comparison to the SM…
▽ More
We present a fully differential and spin-dependent $t$-channel single-top-quark calculation at next-to-leading order (NLO) in QCD including off-shell effects by using the complex mass scheme in the Standard Model (SM) and in the Standard Model Effective Field Theory (SMEFT). We include all relevant SMEFT operators at $1/Λ^2$ that contribute at NLO in QCD for a fully consistent comparison to the SM at NLO. In addition, we include chirality flip** operators that do not interfere with the SM amplitude and contribute only at $1/Λ^4$ with a massless $b$-quark. Such higher order effects are usually captured by considering anomalous right-handed $Wtb$ and left-handed $Wtb$ tensor couplings. Despite their formal suppression in the SMEFT, they describe an important class of models for new physics. Our calculation and analysis framework is publicly available in MCFM.
△ Less
Submitted 17 June, 2019; v1 submitted 26 March, 2019;
originally announced March 2019.