-
Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter
Authors:
M. Aamir,
B. Acar,
G. Adamov,
T. Adams,
C. Adloff,
S. Afanasiev,
C. Agrawal,
C. Agrawal,
A. Ahmad,
H. A. Ahmed,
S. Akbar,
N. Akchurin,
B. Akgul,
B. Akgun,
R. O. Akpinar,
E. Aktas,
A. AlKadhim,
V. Alexakhin,
J. Alimena,
J. Alison,
A. Alpana,
W. Alshehri,
P. Alvarez Dominguez,
M. Alyari,
C. Amendola
, et al. (550 additional authors not shown)
Abstract:
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr…
▽ More
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Simulation, Modelling and Classification of Wiki Contributors: Spotting The Good, The Bad, and The Ugly
Authors:
Silvia García Méndez,
Fátima Leal,
Benedita Malheiro,
Juan Carlos Burguillo Rial,
Bruno Veloso,
Adriana E. Chis,
Horacio González Vélez
Abstract:
Data crowdsourcing is a data acquisition process where groups of voluntary contributors feed platforms with highly relevant data ranging from news, comments, and media to knowledge and classifications. It typically processes user-generated data streams to provide and refine popular services such as wikis, collaborative maps, e-commerce sites, and social networks. Nevertheless, this modus operandi…
▽ More
Data crowdsourcing is a data acquisition process where groups of voluntary contributors feed platforms with highly relevant data ranging from news, comments, and media to knowledge and classifications. It typically processes user-generated data streams to provide and refine popular services such as wikis, collaborative maps, e-commerce sites, and social networks. Nevertheless, this modus operandi raises severe concerns regarding ill-intentioned data manipulation in adversarial environments. This paper presents a simulation, modelling, and classification approach to automatically identify human and non-human (bots) as well as benign and malign contributors by using data fabrication to balance classes within experimental data sets, data stream modelling to build and update contributor profiles and, finally, autonomic data stream classification. By employing WikiVoyage - a free worldwide wiki travel guide open to contribution from the general public - as a testbed, our approach proves to significantly boost the confidence and quality of the classifier by using a class-balanced data stream, comprising both real and synthetic data. Our empirical results show that the proposed method distinguishes between benign and malign bots as well as human contributors with a classification accuracy of up to 92 %.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
A System for Automatic English Text Expansion
Authors:
Silvia García Méndez,
Milagros Fernández Gavilanes,
Enrique Costa Montenegro,
Jonathan Juncal Martínez,
Francisco Javier González Castaño,
Ehud Reiter
Abstract:
We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptabilit…
▽ More
We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Interpretable classification of wiki-review streams
Authors:
Silvia García Méndez,
Fátima Leal,
Benedita Malheiro,
Juan Carlos Burguillo Rial
Abstract:
Wiki articles are created and maintained by a crowd of editors, producing a continuous stream of reviews. Reviews can take the form of additions, reverts, or both. This crowdsourcing model is exposed to manipulation since neither reviews nor editors are automatically screened and purged. To protect articles against vandalism or damage, the stream of reviews can be mined to classify reviews and pro…
▽ More
Wiki articles are created and maintained by a crowd of editors, producing a continuous stream of reviews. Reviews can take the form of additions, reverts, or both. This crowdsourcing model is exposed to manipulation since neither reviews nor editors are automatically screened and purged. To protect articles against vandalism or damage, the stream of reviews can be mined to classify reviews and profile editors in real-time. The goal of this work is to anticipate and explain which reviews to revert. This way, editors are informed why their edits will be reverted. The proposed method employs stream-based processing, updating the profiling and classification models on each incoming event. The profiling uses side and content-based features employing Natural Language Processing, and editor profiles are incrementally updated based on their reviews. Since the proposed method relies on self-explainable classification algorithms, it is possible to understand why a review has been classified as a revert or a non-revert. In addition, this work contributes an algorithm for generating synthetic data for class balancing, making the final classification fairer. The proposed online method was tested with a real data set from Wikivoyage, which was balanced through the aforementioned synthetic data generation. The results attained near-90 % values for all evaluation metrics (accuracy, precision, recall, and F-measure).
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph
Authors:
Runsong Jia,
Bowen Zhang,
Sergio J. Rodríguez Méndez,
Pouya G. Omran
Abstract:
The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related ar…
▽ More
The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG).
To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs.
By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency.
We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for develo** precise and reliable interactions with LLMs.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
How a table modulates the risk of airborne transmission between facing individuals
Authors:
Oğuzhan Kaplan,
Manouk Abkarian,
Simon Mendez
Abstract:
Airborne transmission has been recognized as an important route of transmission for SARS-CoV-2, the virus responsible for the COVID-19 pandemic. While coughing and sneezing are spectacular sources of production of infected aerosols with far-reaching airflows, the prevalence of asymptomatic transmissions highlighted the importance of social activities. Gathering around a table, a common scenario of…
▽ More
Airborne transmission has been recognized as an important route of transmission for SARS-CoV-2, the virus responsible for the COVID-19 pandemic. While coughing and sneezing are spectacular sources of production of infected aerosols with far-reaching airflows, the prevalence of asymptomatic transmissions highlighted the importance of social activities. Gathering around a table, a common scenario of human interactions, not only fixes a typical distance between static interlocutors, but influences airborne transmission, by serving both as a flow diverter and a surface for droplet deposition. Here, we use high-fidelity large-eddy simulations to characterize short-range airborne transmission when two people face each other at a typical table. We show that compared to the natural distance travelled by free buoyant puffs and jets, the distance between the table and the emission constitutes a new length scale that modifies downward exhaled flows, common during nose breathing, speech, and laughter. When the table is close to the emitter, its main effect is to restrict the forward spread of emitted particles. However, if the distance between individuals is too short, particles reaching the recipient become more concentrated, rising transmission risks. Additionally, simulations of forceful exhalations, like laughter, demonstrate that the table acts as a filter that collects medium-sized particles that would have remained in the free jet otherwise, but can in that case be involved in the fomite transmission route. The table introduces a cut-off size for particles that depends on the inertia of the exhaled material, thereby modifying the size distribution of particles suspended in the air.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Physics-Guided Neural Networks for Intraventricular Vector Flow Map**
Authors:
Hang Jung Ling,
Salomé Bru,
Julia Puig,
Florian Vixège,
Simon Mendez,
Franck Nicoud,
Pierre-Yves Courand,
Olivier Bernard,
Damien Garcia
Abstract:
Intraventricular vector flow map** (iVFM) seeks to enhance and quantify color Doppler in cardiac imaging. In this study, we propose novel alternatives to the traditional iVFM optimization scheme by utilizing physics-informed neural networks (PINNs) and a physics-guided nnU-Net-based supervised approach. When evaluated on simulated color Doppler images derived from a patient-specific computationa…
▽ More
Intraventricular vector flow map** (iVFM) seeks to enhance and quantify color Doppler in cardiac imaging. In this study, we propose novel alternatives to the traditional iVFM optimization scheme by utilizing physics-informed neural networks (PINNs) and a physics-guided nnU-Net-based supervised approach. When evaluated on simulated color Doppler images derived from a patient-specific computational fluid dynamics model and in vivo Doppler acquisitions, both approaches demonstrate comparable reconstruction performance to the original iVFM algorithm. The efficiency of PINNs is boosted through dual-stage optimization and pre-optimized weights. On the other hand, the nnU-Net method excels in generalizability and real-time capabilities. Notably, nnU-Net shows superior robustness on sparse and truncated Doppler data while maintaining independence from explicit boundary conditions. Overall, our results highlight the effectiveness of these methods in reconstructing intraventricular vector blood flow. The study also suggests potential applications of PINNs in ultrafast color Doppler imaging and the incorporation of fluid dynamics equations to derive biomarkers for cardiovascular diseases based on blood flow.
△ Less
Submitted 27 June, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Viral transmission in pedestrian crowds: Coupling an open-source code assessing the risks of airborne contagion with diverse pedestrian dynamics models
Authors:
Alexandre Nicolas,
Simon Mendez
Abstract:
We study viral transmission in crowds via the short-ranged airborne pathway using a purely model-based approach. Our goal is two-pronged. Firstly, we illustrate with a concrete and pedagogical case study how to estimate the risks of new viral infections by coupling pedestrian simulations with the transmission algorithm that we recently released as open-source code. The algorithm hinges on pre-comp…
▽ More
We study viral transmission in crowds via the short-ranged airborne pathway using a purely model-based approach. Our goal is two-pronged. Firstly, we illustrate with a concrete and pedagogical case study how to estimate the risks of new viral infections by coupling pedestrian simulations with the transmission algorithm that we recently released as open-source code. The algorithm hinges on pre-computed viral concentration maps derived from computational fluid dynamics (CFD) simulations. Secondly, we investigate to what extent the transmission risk predictions depend on the pedestrian dynamics model in use. For the simple bidirectional flow under consideration, the predictions are found to be surprisingly stable across initial conditions and models, despite the different microscopic arrangements of the simulated crowd, as long as the crowd evolves in a qualitatively similarly way. On the other hand, when major changes are observed in the crowd's behaviour, notably whenever a jam occurs at the centre of the channel, the estimated risks surge drastically.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Authors:
Tuan Dung Nguyen,
Yuan-Sen Ting,
Ioana Ciucă,
Charlie O'Neill,
Ze-Chang Sun,
Maja Jabłońska,
Sandor Kruk,
Ernest Perkowski,
Jack Miller,
Jason Li,
Josh Peek,
Kartheik Iyer,
Tomasz Różański,
Pranav Khetarpal,
Sharaf Zaman,
David Brodrick,
Sergio J. Rodríguez Méndez,
Thang Bui,
Alyssa Goodman,
Alberto Accomazzi,
Jill Naiman,
Jesse Cranney,
Kevin Schawinski,
UniverseTBD
Abstract:
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marke…
▽ More
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Syntactic Complexity Identification, Measurement, and Reduction Through Controlled Syntactic Simplification
Authors:
Muhammad Salman,
Armin Haller,
Sergio J. Rodríguez Méndez
Abstract:
Text simplification is one of the domains in Natural Language Processing (NLP) that offers an opportunity to understand the text in a simplified manner for exploration. However, it is always hard to understand and retrieve knowledge from unstructured text, which is usually in the form of compound and complex sentences. There are state-of-the-art neural network-based methods to simplify the sentenc…
▽ More
Text simplification is one of the domains in Natural Language Processing (NLP) that offers an opportunity to understand the text in a simplified manner for exploration. However, it is always hard to understand and retrieve knowledge from unstructured text, which is usually in the form of compound and complex sentences. There are state-of-the-art neural network-based methods to simplify the sentences for improved readability while replacing words with plain English substitutes and summarising the sentences and paragraphs. In the Knowledge Graph (KG) creation process from unstructured text, summarising long sentences and substituting words is undesirable since this may lead to information loss. However, KG creation from text requires the extraction of all possible facts (triples) with the same mentions as in the text. In this work, we propose a controlled simplification based on the factual information in a sentence, i.e., triple. We present a classical syntactic dependency-based approach to split and rephrase a compound and complex sentence into a set of simplified sentences. This simplification process will retain the original wording with a simple structure of possible domain facts in each sentence, i.e., triples. The paper also introduces an algorithm to identify and measure a sentence's syntactic complexity (SC), followed by reduction through a controlled syntactic simplification process. Last, an experiment for a dataset re-annotation is also conducted through GPT3; we aim to publish this refined corpus as a resource. This work is accepted and presented in International workshop on Learning with Knowledge Graphs (IWLKG) at WSDM-2023 Conference. The code and data is available at www.github.com/sallmanm/SynSim.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
A Pipeline for Analysing Grant Applications
Authors:
Shuaiqun Pan,
Sergio J. Rodríguez Méndez,
Kerry Taylor
Abstract:
Data mining techniques can transform massive amounts of unstructured data into quantitative data that quickly reveal insights, trends, and patterns behind the original data. In this paper, a data mining model is applied to analyse the 2019 grant applications submitted to an Australian Government research funding agency to investigate whether grant schemes successfully identifies innovative project…
▽ More
Data mining techniques can transform massive amounts of unstructured data into quantitative data that quickly reveal insights, trends, and patterns behind the original data. In this paper, a data mining model is applied to analyse the 2019 grant applications submitted to an Australian Government research funding agency to investigate whether grant schemes successfully identifies innovative project proposals, as intended. The grant applications are peer-reviewed research proposals that include specific ``innovation and creativity'' (IC) scores assigned by reviewers. In addition to predicting the IC score for each research proposal, we are particularly interested in understanding the vocabulary of innovative proposals. In order to solve this problem, various data mining models and feature encoding algorithms are studied and explored. As a result, we propose a model with the best performance, a Random Forest (RF) classifier over documents encoded with features denoting the presence or absence of unigrams. In specific, the unigram terms are encoded by a modified Term Frequency - Inverse Document Frequency (TF-IDF) algorithm, which only implements the IDF part of TF-IDF. Besides the proposed model, this paper also presents a rigorous experimental pipeline for analysing grant applications, and the experimental results prove its feasibility.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
Yttrium incorporation in Cr2AlC: On the metastable phase formation and decomposition of (Cr,Y)2AlC MAX phase thin films
Authors:
Clio Azina,
Tim Bartsch,
Damian M. Holzapfel,
Martin Dahlqvist,
Johanna Rosen,
Lukas Löfler,
Alba San Jose Mendez,
Marcus Hans,
Daniel Primetzhofer,
Jochen M. Schneider
Abstract:
Herein we report on the synthesis of a metastable (Cr,Y)2AlC MAX phase solid solution by co-sputtering from a composite Cr-Al-C and elemental Y target, at room temperature, followed by annealing. While direct high-temperature synthesis resulted in multiphase films, as evidenced by X-ray diffraction analyses, room temperature depositions, followed by annealing to 760 °C led to the formation of phas…
▽ More
Herein we report on the synthesis of a metastable (Cr,Y)2AlC MAX phase solid solution by co-sputtering from a composite Cr-Al-C and elemental Y target, at room temperature, followed by annealing. While direct high-temperature synthesis resulted in multiphase films, as evidenced by X-ray diffraction analyses, room temperature depositions, followed by annealing to 760 °C led to the formation of phase pure (Cr,Y)2AlC by diffusion. Higher annealing temperatures caused decomposition of the metastable phase into Cr2AlC, Y5Al3 , and Cr-carbides. In contrast to pure Cr2AlC, the Y-containing phase crystallizes directly in the MAX phase structure instead of first forming a disordered solid solution. Furthermore, the crystallization temperature was shown to be Y-content dependent and was increased by ~200 °C for 5 at.% Y compared to Cr2AlC. Calculations predicting the metastable phase formation of (Cr,Y)2AlC and its decomposition are in excellent agreement with the experimental findings.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
From microscopic droplets to macroscopic crowds: Crossing the scales in models of short-range respiratory disease transmission, with application to COVID-19
Authors:
Simon Mendez,
Willy Garcia,
Alexandre Nicolas
Abstract:
Short-range exposure to airborne virus-laden respiratory droplets is now acknowledged as an effective transmission route of respiratory diseases, as exemplified by COVID-19. In order to assess the risks associated with this pathway in daily-life settings involving tens to hundreds of individuals, the chasm needs to be bridged between fluid dynamical simulations of droplet propagation and populatio…
▽ More
Short-range exposure to airborne virus-laden respiratory droplets is now acknowledged as an effective transmission route of respiratory diseases, as exemplified by COVID-19. In order to assess the risks associated with this pathway in daily-life settings involving tens to hundreds of individuals, the chasm needs to be bridged between fluid dynamical simulations of droplet propagation and population-scale epidemiological models. We achieve this by coarse-graining microscopic droplet trajectories (simulated in various ambient flows) into spatio-temporal maps of viral concentration around the emitter and coupling these maps to field-data about pedestrian crowds in different scenarios (streets, train stations, markets, queues, and street caf{é}s). At the scale of an individual pedestrian, our results highlight the paramount importance of the velocity of the ambient air flow relative to the emitter's motion. This aerodynamic effect, which disperses infectious aerosols and thus mitigates short-range transmission risks, prevails over all other environmental variables. At the crowd's scale, the method yields a ranking of the scenarios by the risks of new infections that they present, dominated by the street caf{é}s and then the outdoor market. While the effect of light winds on the qualitative ranking is fairly marginal, even the most modest ambient air flows dramatically lower the quantitative rates of new infections. The proposed framework was here applied with SARS-CoV-2 in mind, but its generalization to other airborne pathogens and to other (real or hypothetical) crowd arrangements is straightforward.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Full-volume three-component intraventricular vector flow map** by triplane color Doppler
Authors:
Florian Vixège,
Alain Berod,
Pierre-Yves Courand,
Simon Mendez,
Franck Nicoud,
Philippe Blanc-Benon,
Didier Vray,
Damien Garcia
Abstract:
Intraventricular vector flow map** (iVFM) is a technique for retrieving 2-D velocity vector fields of blood flow in the left ventricle. This method is based on conventional color Doppler imaging, which makes iVFM compatible with the clinical setting. We have generalized the iVFM for a three-dimensional reconstruction (3D-iVFM). 3D-iVFM is able to recover three-component velocity vector fields in…
▽ More
Intraventricular vector flow map** (iVFM) is a technique for retrieving 2-D velocity vector fields of blood flow in the left ventricle. This method is based on conventional color Doppler imaging, which makes iVFM compatible with the clinical setting. We have generalized the iVFM for a three-dimensional reconstruction (3D-iVFM). 3D-iVFM is able to recover three-component velocity vector fields in a full intraventricular volume by using a clinical echocardiographic triplane mode. As with the 2-D version, the method is based on the mass conservation, and free-slip boundary conditions on the endocardial wall. These mechanical constraints were imposed in a least-squares minimization problem that was solved through the method of Lagrange multipliers. We validated 3D-iVFM in silico in a patient-specific CFD (computational fluid dynamics) model of cardiac flow, and tested its feasibility in vivo on volunteers. In both in silico and in vivo investigations, the dynamics of the intraventricular vortex that forms during diastole was deciphered by 3D-iVFM. Our results tend to indicate that 3D-iVFM could provide full-volume echocardiographic information on left intraventricular hemodynamics from the clinical modality of triplane color Doppler.
△ Less
Submitted 10 December, 2021; v1 submitted 7 December, 2021;
originally announced December 2021.
-
TNNT: The Named Entity Recognition Toolkit
Authors:
Sandaru Seneviratne,
Sergio J. Rodríguez Méndez,
Xuecheng Zhang,
Pouya G. Omran,
Kerry Taylor,
Armin Haller
Abstract:
Extraction of categorised named entities from text is a complex task given the availability of a variety of Named Entity Recognition (NER) models and the unstructured information encoded in different source document formats. Processing the documents to extract text, identifying suitable NER models for a task, and obtaining statistical information is important in data analysis to make informed deci…
▽ More
Extraction of categorised named entities from text is a complex task given the availability of a variety of Named Entity Recognition (NER) models and the unstructured information encoded in different source document formats. Processing the documents to extract text, identifying suitable NER models for a task, and obtaining statistical information is important in data analysis to make informed decisions. This paper presents TNNT, a toolkit that automates the extraction of categorised named entities from unstructured information encoded in source documents, using diverse state-of-the-art Natural Language Processing (NLP) tools and NER models. TNNT integrates 21 different NER models as part of a Knowledge Graph Construction Pipeline (KGCP) that takes a document set as input and processes it based on the defined settings, applying the selected blocks of NER models to output the results. The toolkit generates all results with an integrated summary of the extracted entities, enabling enhanced data analysis to support the KGCP, and also, to aid further NLP tasks.
△ Less
Submitted 31 August, 2021;
originally announced August 2021.
-
Characterization of a Spatially Resolved Multi-Element Laser Ablation Ion Source
Authors:
K. Murray,
C. Chambers,
D. Chen,
Z. Feng,
J. Fraser,
Y. Ito,
Y. Lan,
S. Mendez,
M. Medina Peregrina,
H. Rasiwala,
L. Richez,
N. Roy,
R. Simpson,
J. Dilling,
W. Fairbank Jr.,
A. A. Kwiatkowski,
T. Brunner
Abstract:
A laser ablation ion source (LAS) is a powerful tool by which diverse species of ions can be produced for mass spectrometer calibration, or surface study applications. It is necessary to frequently shift the laser position on the target to selectively ablate materials in a controlled manner, and to mitigate degradation of the target surface caused by ablation. An alternative to mounting the target…
▽ More
A laser ablation ion source (LAS) is a powerful tool by which diverse species of ions can be produced for mass spectrometer calibration, or surface study applications. It is necessary to frequently shift the laser position on the target to selectively ablate materials in a controlled manner, and to mitigate degradation of the target surface caused by ablation. An alternative to mounting the target onto a rotation wheel or $x-y$ translation stage, is to shift the laser position with a final reflection from a motorized kinematic mirror mount. Such a system has been developed, assembled and characterized with a two axis motorized mirror and various metal targets. In the system presented here, ions are ablated from the target surface and guided by a 90 degree quadrupole bender to a Faraday cup where the ion current is measured. Spatially resolved scans of the target are produced by actuating the mirror motors, thus moving the laser spot across the target, and performing synchronous measurements of the ion current to construct 2D images of a target surface which can be up to 50~mm in diameter. The spatial resolution of the system has been measured by scanning the interfaces between metals such as steel and niobium, where it was demonstrated that the LAS can selectively ablate an area of diameter $\approx$50 $μ$m. This work informs the development of subsequent LAS systems, that are intended to serve as multi-element ion sources for commercial and custom-built time-of-flight mass spectrometers, or to selectively study surface specific regions of samples.
△ Less
Submitted 17 November, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Physics-constrained intraventricular vector flow map** by color Doppler
Authors:
Florian Vixège,
Alain Berod,
Yunyun Sun,
Simon Mendez,
Olivier Bernard,
Nicolas Ducros,
Pierre-Yves Courand,
Franck Nicoud,
Damien Garcia
Abstract:
Color Doppler by transthoracic echocardiography creates 2-D fan-shaped maps of blood velocities in the cardiac cavities. It is a one-component velocimetric technique since it only returns the velocity components parallel to the ultrasound beams. Intraventricular vector flow map** (iVFM) is a method to recover the blood velocity vectors from the Doppler scalar fields in an echocardiographic three…
▽ More
Color Doppler by transthoracic echocardiography creates 2-D fan-shaped maps of blood velocities in the cardiac cavities. It is a one-component velocimetric technique since it only returns the velocity components parallel to the ultrasound beams. Intraventricular vector flow map** (iVFM) is a method to recover the blood velocity vectors from the Doppler scalar fields in an echocardiographic three-chamber view. We improved our iVFM numerical scheme by imposing physical constraints. The iVFM consisted in minimizing regularized Doppler residuals subject to the condition that two fluid-dynamics constraints were satisfied, namely planar mass conservation, and free-slip boundary conditions. The optimization problem was solved by using the Lagrange multiplier method. A finite-difference discretization of the optimization problem, written in the polar coordinate system centered on the cardiac ultrasound probe, led to a sparse linear system. The single regularization parameter was determined automatically for non-supervision considerations. The physics-constrained method was validated using realistic intracardiac flow data from a patient-specific CFD model. The numerical evaluations showed that the iVFM-derived velocity vectors were in very good agreement with the CFD-based original velocities, with relative errors ranged between 0.3 and 12%. We calculated two macroscopic measures of flow in the cardiac region of interest, the mean vorticity and mean stream function, and observed an excellent concordance between physics-constrained iVFM and CFD. The capability of physics-constrained iVFM was finally tested with in vivo color Doppler data acquired in patients routinely examined in the echocardiographic laboratory. The vortex that forms during the rapid filling was deciphered. The physics-constrained iVFM algorithm is ready for pilot clinical studies.
△ Less
Submitted 7 December, 2021; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Model-based assessment of the risks of viral transmission in non-confined crowds
Authors:
Willy Garcia,
Simon Mendez,
Baptiste Fray,
Alexandre Nicolas
Abstract:
This work aims to assess the risks of Covid-19 disease spread in diverse daily-life situations (referred to as scenarios) involving crowds of maskless pedestrians, mostly outdoors. More concretely, we develop a method to infer the global number of new infections from patchyobservations of pedestrians. The method relies on ad hoc spatially resolved models for disease transmissionvia virus-laden res…
▽ More
This work aims to assess the risks of Covid-19 disease spread in diverse daily-life situations (referred to as scenarios) involving crowds of maskless pedestrians, mostly outdoors. More concretely, we develop a method to infer the global number of new infections from patchyobservations of pedestrians. The method relies on ad hoc spatially resolved models for disease transmissionvia virus-laden respiratory droplets, which are fit to existing exposure studies about Covid-19. The approach is applied to the detailed field data about pedestrian trajectories and orientations that we acquired during the pandemic. This allows us to rank the investigated scenarios by the infection risks that they present; importantly, the obtained hierarchy of risks is conserved across all our transmission models (except the most pessimistic ones): Street caf{é}s present the largest average rate of new infections caused by an attendant, followed by busy outdoor markets, and then metro and train stations, whereas the risks incurred while walking on fairly busy streets (average density around 0.1 person/m${}^2$) are comparatively quite low. While none of our ad hoc models can claim accuracy, their converging predictions lend credence to these findings.} In scenarios with a moving crowd, we find that density is the main factor influencing the estimated infection rate. Finally, our study explores the efficiency of street and venue redesigns in mitigating the viral spread: While the benefits of enforcing one-way foot traffic in (wide) walkways are unclear, changing the geometry of queues substantially affects disease transmission risks.
△ Less
Submitted 3 November, 2021; v1 submitted 15 December, 2020;
originally announced December 2020.
-
Speech can produce jet-like transport relevant to asymptomatic spreading of virus
Authors:
Manouk Abkarian,
Simon Mendez,
Nan Xue,
Fan Yang,
Howard A. Stone
Abstract:
Many scientific reports document that asymptomatic and presymptomatic individuals contribute to the spread of COVID-19, probably during conversations in social interactions. Droplet emission occurs during speech, yet few studies document the flow to provide the transport mechanism. This lack of understanding prevents informed public health guidance for risk reduction and mitigation strategies, e.g…
▽ More
Many scientific reports document that asymptomatic and presymptomatic individuals contribute to the spread of COVID-19, probably during conversations in social interactions. Droplet emission occurs during speech, yet few studies document the flow to provide the transport mechanism. This lack of understanding prevents informed public health guidance for risk reduction and mitigation strategies, e.g. the "six-foot rule". Here we analyze flows during breathing and speaking, including phonetic features, using order-of-magnitudes estimates, numerical simulations, and laboratory experiments. We document the spatio-temporal structure of the expelled air flow. Phonetic characteristics of plosive sounds like 'P' lead to enhanced directed transport, including jet-like flows that entrain the surrounding air. We highlight three distinct temporal scaling laws for the transport distance of exhaled material including (i) transport over a short distance ($<$ 0.5 m) in a fraction of a second, with large angular variations due to the complexity of speech, (ii) a longer distance, approximately 1 m, where directed transport is driven by individual vortical puffs corresponding to plosive sounds, and (iii) a distance out to about 2 m, or even further, where sequential plosives in a sentence, corresponding effectively to a train of puffs, create conical, jet-like flows. The latter dictates the long-time transport in a conversation. We believe that this work will inform thinking about the role of ventilation, aerosol transport in disease transmission for humans and other animals, and yield a better understanding of linguistic aerodynamics, i.e., aerophonetics.
△ Less
Submitted 25 September, 2020; v1 submitted 18 June, 2020;
originally announced June 2020.
-
Tools for Analyzing Parallel I/O
Authors:
Julian M. Kunkel,
Eugen Betke,
Matt Bryson,
Philip Carns,
Rosemary Francis,
Wolfgang Frings,
Roland Laifer,
Sandra Mendez
Abstract:
Parallel application I/O performance often does not meet user expectations. Additionally, slight access pattern modifications may lead to significant changes in performance due to complex interactions between hardware and software. These challenges call for sophisticated tools to capture, analyze, understand, and tune application I/O. In this paper, we highlight advances in monitoring tools to hel…
▽ More
Parallel application I/O performance often does not meet user expectations. Additionally, slight access pattern modifications may lead to significant changes in performance due to complex interactions between hardware and software. These challenges call for sophisticated tools to capture, analyze, understand, and tune application I/O. In this paper, we highlight advances in monitoring tools to help address this problem. We also describe best practices, identify issues in measurement and analysis, and provide practical approaches to translate parallel I/O analysis into actionable outcomes for users, facility operators, and researchers.
△ Less
Submitted 18 July, 2018; v1 submitted 13 July, 2018;
originally announced July 2018.
-
A new look at blood shear-thinning
Authors:
Luca Lanotte,
Johannes Mauer,
Simon Mendez,
Dmitry A. Fedosov,
Jean-Marc Fromental,
Viviana Clavería,
Franck Nicoud,
Gerhard Gompper,
Manouk Abkarian
Abstract:
Blood viscosity decreases with shear stress, a property essential for an efficient perfusion of the vascular tree. Shear-thinning is intimately related to the dynamics and mutual interactions of red blood cells (RBCs), the major constituents of blood. Our work explores RBCs dynamics under physiologically relevant conditions of flow strength, outer fluid viscosity and volume fraction. Our results c…
▽ More
Blood viscosity decreases with shear stress, a property essential for an efficient perfusion of the vascular tree. Shear-thinning is intimately related to the dynamics and mutual interactions of red blood cells (RBCs), the major constituents of blood. Our work explores RBCs dynamics under physiologically relevant conditions of flow strength, outer fluid viscosity and volume fraction. Our results contradict the current paradigm stating that RBCs should align and elongate in the flow direction thanks to their membrane circulation around their center of mass, reducing flow-lines disturbances. On the contrary, we observe both experimentally and with simulations, rich morphological transitions that relate to global blood rheology. For increasing shear stresses, RBCs successively tumble, roll, deform into rolling stomatocytes and finally adopt highly deformed and polylobed shapes even for semi-dilute volume fractions analogous to microcirculatory values. Our study suggests that any pathological change in plasma composition, RBCs cytosol viscosity or membrane mechanical properties will impact the onset of shape transitions and should play a central role in pathological blood rheology and flow behavior.
△ Less
Submitted 12 August, 2016;
originally announced August 2016.
-
On the existence and analycity of solitary waves solutions to a two-dimesional Benjamin-Ono equation
Authors:
Germán Preciado López,
Félix H. Soriano Méndez
Abstract:
We show the existence, regularity and analyticity of solitary waves associated to the following equation \begin{eqnarray*}
(u_t+u^{p}u_x+ \mathcal H\partial_x^2u+ λ\mathcal H\partial_y^2u)_x +μu_{yy}=0, \end{eqnarray*} where $\mathcal H$ is the Hilbert transform with respect to $x$ and $λ$ and $μ$ are nonnegative real numbers, not simultaneously zero.
We show the existence, regularity and analyticity of solitary waves associated to the following equation \begin{eqnarray*}
(u_t+u^{p}u_x+ \mathcal H\partial_x^2u+ λ\mathcal H\partial_y^2u)_x +μu_{yy}=0, \end{eqnarray*} where $\mathcal H$ is the Hilbert transform with respect to $x$ and $λ$ and $μ$ are nonnegative real numbers, not simultaneously zero.
△ Less
Submitted 14 March, 2015;
originally announced March 2015.
-
On the Cauchy problem of a two-dimesional Benjamin-Ono equation
Authors:
Germán Preciado López,
Félix H. Soriano Méndez
Abstract:
In this work we shall show that the Cauchy problem \begin{equation} \left\{ \begin{aligned} &(u_t+u^pu_x+\mathcal H\partial_x^2u+ α\mathcal H\partial_y^2u )_x - γu_{yy}=0 \quad p\in{\nat}
&u(0;x,y)=φ{(x,y)}
\end{aligned}
\right. \end{equation} is locally well-posed in the Sobolev spaces $H^s({\re}^2)$, $X^s$ and weighted spaces $X_s(w^2)$, for $s>2$.
In this work we shall show that the Cauchy problem \begin{equation} \left\{ \begin{aligned} &(u_t+u^pu_x+\mathcal H\partial_x^2u+ α\mathcal H\partial_y^2u )_x - γu_{yy}=0 \quad p\in{\nat}
&u(0;x,y)=φ{(x,y)}
\end{aligned}
\right. \end{equation} is locally well-posed in the Sobolev spaces $H^s({\re}^2)$, $X^s$ and weighted spaces $X_s(w^2)$, for $s>2$.
△ Less
Submitted 14 March, 2015;
originally announced March 2015.
-
On the damped oscillations of an elastic quasi-circular membrane in a two-dimensional incompressible fluid
Authors:
Marco Martins Afonso,
Simon Mendez,
Franck Nicoud
Abstract:
We propose a procedure - partly analytical and partly numerical - to find the frequency and the dam** rate of the small-amplitude oscillations of a massless elastic capsule immersed in a two-dimensional viscous incompressible fluid. The unsteady Stokes equations for the stream function are decomposed onto normal modes for the angular and temporal variables, leading to a fourth-order linear ordin…
▽ More
We propose a procedure - partly analytical and partly numerical - to find the frequency and the dam** rate of the small-amplitude oscillations of a massless elastic capsule immersed in a two-dimensional viscous incompressible fluid. The unsteady Stokes equations for the stream function are decomposed onto normal modes for the angular and temporal variables, leading to a fourth-order linear ordinary differential equation in the radial variable. The forcing terms are dictated by the properties of the membrane, and result into jump conditions at the interface between the internal and external media. The equation can be solved numerically, and an excellent agreement is found with a fully-computational approach we developed in parallel. Comparisons are also shown with the results available in the scientific literature for drops, and a model based on the concept of embarked fluid is presented, which allows for a good representation of the results and a consistent interpretation of the underlying physics.
△ Less
Submitted 6 March, 2014;
originally announced March 2014.
-
Elucidating the turbulence nature of the intracardiac flow: from medical images to multi-cycle Large Eddy Simulations
Authors:
Christophe Chnafa,
Simon Mendez,
Franck Nicoud
Abstract:
This brief article accompanies a fluid dynamics video presenting the results of a large-eddy simulation of the flow in a realistic left heart. The left heart geometry, from the pulmonary veins to the aortic root, is extracted from medical images and the endocardium movements are reconstructed through image registration. Large-eddy simulations are thus performed in a patient-specific heart model, w…
▽ More
This brief article accompanies a fluid dynamics video presenting the results of a large-eddy simulation of the flow in a realistic left heart. The left heart geometry, from the pulmonary veins to the aortic root, is extracted from medical images and the endocardium movements are reconstructed through image registration. Large-eddy simulations are thus performed in a patient-specific heart model, where the patient-specific movements of the geometry are prescribed. The flow obtained is intermittent, showing both in the left atrium and in the left ventricle turbulent spots correlated to flow decelerations.
△ Less
Submitted 11 October, 2013;
originally announced October 2013.
-
Quid Pro Quo: A Mechanism for Fair Collaboration in Networked Systems
Authors:
Agustín Santos Méndez,
Antonio Fernández Anta,
Luis López Fernández
Abstract:
Collaboration may be understood as the execution of coordinated tasks (in the most general sense) by groups of users, who cooperate for achieving a common goal. Collaboration is a fundamental assumption and requirement for the correct operation of many communication systems. The main challenge when creating collaborative systems in a decentralized manner is dealing with the fact that users may beh…
▽ More
Collaboration may be understood as the execution of coordinated tasks (in the most general sense) by groups of users, who cooperate for achieving a common goal. Collaboration is a fundamental assumption and requirement for the correct operation of many communication systems. The main challenge when creating collaborative systems in a decentralized manner is dealing with the fact that users may behave in selfish ways, trying to obtain the benefits of the tasks but without participating in their execution. In this context, Game Theory has been instrumental to model collaborative systems and the task allocation problem, and to design mechanisms for optimal allocation of tasks. In this paper, we revise the classical assumptions and propose a new approach to this problem. First, we establish a system model based on heterogenous nodes (users, players), and propose a basic distributed mechanism so that, when a new task appears, it is assigned to the most suitable node. The classical technique for compensating a node that executes a task is the use of payments (which in most networks are hard or impossible to implement). Instead, we propose a distributed mechanism for the optimal allocation of tasks without payments. We prove this mechanism to be robust event in the presence of independent selfish or rationally limited players. Additionally, our model is based on very weak assumptions, which makes the proposed mechanisms susceptible to be implemented in networked systems (e.g., the Internet).
△ Less
Submitted 25 July, 2012;
originally announced July 2012.