Search | arXiv e-print repository

MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction

Authors: Xiang Dai, Sarvnaz Karimi, Abeed Sarker, Ben Hachey, Cecile Paris

Abstract: Objective. Active adverse event surveillance monitors Adverse Drug Events (ADE) from different data sources, such as electronic health records, medical literature, social media and search engine logs. Over years, many datasets are created, and shared tasks are organised to facilitate active adverse event surveillance. However, most-if not all-datasets or shared tasks focus on extracting ADEs from… ▽ More Objective. Active adverse event surveillance monitors Adverse Drug Events (ADE) from different data sources, such as electronic health records, medical literature, social media and search engine logs. Over years, many datasets are created, and shared tasks are organised to facilitate active adverse event surveillance. However, most-if not all-datasets or shared tasks focus on extracting ADEs from a particular type of text. Domain generalisation-the ability of a machine learning model to perform well on new, unseen domains (text types)-is under-explored. Given the rapid advancements in natural language processing, one unanswered question is how far we are from having a single ADE extraction model that are effective on various types of text, such as scientific literature and social media posts}. Methods. We contribute to answering this question by building a multi-domain benchmark for adverse drug event extraction, which we named MultiADE. The new benchmark comprises several existing datasets sampled from different text types and our newly created dataset-CADECv2, which is an extension of CADEC (Karimi, et al., 2015), covering online posts regarding more diverse drugs than CADEC. Our new dataset is carefully annotated by human annotators following detailed annotation guidelines. Conclusion. Our benchmark results show that the generalisation of the trained models is far from perfect, making it infeasible to be deployed to process different types of text. In addition, although intermediate transfer learning is a promising approach to utilising existing resources, further investigation is needed on methods of domain adaptation, particularly cost-effective methods to select useful training instances. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Under review; feedback welcome

arXiv:2404.09270 [pdf, other]

The Next Generation of MeV Energy X-ray Sources for use in the Inspection of Additively Manufactured Parts for Industry

Authors: C. Thornton, S. Karimi, S. Glenn, W. D. Brown, N. Draganic, M. Skeate, M. Ferrucci, Q. Chen, R. Jacob, K. Nakamura, T. Ostermayr, J. van Tilborg, C. Armstrong, O. J. Finlay, N. Turner, S. Glanvill, H. Martz, C. Geddes

Abstract: For the first time, we demonstrate the application of an inverse Compton scattering X-ray Source, driven by a laser-plasma accelerator, to image an additively manufactured component. X-rays with a mean energy of 380 keV were produced and used to image an additively manufactured part made of an Inconel (Nickel 718) alloy. Because inverse Compton scattering driven by laser-plasma acceleration produc… ▽ More For the first time, we demonstrate the application of an inverse Compton scattering X-ray Source, driven by a laser-plasma accelerator, to image an additively manufactured component. X-rays with a mean energy of 380 keV were produced and used to image an additively manufactured part made of an Inconel (Nickel 718) alloy. Because inverse Compton scattering driven by laser-plasma acceleration produces high-energy X-rays while maintaining a focal spot size on the order of a micron, the source can provide several benefits over conventional X-ray production methods, particularly when imaging superalloy parts, with the potential to revolutionise what can be inspected. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2403.09997 [pdf, other]

Identifying Health Risks from Family History: A Survey of Natural Language Processing Techniques

Authors: Xiang Dai, Sarvnaz Karimi, Nathan O'Callaghan

Abstract: Electronic health records include information on patients' status and medical history, which could cover the history of diseases and disorders that could be hereditary. One important use of family history information is in precision health, where the goal is to keep the population healthy with preventative measures. Natural Language Processing (NLP) and machine learning techniques can assist with… ▽ More Electronic health records include information on patients' status and medical history, which could cover the history of diseases and disorders that could be hereditary. One important use of family history information is in precision health, where the goal is to keep the population healthy with preventative measures. Natural Language Processing (NLP) and machine learning techniques can assist with identifying information that could assist health professionals in identifying health risks before a condition is developed in their later years, saving lives and reducing healthcare costs. We survey the literature on the techniques from the NLP field that have been developed to utilise digital health records to identify risks of familial diseases. We highlight that rule-based methods are heavily investigated and are still actively used for family history extraction. Still, more recent efforts have been put into building neural models based on large-scale pre-trained language models. In addition to the areas where NLP has successfully been utilised, we also identify the areas where more research is needed to unlock the value of patients' records regarding data collection, task formulation and downstream applications. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Under Review

arXiv:2312.06210 [pdf, other]

Measurement of the depth-dependent local dynamics in thin polymer films through rejuvenation of ultrastable glasses

Authors: Saba Karimi, Junjie Yin, Thomas Salez, James A Forrest

Abstract: We measure the isothermal rejuvenation of stable glass films of poly(styrene) and poly(methylmethacrylate). We demonstrate that the propagation of the front responsible for the transformation to a supercooled-liquid state can serve as a highly localized probe of the local supercooled dynamics. We use this connection to probe the depth-dependent relaxation rate with nanometric precision for a seri… ▽ More We measure the isothermal rejuvenation of stable glass films of poly(styrene) and poly(methylmethacrylate). We demonstrate that the propagation of the front responsible for the transformation to a supercooled-liquid state can serve as a highly localized probe of the local supercooled dynamics. We use this connection to probe the depth-dependent relaxation rate with nanometric precision for a series of polystyrene films over a range of temperatures near the bulk glass transition temperature. The analysis shows the spatial extent of enhanced surface mobility and reveals the existence of an unexpected large dynamical length scale in the system. The results are compared with the cooperative-string model for glassy dynamics. The data reveals that the film-thickness dependence of whole film properties arises only from the volume fraction of the near-surface region. While the dynamics at the middle of the samples shows the expected bulk-like temperature dependence, the near-surface region shows very little dependence on temperature. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2311.13036 [pdf, other]

Favour: FAst Variance Operator for Uncertainty Rating

Authors: Thomas D. Ahle, Sahar Karimi, Peter Tak Peter Tang

Abstract: Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions. By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference. Unfortunately many inference samples are often needed, the overhead of which greatly hinder BNN's wide adoption. To mitigate this, previous work proposed propagating the first and second moments… ▽ More Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions. By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference. Unfortunately many inference samples are often needed, the overhead of which greatly hinder BNN's wide adoption. To mitigate this, previous work proposed propagating the first and second moments of the posterior directly through the network. However, on its own this method is even slower than sampling, so the propagated variance needs to be approximated such as assuming independence between neural nodes. The resulting trade-off between quality and inference time did not match even plain Monte Carlo sampling. Our contribution is a more principled variance propagation framework based on "spiked covariance matrices", which smoothly interpolates between quality and inference time. This is made possible by a new fast algorithm for updating a diagonal-plus-low-rank matrix approximation under various operations. We tested our algorithm against sampling based MC Dropout and Variational Inference on a number of downstream uncertainty themed tasks, such as calibration and out-of-distribution testing. We find that Favour is as fast as performing 2-3 inference samples, while matching the performance of 10-100 samples. In summary, this work enables the use of BNN in the realm of performance critical tasks where they have previously been out of reach. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2309.04250 [pdf, other]

Provider Fairness and Beyond-Accuracy Trade-offs in Recommender Systems

Authors: Saeedeh Karimi, Hossein A. Rahmani, Mohammadmehdi Naghiaei, Leila Safari

Abstract: Recommender systems, while transformative in online user experiences, have raised concerns over potential provider-side fairness issues. These systems may inadvertently favor popular items, thereby marginalizing less popular ones and compromising provider fairness. While previous research has recognized provider-side fairness issues, the investigation into how these biases affect beyond-accuracy a… ▽ More Recommender systems, while transformative in online user experiences, have raised concerns over potential provider-side fairness issues. These systems may inadvertently favor popular items, thereby marginalizing less popular ones and compromising provider fairness. While previous research has recognized provider-side fairness issues, the investigation into how these biases affect beyond-accuracy aspects of recommendation systems - such as diversity, novelty, coverage, and serendipity - has been less emphasized. In this paper, we address this gap by introducing a simple yet effective post-processing re-ranking model that prioritizes provider fairness, while simultaneously maintaining user relevance and recommendation quality. We then conduct an in-depth evaluation of the model's impact on various aspects of recommendation quality across multiple datasets. Specifically, we apply the post-processing algorithm to four distinct recommendation models across four varied domain datasets, assessing the improvement in each metric, encompassing both accuracy and beyond-accuracy aspects. This comprehensive analysis allows us to gauge the effectiveness of our approach in mitigating provider biases. Our findings underscore the effectiveness of the adopted method in improving provider fairness and recommendation quality. They also provide valuable insights into the trade-offs involved in achieving fairness in recommender systems, contributing to a more nuanced understanding of this complex issue. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: FAccTRec at RecSys 2023

arXiv:2306.09145 [pdf, other]

doi 10.1016/j.techsoc.2023.102260

Artificial intelligence adoption in the physical sciences, natural sciences, life sciences, social sciences and the arts and humanities: A bibliometric analysis of research publications from 1960-2021

Authors: Stefan Hajkowicz, Conrad Sanderson, Sarvnaz Karimi, Alexandra Bratanova, Claire Naughtin

Abstract: Analysing historical patterns of artificial intelligence (AI) adoption can inform decisions about AI capability uplift, but research to date has provided a limited view of AI adoption across various fields of research. In this study we examine worldwide adoption of AI technology within 333 fields of research during 1960-2021. We do this by using bibliometric analysis with 137 million peer-reviewed… ▽ More Analysing historical patterns of artificial intelligence (AI) adoption can inform decisions about AI capability uplift, but research to date has provided a limited view of AI adoption across various fields of research. In this study we examine worldwide adoption of AI technology within 333 fields of research during 1960-2021. We do this by using bibliometric analysis with 137 million peer-reviewed publications captured in The Lens database. We define AI using a list of 214 phrases developed by expert working groups at the Organisation for Economic Cooperation and Development (OECD). We found that 3.1 million of the 137 million peer-reviewed research publications during the entire period were AI-related, with a surge in AI adoption across practically all research fields (physical science, natural science, life science, social science and the arts and humanities) in recent years. The diffusion of AI beyond computer science was early, rapid and widespread. In 1960 14% of 333 research fields were related to AI (many in computer science), but this increased to cover over half of all research fields by 1972, over 80% by 1986 and over 98% in current times. We note AI has experienced boom-bust cycles historically: the AI "springs" and "winters". We conclude that the context of the current surge appears different, and that interdisciplinary AI application is likely to be sustained. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Journal ref: Technology in Society, Vol. 74, 2023

arXiv:2305.19542 [pdf]

Shallow Depth Factoring Based on Quantum Feasibility Labeling and Variational Quantum Search

Authors: Imran Khan Tutul, Sara Karimi, Mohammadreza Soltaninia, Junpeng Zhan

Abstract: Large integer factorization is a prominent research challenge, particularly in the context of quantum computing. This holds significant importance, especially in information security that relies on public key cryptosystems. The classical computation of prime factors for an integer has exponential time complexity. Quantum computing offers the potential for significantly faster computational process… ▽ More Large integer factorization is a prominent research challenge, particularly in the context of quantum computing. This holds significant importance, especially in information security that relies on public key cryptosystems. The classical computation of prime factors for an integer has exponential time complexity. Quantum computing offers the potential for significantly faster computational processes compared to classical processors. In this paper, we propose a new quantum algorithm, Shallow Depth Factoring (SDF), to factor a biprime integer. SDF consists of three steps. First, it converts a factoring problem to an optimization problem without an objective function. Then, it uses a Quantum Feasibility Labeling (QFL) method to label every possible solution according to whether it is feasible or infeasible for the optimization problem. Finally, it employs the Variational Quantum Search (VQS) to find all feasible solutions. The SDF utilizes shallow-depth quantum circuits for efficient factorization, with the circuit depth scaling linearly as the integer to be factorized increases. Through minimizing the number of gates in the circuit, the algorithm enhances feasibility and reduces vulnerability to errors. △ Less

Submitted 21 October, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: 10 pages, 3 figures

arXiv:2211.13819 [pdf, other]

Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

Authors: Xiang Dai, Sarvnaz Karimi

Abstract: Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics litera… ▽ More Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature. We planned our participation such that it enables us to conduct an empirical comparison between word-based tagging and span-based classification methods. When evaluated on two hidden test sets provided by the organizer, our best-performing submission achieved $F_1$ scores of 0.8307 (validation phase) and 0.7990 (testing phase). △ Less

Submitted 24 November, 2022; originally announced November 2022.

Comments: AACL-IJCNLP Workshop on Information Extraction from Scientific Publications (WIESP 2022)

arXiv:2207.09834 [pdf, ps, other]

doi 10.1103/PhysRevD.106.066016

Couplings of order six in the gauge field strength and the second fundamental form on a D$_p$-brane at order $α'^2$

Authors: Mohammad R. Garousi, Saman Karimi

Abstract: Using the assumption that the independent gauge invariant couplings on the world-volume of the non-perturbative objects in the string theory are independent of the background, we find the four and the six gauge field strength and/or the second fundamental form couplings on the world volume of a D$_p$-brane in the superstring theory at order $α'^2$ in the normalization that $F$ is dimensionless. We… ▽ More Using the assumption that the independent gauge invariant couplings on the world-volume of the non-perturbative objects in the string theory are independent of the background, we find the four and the six gauge field strength and/or the second fundamental form couplings on the world volume of a D$_p$-brane in the superstring theory at order $α'^2$ in the normalization that $F$ is dimensionless. We have found them by considering the particular background which has one circle and by imposing the corresponding T-duality constraint on the independent couplings. In particular, we find that there are 12+146 independent gauge invariant couplings at this order, and the T-duality constraint can fix 150 of them. We show that these couplings are fully consistent with the partial results in the literature. This comparison also fixes the remaining 8 couplings. △ Less

Submitted 12 September, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

Comments: 27 pages, latex file, no figure; it appears in PRD

arXiv:2201.03679 [pdf]

Informal Persian Universal Dependency Treebank

Authors: Roya Kabiri, Simin Karimi, Mihai Surdeanu

Abstract: This paper presents the phonological, morphological, and syntactic distinctions between formal and informal Persian, showing that these two variants have fundamental differences that cannot be attributed solely to pronunciation discrepancies. Given that informal Persian exhibits particular characteristics, any computational model trained on formal Persian is unlikely to transfer well to informal P… ▽ More This paper presents the phonological, morphological, and syntactic distinctions between formal and informal Persian, showing that these two variants have fundamental differences that cannot be attributed solely to pronunciation discrepancies. Given that informal Persian exhibits particular characteristics, any computational model trained on formal Persian is unlikely to transfer well to informal Persian, necessitating the creation of dedicated treebanks for this variety. We thus detail the development of the open-source Informal Persian Universal Dependency Treebank, a new treebank annotated within the Universal Dependencies scheme. We then investigate the parsing of informal Persian by training two dependency parsers on existing formal treebanks and evaluating them on out-of-domain data, i.e. the development set of our informal treebank. Our results show that parsers experience a substantial performance drop when we move across the two domains, as they face more unknown tokens and structures and fail to generalize well. Furthermore, the dependency relations whose performance deteriorates the most represent the unique properties of the informal variant. The ultimate goal of this study that demonstrates a broader impact is to provide a step**-stone to reveal the significance of informal variants of languages, which have been widely overlooked in natural language processing tools across languages. △ Less

Submitted 10 January, 2022; originally announced January 2022.

arXiv:2111.11613 [pdf, ps, other]

Nonlinear conjugate gradient for smooth convex functions

Authors: Sahar Karimi, Stephen Vavasis

Abstract: The method of nonlinear conjugate gradients (NCG) is widely used in practice for unconstrained optimization, but it satisfies weak complexity bounds at best when applied to smooth convex functions. In contrast, Nesterov's accelerated gradient (AG) method is optimal up to constant factors for this class. However, when specialized to quadratic function, conjugate gradient is optimal in a strong sens… ▽ More The method of nonlinear conjugate gradients (NCG) is widely used in practice for unconstrained optimization, but it satisfies weak complexity bounds at best when applied to smooth convex functions. In contrast, Nesterov's accelerated gradient (AG) method is optimal up to constant factors for this class. However, when specialized to quadratic function, conjugate gradient is optimal in a strong sense among function-gradient methods. Therefore, there is seemingly a gap in the menu of available algorithms: NCG, the optimal algorithm for quadratic functions that also exhibits good practical performance for general functions, has poor complexity bounds compared to AG. We propose an NCG method called C+AG ("conjugate plus accelerated gradient") to close this gap, that is, it is optimal for quadratic functions and still satisfies the best possible complexity bound for more general smooth convex functions. It takes conjugate gradient steps until insufficient progress is made, at which time it switches to accelerated gradient steps, and later retries conjugate gradient. The proposed method has the following theoretical properties: (i) It is identical to linear conjugate gradient (and hence terminates finitely) if the objective function is quadratic; (ii) Its running-time bound is $O(\eps^{-1/2})$ gradient evaluations for an $L$-smooth convex function, where $\eps$ is the desired residual reduction, (iii) Its running-time bound is $O(\sqrt{L/\ell}\ln(1/\eps))$ if the function is both $L$-smooth and $\ell$-strongly convex. In computational tests, the function-gradient evaluation count for the C+AG method typically behaves as whichever is better of AG or classical NCG. In some test cases it outperforms both. △ Less

Submitted 3 January, 2024; v1 submitted 22 November, 2021; originally announced November 2021.

MSC Class: 65K10; 90C25

arXiv:2110.03015 [pdf, ps, other]

Fast and flexible preconditioners for solving multilinear systems

Authors: Eisa Khosravi Dehdezi, Saeed Karimi

Abstract: This paper investigates a type of fast and flexible preconditioners to solve multilinear system $\mathcal{A}\textbf{x}^{m-1}=\textbf{b}$ with $\mathcal{M}$-tensor $\mathcal{A}$ and obtains some important convergent theorems about preconditioned Jacobi, Gauss-Seidel and SOR type iterative methods. The main results theoretically prove that the preconditioners can accelerate the convergence of iterat… ▽ More This paper investigates a type of fast and flexible preconditioners to solve multilinear system $\mathcal{A}\textbf{x}^{m-1}=\textbf{b}$ with $\mathcal{M}$-tensor $\mathcal{A}$ and obtains some important convergent theorems about preconditioned Jacobi, Gauss-Seidel and SOR type iterative methods. The main results theoretically prove that the preconditioners can accelerate the convergence of iterations. Numerical examples are presented to reverify the efficiency of the proposed preconditioned methods. △ Less

Submitted 6 October, 2021; originally announced October 2021.

arXiv:2110.01482 [pdf, ps, other]

doi 10.1140/epjc/s10052-022-10735-w

Non-Gaussianity and Secondary Gravitational Waves from Primordial Black Holes Production in $α$-attractor Inflation

Authors: Kazem Rezazadeh, Zeinab Teimoori, Saeid Karimi, Kayoomars Karami

Abstract: We study the non-Gaussianity and secondary Gravitational Waves (GWs) in the process of the Primordial Black Holes (PBHs) production from inflation. In our work, we focus on the $α$-attractor inflation model in which a tiny bump in the inflaton potential enhances the amplitude of the curvature perturbations at some scales and consequently leads to the PBHs production with different mass scales. We… ▽ More We study the non-Gaussianity and secondary Gravitational Waves (GWs) in the process of the Primordial Black Holes (PBHs) production from inflation. In our work, we focus on the $α$-attractor inflation model in which a tiny bump in the inflaton potential enhances the amplitude of the curvature perturbations at some scales and consequently leads to the PBHs production with different mass scales. We implement the computational code BINGO which calculates the non-Gaussianity parameter in different triangle configurations. Our examination implies that in this setup, the non-Gaussianity gets amplified significantly in the equilateral shape around the scales in which the power spectrum of the scalar perturbations undergoes a sharp declination. The imprints of these non-Gaussianities can be probed in the scales corresponding to the BBN and $μ$-distortion events, or in smaller scales, and detection of such signatures in the future observations may confirm the idea of our model for the generation of PBHs or rule it out. Moreover, we investigate the secondary GWs in this framework and show that in our model, the peak of the present fractional energy density is obtained as $Ω_{\rm GW0} \sim 10^{-8}$ at different frequencies which depends on the model parameters. These results lie well within the sensitivity region of some GWs detectors at some frequencies, and therefore the observational compatibility of our model can be evaluated by the forthcoming data from these detectors. We further provide some estimations for the tilts of the induced GWs spectrum in the different intervals of frequency, and demonstrate that the spectrum obeys the power-law relation $Ω_{\rm GW0}\sim f^{n}$ in those frequency bands. △ Less

Submitted 30 August, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: 38 pages, 9 figures

Journal ref: Eur. Phys. J. C 82, 758 (2022)

arXiv:2109.11276 [pdf, ps, other]

A new block diagonal preconditioner for a class of $3\times 3$ block saddle point problems

Authors: Maryam Abdolmaleki, Saeed Karimi, Davod Khojasteh Salkuyeh

Abstract: We study the performance of a new block preconditioner for a class of $3\times3$ block saddle point problems which arise from finite element methods for solving time-dependent Maxwell equations and some other practical problems. We also estimate the lower and upper bounds of eigenvalues of the preconditioned matrix. \cred{Finally, we examine our new preconditioner to accelerate the convergence spe… ▽ More We study the performance of a new block preconditioner for a class of $3\times3$ block saddle point problems which arise from finite element methods for solving time-dependent Maxwell equations and some other practical problems. We also estimate the lower and upper bounds of eigenvalues of the preconditioned matrix. \cred{Finally, we examine our new preconditioner to accelerate the convergence speed of the GMRES method which shows the effectiveness of the preconditioner. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: 14 pages, 2021, to appear in Mediterranean Journal of Mathematics

MSC Class: 65F10; 65F50; 65F08

arXiv:2109.02767 [pdf]

A Recursive Delay Estimation Algorithm for Linear Multivariable Systems with Time-varying Delays

Authors: Iman Shafikhani, Hazhar Sufi Karimi, Mohammad Mohammadian, Amin Ramezani, Hamid Reza Momeni

Abstract: Time delay estimation plays a critical role in control, stabilization and state estimation of many practical system with time delay. In this paper, we propose a method to estimate delay for discrete time linear multiple-input multiple-output systems with time-varying input delays. This method is purposefully given for situations where only a limited amount of information is available for the syste… ▽ More Time delay estimation plays a critical role in control, stabilization and state estimation of many practical system with time delay. In this paper, we propose a method to estimate delay for discrete time linear multiple-input multiple-output systems with time-varying input delays. This method is purposefully given for situations where only a limited amount of information is available for the system. Although, this approach is primarily developed in a deterministic framework, it can also be applied to noisy data under special circumstances. In addition, switched linear autoregressive models with exogenous inputs are introduced as possible applications of the presented algorithm provided that the switching frequencies are small. Finally, effectiveness of the algorithm is illustrated by two numerical examples. △ Less

Submitted 6 September, 2021; originally announced September 2021.

arXiv:2106.10220 [pdf, other]

Semantic navigation with domain knowledge

Authors: Rafael Gomes Braga, Sina Karimi, Ulrich Dah-Achinanon, Ivanka Iordanova, David St-Onge

Abstract: Several deployment locations of mobile robotic systems are human made (i.e. urban firefighter, building inspection, property security) and the manager may have access to domain-specific knowledge about the place, which can provide semantic contextual information allowing better reasoning and decision making. In this paper we propose a system that allows a mobile robot to operate in a location-awar… ▽ More Several deployment locations of mobile robotic systems are human made (i.e. urban firefighter, building inspection, property security) and the manager may have access to domain-specific knowledge about the place, which can provide semantic contextual information allowing better reasoning and decision making. In this paper we propose a system that allows a mobile robot to operate in a location-aware and operator-friendly way, by leveraging semantic information from the deployment location and integrating it to the robots localization and navigation systems. We integrate Building Information Models (BIM) into the Robotic Operating System (ROS), to generate topological and metric maps fed to an layered path planner (global and local). A map merging algorithm integrates newly discovered obstacles into the metric map, while a UWB-based localization system detects equipment to be registered back into the semantic database. The results are validated in simulation and real-life deployments in buildings and construction sites. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: 12 pages, 10 figures. arXiv admin note: substantial text overlap with arXiv:2104.10296

arXiv:2104.10296 [pdf, other]

Semantic Navigation Using Building Information on Construction Sites

Authors: Sina Karimi, Rafael Gomes Braga, Ivanka Iordanova, David St-Onge

Abstract: With the growth in automated data collection of construction projects, the need for semantic navigation of mobile robots is increasing. In this paper, we propose an infrastructure to leverage building-related information for smarter, safer and more precise robot navigation during construction phase. Our use of Building Information Models (BIM) in robot navigation is twofold: (1) the intuitive sema… ▽ More With the growth in automated data collection of construction projects, the need for semantic navigation of mobile robots is increasing. In this paper, we propose an infrastructure to leverage building-related information for smarter, safer and more precise robot navigation during construction phase. Our use of Building Information Models (BIM) in robot navigation is twofold: (1) the intuitive semantic information enables non-experts to deploy robots and (2) the semantic data exposed to the navigation system allows optimal path planning (not necessarily the shortest one). Our Building Information Robotic System (BIRS) uses Industry Foundation Classes (IFC) as the interoperable data format between BIM and the Robotic Operating System (ROS). BIRS generates topological and metric maps from BIM for ROS usage. An optimal path planer, integrating critical components for construction assessment is proposed using a cascade strategy (global versus local). The results are validated through series of experiments in construction sites. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: 7 pages, 7 figures, conference

arXiv:2104.10239 [pdf]

An ontology-based approach to data exchanges for robot navigation on construction sites

Authors: Sina Karimi, Ivanka Iordanova, David St-Onge

Abstract: With growth in the use of autonomous Unmanned Ground Vehicle (UGV) for automated data collection from construction projects, the problem of inter-disciplinary semantic data sharing and exchanges between construction and robotic domains has attracted construction stakeholders' attention. Cross-domain data translation requires detailed specifications especially when it comes to semantic data transla… ▽ More With growth in the use of autonomous Unmanned Ground Vehicle (UGV) for automated data collection from construction projects, the problem of inter-disciplinary semantic data sharing and exchanges between construction and robotic domains has attracted construction stakeholders' attention. Cross-domain data translation requires detailed specifications especially when it comes to semantic data translation. Building Information Modeling (BIM) and Geographic Information System (GIS) are the two technologies to capture and store construction data for indoor structure and outdoor environment respectively. In the absence of a standard format for data exchanges between the construction and robotic domains, the tools of both industries are yet to be integrated in a coherent deployment infrastructure. Hence, the semantics of BIM-GIS cannot be automatically integrated by any robotic platform. To enable semantic data transfer across domains, semantic web technology has been widely used in multidisciplinary areas for interoperability. We exploit it to pave the way to a smarter, quicker and more precise robot navigation on job-sites. This paper develops a semantic web ontology integrating robot navigation and data collection to convey the meanings from BIM-GIS to the robot. The proposed Building Information Robotic System (BIRS) provides construction data that are semantically transferred to the robotic platform and can be used by the robot navigation software stack on construction sites. To reach this objective, we first need to bridge the knowledge representation between construction and robotic domains. Then, we develop a semantic database to integrate with Robot Operating System (ROS) which can communicate with the robot and the navigation system in order to provide the robot with semantic building data at each step of data collection. Finally, the proposed system is validated through a case study. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: 21 pages, 12 figures, journal paper

arXiv:2103.03308 [pdf, ps, other]

doi 10.1016/j.physd.2022.133291

The occurrence of riddled basins and blowout bifurcations in a parametric nonlinear system

Authors: M. Rabiee, F. H. Ghane, M. Zaj, S. Karimi

Abstract: In this paper, a two parameters family $F_{β_1,β_2}$ of maps of the plane living two different subspaces invariant is studied. We observe that, our model exhibits two chaotic attractors $A_i$, $i=0,1$, lying in these invariant subspaces and identify the parameters at which $A_i$ has a locally riddled basin of attraction or becomes a chaotic saddle. Then, the occurrence of riddled basin in the glob… ▽ More In this paper, a two parameters family $F_{β_1,β_2}$ of maps of the plane living two different subspaces invariant is studied. We observe that, our model exhibits two chaotic attractors $A_i$, $i=0,1$, lying in these invariant subspaces and identify the parameters at which $A_i$ has a locally riddled basin of attraction or becomes a chaotic saddle. Then, the occurrence of riddled basin in the global sense is investigated in an open region of $β_1β_2$-plane. We semi-conjugate our system to a random walk model and define a fractal boundary which separates the basins of attraction of the two chaotic attractors, then we describe riddled basin in detail. We show that the model undergos a sequence of bifurcations: "a blowout bifurcation", "a bifurcation to normal repulsion" and "a bifurcation by creating a new chaotic attractor with an intermingled basin". Numerical simulations are presented graphically to confirm the validity of our results. △ Less

Submitted 4 March, 2021; originally announced March 2021.

Comments: 26 pages, 15 figures

MSC Class: 37C05; 37C40; 37C70; 37H15; 37E05; 37D35

arXiv:2012.04883 [pdf, ps, other]

On Distributed Algorithms for Minimum Dominating Set problem, from theory to application

Authors: Sharareh Alipour, Ehsan Futuhi, Shayan Karimi

Abstract: In this paper, we propose a distributed algorithm for the minimum dominating set problem. For some especial networks, we prove theoretically that the achieved answer by our proposed algorithm is a constant approximation factor of the exact answer. This problem arises naturally in social networks, for example in news spreading, avoiding rumor spreading and recommendation spreading. So we implement… ▽ More In this paper, we propose a distributed algorithm for the minimum dominating set problem. For some especial networks, we prove theoretically that the achieved answer by our proposed algorithm is a constant approximation factor of the exact answer. This problem arises naturally in social networks, for example in news spreading, avoiding rumor spreading and recommendation spreading. So we implement our algorithm on massive social networks and compare our results with the state of the art algorithms. Also, we extend our algorithm to solve the $k$-distance dominating set problem and experimentally study the efficiency of the proposed algorithm. Our proposed algorithm is fast and easy to implement and can be used in dynamic networks where the edges and vertices are added or deleted constantly. More importantly, based on the experimental results the proposed algorithm has reasonable solutions and running time which enables us to use it in distributed model practically. △ Less

Submitted 3 January, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

arXiv:2010.01150 [pdf, other]

Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media

Authors: Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Abstract: Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data. Often, the pretraining data used in these models are selected based on their subject matter, e.g., biology or computer science. Given the range of applications using social media text, and its unique language variety, we pretrain two models on twee… ▽ More Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data. Often, the pretraining data used in these models are selected based on their subject matter, e.g., biology or computer science. Given the range of applications using social media text, and its unique language variety, we pretrain two models on tweets and forum text respectively, and empirically demonstrate the effectiveness of these two resources. In addition, we investigate how similarity measures can be used to nominate in-domain pretraining data. We publicly release our pretrained models at https://bit.ly/35RpTf0. △ Less

Submitted 2 October, 2020; originally announced October 2020.

Comments: Findings of EMNLP 2020

arXiv:2008.10702 [pdf, other]

Optimal Scheduling of Anticipated COVID-19 Vaccination: A Case Study of New York State

Authors: Syed Irfan Ali Meerza, Seyed M. Karimi, Bert B. Little, Jacek M. Zurada, Tamer Inanc

Abstract: This study aims to determine an optimal control strategy for vaccine scheduling in COVID-19 pandemic treatment by converting widely acknowledged infectious disease model named SEIR into an optimal control problem. The problem is augmented by adding medication and vaccine limitations to match real-world situations. Two version of the problem is formulated to minimize the number of infected individu… ▽ More This study aims to determine an optimal control strategy for vaccine scheduling in COVID-19 pandemic treatment by converting widely acknowledged infectious disease model named SEIR into an optimal control problem. The problem is augmented by adding medication and vaccine limitations to match real-world situations. Two version of the problem is formulated to minimize the number of infected individuals at the same provide the optimal vaccine possible to reduce the susceptible population to a considerably lower state. Optimal control problems are solved using RBF-Galerkin method. These problems are tested with a benchmarking dataset to determine required parameters. After this step, problems are tested with recent data for New York State, USA. The results regarding the proposed optimal control problem provides a set of evidences from which an optimal strategy for vaccine scheduling can be chosen, when the vaccine for COVID-19 will be available. △ Less

Submitted 24 August, 2020; originally announced August 2020.

Comments: Submitted to International conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)-2020

arXiv:2007.02492 [pdf, other]

Searching Scientific Literature for Answers on COVID-19 Questions

Authors: Vincent Nguyen, Maciek Rybinski, Sarvnaz Karimi, Zhenchang Xing

Abstract: Finding answers related to a pandemic of a novel disease raises new challenges for information seeking and retrieval, as the new information becomes available gradually. TREC COVID search track aims to assist in creating search tools to aid scientists, clinicians, policy makers and others with similar information needs in finding reliable answers from the scientific literature. We experiment with… ▽ More Finding answers related to a pandemic of a novel disease raises new challenges for information seeking and retrieval, as the new information becomes available gradually. TREC COVID search track aims to assist in creating search tools to aid scientists, clinicians, policy makers and others with similar information needs in finding reliable answers from the scientific literature. We experiment with different ranking algorithms as part of our participation in this challenge. We propose a novel method for neural retrieval, and demonstrate its effectiveness on the TREC COVID search. △ Less

Submitted 5 July, 2020; originally announced July 2020.

Comments: 4 pages + 1 page of references, submitted to ACL COVID-19 workshop

arXiv:2005.08989 [pdf, other]

doi 10.1140/epjc/s10052-020-08503-9

Holographic complexity in general quadratic curvature theory of gravity

Authors: Ahmad Ghodsi, Saeed Qolibikloo, Saman Karimi

Abstract: In the context of CA conjecture for holographic complexity, we study the action growth rate at late time approximation for general quadratic curvature theory of gravity. We show how the Lloyd's bound saturates for charged and neutral black hole solutions. We observe that a second singular point may modify the action growth rate to a value other than the Lloyd's bound. Moreover, we find the univers… ▽ More In the context of CA conjecture for holographic complexity, we study the action growth rate at late time approximation for general quadratic curvature theory of gravity. We show how the Lloyd's bound saturates for charged and neutral black hole solutions. We observe that a second singular point may modify the action growth rate to a value other than the Lloyd's bound. Moreover, we find the universal terms that appear in the divergent part of complexity from computing the bulk and joint terms on a regulated WDW patch. △ Less

Submitted 26 September, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: 19 pages, 4 figures; v3. Subsection 2.5 modified and more discussions on the second singularity in new subsection 2.6 added. Accepted for publication in EPJC

arXiv:2004.13454 [pdf, other]

An Effective Transition-based Model for Discontinuous NER

Authors: Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Abstract: Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are efficient but preclude recovery of these mentions. We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER. Thro… ▽ More Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are efficient but preclude recovery of these mentions. We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER. Through extensive experiments on three biomedical data sets, we show that our model can effectively recognize discontinuous mentions without sacrificing the accuracy on continuous mentions. △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: ACL 2020

arXiv:2002.11908 [pdf]

doi 10.22105/JARIE.2021.301634.1372

A variable service rate queue model for hub median problem

Authors: Zaniar Ardalan, Sajad Karimi

Abstract: Hub location problems have multiple applications in logistic systems, airways industry, supply chain network design, and telecommunication. In the hub location problem, a number of nodes should be selected as the hub nodes to act as the main distributors and other nodes are connected together by these hubs. The input flow to the hub nodes is very large so more often we face the congestion in the h… ▽ More Hub location problems have multiple applications in logistic systems, airways industry, supply chain network design, and telecommunication. In the hub location problem, a number of nodes should be selected as the hub nodes to act as the main distributors and other nodes are connected together by these hubs. The input flow to the hub nodes is very large so more often we face the congestion in the hub nodes that causes disturbances in the whole system. Also, we have different service rates which is another cause of disturbance and should be addressed by the models. This paper addresses these issues by providing a model that prevents congestion in the system. We incorporated queuing system in the p-median hub location problem by considering multiple server options and different service rates. CAB dataset (contains 25 US cities) was used in the implementation and our findings show the big impact of considering congestion on the hub location network design. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:1912.11371 [pdf]

Comparison of the P300 detection accuracy related to the BCI speller and image recognition scenarios

Authors: S. A. Karimi, A. M. Mijani, M. T. Talebian, S. Mirzakuchaki

Abstract: There are several protocols in the Electroencephalography (EEG) recording scenarios which produce various types of event-related potentials (ERP). P300 pattern is a well-known ERP which produced by auditory and visual oddball paradigm and BCI speller system. In this study, P300 and non-P300 separability are investigated in two scenarios including image recognition paradigm and BCI speller. Image r… ▽ More There are several protocols in the Electroencephalography (EEG) recording scenarios which produce various types of event-related potentials (ERP). P300 pattern is a well-known ERP which produced by auditory and visual oddball paradigm and BCI speller system. In this study, P300 and non-P300 separability are investigated in two scenarios including image recognition paradigm and BCI speller. Image recognition scenario is an experiment that examines the participants, knowledge about an image that shown to them before by analyzing the EEG signal recorded during the observing of that image as visual stimulation. To do this, three types of famous classifiers (SVM, Bayes LDA, and sparse logistic regression) were used to classify EEG recordings in six classes problem. Filtered and down-sampled (temporal samples) of EEG recording were considered as features in classification P300 pattern. Also, different sets of EEG recording including 4, 8 and 16 channels and different trial numbers were used to considering various situations in comparison. The accuracy was increased by increasing the number of trials and channels. The results prove that better accuracy is observed in the case of the image recognition scenario for the different sets of channels and by using the different number of trials. So it can be concluded that P300 pattern which produced in image recognition paradigm is more separable than BCI (matrix speller). △ Less

Submitted 24 December, 2019; originally announced December 2019.

Comments: 8 pages, 3 figures, 2 tables, 24 references

arXiv:1910.12441 [pdf]

Online News Media Website Ranking Using User Generated Content

Authors: Samaneh Karimi, Azadeh Shakery, Rakesh Verma

Abstract: News media websites are important online resources that have drawn great attention of text mining researchers. The main aim of this study is to propose a framework for ranking online news websites from different viewpoints. The ranking of news websites is useful information, which can benefit many news-related tasks such as news retrieval and news recommendation. In the proposed framework, the ran… ▽ More News media websites are important online resources that have drawn great attention of text mining researchers. The main aim of this study is to propose a framework for ranking online news websites from different viewpoints. The ranking of news websites is useful information, which can benefit many news-related tasks such as news retrieval and news recommendation. In the proposed framework, the ranking of news websites is obtained by calculating three measures introduced in the paper and based on user-generated content. Each proposed measure is concerned with the performance of news websites from a particular viewpoint including the completeness of news reports, the diversity of events being covered by the website and its speed. The use of user-generated content in this framework, as a partly-unbiased, real-time and low cost content on the web distinguishes the proposed news website ranking framework from the literature. The results obtained for three prominent news websites, BBC, CNN, NYTimes, show that BBC has the best performance in terms of news completeness and speed, and NYTimes has the best diversity in comparison with the other two websites. △ Less

Submitted 28 October, 2019; originally announced October 2019.

Comments: 35 pages, 4 Figures, 5 tables

arXiv:1907.13206 [pdf]

Toward a safe supply chain: Incorporating accident, physical, psychosocial, and mental overload risks into supply chain network

Authors: Sajad Karimi, Zaniar Ardalan

Abstract: Considering health and safety factors in supply chain network design brings safer place for employer and help firm to have better image in the society. There are many health and safety factors overlooked by literature studies of supply chain. This paper takes advantage of the results of occupational safety and health in the transport sector studies and connect this field of science with the supply… ▽ More Considering health and safety factors in supply chain network design brings safer place for employer and help firm to have better image in the society. There are many health and safety factors overlooked by literature studies of supply chain. This paper takes advantage of the results of occupational safety and health in the transport sector studies and connect this field of science with the supply chain network design. This study incorporates health and safety factors such as accident, physical, psychosocial and mental overload risks as an objective function beside cost and environmental oriented objective functions. We formulated a multi-objective closed loop supply chain network as a mixed integer linear programming model and customized augmented epsilon-constraint algorithm to solve our multi-objective problem to offer multiple choices for decision makers. Eventually, we analyzed the effects of incorporating health and safety factors in supply chain and demonstrated how it will minimize the health and safety risks of supply chain employers, environmental pollution, and the total cost of the network simultaneously. △ Less

Submitted 9 September, 2019; v1 submitted 30 July, 2019; originally announced July 2019.

arXiv:1907.03154 [pdf, ps, other]

The saturation number of powers of graded ideals

Authors: Jürgen Herzog, Shokoufe Karimi, Amir Mafi

Abstract: Let $S=K[x_1,\ldots,x_n]$ be the polynomial ring in $n$ variables over a field $K$ with maximal ideal $\frak{m}=(x_1,...,x_n)$, and let $I$ be a graded ideal of $S$. In this paper, we define the saturation number $\sat(I)$ of $I$ to be the smallest non-negative integer $k$ such that $I:\mm^{k+1}= I:\mm^k$. We show that $f(k)$ is linearly bounded, and that $f(k)$ is a quasi-linear function for… ▽ More Let $S=K[x_1,\ldots,x_n]$ be the polynomial ring in $n$ variables over a field $K$ with maximal ideal $\frak{m}=(x_1,...,x_n)$, and let $I$ be a graded ideal of $S$. In this paper, we define the saturation number $\sat(I)$ of $I$ to be the smallest non-negative integer $k$ such that $I:\mm^{k+1}= I:\mm^k$. We show that $f(k)$ is linearly bounded, and that $f(k)$ is a quasi-linear function for $k\gg 0$, if $I$ is a monomial ideal. Furthermore, we show that $\sat(I^k)=k$ if $I$ is a principal Borel ideal and prove that $\sat(I_{d,n}^k) =\max\{l\:\; (kd-l)/(k-l) \leq n\},$ where $I_{d,n}$ is the squarefree Veronese ideal generated in degree $d$. \end{abstract} △ Less

Submitted 1 September, 2019; v1 submitted 6 July, 2019; originally announced July 2019.

Comments: 8 pages, comments welcome

arXiv:1906.10607 [pdf, other]

Newswire versus Social Media for Disaster Response and Recovery

Authors: Rakesh Verma, Samaneh Karimi, Daniel Lee, Omprakash Gnawali, Azadeh Shakery

Abstract: In a disaster situation, first responders need to quickly acquire situational awareness and prioritize response based on the need, resources available and impact. Can they do this based on digital media such as Twitter alone, or newswire alone, or some combination of the two? We examine this question in the context of the 2015 Nepal Earthquakes. Because newswire articles are longer, effective summ… ▽ More In a disaster situation, first responders need to quickly acquire situational awareness and prioritize response based on the need, resources available and impact. Can they do this based on digital media such as Twitter alone, or newswire alone, or some combination of the two? We examine this question in the context of the 2015 Nepal Earthquakes. Because newswire articles are longer, effective summaries can be helpful in saving time yet giving key content. We evaluate the effectiveness of several unsupervised summarization techniques in capturing key content. We propose a method to link tweets written by the public and newswire articles, so that we can compare their key characteristics: timeliness, whether tweets appear earlier than their corresponding news articles, and content. A novel idea is to view relevant tweets as a summary of the matching news article and evaluate these summaries. Whenever possible, we present both quantitative and qualitative evaluations. One of our main findings is that tweets and newswire articles provide complementary perspectives that form a holistic view of the disaster situation. △ Less

Submitted 25 June, 2019; originally announced June 2019.

arXiv:1906.05468 [pdf, ps, other]

A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Authors: Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris, C Raina MacIntyre

Abstract: Distributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention c… ▽ More Distributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention classification. For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology. There is an improvement of 2-4% in the accuracy when these context-based representations are used instead of word-based representations. △ Less

Submitted 12 June, 2019; originally announced June 2019.

Comments: To Appear in the 18th ACL Workshop on Biomedical Natural Language Processing (BioNLP)

arXiv:1906.05466 [pdf, other]

Figurative Usage Detection of Symptom Words to Improve Personal Health Mention Detection

Authors: Adith Iyer, Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris

Abstract: Personal health mention detection deals with predicting whether or not a given sentence is a report of a health condition. Past work mentions errors in this prediction when symptom words, i.e. names of symptoms of interest, are used in a figurative sense. Therefore, we combine a state-of-the-art figurative usage detection with CNN-based personal health mention detection. To do so, we present two m… ▽ More Personal health mention detection deals with predicting whether or not a given sentence is a report of a health condition. Past work mentions errors in this prediction when symptom words, i.e. names of symptoms of interest, are used in a figurative sense. Therefore, we combine a state-of-the-art figurative usage detection with CNN-based personal health mention detection. To do so, we present two methods: a pipeline-based approach and a feature augmentation-based approach. The introduction of figurative usage detection results in an average improvement of 2.21% F-score of personal health mention detection, in the case of the feature augmentation-based approach. This paper demonstrates the promise of using figurative usage detection to improve personal health mention detection. △ Less

Submitted 3 July, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: To appear at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) (The second version updates the name of a cited paper. A detailed note from the cited author is here : https://github.com/commonsense/conceptnet5/wiki/Citation-complications )

arXiv:1906.01359 [pdf, other]

NNE: A Dataset for Nested Named Entity Recognition in English Newswire

Authors: Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris, James R. Curran

Abstract: Named entity recognition (NER) is widely used in natural language processing applications and downstream tasks. However, most NER tools target flat annotation from popular datasets, eschewing the semantic information available in nested entity mentions. We describe NNE---a fine-grained, nested named entity dataset over the full Wall Street Journal portion of the Penn Treebank (PTB). Our annotation… ▽ More Named entity recognition (NER) is widely used in natural language processing applications and downstream tasks. However, most NER tools target flat annotation from popular datasets, eschewing the semantic information available in nested entity mentions. We describe NNE---a fine-grained, nested named entity dataset over the full Wall Street Journal portion of the Penn Treebank (PTB). Our annotation comprises 279,795 mentions of 114 entity types with up to 6 layers of nesting. We hope the public release of this large dataset for English newswire will encourage development of new techniques for nested NER. △ Less

Submitted 4 June, 2019; originally announced June 2019.

Comments: ACL 2019

arXiv:1904.00585 [pdf, other]

Using Similarity Measures to Select Pretraining Data for NER

Authors: Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris

Abstract: Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks. However, the measure and impact of similarity between pretraining data and target task data are left to intuition. We propose three cost-effective measures to quantify different aspects of similarity between source pretraining and target t… ▽ More Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks. However, the measure and impact of similarity between pretraining data and target task data are left to intuition. We propose three cost-effective measures to quantify different aspects of similarity between source pretraining and target task data. We demonstrate that these measures are good predictors of the usefulness of pretrained models for Named Entity Recognition (NER) over 30 data pairs. Results also suggest that pretrained LMs are more effective and more predictable than pretrained word vectors, but pretrained word vectors are better when pretraining data is dissimilar. △ Less

Submitted 16 May, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

Comments: NAACL 2019

arXiv:1903.05801 [pdf, ps, other]

Survey of Text-based Epidemic Intelligence: A Computational Linguistic Perspective

Authors: Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris, C Raina MacIntyre

Abstract: Epidemic intelligence deals with the detection of disease outbreaks using formal (such as hospital records) and informal sources (such as user-generated text on the web) of information. In this survey, we discuss approaches for epidemic intelligence that use textual datasets, referring to it as `text-based epidemic intelligence'. We view past work in terms of two broad categories: health mention c… ▽ More Epidemic intelligence deals with the detection of disease outbreaks using formal (such as hospital records) and informal sources (such as user-generated text on the web) of information. In this survey, we discuss approaches for epidemic intelligence that use textual datasets, referring to it as `text-based epidemic intelligence'. We view past work in terms of two broad categories: health mention classification (selecting relevant text from a large volume) and health event detection (predicting epidemic events from a collection of relevant text). The focus of our discussion is the underlying computational linguistic techniques in the two categories. The survey also provides details of the state-of-the-art in annotation techniques, resources and evaluation strategies for epidemic intelligence. △ Less

Submitted 13 March, 2019; originally announced March 2019.

Comments: This paper is under review at ACM Computing Surveys. This version of the paper does not use the ACM Computing Surveys stylesheet. This arXiv version is to solicit feedback

arXiv:1809.10377 [pdf, ps, other]

doi 10.1016/j.nuclphysb.2019.01.006

$α'$-corrections to DBI action via T-duality constraint

Authors: Saman Karimi, Mohammad R. Garousi

Abstract: It is known that D$_p$-brane effective action at the leading order of $α'$ in flat space-time which is given by DBI action, transforms to D$_{p-1}$-brane effective action under standard T-duality transformations of the open string gauge bosons and transverse scalar fields. Extending this duality to order $α'$, one may find corrections to the DBI action which include the second fundamental form… ▽ More It is known that D$_p$-brane effective action at the leading order of $α'$ in flat space-time which is given by DBI action, transforms to D$_{p-1}$-brane effective action under standard T-duality transformations of the open string gauge bosons and transverse scalar fields. Extending this duality to order $α'$, one may find corrections to the DBI action which include the second fundamental form $Ω$ and the covariant derivative of gauge field strength $DF$, as well as the corrections to the T-duality transformations. Using this idea, up to two parameters, we have found all 81 covariant couplings of $D FD F$ and $ΩΩ$ with zero, two, four and six $F$'s. The four gauge field couplings that the T-duality constraint fixes are consistent with the known couplings in the literature. △ Less

Submitted 15 April, 2021; v1 submitted 27 September, 2018; originally announced September 2018.

Comments: 24 pages, no figure, Latex file;v2: the version appears in npb, v3: a mistake in the footnote 4 is corrected

arXiv:1803.00730 [pdf, ps, other]

On stability properties of powers of polymatroidal ideals

Authors: Shokoufe Karimi, Amir Mafi

Abstract: Let $R=K[x_1,...,x_n]$ be the polynomial ring in $n$ variables over a field $K$ with the maximal ideal $\frak{m}=(x_1,...,x_n)$. Let $\astab(I)$ and $\dstab(I)$ be the smallest integer $n$ for which $\Ass(I^n)$ and $\depth(I^n)$ stabilize, respectively. In this paper we show that $\astab(I)=\dstab(I)$ in the following cases: \begin{itemize} \item[(i)] $I$ is a matroidal ideal and $n\leq 5$.… ▽ More Let $R=K[x_1,...,x_n]$ be the polynomial ring in $n$ variables over a field $K$ with the maximal ideal $\frak{m}=(x_1,...,x_n)$. Let $\astab(I)$ and $\dstab(I)$ be the smallest integer $n$ for which $\Ass(I^n)$ and $\depth(I^n)$ stabilize, respectively. In this paper we show that $\astab(I)=\dstab(I)$ in the following cases: \begin{itemize} \item[(i)] $I$ is a matroidal ideal and $n\leq 5$. \item[(ii)] $I$ is a polymatroidal ideal, $n=4$ and $\frak{m}\notin\Ass^{\infty}(I)$, where $\Ass^{\infty}(I)$ is the stable set of associated prime ideals of $I$. \item[(iii)] $I$ is a polymatroidal ideal of degree $2$. \end{itemize} Moreover, we give an example of a polymatroidal ideal for which $\astab(I)\neq\dstab(I)$. This is a counterexample to the conjecture of Herzog and Qureshi, according to which these two numbers are the same for polymatroidal ideals. △ Less

Submitted 10 October, 2018; v1 submitted 2 March, 2018; originally announced March 2018.

Comments: 10 pages. To appear in Collec. Math

MSC Class: 13A15; 13A30; 13C15

arXiv:1801.09322 [pdf, ps, other]

Benchmarking Clinical Decision Support Search

Authors: Vincent Nguyen, Sarvnaz Karimi, Sara Falamaki, Cecile Paris

Abstract: Finding relevant literature underpins the practice of evidence-based medicine. From 2014 to 2016, TREC conducted a clinical decision support track, wherein participants were tasked with finding articles relevant to clinical questions posed by physicians. In total, 87 teams have participated over the past three years, generating 395 runs. During this period, each team has trialled a variety of meth… ▽ More Finding relevant literature underpins the practice of evidence-based medicine. From 2014 to 2016, TREC conducted a clinical decision support track, wherein participants were tasked with finding articles relevant to clinical questions posed by physicians. In total, 87 teams have participated over the past three years, generating 395 runs. During this period, each team has trialled a variety of methods. While there was significant overlap in the methods employed by different teams, the results were varied. Due to the diversity of the platforms used, the results arising from the different techniques are not directly comparable, reducing the ability to build on previous work. By using a stable platform, we have been able to compare different document and query processing techniques, allowing us to experiment with different search parameters. We have used our system to reproduce leading teams runs, and compare the results obtained. By benchmarking our indexing and search techniques, we can statistically test a variety of hypotheses, paving the way for further research. △ Less

Submitted 28 January, 2018; originally announced January 2018.

arXiv:1712.09498 [pdf, ps, other]

A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent

Authors: Sahar Karimi, Stephen Vavasis

Abstract: Nesterov's accelerated gradient (AG) method for minimizing a smooth strongly convex function $f$ is known to reduce $f({\bf x}_k)-f({\bf x}^*)$ by a factor of $ε\in(0,1)$ after $k=O(\sqrt{L/\ell}\log(1/ε))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computa… ▽ More Nesterov's accelerated gradient (AG) method for minimizing a smooth strongly convex function $f$ is known to reduce $f({\bf x}_k)-f({\bf x}^*)$ by a factor of $ε\in(0,1)$ after $k=O(\sqrt{L/\ell}\log(1/ε))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. Modulo a line search, the geometric descent (GD) method of Bubeck, Lee and Singh has the same bound for this class of functions. The method of linear conjugate gradients (CG) also satisfies the same complexity bound in the special case of strongly convex quadratic functions, but in this special case it can be faster than the AG and GD methods. Despite similarities in the algorithms and their asymptotic convergence rates, the conventional analysis of the running time of CG is mostly disjoint from that of AG and GD. The analyses of the AG and GD methods are also rather distinct. Our main result is analyses of the three methods that share several common threads: all three analyses show a relationship to a certain "idealized algorithm", all three establish the convergence rate through the use of the Bubeck-Lee-Singh geometric lemma, and all three have the same potential that is computable at run-time and exhibits decrease by a factor of $1-\sqrt{\ell/L}$ or better per iteration. One application of these analyses is that they open the possibility of hybrid or intermediate algorithms. One such algorithm is proposed herein and is shown to perform well in computational tests. △ Less

Submitted 9 January, 2019; v1 submitted 27 December, 2017; originally announced December 2017.

arXiv:1711.06352 [pdf, ps, other]

Numerical time integration of lumped parameter systems governed by implicit constitutive relations

Authors: Saeid Karimi

Abstract: Time-integration for lumped parameter systems obeying implicit Bingham-Kelvin constitutive models is studied. The governing system of equations describing the lumped parameter system is a non-linear differential-algebraic equation and needs to be solved numerically. The response of this system is non-smooth and the kinematic variables can not be written in terms of the dynamic variables, explicitl… ▽ More Time-integration for lumped parameter systems obeying implicit Bingham-Kelvin constitutive models is studied. The governing system of equations describing the lumped parameter system is a non-linear differential-algebraic equation and needs to be solved numerically. The response of this system is non-smooth and the kinematic variables can not be written in terms of the dynamic variables, explicitly. To gain insight into numerical time-integration of this system, a new time-integration scheme based on the trapezoidal method is derived. This method relies on two independent parameters to adjust for dam** and is stable. Numerical examples showcase the performance of the proposed time-integration method and compare it to a benchmark algorithm. Under this scheme, implicit-explicit integration of the governing equations is possible. Using this new method, the limitations of the trapezoidal time-integration methods when applied to a non-smooth differential-algebraic equation are highlighted. △ Less

Submitted 16 November, 2017; originally announced November 2017.

arXiv:1706.01945 [pdf, other]

doi 10.1007/s11128-019-2213-x

Practical Integer-to-Binary Map** for Quantum Annealers

Authors: Sahar Karimi, Pooya Ronagh

Abstract: Recent advancements in quantum annealing hardware and numerous studies in this area suggests that quantum annealers have the potential to be effective in solving unconstrained binary quadratic programming problems. Naturally, one may desire to expand the application domain of these machines to problems with general discrete variables. In this paper, we explore the possibility of employing quantum… ▽ More Recent advancements in quantum annealing hardware and numerous studies in this area suggests that quantum annealers have the potential to be effective in solving unconstrained binary quadratic programming problems. Naturally, one may desire to expand the application domain of these machines to problems with general discrete variables. In this paper, we explore the possibility of employing quantum annealers to solve unconstrained quadratic programming problems over a bounded integer domain. We present an approach for encoding integer variables into binary ones, thereby representing unconstrained integer quadratic programming problems as unconstrained binary quadratic programming problems. To respect some of the limitations of the currently developed quantum annealers, we propose an integer encoding, named bounded- coefficient encoding, in which we limit the size of the coefficients that appear in the encoding. Furthermore, we propose an algorithm for finding the upper bound on the coefficients of the encoding using the precision of the machine and the coefficients of the original integer problem. Finally, we experimentally show that this approach is far more resilient to the noise of the quantum annealers compared to traditional approaches for the encoding of integers in base two. △ Less

Submitted 6 June, 2017; originally announced June 2017.

Journal ref: Quantum Information Processing, Vol. 18, No. 4, 94 (2019)

arXiv:1612.04314 [pdf, other]

doi 10.1103/PhysRevB.96.045301

Spin precession and spin waves in a chiral electron gas: beyond Larmor's theorem

Authors: Shahrzad Karimi, Florent Baboux, Florent Perez, Carsten A. Ullrich, Grzegorz Karczewski, Tomasz Wojtowicz

Abstract: Larmor's theorem holds for magnetic systems that are invariant under spin rotation. In the presence of spin-orbit coupling this invariance is lost and Larmor's theorem is broken: for systems of interacting electrons, this gives rise to a subtle interplay between the spin-orbit coupling acting on individual single-particle states and Coulomb many-body effects. We consider a quasi-two-dimensional, p… ▽ More Larmor's theorem holds for magnetic systems that are invariant under spin rotation. In the presence of spin-orbit coupling this invariance is lost and Larmor's theorem is broken: for systems of interacting electrons, this gives rise to a subtle interplay between the spin-orbit coupling acting on individual single-particle states and Coulomb many-body effects. We consider a quasi-two-dimensional, partially spin-polarized electron gas in a semiconductor quantum well in the presence of Rashba and Dresselhaus spin-orbit coupling. Using a linear-response approach based on time-dependent density-functional theory, we calculate the dispersions of spin-flip waves. We obtain analytic results for small wave vectors and up to second order in the Rashba and Dresselhaus coupling strengths $α$ and $β$. Comparison with experimental data from inelastic light scattering allows us to extract $α$ and $β$ as well as the spin-wave stiffness very accurately. We find significant deviations from the local density approximation for spin-dependent electron systems. △ Less

Submitted 13 December, 2016; originally announced December 2016.

Comments: 11 pages, 7 figures

Journal ref: Phys. Rev. B 96, 045301 (2017)

arXiv:1605.09462 [pdf, other]

A Subgradient Approach for Constrained Binary Optimization via Quantum Adiabatic Evolution

Authors: Sahar Karimi, Pooya Ronagh

Abstract: An earlier work [18] proposes a method for solving the Lagrangian dual of a constrained binary quadratic programming problem via quantum adiabatic evolution using an outer approximation method. This should be an efficient prescription for solving the Lagrangian dual problem in the presence of an ideally noise-free quantum adiabatic system. However, current implementations of quantum annealing syst… ▽ More An earlier work [18] proposes a method for solving the Lagrangian dual of a constrained binary quadratic programming problem via quantum adiabatic evolution using an outer approximation method. This should be an efficient prescription for solving the Lagrangian dual problem in the presence of an ideally noise-free quantum adiabatic system. However, current implementations of quantum annealing systems demand methods that are efficient at handling possible sources of noise. In this paper, we consider a subgradient method for finding an optimal primal-dual pair for the Lagrangian dual of a constrained binary polynomial programming problem. We then study the quadratic stable set (QSS) problem as a case study. We see that this method applied to the QSS problem can be viewed as an instance-dependent penalty-term approach that avoids large penalty coefficients. Finally, we report our experimental results of using the D-Wave 2X quantum annealer and conclude that our approach helps this quantum processor to succeed more often in solving these problems compared to the usual penalty-term approaches. △ Less

Submitted 25 January, 2017; v1 submitted 30 May, 2016; originally announced May 2016.

Journal ref: Quantum Information Processing, Vol. 16, No. 8, 185 (2017)

arXiv:1605.00320 [pdf, ps, other]

A unified convergence bound for conjugate gradient and accelerated gradient

Authors: Sahar Karimi, Stephen A. Vavasis

Abstract: Nesterov's accelerated gradient method for minimizing a smooth strongly convex function $f$ is known to reduce $f(\x_k)-f(\x^*)$ by a factor of $\eps\in(0,1)$ after $k\ge O(\sqrt{L/\ell}\log(1/\eps))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation.… ▽ More Nesterov's accelerated gradient method for minimizing a smooth strongly convex function $f$ is known to reduce $f(\x_k)-f(\x^*)$ by a factor of $\eps\in(0,1)$ after $k\ge O(\sqrt{L/\ell}\log(1/\eps))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. The method of linear conjugate gradients (CG) also satisfies the same complexity bound in the special case of strongly convex quadratic functions, but in this special case it is faster than the accelerated gradient method. Despite similarities in the algorithms and their asymptotic convergence rates, the conventional analyses of the two methods are nearly disjoint. The purpose of this note is provide a single quantity that decreases on every step at the correct rate for both algorithms. Our unified bound is based on a potential similar to the potential in Nesterov's original analysis. As a side benefit of this analysis, we provide a direct proof that conjugate gradient converges in $O(\sqrt{L/\ell}\log(1/\eps))$ iterations. In contrast, the traditional indirect proof first establishes this result for the Chebyshev algorithm, and then relies on optimality of conjugate gradient to show that its iterates are at least as good as Chebyshev iterates. To the best of our knowledge, ours is the first direct proof of the convergence rate of linear conjugate gradient in the literature. △ Less

Submitted 1 May, 2016; originally announced May 2016.

MSC Class: 90C25 ACM Class: G.1.6

arXiv:1601.02708 [pdf, ps, other]

doi 10.1016/j.cma.2017.05.016

A hybrid multi-time-step framework for pore-scale and continuum-scale modeling of solute transport in porous media

Authors: S. Karimi, K. B. Nakshatrala

Abstract: In this paper, we propose a computational framework,which is based on a domain decomposition technique, to employ both finite element method (which is a popular continuum modeling approach) and lattice Boltzmann method (which is a popular pore-scale modeling approach) in the same computational domain. To bridge the gap across the disparate length and time-scales, we first propose a new method to e… ▽ More In this paper, we propose a computational framework,which is based on a domain decomposition technique, to employ both finite element method (which is a popular continuum modeling approach) and lattice Boltzmann method (which is a popular pore-scale modeling approach) in the same computational domain. To bridge the gap across the disparate length and time-scales, we first propose a new method to enforce continuum-scale boundary conditions (i.e., Dirichlet and Neumann boundary conditions) onto the numerical solution from the lattice Boltzmann method. This method are based on maximization of entropy and preserve the non-negativity of discrete distributions under the lattice Boltzmann method. The proposed computational framework allows different grid sizes, orders of interpolation, and time-steps in different subdomains. This allows for different desired resolutions in the numerical solution in different subdomains. Through numerical experiments, the effect of grid and time-step refinement, disparity of time-steps in different subdomains, domain partitioning, and the number of iteration steps on the accuracy and rate of convergence of the proposed methodology are studied. Finally, to showcase the performance of this framework in porous media applications, we use it to simulate the dissolution of calcium carbonate in a porous structure. △ Less

Submitted 16 January, 2016; v1 submitted 11 January, 2016; originally announced January 2016.

arXiv:1504.06936 [pdf, other]

Concept Extraction to Identify Adverse Drug Reactions in Medical Forums: A Comparison of Algorithms

Authors: Alejandro Metke-Jimenez, Sarvnaz Karimi

Abstract: Social media is becoming an increasingly important source of information to complement traditional pharmacovigilance methods. In order to identify signals of potential adverse drug reactions, it is necessary to first identify medical concepts in the social media text. Most of the existing studies use dictionary-based methods which are not evaluated independently from the overall signal detection t… ▽ More Social media is becoming an increasingly important source of information to complement traditional pharmacovigilance methods. In order to identify signals of potential adverse drug reactions, it is necessary to first identify medical concepts in the social media text. Most of the existing studies use dictionary-based methods which are not evaluated independently from the overall signal detection task. We compare different approaches to automatically identify and normalise medical concepts in consumer reviews in medical forums. Specifically, we implement several dictionary-based methods popular in the relevant literature, as well as a method we suggest based on a state-of-the-art machine learning method for entity recognition. MetaMap, a popular biomedical concept extraction tool, is used as a baseline. Our evaluations were performed in a controlled setting on a common corpus which is a collection of medical forum posts annotated with concepts and linked to controlled vocabularies such as MedDRA and SNOMED CT. To our knowledge, our study is the first to systematically examine the effect of popular concept extraction methods in the area of signal detection for adverse reactions. We show that the choice of algorithm or controlled vocabulary has a significant impact on concept extraction, which will impact the overall signal detection process. We also show that our proposed machine learning approach significantly outperforms all the other methods in identification of both adverse reactions and drugs, even when trained with a relatively small set of annotated text. △ Less

Submitted 27 April, 2015; originally announced April 2015.

arXiv:1503.08360 [pdf, ps, other]

doi 10.4208/cicp.181015.270416a

Do current lattice Boltzmann methods for diffusion and diffusion-type equations respect maximum principles and the non-negative constraint?

Authors: S. Karimi, K. B. Nakshatrala

Abstract: The lattice Boltzmann method (LBM) has established itself as a valid numerical method in computational fluid dynamics. Recently, multiple-relaxation-time LBM has been proposed to simulate anisotropic advection-diffusion processes. The governing differential equations of advective-diffusive systems are known to satisfy maximum principles, comparison principles, the non-negative constraint, and the… ▽ More The lattice Boltzmann method (LBM) has established itself as a valid numerical method in computational fluid dynamics. Recently, multiple-relaxation-time LBM has been proposed to simulate anisotropic advection-diffusion processes. The governing differential equations of advective-diffusive systems are known to satisfy maximum principles, comparison principles, the non-negative constraint, and the decay property. In this paper, it will be shown that current single- and multiple-relaxation-time lattice Boltzmann methods fail to preserve these mathematical properties for transient diffusion-type equations. It will also be shown that the discretization of Dirichlet boundary conditions will affect the performance of lattice Boltzmann methods in meeting these mathematical principles. A new way of discretizing the Dirichlet boundary conditions is also proposed. Several benchmark problems have been solved to illustrate the performance of lattice Boltzmann methods and the effect of discretization of boundary conditions with respect to the aforementioned mathematical properties for transient diffusion and advection-diffusion equations. △ Less

Submitted 9 April, 2015; v1 submitted 28 March, 2015; originally announced March 2015.

arXiv:1410.6745 [pdf, ps, other]

doi 10.1103/PhysRevB.90.245304

Three- to two-dimensional crossover in time-dependent density-functional theory

Authors: Shahrzad Karimi, Carsten A. Ullrich

Abstract: Quasi-two-dimensional (2D) systems, such as an electron gas confined in a quantum well, are important model systems for many-body theories. Earlier studies of the crossover from 3D to 2D in ground-state density-functional theory showed that local and semilocal exchange-correlation functionals which are based on the 3D electron gas are appropriate for wide quantum wells, but eventually break down a… ▽ More Quasi-two-dimensional (2D) systems, such as an electron gas confined in a quantum well, are important model systems for many-body theories. Earlier studies of the crossover from 3D to 2D in ground-state density-functional theory showed that local and semilocal exchange-correlation functionals which are based on the 3D electron gas are appropriate for wide quantum wells, but eventually break down as the 2D limit is approached. We now consider the dynamical case and study the performance of various linear-response exchange kernels in time-dependent density-functional theory. We compare approximate local, semilocal and orbital-dependent exchange kernels, and analyze their performance for inter- and intrasubband plasmons as the quantum wells approach the 2D limit. 3D (semi)local exchange functionals are found to fail for quantum well widths comparable to the 2D Wigner-Seitz radius, which implies in practice that 3D local exchange remains valid in the quasi-2D dynamical regime for typical quantum well parameters, except for very low densities. △ Less

Submitted 24 October, 2014; originally announced October 2014.

Comments: 13 pages, 9 figures

Showing 1–50 of 56 results for author: Karimi, S