Skip to main content

Showing 1–50 of 69 results for author: Greene, D

.
  1. arXiv:2406.04244  [pdf, other

    cs.CL

    Benchmark Data Contamination of Large Language Models: A Survey

    Authors: Cheng Xu, Shuhao Guan, Derek Greene, M-Tahar Kechadi

    Abstract: The rapid development of Large Language Models (LLMs) like GPT-4, Claude-3, and Gemini has transformed the field of natural language processing. However, it has also resulted in a significant issue known as Benchmark Data Contamination (BDC). This occurs when language models inadvertently incorporate evaluation benchmark information from their training data, leading to inaccurate or unreliable per… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 31 pages, 7 figures, 3 tables

  2. arXiv:2406.03921  [pdf, other

    cs.SI

    Knowledge Transfer, Knowledge Gaps, and Knowledge Silos in Citation Networks

    Authors: Eoghan Cunningham, Derek Greene

    Abstract: The advancement of science relies on the exchange of ideas across disciplines and the integration of diverse knowledge domains. However, tracking knowledge flows and interdisciplinary integration in rapidly evolving, multidisciplinary fields remains a significant challenge. This work introduces a novel network analysis framework to study the dynamics of knowledge transfer directly from citation da… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2403.19011  [pdf, other

    q-bio.QM cs.LG

    Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models

    Authors: Alan D. Kaplan, Priyadip Ray, John D. Greene, Vincent X. Liu

    Abstract: In the dynamic hospital setting, decision support can be a valuable tool for improving patient outcomes. Data-driven inference of future outcomes is challenging in this dynamic setting, where long sequences such as laboratory tests and medications are updated frequently. This is due in part to heterogeneity of data types and mixed-sequence types contained in variable length sequences. In this work… ▽ More

    Submitted 24 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  4. arXiv:2401.11198  [pdf, other

    cs.IR

    A Deep Learning Approach for Selective Relevance Feedback

    Authors: Suchana Datta, Debasis Ganguly, Sean MacAvaney, Derek Greene

    Abstract: Pseudo-relevance feedback (PRF) can enhance average retrieval effectiveness over a sufficiently large number of queries. However, PRF often introduces a drift into the original information need, thus hurting the retrieval effectiveness of several queries. While a selective application of PRF can potentially alleviate this issue, previous approaches have largely relied on unsupervised or feature-ba… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  5. arXiv:2401.00743  [pdf, ps, other

    q-bio.PE

    Walsh coefficients and circuits for several alleles

    Authors: Kristina Crona, Devin Greene

    Abstract: Walsh coefficients have been applied extensively to biallelic systems for quantifying pairwise and higher order epistasis, in particular for demonstrating the empirical importance of higher order interactions. Circuits, or minimal dependence relations, and related approaches that use triangulations of polytopes have also been applied to biallelic systems. Here we provide biological interpretations… ▽ More

    Submitted 2 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

    Comments: 13 pages

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2312.10463  [pdf, other

    cs.IR

    RecPrompt: A Prompt Tuning Framework for News Recommendation Using Large Language Models

    Authors: Dairui Liu, Boming Yang, Honghui Du, Derek Greene, Aonghus Lawlor, Ruihai Dong, Irene Li

    Abstract: In the evolving field of personalized news recommendation, understanding the semantics of the underlying data is crucial. Large Language Models (LLMs) like GPT-4 have shown promising performance in understanding natural language. However, the extent of their applicability in news recommendation systems remains to be validated. This paper introduces RecPrompt, the first framework for news recommend… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 8 pages, 3 figures, and 8 tables

  8. arXiv:2311.16925  [pdf, ps, other

    q-bio.QM q-bio.PE

    Multiallelic Walsh transforms

    Authors: Devin Greene

    Abstract: A closed formula multiallelic Walsh (or Hadamard) transform is introduced. Basic results are derived, and a statistical interpretation of some of the resulting linear forms is discussed.

    Submitted 28 November, 2023; originally announced November 2023.

  9. arXiv:2311.02177  [pdf, ps, other

    q-bio.PE

    A Primer for the Walsh Transform

    Authors: Devin Greene

    Abstract: A mathematical development of the Walsh transform, Walsh basis, and Walsh coefficients is given. The author was prompted to write this by a wish to give a unified treatment of epistatic coordinates as they are used in evolutionary biology. At the end of the article, opinions are expressed regarding the usefulness of these concepts for the practical researcher.

    Submitted 3 November, 2023; originally announced November 2023.

  10. arXiv:2309.14984  [pdf, other

    cs.IR cs.DL

    The Role of Document Embedding in Research Paper Recommender Systems: To Breakdown or to Bolster Disciplinary Borders?

    Authors: Eoghan Cunningham, Derek Greene, Barry Smyth

    Abstract: In the extensive recommender systems literature, novelty and diversity have been identified as key properties of useful recommendations. However, these properties have received limited attention in the specific sub-field of research paper recommender systems. In this work, we argue for the importance of offering novel and diverse research paper recommendations to scientists. This approach aims to… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Under Review at Scientometrics

  11. Handwriting Analysis on the Diaries of Rosamond Jacob

    Authors: Sharmistha S. Sawant, Saloni D. Thakare, Derek Greene, Gerardine Meaney, Alan F. Smeaton

    Abstract: Handwriting is an art form that most people learn at an early age. Each person's writing style is unique with small changes as we grow older and as our mood changes. Here we analyse handwritten text in a culturally significant personal diary. We compare changes in handwriting and relate this to the sentiment of the written material and to the topic of diary entries. We identify handwritten text fr… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: International Conference on Content-based Multimedia Indexing, September 20--22, 2023, Orleans, France

  12. Curatr: A Platform for Semantic Analysis and Curation of Historical Literary Texts

    Authors: Susan Leavy, Gerardine Meaney, Karen Wade, Derek Greene

    Abstract: The increasing availability of digital collections of historical and contemporary literature presents a wealth of possibilities for new research in the humanities. The scale and diversity of such collections however, presents particular challenges in identifying and extracting relevant content. This paper presents Curatr, an online platform for the exploration and curation of literature with machi… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 12 pages

    Journal ref: Metadata and Semantic Research (MTSR 2019), Communications in Computer and Information Science, vol 1057. Springer, Cham

  13. arXiv:2306.07506  [pdf, other

    cs.IR

    Topic-Centric Explanations for News Recommendation

    Authors: Dairui Liu, Derek Greene, Irene Li, Xuefei Jiang, Ruihai Dong

    Abstract: News recommender systems (NRS) have been widely applied for online news websites to help users find relevant articles based on their interests. Recent methods have demonstrated considerable success in terms of recommendation performance. However, the lack of explanation for these recommendations can lead to mistrust among users and lack of acceptance of recommendations. To address this issue, we p… ▽ More

    Submitted 6 October, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: 20 pages

  14. arXiv:2304.00310  [pdf, other

    cs.IR

    On the Feasibility and Robustness of Pointwise Evaluation of Query Performance Prediction

    Authors: Suchana Datta, Debasis Ganguly, Derek Greene, Mandar Mitra

    Abstract: Despite the retrieval effectiveness of queries being mutually independent of one another, the evaluation of query performance prediction (QPP) systems has been carried out by measuring rank correlation over an entire set of queries. Such a listwise approach has a number of disadvantages, notably that it does not support the common requirement of assessing QPP for individual queries. In this paper,… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  15. arXiv:2303.08954  [pdf, other

    cs.CL

    PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs

    Authors: Rahul Goel, Waleed Ammar, Aditya Gupta, Siddharth Vashishtha, Motoki Sano, Faiz Surani, Max Chang, HyunJeong Choe, David Greene, Kyle He, Rattima Nitisaroj, Anna Trukhina, Shachi Paul, Pararth Shah, Rushin Shah, Zhou Yu

    Abstract: Research interest in task-oriented dialogs has increased as systems such as Google Assistant, Alexa and Siri have become ubiquitous in everyday life. However, the impact of academic research in this area has been limited by the lack of datasets that realistically capture the wide array of user pain points. To enable research on some of the more challenging aspects of parsing realistic conversation… ▽ More

    Submitted 16 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: PRESTO v1 Release

  16. Graph Embedding for Map** Interdisciplinary Research Networks

    Authors: Eoghan Cunningham, Derek Greene

    Abstract: Representation learning is the first step in automating tasks such as research paper recommendation, classification, and retrieval. Due to the accelerating rate of research publication, together with the recognised benefits of interdisciplinary research, systems that facilitate researchers in discovering and understanding relevant works from beyond their immediate school of knowledge are vital. Th… ▽ More

    Submitted 20 March, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

  17. arXiv:2212.08733  [pdf, other

    cs.LG cs.AI cs.CV

    Counterfactual Explanations for Misclassified Images: How Human and Machine Explanations Differ

    Authors: Eoin Delaney, Arjun Pakrashi, Derek Greene, Mark T. Keane

    Abstract: Counterfactual explanations have emerged as a popular solution for the eXplainable AI (XAI) problem of elucidating the predictions of black-box deep-learning systems due to their psychological validity, flexibility across problem domains and proposed legal compliance. While over 100 counterfactual methods exist, claiming to generate plausible explanations akin to those preferred by people, few hav… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  18. arXiv:2206.03159  [pdf, other

    cs.SI

    The Structure of Interdisciplinary Science: Uncovering and Explaining Roles in Citation Graphs

    Authors: Eoghan Cunningham, Derek Greene

    Abstract: Role discovery is the task of dividing the set of nodes on a graph into classes of structurally similar roles. Modern strategies for role discovery typically rely on graph embedding techniques, which are capable of recognising complex local structures. However, when working with large, real-world networks, it is difficult to interpret or validate a set of roles identified according to these method… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: submitted to the international conference on complex networks and their applications

  19. arXiv:2204.07292  [pdf, other

    cs.LG q-bio.QM

    Unsupervised Probabilistic Models for Sequential Electronic Health Records

    Authors: Alan D. Kaplan, John D. Greene, Vincent X. Liu, Priyadip Ray

    Abstract: We develop an unsupervised probabilistic model for heterogeneous Electronic Health Record (EHR) data. Utilizing a mixture model formulation, our approach directly models sequences of arbitrary length, such as medications and laboratory results. This allows for subgrou** and incorporation of the dynamics underlying heterogeneous data types. The model consists of a layered set of latent variables… ▽ More

    Submitted 31 August, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

  20. arXiv:2203.12504  [pdf, other

    cs.DL cs.SI

    Author Multidisciplinarity and Disciplinary Roles in Field of Study Networks

    Authors: Eoghan Cunningham, Barry Smyth, Derek Greene

    Abstract: When studying large research corpora, "distant reading" methods are vital to understand the topics and trends in the corresponding research space. In particular, given the recognised benefits of multidisciplinary research, it may be important to map schools or communities of diverse research topics, and to understand the multidisciplinary role that topics play within and between these communities.… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

  21. Assessing Network Representations for Identifying Interdisciplinarity

    Authors: Eoghan Cunningham, Derek Greene

    Abstract: Many studies have sought to identify interdisciplinary research as a function of the diversity of disciplines identified in an article's references or citations. However, given the constant evolution of the scientific landscape, disciplinary boundaries are shifting and blurring, making it increasingly difficult to describe research within a strict taxonomy. In this work, we explore the potential f… ▽ More

    Submitted 8 April, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

  22. A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification

    Authors: Dairui Liu, Derek Greene, Ruihai Dong

    Abstract: Many recent deep learning-based solutions have widely adopted the attention-based mechanism in various tasks of the NLP discipline. However, the inherent characteristics of deep learning models and the flexibility of the attention mechanism increase the models' complexity, thus leading to challenges in model explainability. In this paper, to address this challenge, we propose a novel practical fra… ▽ More

    Submitted 27 October, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Findings of ACL2022

  23. arXiv:2202.07376  [pdf, other

    cs.IR

    Deep-QPP: A Pairwise Interaction-based Deep Learning Model for Supervised Query Performance Prediction

    Authors: Suchana Datta, Debasis Ganguly, Derek Greene, Mandar Mitra

    Abstract: Motivated by the recent success of end-to-end deep neural models for ranking tasks, we present here a supervised end-to-end neural approach for query performance prediction (QPP). In contrast to unsupervised approaches that rely on various statistics of document score distributions, our approach is entirely data-driven. Further, in contrast to weakly supervised approaches, our method also does not… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  24. arXiv:2202.06306  [pdf, ps, other

    cs.IR

    An Analysis of Variations in the Effectiveness of Query Performance Prediction

    Authors: Debasis Ganguly, Suchana Datta, Mandar Mitra, Derek Greene

    Abstract: A query performance predictor estimates the retrieval effectiveness of an IR system for a given query. An important characteristic of QPP evaluation is that, since the ground truth retrieval effectiveness for QPP evaluation can be measured with different metrics, the ground truth itself is not absolute, which is in contrast to other retrieval tasks, such as that of ad-hoc retrieval. Motivated by t… ▽ More

    Submitted 13 February, 2022; originally announced February 2022.

  25. Collaboration in the Time of COVID: A Scientometric Analysis of Multidisciplinary SARS-CoV-2 Research

    Authors: Eoghan Cunningham, Barry Smyth, Derek Greene

    Abstract: The novel coronavirus SARS-CoV-2 and the COVID-19 illness it causes have inspired unprecedented levels of multidisciplinary research in an effort to address a generational public health challenge. In this work we conduct a scientometric analysis of COVID-19 research, paying particular attention to the nature of collaboration that this pandemic has fostered among different disciplines. Increased mu… ▽ More

    Submitted 30 August, 2021; originally announced August 2021.

    Comments: Submitted to Humanities and Social Sciences Communications: accepted pending minor revisions

    Journal ref: Humanit Soc Sci Commun 8, 240 (2021)

  26. arXiv:2107.09734  [pdf, other

    cs.LG cs.AI

    Uncertainty Estimation and Out-of-Distribution Detection for Counterfactual Explanations: Pitfalls and Solutions

    Authors: Eoin Delaney, Derek Greene, Mark T. Keane

    Abstract: Whilst an abundance of techniques have recently been proposed to generate counterfactual explanations for the predictions of opaque black-box systems, markedly less attention has been paid to exploring the uncertainty of these generated explanations. This becomes a critical issue in high-stakes scenarios, where uncertain and misleading explanations could have dire consequences (e.g., medical diagn… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    Journal ref: ICML Workshop on Algorithmic Recourse, July 2021

  27. arXiv:2104.14461  [pdf

    cs.AI

    Twin Systems for DeepCBR: A Menagerie of Deep Learning and Case-Based Reasoning Pairings for Explanation and Data Augmentation

    Authors: Mark T Keane, Eoin M Kenny, Mohammed Temraz, Derek Greene, Barry Smyth

    Abstract: Recently, it has been proposed that fruitful synergies may exist between Deep Learning (DL) and Case Based Reasoning (CBR); that there are insights to be gained by applying CBR ideas to problems in DL (what could be called DeepCBR). In this paper, we report on a program of research that applies CBR solutions to the problem of Explainable AI (XAI) in the DL. We describe a series of twin-systems pai… ▽ More

    Submitted 13 June, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: 7 pages,4 figures, 2 tables

    Journal ref: IJCAI-21 Workshop on DL-CBR-AML, July 2021

  28. arXiv:2009.13211  [pdf, other

    cs.LG stat.ML

    Instance-based Counterfactual Explanations for Time Series Classification

    Authors: Eoin Delaney, Derek Greene, Mark T. Keane

    Abstract: In recent years, there has been a rapidly expanding focus on explaining the predictions made by black-box AI systems that handle image and tabular data. However, considerably less attention has been paid to explaining the predictions of opaque AI systems handling time series data. In this paper, we advance a novel model-agnostic, case-based technique -- Native Guide -- that generates counterfactua… ▽ More

    Submitted 24 June, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

  29. arXiv:2008.05223  [pdf, other

    physics.med-ph cs.LG eess.IV

    Bone Segmentation in Contrast Enhanced Whole-Body Computed Tomography

    Authors: Patrick Leydon, Martin O'Connell, Derek Greene, Kathleen M Curran

    Abstract: Segmentation of bone regions allows for enhanced diagnostics, disease characterisation and treatment monitoring in CT imaging. In contrast enhanced whole-body scans accurate automatic segmentation is particularly difficult as low dose whole body protocols reduce image quality and make contrast enhanced regions more difficult to separate when relying on differences in pixel intensities. This paper… ▽ More

    Submitted 13 August, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: 15 pages, 10 figures and 3 tables. Submitted to The Journal of Physics in Medicine and Biology for possible publication

  30. arXiv:2005.06898  [pdf, other

    cs.CL cs.LG

    Mitigating Gender Bias in Machine Learning Data Sets

    Authors: Susan Leavy, Gerardine Meaney, Karen Wade, Derek Greene

    Abstract: Artificial Intelligence has the capacity to amplify and perpetuate societal biases and presents profound ethical implications for society. Gender bias has been identified in the context of employment advertising and recruitment tools, due to their reliance on underlying language processing and recommendation algorithms. Attempts to address such issues have involved testing learned associations, in… ▽ More

    Submitted 18 May, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as part of the ECIR Conference) - http://bias.disim.univaq.it

  31. arXiv:1910.05851  [pdf, other

    stat.ME stat.AP stat.ML

    Nonstationary Multivariate Gaussian Processes for Electronic Health Records

    Authors: Rui Meng, Braden Soper, Herbert Lee, Vincent X. Liu, John D. Greene, Priyadip Ray

    Abstract: We propose multivariate nonstationary Gaussian processes for jointly modeling multiple clinical variables, where the key parameters, length-scales, standard deviations and the correlations between the observed output, are all time dependent. We perform posterior inference via Hamiltonian Monte Carlo (HMC). We also provide methods for obtaining computationally efficient gradient-based maximum a pos… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  32. arXiv:1908.05192  [pdf, other

    cs.SI cs.LG

    Temporal Analysis of Reddit Networks via Role Embeddings

    Authors: Siobhan Grayson, Derek Greene

    Abstract: Inspired by diachronic word analysis from the field of natural language processing, we propose an approach for uncovering temporal insights regarding user roles from social networks using graph embedding methods. Specifically, we apply the role embedding algorithm, struc2vec, to a collection of social networks exhibiting either "loyal" or "vagrant" characteristics derived from the popular online s… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

  33. Fuel Economy Gaps Within & Across Garages: A Bivariate Random Parameters Seemingly Unrelated Regression Approach

    Authors: Behram Wali, Asad Khattak, David Greene, Jun Liu

    Abstract: The key objective of this study is to investigate the interrelationship between fuel economy gaps and to quantify the differential effects of several factors on fuel economy gaps of vehicles operated by the same garage. By using a unique fuel economy database (fueleconomy.gov), users self-reported fuel economy estimates and government fuel economy ratings are analyzed for more than 7000 garages ac… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: Fuel economy gap, two-vehicles, garage, My MPG, On-road & test cycle estimates, Random parameters, Seemingly unrelated regression estimation

    Journal ref: International Journal of Sustainable Transportation, 1-16 (2018)

  34. arXiv:1810.05511  [pdf, other

    cs.SI physics.soc-ph

    Semi-Supervised Overlap** Community Finding based on Label Propagation with Pairwise Constraints

    Authors: Elham Alghamdi, Derek Greene

    Abstract: Algorithms for detecting communities in complex networks are generally unsupervised, relying solely on the structure of the network. However, these methods can often fail to uncover meaningful grou**s that reflect the underlying communities in the data, particularly when those structures are highly overlap**. One way to improve the usefulness of these algorithms is by incorporating additional… ▽ More

    Submitted 21 November, 2018; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Fix tables

  35. arXiv:1810.03046  [pdf, other

    cs.SI cs.CY

    MeetupNet Dublin: Discovering Communities in Dublin's Meetup Network

    Authors: Arjun Pakrashi, Elham Alghamdi, Brian Mac Namee, Derek Greene

    Abstract: Meetup.com is a global online platform which facilitates the organisation of meetups in different parts of the world. A meetup group typically focuses on one specific topic of interest, such as sports, music, language, or technology. However, many users of this platform attend multiple meetups. On this basis, we can construct a co-membership network for a given location. This network encodes how p… ▽ More

    Submitted 2 November, 2018; v1 submitted 6 October, 2018; originally announced October 2018.

  36. Analyzing within Garage Fuel Economy Gaps to Support Vehicle Purchasing Decisions - A Copula-Based Modeling & Forecasting Approach

    Authors: Behram Wali, David Greene, Asad Khattak, Jun Liu

    Abstract: A key purpose of the U.S. government fuel economy ratings is to provide precise and unbiased fuel economy estimates to assist consumers in their vehicle purchase decisions. For the official fuel economy ratings to be useful, the numbers must be relatively reliable. This study focuses on quantifying the variations of on-road fuel economy relative to official government ratings (fuel economy gap) an… ▽ More

    Submitted 16 August, 2018; originally announced August 2018.

    Journal ref: Transportation Research Part D: Transport and Environment, 63, 186-208 (2018)

  37. arXiv:1801.02736  [pdf, other

    stat.ML q-bio.QM

    Modeling sepsis progression using hidden Markov models

    Authors: Brenden K. Petersen, Michael B. Mayhew, Kalvin O. E. Ogbuefi, John D. Greene, Vincent X. Liu, Priyadip Ray

    Abstract: Characterizing a patient's progression through stages of sepsis is critical for enabling risk stratification and adaptive, personalized treatment. However, commonly used sepsis diagnostic criteria fail to account for significant underlying heterogeneity, both between patients as well as over time in a single patient. We introduce a hidden Markov model of sepsis progression that explicitly accounts… ▽ More

    Submitted 8 January, 2018; originally announced January 2018.

    Comments: Accepted to NIPS ML4H 2017

  38. Computational Analysis for the Rational Design of Anti-Amyloid Beta (ABeta) Antibodies

    Authors: D'Artagnan Greene, Theodora Po, Jennifer Pan, Tanya Tabibian, Ray Luo

    Abstract: Alzheimer's Disease (AD) is a neurodegenerative disorder that lacks effective treatment options. Anti-amyloid beta (ABeta) antibodies are the leading drug candidates to treat AD, but the results of clinical trials have been disappointing. Introducing rational mutations into anti-ABeta antibodies to increase their effectiveness is a way forward, but the path to take is unclear. In this study, we de… ▽ More

    Submitted 22 February, 2018; v1 submitted 4 January, 2018; originally announced January 2018.

  39. arXiv:1710.05212  [pdf

    cs.CY

    On Supporting Digital Journalism: Case Studies in Co-Designing Journalistic Tools

    Authors: Georgiana Ifrim, Derek Greene, Mark T. Keane, Claudia Orellana-Rodriguez, Bichen Shi, Gevorg Poghosyan

    Abstract: Since 2013 researchers at University College Dublin in the Insight Centre for Data Analytics have been involved in a significant research programme in digital journalism, specifically targeting tools and social media guidelines to support the work of journalists. Most of this programme was undertaken in collaboration with The Irish Times. This collaboration involved identifying key problems curren… ▽ More

    Submitted 14 October, 2017; originally announced October 2017.

    Comments: Computation + Journalism Symposium (C+J 2017), October 2017, Northwestern University, Evanston, IL USA

    Journal ref: Computation + Journalism Symposium (C+J 2017), October 2017, Northwestern University, Evanston, IL USA

  40. arXiv:1704.06685  [pdf

    physics.bio-ph

    A Continuum Poisson-Boltzmann Model for Membrane Channel Proteins

    Authors: Li Xiao, Jianxiong Diao, D Artagnan Greene, Junmei Wang, Ray Luo

    Abstract: Membrane proteins constitute a large portion of the human proteome and perform a variety of important functions as membrane receptors, transport proteins, enzymes, signaling proteins, and more. The computational studies of membrane proteins are usually much more complicated than those of globular proteins. Here we propose a new continuum model for Poisson-Boltzmann calculations of membrane channel… ▽ More

    Submitted 21 April, 2017; originally announced April 2017.

    Comments: 40 pages, 6 figures, 5 tables

  41. arXiv:1702.07186  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Stability of Topic Modeling via Matrix Factorization

    Authors: Mark Belford, Brian Mac Namee, Derek Greene

    Abstract: Topic models can provide us with an insight into the underlying latent structure of a large corpus of documents. A range of methods have been proposed in the literature, including probabilistic topic models and techniques based on matrix factorization. However, in both cases, standard implementations rely on stochastic elements in their initialization phase, which can potentially lead to different… ▽ More

    Submitted 9 September, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

  42. arXiv:1702.06891  [pdf, other

    cs.CL

    EVE: Explainable Vector Based Embedding Technique Using Wikipedia

    Authors: M. Atif Qureshi, Derek Greene

    Abstract: We present an unsupervised explainable word embedding technique, called EVE, which is built upon the structure of Wikipedia. The proposed model defines the dimensions of a semantic vector representing a word using human-readable labels, thereby it readily interpretable. Specifically, each vector is constructed using the Wikipedia category graph structure together with the Wikipedia article link st… ▽ More

    Submitted 22 February, 2017; originally announced February 2017.

  43. arXiv:1607.03055  [pdf, other

    cs.CL cs.CY

    Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach

    Authors: Derek Greene, James P. Cross

    Abstract: This study analyzes the political agenda of the European Parliament (EP) plenary, how it has evolved over time, and the manner in which Members of the European Parliament (MEPs) have reacted to external and internal stimuli when making plenary speeches. To unveil the plenary agenda and detect latent themes in legislative speeches over time, MEP speech content is analyzed using a new dynamic topic… ▽ More

    Submitted 11 July, 2016; originally announced July 2016.

    Comments: Long version including appendix. arXiv admin note: substantial text overlap with arXiv:1505.07302

  44. arXiv:1601.02975  [pdf, other

    cs.CY cs.AI

    Indicators of Good Student Performance in Moodle Activity Data

    Authors: Ewa Młynarska, Derek Greene, Pádraig Cunningham

    Abstract: In this paper we conduct an analysis of Moodle activity data focused on identifying early predictors of good student performance. The analysis shows that three relevant hypotheses are largely supported by the data. These hypotheses are: early submission is a good sign, a high level of activity is predictive of good results and evening activity is even better than daytime activity. We highlight som… ▽ More

    Submitted 12 January, 2016; originally announced January 2016.

    Comments: Short version

  45. arXiv:1508.01067  [pdf, other

    cs.CL cs.IR

    Topic Stability over Noisy Sources

    Authors: **g Su, Oisín Boydell, Derek Greene, Gerard Lynch

    Abstract: Topic modelling techniques such as LDA have recently been applied to speech transcripts and OCR output. These corpora may contain noisy or erroneous texts which may undermine topic stability. Therefore, it is important to know how well a topic modelling algorithm will perform when applied to noisy data. In this paper we show that different types of textual noise will have diverse effects on the st… ▽ More

    Submitted 5 August, 2015; originally announced August 2015.

  46. Modeling Indications of Technology in Planetary Transit Light Curves -- Dark-side illumination

    Authors: Eric J. Korpela, Shauna M. Sallmen, Diana Leystra Greene

    Abstract: We analyze potential effects of an extraterrestrial civilization's use of orbiting mirrors to illuminate the dark side of a synchronously rotating planet on planetary transit light curves. Previous efforts to detect civilizations based on side effects of planetary-scale engineering have focused on structures affecting the host star output (e.g. Dyson spheres). However, younger civilizations are li… ▽ More

    Submitted 6 June, 2018; v1 submitted 27 May, 2015; originally announced May 2015.

    Comments: 14 pages, 12 figures, some color, rev 3 corrects arithmetic error in cost of launching a mirror fleet

    Journal ref: ApJ 809, 139 (2015)

  47. arXiv:1505.07302  [pdf, other

    cs.CL cs.CY

    Unveiling the Political Agenda of the European Parliament Plenary: A Topical Analysis

    Authors: Derek Greene, James P. Cross

    Abstract: This study analyzes political interactions in the European Parliament (EP) by considering how the political agenda of the plenary sessions has evolved over time and the manner in which Members of the European Parliament (MEPs) have reacted to external and internal stimuli when making Parliamentary speeches. It does so by considering the context in which speeches are made, and the content of those… ▽ More

    Submitted 7 July, 2015; v1 submitted 27 May, 2015; originally announced May 2015.

    Comments: Add link to implementation code on Github

  48. arXiv:1502.04609  [pdf, other

    cs.IR cs.HC

    TextLuas: Tracking and Visualizing Document and Term Clusters in Dynamic Text Data

    Authors: Derek Greene, Daniel Archambault, Václav Belák, Pádraig Cunningham

    Abstract: For large volumes of text data collected over time, a key knowledge discovery task is identifying and tracking clusters. These clusters may correspond to emerging themes, popular topics, or breaking news stories in a corpus. Therefore, recently there has been increased interest in the problem of clustering dynamic data. However, there exists little support for the interactive exploration of the ou… ▽ More

    Submitted 3 November, 2014; originally announced February 2015.

    Comments: 21 page version

  49. arXiv:1407.7736  [pdf, ps, other

    cs.SI cs.CL cs.CY physics.soc-ph

    A Latent Space Analysis of Editor Lifecycles in Wikipedia

    Authors: Xiangju Qin, Derek Greene, Pádraig Cunningham

    Abstract: Collaborations such as Wikipedia are a key part of the value of the modern Internet. At the same time there is concern that these collaborations are threatened by high levels of member turnover. In this paper we borrow ideas from topic analysis to editor activity on Wikipedia over time into a latent space that offers an insight into the evolving patterns of editor behavior. This latent space repre… ▽ More

    Submitted 29 July, 2014; originally announced July 2014.

    Comments: 16 pages, In Proc. of 5th International Workshop on Mining Ubiquitous and Social Environments (MUSE) at ECML/PKDD 2014

  50. arXiv:1406.1564  [pdf

    q-bio.PE

    Rational Design of Antibiotic Treatment Plans

    Authors: Portia M. Mira, Kristina Crona, Devin Greene, Juan C. Meza, Bernd Sturmfels, Miriam Barlow

    Abstract: The development of reliable methods for restoring susceptibility after antibiotic resistance arises has proven elusive. A greater understanding of the relationship between antibiotic administration and the evolution of resistance is key to overcoming this challenge. Here we present a data-driven mathematical approach for develo** antibiotic treatment plans that can reverse the evolution of antib… ▽ More

    Submitted 5 June, 2014; originally announced June 2014.

    Comments: 52 pages, additional supplementary information can be requested from the authors