Skip to main content

Showing 1–19 of 19 results for author: Jay, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.12120  [pdf

    cs.SE cs.CY

    Individual context-free online community health indicators fail to identify open source software sustainability

    Authors: Yo Yehudi, Carole Goble, Caroline Jay

    Abstract: The global value of open source software is estimated to be in the billions or trillions worldwide1, but despite this, it is often under-resourced and subject to high-impact security vulnerabilities and stability failures2,3. In order to investigate factors contributing to open source community longevity, we monitored thirty-eight open source projects over the period of a year, focusing primarily,… ▽ More

    Submitted 9 May, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 99 pages, 34 tables, 19 figures

  2. arXiv:2308.13014  [pdf, other

    cs.SI cs.CY cs.HC physics.soc-ph

    Tracking the Structure and Sentiment of Vaccination Discussions on Mumsnet

    Authors: Miguel E. P. Silva, Rigina Skeva, Thomas House, Caroline Jay

    Abstract: Vaccination is one of the most impactful healthcare interventions in terms of lives saved at a given cost, leading the anti-vaccination movement to be identified as one of the top 10 threats to global health in 2019 by the World Health Organization. This issue increased in importance during the COVID-19 pandemic where, despite good overall adherence to vaccination, specific communities still showe… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  3. arXiv:2304.10369  [pdf, other

    cs.SE cs.HC

    Novice programmers strategies for online resource use and their impact on source code

    Authors: Omar Alghamdi, Sarah Clinch, Mohammad Alhamadi, Caroline Jay

    Abstract: Websites are frequently used by programmers to support the development process. This paper investigates programmer-Web interactions when coding, and combines observations of behaviour with assessments of the resulting source code. We report on an online observational study with ten undergraduate student programmers as they engaged in programming tasks of varying complexity. Screens were recorded o… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  4. Subjective data models in bioinformatics: Do wet-lab and computational biologists comprehend data differently?

    Authors: Yo Yehudi, Lukas Hughes-Noehrer, Carole Goble, Caroline Jay

    Abstract: Biological science produces large amounts of data in a variety of formats, which necessitates the use of computational tools to process, integrate, analyse, and glean insights from the data. Researchers who use computational biology tools range from those who use computers primarily for communication and data lookup, to those who write complex software programs in order to analyse data or make it… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: 18 pages, 1 figure, 3 tables

  5. Comparing directed networks via denoising graphlet distributions

    Authors: Miguel E. P. Silva, Robert E. Gaunt, Luis Ospina-Forero, Caroline Jay, Thomas House

    Abstract: Network comparison is a widely-used tool for analyzing complex systems, with applications in varied domains including comparison of protein interactions or highlighting changes in structure of trade networks. In recent years, a number of network comparison methodologies based on the distribution of graphlets (small connected network subgraphs) have been introduced. In particular, NetEmd has recent… ▽ More

    Submitted 8 March, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

  6. arXiv:2205.12098  [pdf, other

    cs.CY

    COVID-19: An exploration of consecutive systemic barriers to pathogen-related data sharing during a pandemic

    Authors: Yo Yehudi, Lukas Hughes-Noehrer, Carole Goble, Caroline Jay

    Abstract: In 2020, the COVID-19 pandemic resulted in a rapid response from governments and researchers worldwide. As of late 2023, over millions have died as a result of COVID-19, with many COVID-19 survivors going on to experience long-term effects weeks, months, or years after their illness. Despite this staggering toll, those who work with pandemic-relevant data often face significant systemic barriers t… ▽ More

    Submitted 22 December, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: 35 pages including references, three figures. To be submitted to Data and Policy

  7. arXiv:2108.01949  [pdf, other

    cs.HC cs.IR cs.LG

    Using Interaction Data to Predict Engagement with Interactive Media

    Authors: Jonathan Carlton, Andy Brown, Caroline Jay, John Keane

    Abstract: Media is evolving from traditional linear narratives to personalised experiences, where control over information (or how it is presented) is given to individual audience members. Measuring and understanding audience engagement with this media is important in at least two ways: (1) a post-hoc understanding of how engaged audiences are with the content will help production teams learn from experienc… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: This is a pre-print of our paper published in proceedings of the 29th ACM International Conference on Multimedia (MM'21)

  8. arXiv:2104.14815  [pdf, other

    cs.DL cs.AI

    Number and quality of diagrams in scholarly publications is associated with number of citations

    Authors: Guy Clarke Marshall, Caroline Jay, Andre Freitas

    Abstract: Diagrams are often used in scholarly communication. We analyse a corpus of diagrams found in scholarly computational linguistics conference proceedings (ACL 2017), and find inclusion of a system diagram to be correlated with higher numbers of citations after 3 years. Inclusion of over three diagrams in this 8-page limit conference was found to correlate with a lower citation count. Focusing on neu… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

    Comments: 15 pages, 4 figures. arXiv admin note: text overlap with arXiv:2008.12566

  9. arXiv:2104.14811  [pdf, other

    cs.HC cs.AI

    Why scholars are diagramming neural network models

    Authors: Guy Clarke Marshall, Caroline Jay, Andre Freitas

    Abstract: Complex models, such as neural networks (NNs), are comprised of many interrelated components. In order to represent these models, eliciting and characterising the relations between components is essential. Perhaps because of this, diagrams, as "icons of relation", are a prevalent medium for signifying complex models. Diagrams used to communicate NN architectures are currently extremely varied. The… ▽ More

    Submitted 10 June, 2022; v1 submitted 30 April, 2021; originally announced April 2021.

    Comments: 16 pages, 4 figures

  10. arXiv:2104.14810  [pdf, other

    cs.HC cs.AI

    Structuralist analysis for neural network system diagrams

    Authors: Guy Clarke Marshall, Caroline Jay, Andre Freitas

    Abstract: This short paper examines diagrams describing neural network systems in academic conference proceedings. Many aspects of scholarly communication are controlled, particularly with relation to text and formatting, but often diagrams are not centrally curated beyond a peer review. Using a corpus-based approach, we argue that the heterogeneous diagrammatic notations used for neural network systems has… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

    Comments: 8 pages, 2 figures

  11. Understanding Equity, Diversity and Inclusion Challenges Within the Research Software Community

    Authors: Neil P. Chue Hong, Jeremy Cohen, Caroline Jay

    Abstract: Research software -- specialist software used to support or undertake research -- is of huge importance to researchers. It contributes to significant advances in the wider world and requires collaboration between people with diverse skills and backgrounds. Analysis of recent survey data provides evidence for a lack of diversity in the Research Software Engineer community. We identify interventions… ▽ More

    Submitted 4 April, 2021; originally announced April 2021.

    Comments: 14 pages, 3 figures and tables, SE4Science21 track at 2021 International Conference on Computational Science

    Journal ref: Lecture Notes in Computer Science. Vol. 12747 (2021) pp390-403

  12. Software must be recognised as an important output of scholarly research

    Authors: Caroline Jay, Robert Haines, Daniel S. Katz

    Abstract: Software now lies at the heart of scholarly research. Here we argue that as well as being important from a methodological perspective, software should, in many instances, be recognised as an output of research, equivalent to an academic paper. The article discusses the different roles that software may play in research and highlights the relationship between software and research sustainability an… ▽ More

    Submitted 15 November, 2020; originally announced November 2020.

    Comments: 6 pages. Submitted to IJDC

  13. arXiv:2008.12566  [pdf, other

    cs.HC cs.AI

    A Framework for Improving Scholarly Neural Network Diagrams

    Authors: Guy Clarke Marshall, André Freitas, Caroline Jay

    Abstract: Neural networks are a prevalent and effective machine learning component, and their application is leading to significant scientific progress in many domains. As the field of neural network systems is fast growing, it is important to understand how advances are communicated. Diagrams are key to this, appearing in almost all papers describing novel systems. This paper reports on a study into the us… ▽ More

    Submitted 21 November, 2022; v1 submitted 28 August, 2020; originally announced August 2020.

    Comments: 51 pages, 13 tables, 18 figures

  14. arXiv:2008.11785  [pdf, other

    cs.HC cs.AI cs.CL

    Understanding scholarly Natural Language Processing system diagrams through application of the Richards-Engelhardt framework

    Authors: Guy Clarke Marshall, Caroline Jay, André Freitas

    Abstract: We utilise Richards-Engelhardt framework as a tool for understanding Natural Language Processing systems diagrams. Through four examples from scholarly proceedings, we find that the application of the framework to this ecological and complex domain is effective for reflecting on these diagrams. We argue for vocabulary to describe multiple-codings, semiotic variability, and inconsistency or misuse… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: 16 pages, 5 figures, pre-print

  15. The Four Pillars of Research Software Engineering

    Authors: J. Cohen, D. S. Katz, M. Barker, N. Chue Hong, R. Haines, C. Jay

    Abstract: Building software that can support the huge growth in data and computation required by modern research needs individuals with increasingly specialist skill sets that take time to develop and maintain. The Research Software Engineering movement, which started in the UK and has been built up over recent years, aims to recognise and support these individuals. Why does research software matter to prof… ▽ More

    Submitted 25 January, 2023; v1 submitted 3 February, 2020; originally announced February 2020.

    Journal ref: IEEE Software 38(1) (2021) 97-105

  16. arXiv:1910.09902  [pdf

    cs.SE

    Theory-Software Translation: Research Challenges and Future Directions

    Authors: Caroline Jay, Robert Haines, Daniel S. Katz, Jeffrey Carver, James C. Phillips, Anshu Dubey, Sandra Gesing, Matthew Turk, Hui Wan, Hubertus van Dam, James Howison, Vitali Morozov, Steven R. Brandt

    Abstract: The Theory-Software Translation Workshop, held in New Orleans in February 2019, explored in depth the process of both instantiating theory in software - for example, implementing a mathematical model in code as part of a simulation - and using the outputs of software - such as the behavior of a simulation - to advance knowledge. As computation within research is now ubiquitous, the workshop provid… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  17. arXiv:1903.06772  [pdf, other

    cs.SE cs.CY

    A Methodology for Using GitLab for Software Engineering Learning Analytics

    Authors: Julio César Cortés Ríos, Kamilla Kopec-Harding, Sukru Eraslan, Christopher Page, Robert Haines, Caroline Jay, Suzanne M. Embury

    Abstract: To bridge the digital skills gap, we need to train more people in Software Engineering techniques. This paper reports on a project exploring the way students solve tasks using collaborative development platforms and version control systems, such as GitLab, to find patterns and evaluation metrics that can be used to improve the course content and reflect on the most common issues the students are f… ▽ More

    Submitted 15 March, 2019; originally announced March 2019.

  18. arXiv:1903.06039  [pdf, ps, other

    cs.SE cs.CY

    What Makes Research Software Sustainable? An Interview Study With Research Software Engineers

    Authors: Mario Rosado de Souza, Robert Haines, Markel Vigo, Caroline Jay

    Abstract: Software is now a vital scientific instrument, providing the tools for data collection and analysis across disciplines from bioinformatics and computational physics, to the humanities. The software used in research is often home-grown and bespoke: it is constructed for a particular project, and rarely maintained beyond this, leading to rapid decay, and frequent `reinvention of the wheel'. Understa… ▽ More

    Submitted 14 March, 2019; originally announced March 2019.

  19. The State of Sustainable Research Software: Results from the Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1)

    Authors: Daniel S. Katz, Stephan Druskat, Robert Haines, Caroline Jay, Alexander Struck

    Abstract: This article summarizes motivations, organization, and activities of the Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE5.1) held in Manchester, UK in September 2017. The WSSSPE series promotes sustainable research software by positively impacting principles and best practices, careers, learning, and credit. This article discusses the Code of Conduct, idea papers, po… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

    Journal ref: Journal of Open Research Software, 7(1), 2019, p.11