Search | arXiv e-print repository

arXiv:2306.11946 [pdf, other]

doi 10.1109/CSCI58124.2022.00040

Winter Wheat Crop Yield Prediction on Multiple Heterogeneous Datasets using Machine Learning

Authors: Yogesh Bansal, Dr. David Lillis, Prof. Mohand Tahar Kechadi

Abstract: Winter wheat is one of the most important crops in the United Kingdom, and crop yield prediction is essential for the nation's food security. Several studies have employed machine learning (ML) techniques to predict crop yield on a county or farm-based level. The main objective of this study is to predict winter wheat crop yield using ML models on multiple heterogeneous datasets, i.e., soil and we… ▽ More Winter wheat is one of the most important crops in the United Kingdom, and crop yield prediction is essential for the nation's food security. Several studies have employed machine learning (ML) techniques to predict crop yield on a county or farm-based level. The main objective of this study is to predict winter wheat crop yield using ML models on multiple heterogeneous datasets, i.e., soil and weather on a zone-based level. Experimental results demonstrated their impact when used alone and in combination. In addition, we employ numerous ML algorithms to emphasize the significance of data quality in any machine-learning strategy. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Journal ref: International Conference on Computational Science and Computational Intelligence (CSCI 2022)

arXiv:2306.11942

A Deep Learning Model for Heterogeneous Dataset Analysis -- Application to Winter Wheat Crop Yield Prediction

Authors: Yogesh Bansal, David Lillis, Mohand Tahar Kechadi

Abstract: Western countries rely heavily on wheat, and yield prediction is crucial. Time-series deep learning models, such as Long Short Term Memory (LSTM), have already been explored and applied to yield prediction. Existing literature reported that they perform better than traditional Machine Learning (ML) models. However, the existing LSTM cannot handle heterogeneous datasets (a combination of data which… ▽ More Western countries rely heavily on wheat, and yield prediction is crucial. Time-series deep learning models, such as Long Short Term Memory (LSTM), have already been explored and applied to yield prediction. Existing literature reported that they perform better than traditional Machine Learning (ML) models. However, the existing LSTM cannot handle heterogeneous datasets (a combination of data which varies and remains static with time). In this paper, we propose an efficient deep learning model that can deal with heterogeneous datasets. We developed the system architecture and applied it to the real-world dataset in the digital agriculture area. We showed that it outperforms the existing ML models. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: This version has been removed by arXiv administrators because the submitter did not have the authority to grant the license at the time of submission

arXiv:2202.13457 [pdf, other]

Enhancing Legal Argument Mining with Domain Pre-training and Neural Networks

Authors: Gechuan Zhang, Paul Nulty, David Lillis

Abstract: The contextual word embedding model, BERT, has proved its ability on downstream tasks with limited quantities of annotated data. BERT and its variants help to reduce the burden of complex annotation work in many interdisciplinary research areas, for example, legal argument mining in digital humanities. Argument mining aims to develop text analysis tools that can automatically retrieve arguments an… ▽ More The contextual word embedding model, BERT, has proved its ability on downstream tasks with limited quantities of annotated data. BERT and its variants help to reduce the burden of complex annotation work in many interdisciplinary research areas, for example, legal argument mining in digital humanities. Argument mining aims to develop text analysis tools that can automatically retrieve arguments and identify relationships between argumentation clauses. Since argumentation is one of the key aspects of case law, argument mining tools for legal texts are applicable to both academic and non-academic legal research. Domain-specific BERT variants (pre-trained with corpora from a particular background) have also achieved strong performance in many tasks. To our knowledge, previous machine learning studies of argument mining on judicial case law still heavily rely on statistical models. In this paper, we provide a broad study of both classic and contextual embedding models and their performance on practical case law from the European Court of Human Rights (ECHR). During our study, we also explore a number of neural networks when being combined with different embeddings. Our experiments provide a comprehensive overview of a variety of approaches to the legal argument mining task. We conclude that domain pre-trained transformer models have great potential in this area, although traditional embeddings can also achieve strong performance when combined with additional neural network layers. △ Less

Submitted 6 April, 2022; v1 submitted 27 February, 2022; originally announced February 2022.

arXiv:2112.03737 [pdf, ps, other]

UCD-CS at TREC 2021 Incident Streams Track

Authors: Congcong Wang, David Lillis

Abstract: In recent years, the task of mining important information from social media posts during crises has become a focus of research for the purposes of assisting emergency response (ES). The TREC Incident Streams (IS) track is a research challenge organised for this purpose. The track asks participating systems to both classify a stream of crisis-related tweets into humanitarian aid related information… ▽ More In recent years, the task of mining important information from social media posts during crises has become a focus of research for the purposes of assisting emergency response (ES). The TREC Incident Streams (IS) track is a research challenge organised for this purpose. The track asks participating systems to both classify a stream of crisis-related tweets into humanitarian aid related information types and estimate their importance regarding criticality. The former refers to a multi-label information type classification task and the latter refers to a priority estimation task. In this paper, we report on the participation of the University College Dublin School of Computer Science (UCD-CS) in TREC-IS 2021. We explored a variety of approaches, including simple machine learning algorithms, multi-task learning techniques, text augmentation, and ensemble approaches. The official evaluation results indicate that our runs achieve the highest scores in many metrics. To aid reproducibility, our code is publicly available at https://github.com/wangcongcong123/crisis-mtl. △ Less

Submitted 7 December, 2021; originally announced December 2021.

arXiv:2110.08015 [pdf]

Crisis Domain Adaptation Using Sequence-to-sequence Transformers

Authors: Congcong Wang, Paul Nulty, David Lillis

Abstract: User-generated content (UGC) on social media can act as a key source of information for emergency responders in crisis situations. However, due to the volume concerned, computational techniques are needed to effectively filter and prioritise this content as it arises during emerging events. In the literature, these techniques are trained using annotated content from previous crises. In this paper,… ▽ More User-generated content (UGC) on social media can act as a key source of information for emergency responders in crisis situations. However, due to the volume concerned, computational techniques are needed to effectively filter and prioritise this content as it arises during emerging events. In the literature, these techniques are trained using annotated content from previous crises. In this paper, we investigate how this prior knowledge can be best leveraged for new crises by examining the extent to which crisis events of a similar type are more suitable for adaptation to new events (cross-domain adaptation). Given the recent successes of transformers in various language processing tasks, we propose CAST: an approach for Crisis domain Adaptation leveraging Sequence-to-sequence Transformers. We evaluate CAST using two major crisis-related message classification datasets. Our experiments show that our CAST-based best run without using any target data achieves the state of the art performance in both in-domain and cross-domain contexts. Moreover, CAST is particularly effective in one-to-one cross-domain adaptation when trained with a larger language model. In many-to-one adaptation where multiple crises are jointly used as the source domain, CAST further improves its performance. In addition, we find that more similar events are more likely to bring better adaptation performance whereas fine-tuning using dissimilar events does not help for adaptation. To aid reproducibility, we open source our code to the community. △ Less

Submitted 15 October, 2021; originally announced October 2021.

Comments: 18th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2021)

arXiv:2110.08010 [pdf]

Transformer-based Multi-task Learning for Disaster Tweet Categorisation

Authors: Congcong Wang, Paul Nulty, David Lillis

Abstract: Social media has enabled people to circulate information in a timely fashion, thus motivating people to post messages seeking help during crisis situations. These messages can contribute to the situational awareness of emergency responders, who have a need for them to be categorised according to information types (i.e. the type of aid services the messages are requesting). We introduce a transform… ▽ More Social media has enabled people to circulate information in a timely fashion, thus motivating people to post messages seeking help during crisis situations. These messages can contribute to the situational awareness of emergency responders, who have a need for them to be categorised according to information types (i.e. the type of aid services the messages are requesting). We introduce a transformer-based multi-task learning (MTL) technique for classifying information types and estimating the priority of these messages. We evaluate the effectiveness of our approach with a variety of metrics by submitting runs to the TREC Incident Streams (IS) track: a research initiative specifically designed for disaster tweet classification and prioritisation. The results demonstrate that our approach achieves competitive performance in most metrics as compared to other participating runs. Subsequently, we find that an ensemble approach combining disparate transformer encoders within our approach helps to improve the overall effectiveness to a significant extent, achieving state-of-the-art performance in almost every metric. We make the code publicly available so that our work can be reproduced and used as a baseline for the community for future work in this domain. △ Less

Submitted 15 October, 2021; originally announced October 2021.

Comments: 18th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2021)

arXiv:2110.06094 [pdf]

Increasing Gender Balance Across Academic Staffing in Computer Science -- case study

Authors: Susan Mckeever, Deirdre Lillis

Abstract: As at 2019, Technological University Dublin* Computer Science is the top university in Ireland in terms of gender balance of female academic staff in computer science schools. In an academic team of approximately 55 full-time equivalents, 36% of our academic staff are female, 50% of our senior academic leadership team (2 of 4) are female and 75% of our School Executive are female (3 of 4), includi… ▽ More As at 2019, Technological University Dublin* Computer Science is the top university in Ireland in terms of gender balance of female academic staff in computer science schools. In an academic team of approximately 55 full-time equivalents, 36% of our academic staff are female, 50% of our senior academic leadership team (2 of 4) are female and 75% of our School Executive are female (3 of 4), including a female Head of School. This is as a result of our seven year SUCCESS programme which had a four strand approach: Source, Career, Environment and Support. The Source strand explicitly encouraged females to apply for each recruitment drive; Career focused on female career and skills development initiatives; Environment created a female-friendly culture and reputation, both within the School, across our organisation and across the third level sector in Ireland and Support addressed practical supports for the specific difficulties experienced by female staff. As a result we have had 0% turnover in female staff in the past five years (in contrast to 10% male staff turnover). We will continue to work across these four strands to preserve our pipeline of female staff and ensure their success over the coming years in an academic and ICT sector that remains challenging for females. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: This paper represents the winning submission of the Informatics Europe Minerva 2019 award; 9 pages, including two pages of appendix

arXiv:2110.06090 [pdf]

Addressing the Recruitment and Retention of Female Students in Computer Science at Third Level

Authors: Susan McKeever, Deirdre Lillis

Abstract: In the School of Computing at the Dublin Institute of Technology (DIT), Ireland, we undertook our Computer Science for All (CS4All) initiative, a five year strategy to implement structural reforms at Faculty level, to address recruitment and retention issues of female undergraduate computer science (CS) students. Since 2012, under CS4All we implemented a variety of reforms to improve student reten… ▽ More In the School of Computing at the Dublin Institute of Technology (DIT), Ireland, we undertook our Computer Science for All (CS4All) initiative, a five year strategy to implement structural reforms at Faculty level, to address recruitment and retention issues of female undergraduate computer science (CS) students. Since 2012, under CS4All we implemented a variety of reforms to improve student retention, set up a new CS program to attract more female students, and delivered changes to promote a sense of community amongst our female students. We have made significant improvements. For example, we have achieved a dramatic improvement in retention rising from 45% to 89% in first year progression rates. Our new hybrid CS International program has more than double the percentage of females first year enrolments in comparison to our other undergraduate programs. As at 2018, we continue to roll out the remaining parts of CS4All within our School. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: This paper represents the runner up submission of the Informatics Europe Minerva 2018 award. 7 pages

arXiv:2102.13395 [pdf, other]

Multi-task transfer learning for finding actionable information from crisis-related messages on social media

Authors: Congcong Wang, David Lillis

Abstract: The Incident streams (IS) track is a research challenge aimed at finding important information from social media during crises for emergency response purposes. More specifically, given a stream of crisis-related tweets, the IS challenge asks a participating system to 1) classify what the types of users' concerns or needs are expressed in each tweet, known as the information type (IT) classificatio… ▽ More The Incident streams (IS) track is a research challenge aimed at finding important information from social media during crises for emergency response purposes. More specifically, given a stream of crisis-related tweets, the IS challenge asks a participating system to 1) classify what the types of users' concerns or needs are expressed in each tweet, known as the information type (IT) classification task and 2) estimate how critical each tweet is with regard to emergency response, known as the priority level prediction task. In this paper, we describe our multi-task transfer learning approach for this challenge. Our approach leverages state-of-the-art transformer models including both encoder-based models such as BERT and a sequence-to-sequence based T5 for joint transfer learning on the two tasks. Based on this approach, we submitted several runs to the track. The returned evaluation results show that our runs substantially outperform other participating runs in both IT classification and priority level prediction. △ Less

Submitted 26 February, 2021; originally announced February 2021.

Comments: 8 pages, TREC 2020

arXiv:2012.01179 [pdf, other]

doi 10.1109/CyberSecurity49315.2020.9138851

Assessing the Influencing Factors on the Accuracy of Underage Facial Age Estimation

Authors: Felix Anda, Brett A. Becker, David Lillis, Nhien-An Le-Khac, Mark Scanlon

Abstract: Swift response to the detection of endangered minors is an ongoing concern for law enforcement. Many child-focused investigations hinge on digital evidence discovery and analysis. Automated age estimation techniques are needed to aid in these investigations to expedite this evidence discovery process, and decrease investigator exposure to traumatic material. Automated techniques also show promise… ▽ More Swift response to the detection of endangered minors is an ongoing concern for law enforcement. Many child-focused investigations hinge on digital evidence discovery and analysis. Automated age estimation techniques are needed to aid in these investigations to expedite this evidence discovery process, and decrease investigator exposure to traumatic material. Automated techniques also show promise in decreasing the overflowing backlog of evidence obtained from increasing numbers of devices and online services. A lack of sufficient training data combined with natural human variance has been long hindering accurate automated age estimation -- especially for underage subjects. This paper presented a comprehensive evaluation of the performance of two cloud age estimation services (Amazon Web Service's Rekognition service and Microsoft Azure's Face API) against a dataset of over 21,800 underage subjects. The objective of this work is to evaluate the influence that certain human biometric factors, facial expressions, and image quality (i.e. blur, noise, exposure and resolution) have on the outcome of automated age estimation services. A thorough evaluation allows us to identify the most influential factors to be overcome in future age estimation systems. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Journal ref: The 6th IEEE International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Dublin, Ireland, June 2020

arXiv:2009.10047 [pdf, other]

UCD-CS at W-NUT 2020 Shared Task-3: A Text to Text Approach for COVID-19 Event Extraction on Social Media

Authors: Congcong Wang, David Lillis

Abstract: In this paper, we describe our approach in the shared task: COVID-19 event extraction from Twitter. The objective of this task is to extract answers from COVID-related tweets to a set of predefined slot-filling questions. Our approach treats the event extraction task as a question answering task by leveraging the transformer-based T5 text-to-text model. According to the official evaluation score… ▽ More In this paper, we describe our approach in the shared task: COVID-19 event extraction from Twitter. The objective of this task is to extract answers from COVID-related tweets to a set of predefined slot-filling questions. Our approach treats the event extraction task as a question answering task by leveraging the transformer-based T5 text-to-text model. According to the official evaluation scores returned, namely F1, our submitted run achieves competitive performance compared to other participating runs (Top 3). However, we argue that this evaluation may underestimate the actual performance of runs based on text-generation. Although some such runs may answer the slot questions well, they may not be an exact string match for the gold standard answers. To measure the extent of this underestimation, we adopt a simple exact-answer transformation method aiming at converting the well-answered predictions to exactly-matched predictions. The results show that after this transformation our run overall reaches the same level of performance as the best participating run and state-of-the-art F1 scores in three of five COVID-related events. Our code is publicly available to aid reproducibility △ Less

Submitted 12 October, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

Comments: 8 pages, 2 figures

arXiv:1907.01427 [pdf, other]

doi 10.1145/3339252.3341491

Improving Borderline Adulthood Facial Age Estimation through Ensemble Learning

Authors: Felix Anda, David Lillis, Aikaterini Kanta, Brett A. Becker, Elias Bou-Harb, Nhien-An Le-Khac, Mark Scanlon

Abstract: Achieving high performance for facial age estimation with subjects in the borderline between adulthood and non-adulthood has always been a challenge. Several studies have used different approaches from the age of a baby to an elder adult and different datasets have been employed to measure the mean absolute error (MAE) ranging between 1.47 to 8 years. The weakness of the algorithms specifically in… ▽ More Achieving high performance for facial age estimation with subjects in the borderline between adulthood and non-adulthood has always been a challenge. Several studies have used different approaches from the age of a baby to an elder adult and different datasets have been employed to measure the mean absolute error (MAE) ranging between 1.47 to 8 years. The weakness of the algorithms specifically in the borderline has been a motivation for this paper. In our approach, we have developed an ensemble technique that improves the accuracy of underage estimation in conjunction with our deep learning model (DS13K) that has been fine-tuned on the Deep Expectation (DEX) model. We have achieved an accuracy of 68% for the age group 16 to 17 years old, which is 4 times better than the DEX accuracy for such age range. We also present an evaluation of existing cloud-based and offline facial age prediction services, such as Amazon Rekognition, Microsoft Azure Cognitive Services, How-Old.net and DEX. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Journal ref: 14th International Conference on Availability, Reliability and Security (ARES 2019), Canterbury, UK, August 2019

arXiv:1802.04068 [pdf]

Towards an Open Science Platform for the Evaluation of Data Fusion

Authors: Weinan Huang, Junyi Chen, Lei Meng, David Lillis

Abstract: Combining the results of different search engines in order to improve upon their performance has been the subject of many research papers. This has become known as the "Data Fusion" task, and has great promise in dealing with the vast quantity of unstructured textual data that is a feature of many Big Data scenarios. However, no universally-accepted evaluation methodology has emerged in the commun… ▽ More Combining the results of different search engines in order to improve upon their performance has been the subject of many research papers. This has become known as the "Data Fusion" task, and has great promise in dealing with the vast quantity of unstructured textual data that is a feature of many Big Data scenarios. However, no universally-accepted evaluation methodology has emerged in the community. This makes it difficult to make meaningful comparisons between the various proposed techniques from reading the literature alone. Variations in the datasets, metrics, and baseline results have all contributed to this difficulty. This paper argues that a more unified approach is required, and that a centralised software platform should be developed to aid researchers in making comparisons between their algorithms and others. The desirable qualities of such a system have been identified and proposed, and an early prototype has been developed. Re-implementing algorithms published by other researchers is a great burden on those proposing new techniques. The prototype system has the potential to greatly reduce this burden and thus encourage more comparable results being generated and published more easily. △ Less

Submitted 12 February, 2018; originally announced February 2018.

Comments: Proceedings of the 3rd IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA 2018), Chengdu, China, 2018

arXiv:1712.04544 [pdf, other]

doi 10.15394/jdfsl.2018.1489

Hierarchical Bloom Filter Trees for Approximate Matching

Authors: David Lillis, Frank Breitinger, Mark Scanlon

Abstract: Bytewise approximate matching algorithms have in recent years shown significant promise in de- tecting files that are similar at the byte level. This is very useful for digital forensic investigators, who are regularly faced with the problem of searching through a seized device for pertinent data. A common scenario is where an investigator is in possession of a collection of "known-illegal" files… ▽ More Bytewise approximate matching algorithms have in recent years shown significant promise in de- tecting files that are similar at the byte level. This is very useful for digital forensic investigators, who are regularly faced with the problem of searching through a seized device for pertinent data. A common scenario is where an investigator is in possession of a collection of "known-illegal" files (e.g. a collection of child abuse material) and wishes to find whether copies of these are stored on the seized device. Approximate matching addresses shortcomings in traditional hashing, which can only find identical files, by also being able to deal with cases of merged files, embedded files, partial files, or if a file has been changed in any way. Most approximate matching algorithms work by comparing pairs of files, which is not a scalable approach when faced with large corpora. This paper demonstrates the effectiveness of using a "Hierarchical Bloom Filter Tree" (HBFT) data structure to reduce the running time of collection-against-collection matching, with a specific focus on the MRSH-v2 algorithm. Three experiments are discussed, which explore the effects of different configurations of HBFTs. The proposed approach dramatically reduces the number of pairwise comparisons required, and demonstrates substantial speed gains, while maintaining effectiveness. △ Less

Submitted 12 December, 2017; originally announced December 2017.

arXiv:1711.02634 [pdf, other]

Internalising Interaction Protocols as First-Class Programming Elements in Multi Agent Systems

Authors: David J. Lillis

Abstract: Since their inception, Multi Agent Systems (MASs) have been championed as a solution for the increasing problem of software complexity. Communities of distributed autonomous computing entities that are capable of collaborating, negotiating and acting to solve complex organisational and system management problems are an attractive proposition. Central to this is the requirement for agents to posses… ▽ More Since their inception, Multi Agent Systems (MASs) have been championed as a solution for the increasing problem of software complexity. Communities of distributed autonomous computing entities that are capable of collaborating, negotiating and acting to solve complex organisational and system management problems are an attractive proposition. Central to this is the requirement for agents to possess the capability of interacting with one another in a structured, consistent and organised manner. This thesis presents the Agent Conversation Reasoning Engine (ACRE), which constitutes a holistic view of communication management for MASs. ACRE is intended to facilitate the practical development, debugging and deployment of communication-heavy MASs. ACRE has been formally defined in terms of its operational semantics, and a generic architecture has been proposed to facilitate its integration with a wide variety of diverse agent development frameworks and Agent Oriented Programming (AOP) languages. A concrete implementation has also been developed that uses the Agent Factory AOP framework as its base. This allows ACRE to be used with a number of different AOP languages, while providing a reference implementation that other integrations can be modelled upon. A standard is also proposed for the modelling and sharing of agent-focused interaction protocols that is independent of the platform within which a concrete ACRE implementation is run. Finally, a user evaluation illustrates the benefits of incorporating conversation management into agent programming. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Journal ref: PhD Thesis, University College Dublin, 2012

arXiv:1704.08990 [pdf]

doi 10.1016/j.diin.2017.01.010

EviPlant: An efficient digital forensic challenge creation, manipulation and distribution solution

Authors: Mark Scanlon, Xiaoyu Du, David Lillis

Abstract: Education and training in digital forensics requires a variety of suitable challenge corpora containing realistic features including regular wear-and-tear, background noise, and the actual digital traces to be discovered during investigation. Typically, the creation of these challenges requires overly arduous effort on the part of the educator to ensure their viability. Once created, the challenge… ▽ More Education and training in digital forensics requires a variety of suitable challenge corpora containing realistic features including regular wear-and-tear, background noise, and the actual digital traces to be discovered during investigation. Typically, the creation of these challenges requires overly arduous effort on the part of the educator to ensure their viability. Once created, the challenge image needs to be stored and distributed to a class for practical training. This storage and distribution step requires significant time and resources and may not even be possible in an online/distance learning scenario due to the data sizes involved. As part of this paper, we introduce a more capable methodology and system as an alternative to current approaches. EviPlant is a system designed for the efficient creation, manipulation, storage and distribution of challenges for digital forensics education and training. The system relies on the initial distribution of base disk images, i.e., images containing solely base operating systems. In order to create challenges for students, educators can boot the base system, emulate the desired activity and perform a "diffing" of resultant image and the base image. This diffing process extracts the modified artefacts and associated metadata and stores them in an "evidence package". Evidence packages can be created for different personae, different wear-and-tear, different emulated crimes, etc., and multiple evidence packages can be distributed to students and integrated into the base images. A number of additional applications in digital forensic challenge creation for tool testing and validation, proficiency testing, and malware analysis are also discussed as a result of using EviPlant. △ Less

Submitted 28 April, 2017; originally announced April 2017.

Comments: Digital Forensic Research Workshop Europe 2017

Journal ref: Digital Investigation, Volume 20, Supplement, March 2017, Pages S29-S36, ISSN 1742-2876

arXiv:1604.03850 [pdf, other]

Current Challenges and Future Research Areas for Digital Forensic Investigation

Authors: David Lillis, Brett Becker, Tadhg O'Sullivan, Mark Scanlon

Abstract: Given the ever-increasing prevalence of technology in modern life, there is a corresponding increase in the likelihood of digital devices being pertinent to a criminal investigation or civil litigation. As a direct consequence, the number of investigations requiring digital forensic expertise is resulting in huge digital evidence backlogs being encountered by law enforcement agencies throughout th… ▽ More Given the ever-increasing prevalence of technology in modern life, there is a corresponding increase in the likelihood of digital devices being pertinent to a criminal investigation or civil litigation. As a direct consequence, the number of investigations requiring digital forensic expertise is resulting in huge digital evidence backlogs being encountered by law enforcement agencies throughout the world. It can be anticipated that the number of cases requiring digital forensic analysis will greatly increase in the future. It is also likely that each case will require the analysis of an increasing number of devices including computers, smartphones, tablets, cloud-based services, Internet of Things devices, wearables, etc. The variety of new digital evidence sources pose new and challenging problems for the digital investigator from an identification, acquisition, storage and analysis perspective. This paper explores the current challenges contributing to the backlog in digital forensics from a technical standpoint and outlines a number of future research topics that could greatly contribute to a more efficient digital forensic process. △ Less

Submitted 13 April, 2016; originally announced April 2016.

Comments: The 11th ADFSL Conference on Digital Forensics, Security and Law (CDFSL 2016), Daytona Beach, Florida, USA, May 2016

arXiv:1508.02685 [pdf, other]

doi 10.1007/978-3-642-22723-3_4

Augmenting Agent Platforms to Facilitate Conversation Reasoning

Authors: David Lillis, Rem W. Collier`

Abstract: Within Multi Agent Systems, communication by means of Agent Communication Languages (ACLs) has a key role to play in the co-operation, co-ordination and knowledge-sharing between agents. Despite this, complex reasoning about agent messaging, and specifically about conversations between agents, tends not to have widespread support amongst general-purpose agent programming languages. ACRE (Agent C… ▽ More Within Multi Agent Systems, communication by means of Agent Communication Languages (ACLs) has a key role to play in the co-operation, co-ordination and knowledge-sharing between agents. Despite this, complex reasoning about agent messaging, and specifically about conversations between agents, tends not to have widespread support amongst general-purpose agent programming languages. ACRE (Agent Communication Reasoning Engine) aims to complement the existing logical reasoning capabilities of agent programming languages with the capability of reasoning about complex interaction protocols in order to facilitate conversations between agents. This paper outlines the aims of the ACRE project and gives details of the functioning of a prototype implementation within the Agent Factory multi agent framework. △ Less

Submitted 11 August, 2015; originally announced August 2015.

Journal ref: In Languages, Methodologies, and Development Tools for Multi-Agent Systems - Third International Workshop, LADS 2010, Revised Selected Papers, Lecture Notes in Computer Science vol. 6822, pp. 56--75. Springer Berlin Heidelberg, 2011

arXiv:1508.02677 [pdf, other]

doi 10.1007/978-3-642-22723-3_4

Call Graph Profiling for Multi Agent Systems

Authors: Dinh Doan Van Bien, David Lillis, Rem W. Collier

Abstract: The design, implementation and testing of Multi Agent Systems is typically a very complex task. While a number of specialist agent programming languages and toolkits have been created to aid in the development of such systems, the provision of associated development tools still lags behind those available for other programming paradigms. This includes tools such as debuggers and profilers to help… ▽ More The design, implementation and testing of Multi Agent Systems is typically a very complex task. While a number of specialist agent programming languages and toolkits have been created to aid in the development of such systems, the provision of associated development tools still lags behind those available for other programming paradigms. This includes tools such as debuggers and profilers to help analyse system behaviour, performance and efficiency. AgentSpotter is a profiling tool designed specifically to operate on the concepts of agent-oriented programming. This paper extends previous work on AgentSpotter by discussing its Call Graph View, which presents system performance information, with reference to the communication between the agents in the system. This is aimed at aiding developers in examining the effect that agent communication has on the processing requirements of the system. △ Less

Submitted 11 August, 2015; originally announced August 2015.

Journal ref: In Languages, Methodologies, and Development Tools for Multi-Agent Systems - 2nd International Workshop, LADS 2009, Revised Selected Papers, Lecture Notes in Computer Science vol. 6039, pp. 153--167. Springer Berlin Heidelberg, 2010

arXiv:1508.02674 [pdf, other]

doi 10.1007/978-3-642-14843-9_11

Space-Time Diagram Generation for Profiling Multi Agent Systems

Authors: Dinh Doan Van Bien, David Lillis, Rem W. Collier

Abstract: Advances in Agent Oriented Software Engineering have focused on the provision of frameworks and toolkits to aid in the creation of Multi Agent Systems (MASs). However, despite the need to address the inherent complexity of such systems, little progress has been made in the development of tools to allow for the debugging and understanding of their inner workings. This paper introduces a novel per… ▽ More Advances in Agent Oriented Software Engineering have focused on the provision of frameworks and toolkits to aid in the creation of Multi Agent Systems (MASs). However, despite the need to address the inherent complexity of such systems, little progress has been made in the development of tools to allow for the debugging and understanding of their inner workings. This paper introduces a novel performance analysis system, named AgentSpotter, which facilitates such analysis. AgentSpotter was developed by map** conventional profiling concepts to the domain of MASs. We outline its integration into the Agent Factory multi agent framework. △ Less

Submitted 11 August, 2015; originally announced August 2015.

Journal ref: In L. Braubach, J.-P. Briot, and J. Thangarajah, editors, Programming Multi-Agent Systems, volume 5919 of Lecture Notes in Computer Science, pages 170--184. Springer Berlin Heidelberg, Budapest, Hungary, May 2009

arXiv:1410.2634 [pdf, other]

doi 10.1007/978-3-540-78646-7_33

Extending Probabilistic Data Fusion Using Sliding Windows

Authors: David Lillis, Fergus Toolan, Rem W. Collier, John Dunnion

Abstract: Recent developments in the field of data fusion have seen a focus on techniques that use training queries to estimate the probability that various documents are relevant to a given query and use that information to assign scores to those documents on which they are subsequently ranked. This paper introduces SlideFuse, which builds on these techniques, introducing a sliding window in order to compe… ▽ More Recent developments in the field of data fusion have seen a focus on techniques that use training queries to estimate the probability that various documents are relevant to a given query and use that information to assign scores to those documents on which they are subsequently ranked. This paper introduces SlideFuse, which builds on these techniques, introducing a sliding window in order to compensate for situations where little relevance information is available to aid in the estimation of probabilities. SlideFuse is shown to perform favourably in comparison with CombMNZ, ProbFuse and SegFuse. CombMNZ is the standard baseline technique against which data fusion algorithms are compared whereas ProbFuse and SegFuse represent the state-of-the-art for probabilistic data fusion methods. △ Less

Submitted 9 October, 2014; originally announced October 2014.

Journal ref: Advances in Information Retrieval. Proceedings of the 30th European Conference on Information Retrieval Research (ECIR 2008), volume 4956 of Lecture Notes in Computer Science, pages 358--369, Berlin, 2008. Springer Berlin Heidelberg

arXiv:1410.2632 [pdf, other]

doi 10.1007/978-3-642-38700-5_6

Evaluation of a Conversation Management Toolkit for Multi Agent Programming

Authors: David Lillis, Rem W. Collier, Howell R. Jordan

Abstract: The Agent Conversation Reasoning Engine (ACRE) is intended to aid agent developers to improve the management and reliability of agent communication. To evaluate its effectiveness, a problem scenario was created that could be used to compare code written with and without the use of ACRE by groups of test subjects. This paper describes the requirements that the evaluation scenario was intended to… ▽ More The Agent Conversation Reasoning Engine (ACRE) is intended to aid agent developers to improve the management and reliability of agent communication. To evaluate its effectiveness, a problem scenario was created that could be used to compare code written with and without the use of ACRE by groups of test subjects. This paper describes the requirements that the evaluation scenario was intended to meet and how these motivated the design of the problem. Two experiments were conducted with two separate sets of students and their solutions were analysed using a combination of simple objective metrics and subjective analysis. The analysis suggested that ACRE by default prevents some common problems arising that would limit the reliability and extensibility of conversation-handling code. As ACRE has to date been integrated only with the Agent Factory multi agent framework, it was necessary to verify that the problems identified are not unique to that platform. Thus a comparison was made with best practice communication code written for the Jason platform, in order to demonstrate the wider applicability of a system such as ACRE. △ Less

Submitted 9 October, 2014; originally announced October 2014.

Comments: appears as Programming Multi-Agent Systems - 10th International Workshop, ProMAS 2012, Valencia, Spain, June 5, 2012, Revised Selected Papers

arXiv:1410.0176 [pdf]

doi 10.1145/1558013.1558086

An Agent-Based Approach to Component Management

Authors: David Lillis, Rem Collier, Mauro Dragone, G. M. P. O'Hare

Abstract: This paper details the implementation of a software framework that aids the development of distributed and self-configurable software systems. This framework is an instance of a novel integration strategy called SoSAA (SOcially Situated Agent Architecture), which combines Component-Based Software Engineering and Agent-Oriented Software Engineering, drawing its inspiration from hybrid agent control… ▽ More This paper details the implementation of a software framework that aids the development of distributed and self-configurable software systems. This framework is an instance of a novel integration strategy called SoSAA (SOcially Situated Agent Architecture), which combines Component-Based Software Engineering and Agent-Oriented Software Engineering, drawing its inspiration from hybrid agent control architectures. The framework defines a complete construction process by enhancing a simple component-based framework with reasoning and self-awareness capabilities through a standardized interface. The capabilities of the resulting framework are demonstrated through its application to a non-trivial Multi Agent System (MAS). The system in question is a pre-existing Information Retrieval (IR) system that has not previously taken advantage of CBSE principles. In this paper we contrast these two systems so as to highlight the benefits of using this new hybrid approach. We also outline how component-based elements may be integrated into the Agent Factory agent-oriented application framework. △ Less

Submitted 1 October, 2014; originally announced October 2014.

Comments: In Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS '09), Budapest, Hungary, 2009

arXiv:1409.8518 [pdf, other]

doi 10.1145/1148170.1148197

ProbFuse: A Probabilistic Approach to Data Fusion

Authors: David Lillis, Fergus Toolan, Rem Collier, John Dunnion

Abstract: Data fusion is the combination of the results of independent searches on a document collection into one single output result set. It has been shown in the past that this can greatly improve retrieval effectiveness over that of the individual results. This paper presents probFuse, a probabilistic approach to data fusion. ProbFuse assumes that the performance of the individual input systems on a n… ▽ More Data fusion is the combination of the results of independent searches on a document collection into one single output result set. It has been shown in the past that this can greatly improve retrieval effectiveness over that of the individual results. This paper presents probFuse, a probabilistic approach to data fusion. ProbFuse assumes that the performance of the individual input systems on a number of training queries is indicative of their future performance. The fused result set is based on probabilities of relevance calculated during this training process. Retrieval experiments using data from the TREC ad hoc collection demonstrate that probFuse achieves results superior to that of the popular CombMNZ fusion algorithm. △ Less

Submitted 30 September, 2014; originally announced September 2014.

Comments: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '06), 2006

ACM Class: H.3.3

Showing 1–24 of 24 results for author: Lillis, D