Search | arXiv e-print repository

KeepOriginalAugment: Single Image-based Better Information-Preserving Data Augmentation Approach

Authors: Teerath Kumar, Alessandra Mileo, Malika Bendechache

Abstract: Advanced image data augmentation techniques play a pivotal role in enhancing the training of models for diverse computer vision tasks. Notably, SalfMix and KeepAugment have emerged as popular strategies, showcasing their efficacy in boosting model performance. However, SalfMix reliance on duplicating salient features poses a risk of overfitting, potentially compromising the model's generalization… ▽ More Advanced image data augmentation techniques play a pivotal role in enhancing the training of models for diverse computer vision tasks. Notably, SalfMix and KeepAugment have emerged as popular strategies, showcasing their efficacy in boosting model performance. However, SalfMix reliance on duplicating salient features poses a risk of overfitting, potentially compromising the model's generalization capabilities. Conversely, KeepAugment, which selectively preserves salient regions and augments non-salient ones, introduces a domain shift that hinders the exchange of crucial contextual information, impeding overall model understanding. In response to these challenges, we introduce KeepOriginalAugment, a novel data augmentation approach. This method intelligently incorporates the most salient region within the non-salient area, allowing augmentation to be applied to either region. Striking a balance between data diversity and information preservation, KeepOriginalAugment enables models to leverage both diverse salient and non-salient regions, leading to enhanced performance. We explore three strategies for determining the placement of the salient region minimum, maximum, or random and investigate swap** perspective strategies to decide which part (salient or non-salient) undergoes augmentation. Our experimental evaluations, conducted on classification datasets such as CIFAR-10, CIFAR-100, and TinyImageNet, demonstrate the superior performance of KeepOriginalAugment compared to existing state-of-the-art techniques. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: This paper has been accepted at 20th International Conference on Artificial Intelligence Applications and Innovations 2024

arXiv:2404.19485 [pdf, other]

IID Relaxation by Logical Expressivity: A Research Agenda for Fitting Logics to Neurosymbolic Requirements

Authors: Maarten C. Stol, Alessandra Mileo

Abstract: Neurosymbolic background knowledge and the expressivity required of its logic can break Machine Learning assumptions about data Independence and Identical Distribution. In this position paper we propose to analyze IID relaxation in a hierarchy of logics that fit different use case requirements. We discuss the benefits of exploiting known data dependencies and distribution constraints for Neurosymb… ▽ More Neurosymbolic background knowledge and the expressivity required of its logic can break Machine Learning assumptions about data Independence and Identical Distribution. In this position paper we propose to analyze IID relaxation in a hierarchy of logics that fit different use case requirements. We discuss the benefits of exploiting known data dependencies and distribution constraints for Neurosymbolic use cases and argue that the expressivity required for this knowledge has implications for the design of underlying ML routines. This opens a new research agenda with general questions about Neurosymbolic background knowledge and the expressivity required of its logic. △ Less

Submitted 1 July, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

Comments: 13 pages, 2 figures, submitted to NeSy 2024

arXiv:2401.15448 [pdf, other]

A Systematic Review of Available Datasets in Additive Manufacturing

Authors: Xiao Liu, Alessandra Mileo, Alan F. Smeaton

Abstract: In-situ monitoring incorporating data from visual and other sensor technologies, allows the collection of extensive datasets during the Additive Manufacturing (AM) process. These datasets have potential for determining the quality of the manufactured output and the detection of defects through the use of Machine Learning during the manufacturing process. Open and annotated datasets derived from AM… ▽ More In-situ monitoring incorporating data from visual and other sensor technologies, allows the collection of extensive datasets during the Additive Manufacturing (AM) process. These datasets have potential for determining the quality of the manufactured output and the detection of defects through the use of Machine Learning during the manufacturing process. Open and annotated datasets derived from AM processes are necessary for the machine learning community to address this opportunity, which creates difficulties in the application of computer vision-related machine learning in AM. This systematic review investigates the availability of open image-based datasets originating from AM processes that align with a number of pre-defined selection criteria. The review identifies existing gaps among the current image-based datasets in the domain of AM, and points to the need for greater availability of open datasets in order to allow quality assessment and defect detection during additive manufacturing, to develop. △ Less

Submitted 27 January, 2024; originally announced January 2024.

Comments: 24 pages

arXiv:2309.04762 [pdf, other]

AudRandAug: Random Image Augmentations for Audio Classification

Authors: Teerath Kumar, Muhammad Turab, Alessandra Mileo, Malika Bendechache, Takfarinas Saber

Abstract: Data augmentation has proven to be effective in training neural networks. Recently, a method called RandAug was proposed, randomly selecting data augmentation techniques from a predefined search space. RandAug has demonstrated significant performance improvements for image-related tasks while imposing minimal computational overhead. However, no prior research has explored the application of RandAu… ▽ More Data augmentation has proven to be effective in training neural networks. Recently, a method called RandAug was proposed, randomly selecting data augmentation techniques from a predefined search space. RandAug has demonstrated significant performance improvements for image-related tasks while imposing minimal computational overhead. However, no prior research has explored the application of RandAug specifically for audio data augmentation, which converts audio into an image-like pattern. To address this gap, we introduce AudRandAug, an adaptation of RandAug for audio data. AudRandAug selects data augmentation policies from a dedicated audio search space. To evaluate the effectiveness of AudRandAug, we conducted experiments using various models and datasets. Our findings indicate that AudRandAug outperforms other existing data augmentation methods regarding accuracy performance. △ Less

Submitted 9 September, 2023; originally announced September 2023.

Comments: Paper has accepted at 25th Irish Machine Vision and Image Processing Conference

arXiv:2308.14898

doi 10.4204/EPTCS.385

Proceedings 39th International Conference on Logic Programming

Authors: Enrico Pontelli, Stefania Costantini, Carmine Dodaro, Sarah Gaggl, Roberta Calegari, Artur D'Avila Garcez, Francesco Fabiano, Alessandra Mileo, Alessandra Russo, Francesca Toni

Abstract: This volume contains the Technical Communications presented at the 39th International Conference on Logic Programming (ICLP 2023), held at Imperial College London, UK from July 9 to July 15, 2023. Technical Communications included here concern the Main Track, the Doctoral Consortium, the Application and Systems/Demo track, the Recently Published Research Track, the Birds-of-a-Feather track, the Th… ▽ More This volume contains the Technical Communications presented at the 39th International Conference on Logic Programming (ICLP 2023), held at Imperial College London, UK from July 9 to July 15, 2023. Technical Communications included here concern the Main Track, the Doctoral Consortium, the Application and Systems/Demo track, the Recently Published Research Track, the Birds-of-a-Feather track, the Thematic Tracks on Logic Programming and Machine Learning, and Logic Programming and Explainability, Ethics, and Trustworthiness. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Journal ref: EPTCS 385, 2023

arXiv:2307.07378 [pdf, other]

doi 10.5281/zenodo.8230600

Defect Classification in Additive Manufacturing Using CNN-Based Vision Processing

Authors: Xiao Liu, Alessandra Mileo, Alan F. Smeaton

Abstract: The development of computer vision and in-situ monitoring using visual sensors allows the collection of large datasets from the additive manufacturing (AM) process. Such datasets could be used with machine learning techniques to improve the quality of AM. This paper examines two scenarios: first, using convolutional neural networks (CNNs) to accurately classify defects in an image dataset from AM… ▽ More The development of computer vision and in-situ monitoring using visual sensors allows the collection of large datasets from the additive manufacturing (AM) process. Such datasets could be used with machine learning techniques to improve the quality of AM. This paper examines two scenarios: first, using convolutional neural networks (CNNs) to accurately classify defects in an image dataset from AM and second, applying active learning techniques to the developed classification model. This allows the construction of a human-in-the-loop mechanism to reduce the size of the data required to train and generate training data. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 4 pages, accepted at the Irish Machine Vision and Image Processing Conference (IMVIP), Galway, August 2023

arXiv:2301.02830 [pdf, other]

Image Data Augmentation Approaches: A Comprehensive Survey and Future directions

Authors: Teerath Kumar, Alessandra Mileo, Rob Brennan, Malika Bendechache

Abstract: Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to training data. Consequently, it limits performance improvement. To cope with this problem, various techniques have been proposed such as dropout, normalization a… ▽ More Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to training data. Consequently, it limits performance improvement. To cope with this problem, various techniques have been proposed such as dropout, normalization and advanced data augmentation. Among these, data augmentation, which aims to enlarge the dataset size by including sample diversity, has been a hot topic in recent times. In this article, we focus on advanced data augmentation techniques. we provide a background of data augmentation, a novel and comprehensive taxonomy of reviewed data augmentation techniques, and the strengths and weaknesses (wherever possible) of each technique. We also provide comprehensive results of the data augmentation effect on three popular computer vision tasks, such as image classification, object detection and semantic segmentation. For results reproducibility, we compiled available codes of all data augmentation techniques. Finally, we discuss the challenges and difficulties, and possible future direction for the research community. We believe, this survey provides several benefits i) readers will understand the data augmentation working mechanism to fix overfitting problems ii) results will save the searching time of the researcher for comparison purposes. iii) Codes of the mentioned data augmentation techniques are available at https://github.com/kmr2017/Advanced-Data-augmentation-codes iv) Future work will spark interest in research community. △ Less

Submitted 11 March, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

Comments: We need to make a lot changes to make its quality better

arXiv:2212.06153 [pdf, other]

An adaptive human-in-the-loop approach to emission detection of Additive Manufacturing processes and active learning with computer vision

Authors: Xiao Liu, Alan F. Smeaton, Alessandra Mileo

Abstract: Recent developments in in-situ monitoring and process control in Additive Manufacturing (AM), also known as 3D-printing, allows the collection of large amounts of emission data during the build process of the parts being manufactured. This data can be used as input into 3D and 2D representations of the 3D-printed parts. However the analysis and use, as well as the characterization of this data sti… ▽ More Recent developments in in-situ monitoring and process control in Additive Manufacturing (AM), also known as 3D-printing, allows the collection of large amounts of emission data during the build process of the parts being manufactured. This data can be used as input into 3D and 2D representations of the 3D-printed parts. However the analysis and use, as well as the characterization of this data still remains a manual process. The aim of this paper is to propose an adaptive human-in-the-loop approach using Machine Learning techniques that automatically inspect and annotate the emissions data generated during the AM process. More specifically, this paper will look at two scenarios: firstly, using convolutional neural networks (CNNs) to automatically inspect and classify emission data collected by in-situ monitoring and secondly, applying Active Learning techniques to the developed classification model to construct a human-in-the-loop mechanism in order to accelerate the labeling process of the emission data. The CNN-based approach relies on transfer learning and fine-tuning, which makes the approach applicable to other industrial image patterns. The adaptive nature of the approach is enabled by uncertainty sampling strategy to automatic selection of samples to be presented to human experts for annotation. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: 7 pages, 9 figures, 1 table. Presented at The 6th IEEE Workshop on Human-in-the-Loop Methods and Future of Work in BigData (IEEE HMData 2022) December 2022

arXiv:2009.09158

doi 10.4204/EPTCS.325

Proceedings 36th International Conference on Logic Programming (Technical Communications)

Authors: Francesco Ricca, Alessandra Russo, Sergio Greco, Nicola Leone, Alexander Artikis, Gerhard Friedrich, Paul Fodor, Angelika Kimmig, Francesca Lisi, Marco Maratea, Alessandra Mileo, Fabrizio Riguzzi

Abstract: Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. Contributions are solicited in all areas of logic programming and related areas, including but not restricted to: - Foundations: Semantics, Formalisms, Answer-Set Programming, Non-monotonic Reasoning, Knowledge Representation. - Declarative Programm… ▽ More Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. Contributions are solicited in all areas of logic programming and related areas, including but not restricted to: - Foundations: Semantics, Formalisms, Answer-Set Programming, Non-monotonic Reasoning, Knowledge Representation. - Declarative Programming: Inference engines, Analysis, Type and mode inference, Partial evaluation, Abstract interpretation, Transformation, Validation, Verification, Debugging, Profiling, Testing, Logic-based domain-specific languages, constraint handling rules. - Related Paradigms and Synergies: Inductive and Co-inductive Logic Programming, Constraint Logic Programming, Interaction with SAT, SMT and CSP solvers, Logic programming techniques for type inference and theorem proving, Argumentation, Probabilistic Logic Programming, Relations to object-oriented and Functional programming, Description logics, Neural-Symbolic Machine Learning, Hybrid Deep Learning and Symbolic Reasoning. - Implementation: Concurrency and distribution, Objects, Coordination, Mobility, Virtual machines, Compilation, Higher Order, Type systems, Modules, Constraint handling rules, Meta-programming, Foreign interfaces, User interfaces. - Applications: Databases, Big Data, Data Integration and Federation, Software Engineering, Natural Language Processing, Web and Semantic Web, Agents, Artificial Intelligence, Bioinformatics, Education, Computational life sciences, Education, Cybersecurity, and Robotics. △ Less

Submitted 19 September, 2020; originally announced September 2020.

Journal ref: EPTCS 325, 2020

arXiv:2007.00461 [pdf, ps, other]

Query Based Access Control for Linked Data

Authors: Sabrina Kirrane, Alessandra Mileo, Axel Polleres, Stefan Decker

Abstract: In recent years we have seen significant advances in the technology used to both publish and consume Linked Data. However, in order to support the next generation of ebusiness applications on top of interlinked machine readable data suitable forms of access control need to be put in place. Although a number of access control models and frameworks have been put forward, very little research has bee… ▽ More In recent years we have seen significant advances in the technology used to both publish and consume Linked Data. However, in order to support the next generation of ebusiness applications on top of interlinked machine readable data suitable forms of access control need to be put in place. Although a number of access control models and frameworks have been put forward, very little research has been conducted into the security implications associated with granting access to partial data or the correctness of the proposed access control mechanisms. Therefore the contributions of this paper are two fold: we propose a query rewriting algorithm which can be used to partially restrict access to SPARQL 1.1 queries and updates; and we demonstrate how a set of criteria, which was originally used to verify that an access control policy holds over different database states, can be adapted to verify the correctness of access control via query rewriting. △ Less

Submitted 31 December, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

arXiv:1412.8531

Workshop Notes of the 6th International Workshop on Acquisition, Representation and Reasoning about Context with Logic (ARCOE-Logic 2014)

Authors: Michael Fink, Martin Homola, Alessandra Mileo

Abstract: ARCOE-Logic 2014, the 6th International Workshop on Acquisition, Representation and Reasoning about Context with Logic, was held in co-location with the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2014) on November 25, 2014 in Linkö**, Sweden. These notes contain the five papers which were accepted and presented at the workshop. ARCOE-Logic 2014, the 6th International Workshop on Acquisition, Representation and Reasoning about Context with Logic, was held in co-location with the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2014) on November 25, 2014 in Linkö**, Sweden. These notes contain the five papers which were accepted and presented at the workshop. △ Less

Submitted 29 December, 2014; originally announced December 2014.

Comments: ARCOE-Logic 2014, 5 papers

arXiv:1405.0720 [pdf, ps, other]

Probabilistic Inductive Logic Programming Based on Answer Set Programming

Authors: Matthias Nickles, Alessandra Mileo

Abstract: We propose a new formal language for the expressive representation of probabilistic knowledge based on Answer Set Programming (ASP). It allows for the annotation of first-order formulas as well as ASP rules and facts with probabilities and for learning of such weights from data (parameter estimation). Weighted formulas are given a semantics in terms of soft and hard constraints which determine a p… ▽ More We propose a new formal language for the expressive representation of probabilistic knowledge based on Answer Set Programming (ASP). It allows for the annotation of first-order formulas as well as ASP rules and facts with probabilities and for learning of such weights from data (parameter estimation). Weighted formulas are given a semantics in terms of soft and hard constraints which determine a probability distribution over answer sets. In contrast to related approaches, we approach inference by optionally utilizing so-called streamlining XOR constraints, in order to reduce the number of computed answer sets. Our approach is prototypically implemented. Examples illustrate the introduced concepts and point at issues and topics for future research. △ Less

Submitted 4 May, 2014; originally announced May 2014.

Comments: Appears in the Proceedings of the 15th International Workshop on Non-Monotonic Reasoning (NMR 2014)

arXiv:1006.5657 [pdf, other]

Reasoning Support for Risk Prediction and Prevention in Independent Living

Authors: A. Mileo, D. Merico, R. Bisiani

Abstract: In recent years there has been growing interest in solutions for the delivery of clinical care for the elderly, due to the large increase in aging population. Monitoring a patient in his home environment is necessary to ensure continuity of care in home settings, but, to be useful, this activity must not be too invasive for patients and a burden for caregivers. We prototyped a system called SINDI… ▽ More In recent years there has been growing interest in solutions for the delivery of clinical care for the elderly, due to the large increase in aging population. Monitoring a patient in his home environment is necessary to ensure continuity of care in home settings, but, to be useful, this activity must not be too invasive for patients and a burden for caregivers. We prototyped a system called SINDI (Secure and INDependent lIving), focused on i) collecting a limited amount of data about the person and the environment through Wireless Sensor Networks (WSN), and ii) inferring from these data enough information to support caregivers in understanding patients' well being and in predicting possible evolutions of their health. Our hierarchical logic-based model of health combines data from different sources, sensor data, tests results, common-sense knowledge and patient's clinical profile at the lower level, and correlation rules between health conditions across upper levels. The logical formalization and the reasoning process are based on Answer Set Programming. The expressive power of this logic programming paradigm makes it possible to reason about health evolution even when the available information is incomplete and potentially incoherent, while declarativity simplifies rules specification by caregivers and allows automatic encoding of knowledge. This paper describes how these issues have been targeted in the application scenario of the SINDI system. △ Less

Submitted 29 June, 2010; originally announced June 2010.

Comments: 36 pages, 5 figures, 10 tables. To appear in Theory and Practice of Logic Programming (TPLP)

ACM Class: I.2.1; I.2.3; J.3

Showing 1–13 of 13 results for author: Mileo, A