-
CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification
Authors:
Korbinian Randl,
John Pavlopoulos,
Aron Henriksson,
Tony Lindgren
Abstract:
Contaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a dataset of 7,546 short texts describing public food recall announcements. Each text is manually labeled, on two granularity levels (coarse and fine), for food product…
▽ More
Contaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a dataset of 7,546 short texts describing public food recall announcements. Each text is manually labeled, on two granularity levels (coarse and fine), for food products and hazards that the recall corresponds to. We describe the dataset and benchmark naive, traditional, and Transformer models. Based on our analysis, Logistic Regression based on a tf-idf representation outperforms RoBERTa and XLM-R on classes with low support. Finally, we discuss different prompting strategies and present an LLM-in-the-loop framework, based on Conformal Prediction, which boosts the performance of the base classifier while reducing energy consumption compared to normal prompting.
△ Less
Submitted 30 May, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
A Cost-Sensitive Transformer Model for Prognostics Under Highly Imbalanced Industrial Data
Authors:
Ali Beikmohammadi,
Mohammad Hosein Hamian,
Neda Khoeyniha,
Tony Lindgren,
Olof Steinert,
Sindri Magnússon
Abstract:
The rapid influx of data-driven models into the industrial sector has been facilitated by the proliferation of sensor technology, enabling the collection of vast quantities of data. However, leveraging these models for failure detection and prognosis poses significant challenges, including issues like missing values and class imbalances. Moreover, the cost sensitivity associated with industrial op…
▽ More
The rapid influx of data-driven models into the industrial sector has been facilitated by the proliferation of sensor technology, enabling the collection of vast quantities of data. However, leveraging these models for failure detection and prognosis poses significant challenges, including issues like missing values and class imbalances. Moreover, the cost sensitivity associated with industrial operations further complicates the application of conventional models in this context. This paper introduces a novel cost-sensitive transformer model developed as part of a systematic workflow, which also integrates a hybrid resampler and a regression-based imputer. After subjecting our approach to rigorous testing using the APS failure dataset from Scania trucks and the SECOM dataset, we observed a substantial enhancement in performance compared to state-of-the-art methods. Moreover, we conduct an ablation study to analyze the contributions of different components in our proposed method. Our findings highlight the potential of our method in addressing the unique challenges of failure prediction in industrial settings, thereby contributing to enhanced reliability and efficiency in industrial operations.
△ Less
Submitted 16 January, 2024;
originally announced February 2024.
-
SCANIA Component X Dataset: A Real-World Multivariate Time Series Dataset for Predictive Maintenance
Authors:
Zahra Kharazian,
Tony Lindgren,
Sindri Magnússon,
Olof Steinert,
Oskar Andersson Reyna
Abstract:
This paper presents a description of a real-world, multivariate time series dataset collected from an anonymized engine component (called Component X) of a fleet of trucks from SCANIA, Sweden. This dataset includes diverse variables capturing detailed operational data, repair records, and specifications of trucks while maintaining confidentiality by anonymization. It is well-suited for a range of…
▽ More
This paper presents a description of a real-world, multivariate time series dataset collected from an anonymized engine component (called Component X) of a fleet of trucks from SCANIA, Sweden. This dataset includes diverse variables capturing detailed operational data, repair records, and specifications of trucks while maintaining confidentiality by anonymization. It is well-suited for a range of machine learning applications, such as classification, regression, survival analysis, and anomaly detection, particularly when applied to predictive maintenance scenarios. The large population size and variety of features in the format of histograms and numerical counters, along with the inclusion of temporal information, make this real-world dataset unique in the field. The objective of releasing this dataset is to give a broad range of researchers the possibility of working with real-world data from an internationally well-known company and introduce a standard benchmark to the predictive maintenance field, fostering reproducible research.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
A Resilient Framework for 5G-Edge-Connected UAVs based on Switching Edge-MPC and Onboard-PID Control
Authors:
Gerasimos Damigos,
Achilleas Santi Seisa,
Sumeet Gajanan Satpute,
Tore Lindgren,
George Nikolakopoulos
Abstract:
In recent years, the need for resources for handling processes with high computational complexity for mobile robots is becoming increasingly urgent. More specifically, robots need to autonomously operate in a robust and continuous manner, while kee** high performance, a need that led to the utilization of edge computing to offload many computationally demanding and time-critical robotic procedur…
▽ More
In recent years, the need for resources for handling processes with high computational complexity for mobile robots is becoming increasingly urgent. More specifically, robots need to autonomously operate in a robust and continuous manner, while kee** high performance, a need that led to the utilization of edge computing to offload many computationally demanding and time-critical robotic procedures. However, safe mechanisms should be implemented to handle situations when it is not possible to use the offloaded procedures, such as if the communication is challenged or the edge cluster is not available. To this end, this article presents a switching strategy for safety, redundancy, and optimized behavior through an edge computing-based Model Predictive Controller (MPC) and a low-level onboard-PID controller for edge-connected Unmanned Aerial Vehicles (UAVs). The switching strategy is based on the communication Key Performance Indicators (KPIs) over 5G to decide whether the UAV should be controlled by the edge-based or have a safe fallback based on the onboard controller.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
PACED-5G: Predictive Autonomous Control using Edge for Drones over 5G
Authors:
Viswa Narayanan Sankaranarayanan,
Gerasimos Damigos,
Achilleas Santi Seisa,
Sumeet Gajanan Satpute,
Tore Lindgren,
George Nikolakopoulos
Abstract:
With the advent of technologies such as Edge computing, the horizons of remote computational applications have broadened multidimensionally. Autonomous Unmanned Aerial Vehicle (UAV) mission is a vital application to utilize remote computation to catalyze its performance. However, offloading computational complexity to a remote system increases the latency in the system. Though technologies such as…
▽ More
With the advent of technologies such as Edge computing, the horizons of remote computational applications have broadened multidimensionally. Autonomous Unmanned Aerial Vehicle (UAV) mission is a vital application to utilize remote computation to catalyze its performance. However, offloading computational complexity to a remote system increases the latency in the system. Though technologies such as 5G networking minimize communication latency, the effects of latency on the control of UAVs are inevitable and may destabilize the system. Hence, it is essential to consider the delays in the system and compensate for them in the control design. Therefore, we propose a novel Edge-based predictive control architecture enabled by 5G networking, PACED-5G (Predictive Autonomous Control using Edge for Drones over 5G). In the proposed control architecture, we have designed a state estimator for estimating the current states based on the available knowledge of the time-varying delays, devised a Model Predictive controller (MPC) for the UAV to track the reference trajectory while avoiding obstacles, and provided an interface to offload the high-level tasks over Edge systems. The proposed architecture is validated in two experimental test cases using a quadrotor UAV.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Automotive Multilingual Fault Diagnosis
Authors:
John Pavlopoulos,
Alv Romell,
Jacob Curman,
Olof Steinert,
Tony Lindgren,
Markus Borg
Abstract:
Automated fault diagnosis can facilitate diagnostics assistance, speedier troubleshooting, and better-organised logistics. Currently, AI-based prognostics and health management in the automotive industry ignore the textual descriptions of the experienced problems or symptoms. With this study, however, we show that a multilingual pre-trained Transformer can effectively classify the textual claims f…
▽ More
Automated fault diagnosis can facilitate diagnostics assistance, speedier troubleshooting, and better-organised logistics. Currently, AI-based prognostics and health management in the automotive industry ignore the textual descriptions of the experienced problems or symptoms. With this study, however, we show that a multilingual pre-trained Transformer can effectively classify the textual claims from a large company with vehicle fleets, despite the task's challenging nature due to the 38 languages and 1,357 classes involved. Overall, we report an accuracy of more than 80% for high-frequency classes and above 60% for above-low-frequency classes, bringing novel evidence that multilingual classification can benefit automotive troubleshooting management.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Hierarchical Bayesian Modelling for Knowledge Transfer Across Engineering Fleets via Multitask Learning
Authors:
L. A. Bull,
D. Di Francesco,
M. Dhada,
O. Steinert,
T. Lindgren,
A. K. Parlikad,
A. B. Duncan,
M. Girolami
Abstract:
A population-level analysis is proposed to address data sparsity when building predictive models for engineering infrastructure. Utilising an interpretable hierarchical Bayesian approach and operational fleet data, domain expertise is naturally encoded (and appropriately shared) between different sub-groups, representing (i) use-type, (ii) component, or (iii) operating condition. Specifically, dom…
▽ More
A population-level analysis is proposed to address data sparsity when building predictive models for engineering infrastructure. Utilising an interpretable hierarchical Bayesian approach and operational fleet data, domain expertise is naturally encoded (and appropriately shared) between different sub-groups, representing (i) use-type, (ii) component, or (iii) operating condition. Specifically, domain expertise is exploited to constrain the model via assumptions (and prior distributions) allowing the methodology to automatically share information between similar assets, improving the survival analysis of a truck fleet and power prediction in a wind farm. In each asset management example, a set of correlated functions is learnt over the fleet, in a combined inference, to learn a population model. Parameter estimation is improved when sub-fleets share correlated information at different levels of the hierarchy. In turn, groups with incomplete data automatically borrow statistical strength from those that are data-rich. The statistical correlations enable knowledge transfer via Bayesian transfer learning, and the correlations can be inspected to inform which assets share information for which effect (i.e. parameter). Both case studies demonstrate the wide applicability to practical infrastructure monitoring, since the approach is naturally adapted between interpretable fleet models of different in situ examples.
△ Less
Submitted 12 May, 2023; v1 submitted 26 April, 2022;
originally announced April 2022.