-
Rethinking harmless refusals when fine-tuning foundation models
Authors:
Florin Pop,
Judd Rosenblatt,
Diogo Schwerz de Lucena,
Michael Vaiana
Abstract:
In this paper, we investigate the degree to which fine-tuning in Large Language Models (LLMs) effectively mitigates versus merely conceals undesirable behavior. Through the lens of semi-realistic role-playing exercises designed to elicit such behaviors, we explore the response dynamics of LLMs post fine-tuning interventions. Our methodology involves prompting models for Chain-of-Thought (CoT) reas…
▽ More
In this paper, we investigate the degree to which fine-tuning in Large Language Models (LLMs) effectively mitigates versus merely conceals undesirable behavior. Through the lens of semi-realistic role-playing exercises designed to elicit such behaviors, we explore the response dynamics of LLMs post fine-tuning interventions. Our methodology involves prompting models for Chain-of-Thought (CoT) reasoning and analyzing the coherence between the reasoning traces and the resultant outputs. Notably, we identify a pervasive phenomenon we term \emph{reason-based deception}, where models either stop producing reasoning traces or produce seemingly ethical reasoning traces that belie the unethical nature of their final outputs. We further examine the efficacy of response strategies (polite refusal versus explicit rebuttal) in curbing the occurrence of undesired behavior in subsequent outputs of multi-turn interactions. Our findings reveal that explicit rebuttals significantly outperform polite refusals in preventing the continuation of undesired outputs and nearly eliminate reason-based deception, challenging current practices in model fine-tuning. Accordingly, the two key contributions of this paper are (1) defining and studying reason-based deception, a new type of hidden behavior, and (2) demonstrating that rebuttals provide a more robust response model to harmful requests than refusals, thereby highlighting the need to reconsider the response strategies in fine-tuning approaches.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification
Authors:
Adrian Gheorghiu,
Iulian-Marius Tăiatu,
Dumitru-Clementin Cercel,
Iuliana Marin,
Florin Pop
Abstract:
The detection and classification of diseases in Robusta coffee leaves are essential to ensure that plants are healthy and the crop yield is kept high. However, this job requires extensive botanical knowledge and much wasted time. Therefore, this task and others similar to it have been extensively researched subjects in image classification. Regarding leaf disease classification, most approaches ha…
▽ More
The detection and classification of diseases in Robusta coffee leaves are essential to ensure that plants are healthy and the crop yield is kept high. However, this job requires extensive botanical knowledge and much wasted time. Therefore, this task and others similar to it have been extensively researched subjects in image classification. Regarding leaf disease classification, most approaches have used the more popular PlantVillage dataset while completely disregarding other datasets, like the Robusta Coffee Leaf (RoCoLe) dataset. As the RoCoLe dataset is imbalanced and does not have many samples, fine-tuning of pre-trained models and multiple augmentation techniques need to be used. The current paper uses the RoCoLe dataset and approaches based on deep learning for classifying coffee leaf diseases from images, incorporating the pix2pix model for segmentation and cycle-generative adversarial network (CycleGAN) for augmentation. Our study demonstrates the effectiveness of Transformer-based models, online augmentations, and CycleGAN augmentation in improving leaf disease classification. While synthetic data has limitations, it complements real data, enhancing model performance. These findings contribute to develo** robust techniques for plant disease detection and classification.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Explainability-Driven Leaf Disease Classification Using Adversarial Training and Knowledge Distillation
Authors:
Sebastian-Vasile Echim,
Iulian-Marius Tăiatu,
Dumitru-Clementin Cercel,
Florin Pop
Abstract:
This work focuses on plant leaf disease classification and explores three crucial aspects: adversarial training, model explainability, and model compression. The models' robustness against adversarial attacks is enhanced through adversarial training, ensuring accurate classification even in the presence of threats. Leveraging explainability techniques, we gain insights into the model's decision-ma…
▽ More
This work focuses on plant leaf disease classification and explores three crucial aspects: adversarial training, model explainability, and model compression. The models' robustness against adversarial attacks is enhanced through adversarial training, ensuring accurate classification even in the presence of threats. Leveraging explainability techniques, we gain insights into the model's decision-making process, improving trust and transparency. Additionally, we explore model compression techniques to optimize computational efficiency while maintaining classification performance. Through our experiments, we determine that on a benchmark dataset, the robustness can be the price of the classification accuracy with performance reductions of 3%-20% for regular tests and gains of 50%-70% for adversarial attack tests. We also demonstrate that a student model can be 15-25 times more computationally efficient for a slight performance reduction, distilling the knowledge of more complex models.
△ Less
Submitted 23 January, 2024; v1 submitted 30 December, 2023;
originally announced January 2024.
-
End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition
Authors:
Emilian-Claudiu Mănescu,
Răzvan-Alexandru Smădu,
Andrei-Marius Avram,
Dumitru-Clementin Cercel,
Florin Pop
Abstract:
Lip reading or visual speech recognition has gained significant attention in recent years, particularly because of hardware development and innovations in computer vision. While considerable progress has been obtained, most models have only been tested on a few large-scale datasets. This work addresses this shortcoming by analyzing several architectures and optimizations on the underrepresented, s…
▽ More
Lip reading or visual speech recognition has gained significant attention in recent years, particularly because of hardware development and innovations in computer vision. While considerable progress has been obtained, most models have only been tested on a few large-scale datasets. This work addresses this shortcoming by analyzing several architectures and optimizations on the underrepresented, short-scale Romanian language dataset called Wild LRRo. Most notably, we compare different backend modules, demonstrating the effectiveness of adding ample regularization methods. We obtain state-of-the-art results using our proposed method, namely cross-lingual domain adaptation and unlabeled videos from English and German datasets to help the model learn language-invariant features. Lastly, we assess the performance of adding a layer inspired by the neural inhibition mechanism.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
SkinDistilViT: Lightweight Vision Transformer for Skin Lesion Classification
Authors:
Vlad-Constantin Lungu-Stan,
Dumitru-Clementin Cercel,
Florin Pop
Abstract:
Skin cancer is a treatable disease if discovered early. We provide a production-specific solution to the skin cancer classification problem that matches human performance in melanoma identification by training a vision transformer on melanoma medical images annotated by experts. Since inference cost, both time and memory wise is important in practice, we employ knowledge distillation to obtain a m…
▽ More
Skin cancer is a treatable disease if discovered early. We provide a production-specific solution to the skin cancer classification problem that matches human performance in melanoma identification by training a vision transformer on melanoma medical images annotated by experts. Since inference cost, both time and memory wise is important in practice, we employ knowledge distillation to obtain a model that retains 98.33% of the teacher's balanced multi-class accuracy, at a fraction of the cost. Memory-wise, our model is 49.60% smaller than the teacher. Time-wise, our solution is 69.25% faster on GPU and 97.96% faster on CPU. By adding classification heads at each level of the transformer and employing a cascading distillation process, we improve the balanced multi-class accuracy of the base model by 2.1%, while creating a range of models of various sizes but comparable performance. We provide the code at https://github.com/Longman-Stan/SkinDistilVit.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
From Fake to Hyperpartisan News Detection Using Domain Adaptation
Authors:
Răzvan-Alexandru Smădu,
Sebastian-Vasile Echim,
Dumitru-Clementin Cercel,
Iuliana Marin,
Florin Pop
Abstract:
Unsupervised Domain Adaptation (UDA) is a popular technique that aims to reduce the domain shift between two data distributions. It was successfully applied in computer vision and natural language processing. In the current work, we explore the effects of various unsupervised domain adaptation techniques between two text classification tasks: fake and hyperpartisan news detection. We investigate t…
▽ More
Unsupervised Domain Adaptation (UDA) is a popular technique that aims to reduce the domain shift between two data distributions. It was successfully applied in computer vision and natural language processing. In the current work, we explore the effects of various unsupervised domain adaptation techniques between two text classification tasks: fake and hyperpartisan news detection. We investigate the knowledge transfer from fake to hyperpartisan news detection without involving target labels during training. Thus, we evaluate UDA, cluster alignment with a teacher, and cross-domain contrastive learning. Extensive experiments show that these techniques improve performance, while including data augmentation further enhances the results. In addition, we combine clustering and topic modeling algorithms with UDA, resulting in improved performances compared to the initial UDA setup.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Adversarial Capsule Networks for Romanian Satire Detection and Sentiment Analysis
Authors:
Sebastian-Vasile Echim,
Răzvan-Alexandru Smădu,
Andrei-Marius Avram,
Dumitru-Clementin Cercel,
Florin Pop
Abstract:
Satire detection and sentiment analysis are intensively explored natural language processing (NLP) tasks that study the identification of the satirical tone from texts and extracting sentiments in relationship with their targets. In languages with fewer research resources, an alternative is to produce artificial examples based on character-level adversarial processes to overcome dataset size limit…
▽ More
Satire detection and sentiment analysis are intensively explored natural language processing (NLP) tasks that study the identification of the satirical tone from texts and extracting sentiments in relationship with their targets. In languages with fewer research resources, an alternative is to produce artificial examples based on character-level adversarial processes to overcome dataset size limitations. Such samples are proven to act as a regularization method, thus improving the robustness of models. In this work, we improve the well-known NLP models (i.e., Convolutional Neural Networks, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Units (GRUs), and Bidirectional GRUs) with adversarial training and capsule networks. The fine-tuned models are used for satire detection and sentiment analysis tasks in the Romanian language. The proposed framework outperforms the existing methods for the two tasks, achieving up to 99.08% accuracy, thus confirming the improvements added by the capsule layers and the adversarial training in NLP approaches.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
RoBERTweet: A BERT Language Model for Romanian Tweets
Authors:
Iulian-Marius Tăiatu,
Andrei-Marius Avram,
Dumitru-Clementin Cercel,
Florin Pop
Abstract:
Develo** natural language processing (NLP) systems for social media analysis remains an important topic in artificial intelligence research. This article introduces RoBERTweet, the first Transformer architecture trained on Romanian tweets. Our RoBERTweet comes in two versions, following the base and large architectures of BERT. The corpus used for pre-training the models represents a novelty for…
▽ More
Develo** natural language processing (NLP) systems for social media analysis remains an important topic in artificial intelligence research. This article introduces RoBERTweet, the first Transformer architecture trained on Romanian tweets. Our RoBERTweet comes in two versions, following the base and large architectures of BERT. The corpus used for pre-training the models represents a novelty for the Romanian NLP community and consists of all tweets collected from 2008 to 2022. Experiments show that RoBERTweet models outperform the previous general-domain Romanian and multilingual language models on three NLP tasks with tweet inputs: emotion detection, sexist language identification, and named entity recognition. We make our models and the newly created corpus of Romanian tweets freely available.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
TA-DA: Topic-Aware Domain Adaptation for Scientific Keyphrase Identification and Classification (Student Abstract)
Authors:
Răzvan-Alexandru Smădu,
George-Eduard Zaharia,
Andrei-Marius Avram,
Dumitru-Clementin Cercel,
Mihai Dascalu,
Florin Pop
Abstract:
Keyphrase identification and classification is a Natural Language Processing and Information Retrieval task that involves extracting relevant groups of words from a given text related to the main topic. In this work, we focus on extracting keyphrases from scientific documents. We introduce TA-DA, a Topic-Aware Domain Adaptation framework for keyphrase extraction that integrates Multi-Task Learning…
▽ More
Keyphrase identification and classification is a Natural Language Processing and Information Retrieval task that involves extracting relevant groups of words from a given text related to the main topic. In this work, we focus on extracting keyphrases from scientific documents. We introduce TA-DA, a Topic-Aware Domain Adaptation framework for keyphrase extraction that integrates Multi-Task Learning with Adversarial Training and Domain Adaptation. Our approach improves performance over baseline models by up to 5% in the exact match of the F1-score.
△ Less
Submitted 30 December, 2022;
originally announced January 2023.
-
A Simulation Model for Evaluating Distributed Systems Dependability
Authors:
Ciprian Dobre,
Florin Pop,
Valentin Cristea
Abstract:
In this paper we present a new simulation model designed to evaluate the dependability in distributed systems. This model extends the MONARC simulation model with new capabilities for capturing reliability, safety, availability, security, and maintainability requirements. The model has been implemented as an extension of the multithreaded, process oriented simulator MONARC, which allows the realis…
▽ More
In this paper we present a new simulation model designed to evaluate the dependability in distributed systems. This model extends the MONARC simulation model with new capabilities for capturing reliability, safety, availability, security, and maintainability requirements. The model has been implemented as an extension of the multithreaded, process oriented simulator MONARC, which allows the realistic simulation of a wide-range of distributed system technologies, with respect to their specific components and characteristics. The extended simulation model includes the necessary components to inject various failure events, and provides the mechanisms to evaluate different strategies for replication, redundancy procedures, and security enforcement mechanisms, as well. The results obtained in simulation experiments presented in this paper probe that the use of discrete-event simulators, such as MONARC, in the design and development of distributed systems is appealing due to their efficiency and scalability.
△ Less
Submitted 12 February, 2012;
originally announced February 2012.
-
An Architectural Model for a Grid based Workflow Management Platform in Scientific Applications
Authors:
Alexandru Costan,
Florin Pop,
Corina Stratan,
Ciprian Dobre,
Catalin Leordeanu,
Valentin Cristea
Abstract:
With recent increasing computational and data requirements of scientific applications, the use of large clustered systems as well as distributed resources is inevitable. Although executing large applications in these environments brings increased performance, the automation of the process becomes more and more challenging. While the use of complex workflow management systems has been a viable solu…
▽ More
With recent increasing computational and data requirements of scientific applications, the use of large clustered systems as well as distributed resources is inevitable. Although executing large applications in these environments brings increased performance, the automation of the process becomes more and more challenging. While the use of complex workflow management systems has been a viable solution for this automation process in business oriented environments, the open source engines available for scientific applications lack some functionalities or are too difficult to use for non-specialists. In this work we propose an architectural model for a grid based workflow management platform providing features like an intuitive way to describe workflows, efficient data handling mechanisms and flexible fault tolerance support. Our integrated solution introduces a workflow engine component based on ActiveBPEL extended with additional functionalities and a scheduling component providing efficient map** between tasks and available resources.
△ Less
Submitted 29 June, 2011;
originally announced June 2011.
-
Models and Techniques for Ensuring Reliability, Safety, Availability and Security of Large Scale Distributed Systems
Authors:
Valentin Cristea,
Ciprian Dobre,
Florin Pop,
Corina Stratan,
Alexandru Costan,
Catalin Leordeanu
Abstract:
17th International Conference on Control Systems and Computer Science (CSCS 17), Bucharest, Romania, May 26-29, 2009. Vol. 1, pp. 401-406, ISSN: 2066-4451.
17th International Conference on Control Systems and Computer Science (CSCS 17), Bucharest, Romania, May 26-29, 2009. Vol. 1, pp. 401-406, ISSN: 2066-4451.
△ Less
Submitted 28 June, 2011;
originally announced June 2011.
-
Advance Reservation of Resources for Task Execution in Grid Environments
Authors:
Eliza Moise,
Diana Moise,
Florin Pop,
Valentin Cristea
Abstract:
The paper proposes a solution for the Grid scheduling problem, addressing in particular the requirement of high performance an efficient algorithm must fulfill. Advance Reservation engages a distributed, dynamic, fault-tolerant and efficient strategy which reserves resources for future task execution. The paper presents the main features of the strategy, the functioning mechanism the strategy is b…
▽ More
The paper proposes a solution for the Grid scheduling problem, addressing in particular the requirement of high performance an efficient algorithm must fulfill. Advance Reservation engages a distributed, dynamic, fault-tolerant and efficient strategy which reserves resources for future task execution. The paper presents the main features of the strategy, the functioning mechanism the strategy is based on and the methods used for evaluating the algorithm.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Resource CoAllocation for Scheduling Tasks with Dependencies, in Grid
Authors:
Diana Moise,
Eliza Moise,
Florin Pop,
Valentin Cristea
Abstract:
Scheduling applications on wide-area distributed systems is useful for obtaining quick and reliable results in an efficient manner. Optimized scheduling algorithms are fundamentally important in order to achieve optimized resources utilization. The existing and potential applications include many fields of activity like satellite image processing and medicine. The paper proposes a scheduling algor…
▽ More
Scheduling applications on wide-area distributed systems is useful for obtaining quick and reliable results in an efficient manner. Optimized scheduling algorithms are fundamentally important in order to achieve optimized resources utilization. The existing and potential applications include many fields of activity like satellite image processing and medicine. The paper proposes a scheduling algorithm for tasks with dependencies in Grid environments. CoAllocation represents a strategy that provides a schedule for task with dependencies, having as main purpose the efficiency of the schedule, in terms of load balancing and minimum time for the execution of the tasks.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Clasificarea distribuita a mesajelor de e-mail
Authors:
Florin Pop,
Diana Petrescu,
Ştefan Trauşan-Matu
Abstract:
A basic component in Internet applications is the electronic mail and its various implications. The paper proposes a mechanism for automatically classifying emails and create dynamic groups that belong to these messages. Proposed mechanisms will be based on natural language processing techniques and will be designed to facilitate human-machine interaction in this direction.
A basic component in Internet applications is the electronic mail and its various implications. The paper proposes a mechanism for automatically classifying emails and create dynamic groups that belong to these messages. Proposed mechanisms will be based on natural language processing techniques and will be designed to facilitate human-machine interaction in this direction.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
OpenPh - Numerical Physics Library
Authors:
George Milescu,
Gabriel Noaje,
Florin Pop
Abstract:
Numerical physics has gained a lot of importance in the last decade, its efficiency being motivated and sustained by the growth of computational power. This paper presents a concept that is to be developed in the next few years: OpenPh. OpenPh is a numerical physics library that makes use of the advantages of both open source software and MATLAB programming. Its aim is to deliver the instruments f…
▽ More
Numerical physics has gained a lot of importance in the last decade, its efficiency being motivated and sustained by the growth of computational power. This paper presents a concept that is to be developed in the next few years: OpenPh. OpenPh is a numerical physics library that makes use of the advantages of both open source software and MATLAB programming. Its aim is to deliver the instruments for providing numerical and graphical solutions for various physics problems. It has a modular structure, allowing the user to add new modules to the existing ones and to create its own modules according to its needs, being virtually unlimited extendable. The modules of OpenPh are implemented using MATLAB engine because it is the best solution used in engineering and science, providing a wide range of optimized methods to accomplish even the toughest jobs. Current version of OpenPh includes two modules, the first one providing tools for quantum physics and the second one for mechanics. The quantum physics module deals with the photoelectric effect, the radioactive decay of carbon-11, and the Schrödinger equation - particle in a box. The classical mechanics module includes the study of the uniform circular motion, the forced damped harmonic oscillations and the vibration of a fixed-fixed string.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Intelligent strategies for DAG scheduling optimization in Grid environments
Authors:
Florin Pop,
Valentin Cristea
Abstract:
The paper presents a solution to the dynamic DAG scheduling problem in Grid environments. It presents a distributed, scalable, efficient and fault-tolerant algorithm for optimizing tasks assignment. The scheduler algorithm for tasks with dependencies uses a heuristic model to optimize the total cost of tasks execution. Also, a method based on genetic algorithms is proposed to optimize the procedur…
▽ More
The paper presents a solution to the dynamic DAG scheduling problem in Grid environments. It presents a distributed, scalable, efficient and fault-tolerant algorithm for optimizing tasks assignment. The scheduler algorithm for tasks with dependencies uses a heuristic model to optimize the total cost of tasks execution. Also, a method based on genetic algorithms is proposed to optimize the procedure of resources assignment. The experiments used the MonALISA monitoring environment and its extensions. The results demonstrate very good behavior in comparison with other scheduling approaches for this kind of DAG scheduling algorithms.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Towards an IO intensive Grid application instrumentation in MedioGRID
Authors:
Dacian Tudor,
Florin Pop,
Valentin Cristea,
Vladimir Cretu
Abstract:
Obtaining high performance in IO intensive applications requires systems that support reliable fast transfer, data replication, and caching. In this paper we present an architecture designed for supporting IO intensive applications in MedioGRID, a system for real-time processing of satellite images, operating in a Grid environment. The solution ensures that applications which are processing geogra…
▽ More
Obtaining high performance in IO intensive applications requires systems that support reliable fast transfer, data replication, and caching. In this paper we present an architecture designed for supporting IO intensive applications in MedioGRID, a system for real-time processing of satellite images, operating in a Grid environment. The solution ensures that applications which are processing geographical data have uniform access to data and is based on continuous monitoring of the data transfers using MonALISA and its extensions. The MedioGRID architecture is also built on Globus, Condor and PBS and based on this middleware we aim to extract information about the running systems. The results obtained in testing MedioGRID system for large data transfers show that monitoring system provides a very good view of system evolution.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
DistHash: A robust P2P DHT-based system for replicated objects
Authors:
Ciprian Dobre,
Florin Pop,
Valentin Cristea
Abstract:
Over the Internet today, computing and communications environments are significantly more complex and chaotic than classical distributed systems, lacking any centralized organization or hierarchical control. There has been much interest in emerging Peer-to-Peer (P2P) network overlays because they provide a good substrate for creating large-scale data sharing, content distribution and application-l…
▽ More
Over the Internet today, computing and communications environments are significantly more complex and chaotic than classical distributed systems, lacking any centralized organization or hierarchical control. There has been much interest in emerging Peer-to-Peer (P2P) network overlays because they provide a good substrate for creating large-scale data sharing, content distribution and application-level multicast applications. In this paper we present DistHash, a P2P overlay network designed to share large sets of replicated distributed objects in the context of large-scale highly dynamic infrastructures. We present original solutions to achieve optimal message routing in hop-count and throughput, provide an adequate consistency approach among replicas, as well as provide a fault-tolerant substrate.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.
-
Critical Analysis of Middleware Architectures for Large Scale Distributed Systems
Authors:
Florin Pop,
Ciprian Mihai Dobre,
Alexandru Costan,
Mugurel Ionut Andreica,
Eliana-Dina Tirsa,
Corina Stratan,
Valentin Cristea
Abstract:
Distributed computing is increasingly being viewed as the next phase of Large Scale Distributed Systems (LSDSs). However, the vision of large scale resource sharing is not yet a reality in many areas - Grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new paradigm. Hence, in this paper we analyze the current development of mi…
▽ More
Distributed computing is increasingly being viewed as the next phase of Large Scale Distributed Systems (LSDSs). However, the vision of large scale resource sharing is not yet a reality in many areas - Grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new paradigm. Hence, in this paper we analyze the current development of middleware tools for LSDS, from multiple perspectives: architecture, applications and market research. For each perspective we are interested in relevant technologies used in undergoing projects, existing products or services and useful design issues. In the end, based on this approach, we draw some conclusions regarding the future research directions in this area.
△ Less
Submitted 15 October, 2009;
originally announced October 2009.
-
Robust Failure Detection Architecture for Large Scale Distributed Systems
Authors:
Ciprian Mihai Dobre,
Florin Pop,
Alexandru Costan,
Mugurel Ionut Andreica,
Valentin Cristea
Abstract:
Failure detection is a fundamental building block for ensuring fault tolerance in large scale distributed systems. There are lots of approaches and implementations in failure detectors. Providing flexible failure detection in off-the-shelf distributed systems is difficult. In this paper we present an innovative solution to this problem. Our approach is based on adaptive, decentralized failure de…
▽ More
Failure detection is a fundamental building block for ensuring fault tolerance in large scale distributed systems. There are lots of approaches and implementations in failure detectors. Providing flexible failure detection in off-the-shelf distributed systems is difficult. In this paper we present an innovative solution to this problem. Our approach is based on adaptive, decentralized failure detectors, capable of working asynchronous and independent on the application flow. The proposed solution considers an architecture for the failure detectors, based on clustering, the use of a gossip-based algorithm for detection at local level and the use of a hierarchical structure among clusters of detectors along which traffic is channeled. The solution can scale to a large number of nodes, considers the QoS requirements of both applications and resources, and includes fault tolerance and system orchestration mechanisms, added in order to asses the reliability and availability of distributed systems.
△ Less
Submitted 5 October, 2009;
originally announced October 2009.
-
Towards a Centralized Scheduling Framework for Communication Flows in Distributed Systems
Authors:
Mugurel Ionut Andreica,
Eliana-Dina Tirsa,
Nicolae Tapus,
Florin Pop,
Ciprian Mihai Dobre
Abstract:
The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data transfers involving large volumes of data does not necessarily improve at the same rate. This is due to the inefficient usage of the available network resour…
▽ More
The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data transfers involving large volumes of data does not necessarily improve at the same rate. This is due to the inefficient usage of the available network resources. A solution to this problem consists of data transfer scheduling techniques, which manage and allocate the network resources in an efficient manner. In this paper we present several online and offline data transfer optimization techniques, in the context of a centrally controlled distributed system.
△ Less
Submitted 1 June, 2009;
originally announced June 2009.
-
Optimization of Decentralized Scheduling for Physic Applications in Grid Environments
Authors:
Florin Pop
Abstract:
This paper presents a scheduling framework that is configured for, and used in physic systems. Our work addresses the problem of scheduling various computationally intensive and data intensive applications that are required for extracting information from satellite images. The proposed solution allows map** of image processing applications onto available resources. The scheduling is done at th…
▽ More
This paper presents a scheduling framework that is configured for, and used in physic systems. Our work addresses the problem of scheduling various computationally intensive and data intensive applications that are required for extracting information from satellite images. The proposed solution allows map** of image processing applications onto available resources. The scheduling is done at the level of groups of concurrent applications. It demonstrates a very good behavior for scheduling and executing groups of applications, while also achieving a near-optimal utilization of the resources.
△ Less
Submitted 11 December, 2008;
originally announced December 2008.