-
Incremental Maintenance Of Association Rules Under Support Threshold Change
Authors:
Mohamed Anis Bach Tobji,
Mohamed Salah Gouider
Abstract:
Maintenance of association rules is an interesting problem. Several incremental maintenance algorithms were proposed since the work of (Cheung et al, 1996). The majority of these algorithms maintain rule bases assuming that support threshold doesn't change. In this paper, we present incremental maintenance algorithm under support threshold change. This solution allows user to maintain its rule bas…
▽ More
Maintenance of association rules is an interesting problem. Several incremental maintenance algorithms were proposed since the work of (Cheung et al, 1996). The majority of these algorithms maintain rule bases assuming that support threshold doesn't change. In this paper, we present incremental maintenance algorithm under support threshold change. This solution allows user to maintain its rule base under any support threshold.
△ Less
Submitted 27 January, 2017;
originally announced January 2017.
-
BSTree: an Incremental Indexing Structure for Similarity Search and Real Time Monitoring of Data Streams
Authors:
Abdelwaheb Ferchichi,
Mohamed Salah Gouider
Abstract:
In this work, a new indexing technique of data streams called BSTree is proposed. This technique uses the method of data discretization, SAX [4], to reduce online the dimensionality of data streams. It draws on Btree to build the index and finally uses an LRV (least Recently visited) pruning technique to rid the index structure from data whose last visit time exceeds a threshold value and thus min…
▽ More
In this work, a new indexing technique of data streams called BSTree is proposed. This technique uses the method of data discretization, SAX [4], to reduce online the dimensionality of data streams. It draws on Btree to build the index and finally uses an LRV (least Recently visited) pruning technique to rid the index structure from data whose last visit time exceeds a threshold value and thus minimizes response time for similarity search queries.
△ Less
Submitted 23 June, 2014;
originally announced June 2014.
-
A New Similairty Measure For Spatial Personalization
Authors:
Saida Aissa,
Mohamed Salah Gouider
Abstract:
Extracting the relevant information by exploiting the spatial data warehouse becomes increasingly hard. In fact, because of the enormous amount of data stored in the spatial data warehouse, the user, usually, don't know what part of the cube contain the relevant information and what the forthcoming query should be. As a solution, we propose to study the similarity between the behaviors of the user…
▽ More
Extracting the relevant information by exploiting the spatial data warehouse becomes increasingly hard. In fact, because of the enormous amount of data stored in the spatial data warehouse, the user, usually, don't know what part of the cube contain the relevant information and what the forthcoming query should be. As a solution, we propose to study the similarity between the behaviors of the users, in term of the spatial MDX queries launched on the system, as a basis to recommend the next relevant MDX query to the current user. This paper introduces a new similarity measure for comparing spatial MDX queries. The proposed similarity measure could directly support the development of spatial personalization approaches. The proposed similarity measure takes into account the basic components of the similarity assessment models: the topology, the direction and the distance.
△ Less
Submitted 9 September, 2012;
originally announced September 2012.
-
Towards the Next Generation of Data Warehouse Personalization System: A Survey and a Comparative Study
Authors:
Saida Aissi,
Mohamed Salah Gouider
Abstract:
Multidimensional databases are a great asset for decision making. Their users express complex OLAP (On-Line Analytical Processing) queries, often returning huge volumes of facts, sometimes providing little or no information. Furthermore, due to the huge volume of historical data stored in DWs, the OLAP applications may return a big amount of irrelevant information that could make the data explorat…
▽ More
Multidimensional databases are a great asset for decision making. Their users express complex OLAP (On-Line Analytical Processing) queries, often returning huge volumes of facts, sometimes providing little or no information. Furthermore, due to the huge volume of historical data stored in DWs, the OLAP applications may return a big amount of irrelevant information that could make the data exploration process not efficient and tardy. OLAP personalization systems play a major role in reducing the effort of decision-makers to find the most interesting information. Several works dealing with OLAP personalization were presented in the last few years. This paper aims to provide a comprehensive review of literature on OLAP personalization approaches. A benchmarking study of OLAP personalization methods is proposed. Several evaluation criteria are used to identify the existence of trends as well as potential needs for further investigations.
△ Less
Submitted 1 August, 2012;
originally announced August 2012.
-
Spatial and Spatio-Temporal Multidimensional Data Modelling: A Survey
Authors:
Saida Aissi,
Mohamed Salah Gouider
Abstract:
Data warehouse store and provide access to large volume of historical data supporting the strategic decisions of organisations. Data warehouse is based on a multidimensional model which allow to express user's needs for supporting the decision making process. Since it is estimated that 80% of data used for decision making has a spatial or location component [1, 2], spatial data have been widely in…
▽ More
Data warehouse store and provide access to large volume of historical data supporting the strategic decisions of organisations. Data warehouse is based on a multidimensional model which allow to express user's needs for supporting the decision making process. Since it is estimated that 80% of data used for decision making has a spatial or location component [1, 2], spatial data have been widely integrated in Data Warehouses and in OLAP systems. Extending a multidimensional data model by the inclusion of spatial data provides a concise and organised spatial datawarehouse representation. This paper aims to provide a comprehensive review of litterature on developed and suggested spatial and spatio-temporel multidimensional models. A benchmarking study of the proposed models is presented. Several evaluation criterias are used to identify the existence of trends as well as potential needs for further investigations.
△ Less
Submitted 1 August, 2012;
originally announced August 2012.
-
Personalization in Geographic information systems: A survey
Authors:
Saida Aissi,
Mohamed Salah Gouider
Abstract:
Geographic Information Systems (GIS) are widely used in different domains of applications, such as maritime navigation, museums visits and route planning, as well as ecological, demographical and economical applications. Nowadays, organizations need sophisticated and adapted GIS-based Decision Support System (DSS) to get quick access to relevant information and to analyze data with respect to geog…
▽ More
Geographic Information Systems (GIS) are widely used in different domains of applications, such as maritime navigation, museums visits and route planning, as well as ecological, demographical and economical applications. Nowadays, organizations need sophisticated and adapted GIS-based Decision Support System (DSS) to get quick access to relevant information and to analyze data with respect to geographic information, represented not only as spatial objects, but also as maps.
Several research works on GIS personalization was proposed: Face the great challenge of develo** both the theory and practice to provide personalization GIS visualization systems. This paper aims to provide a comprehensive review of literature on presented GIS personalization approaches. A benchmarking study of GIS personalization methods is proposed. Several evaluation criteria are used to identify the existence of trends as well as potential needs for further investigations.
△ Less
Submitted 1 August, 2012;
originally announced August 2012.
-
Frequent Patterns mining in time-sensitive Data Stream
Authors:
Manel Zarrouk,
Med Salah Gouider
Abstract:
Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the frequent patterns' mining has much more information to track and much greater complexity to manage. Infrequent items can become frequent later on and hence cannot…
▽ More
Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the frequent patterns' mining has much more information to track and much greater complexity to manage. Infrequent items can become frequent later on and hence cannot be ignored. The output structure needs to be dynamically incremented to reflect the evolution of itemset frequencies over time. In this paper, we study this problem and specifically the methodology of mining time-sensitive data streams. We tried to improve an existing algorithm by increasing the temporal accuracy and discarding the out-of-date data by adding a new concept called the "Shaking Point". We presented as well some experiments illustrating the time and space required.
△ Less
Submitted 5 June, 2012;
originally announced June 2012.
-
Mining Multi-Level Frequent Itemsets under Constraints
Authors:
Mohamed Salah Gouider,
Amine Farhat
Abstract:
Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more…
▽ More
Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more useful, more refined and more interpretable by the user. Several algorithms have been proposed in the literature to discover the multilevel association rules. In this article, we are interested in the problem of discovering multi-level frequent itemsets under constraints, involving the user in the research process. We proposed a technique for modeling and interpretation of constraints in a context of use of concept hierarchies. Three approaches for discovering multi-level frequent itemsets under constraints were proposed and discussed: Basic approach, "Test and Generate" approach and Pruning based Approach.
△ Less
Submitted 26 December, 2010;
originally announced December 2010.
-
Towards an incremental maintenance of cyclic association rules
Authors:
Eya ben Ahmed,
Mohamed Salah Gouider
Abstract:
Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an in…
▽ More
Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an incremental algorithm for cyclic association rules maintenance. The carried out experiments of our proposal stress on its efficiency and performance.
△ Less
Submitted 26 September, 2010;
originally announced September 2010.
-
Building a Data Warehouse for National Social Security Fund of the Republic of Tunisia
Authors:
Mohamed Salah Gouider,
Amine Farhat
Abstract:
The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Bu…
▽ More
The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Builder to extract, transform and load the source data into the Data Warehouse, by applying the KDD process. We have implemented the Data Warehouse as an Oracle OLAP. The knowledge extraction has been performed using the Oracle Discoverer tool. This allowed users to take maximum advantage of knowledge as a regular report or as ad hoc queries. We started by implementing the main topic for this public institution, accounting for the movements of insured persons. The great success that has followed the completion of this work has encouraged the NSSF to complete the achievement of other topics of interest within the NSSF. We suggest in the near future to use Multidimensional Data Mining to extract hidden knowledge and that are not predictable by the OLAP.
△ Less
Submitted 4 June, 2010;
originally announced June 2010.