Search | arXiv e-print repository

Incremental Maintenance Of Association Rules Under Support Threshold Change

Authors: Mohamed Anis Bach Tobji, Mohamed Salah Gouider

Abstract: Maintenance of association rules is an interesting problem. Several incremental maintenance algorithms were proposed since the work of (Cheung et al, 1996). The majority of these algorithms maintain rule bases assuming that support threshold doesn't change. In this paper, we present incremental maintenance algorithm under support threshold change. This solution allows user to maintain its rule bas… ▽ More Maintenance of association rules is an interesting problem. Several incremental maintenance algorithms were proposed since the work of (Cheung et al, 1996). The majority of these algorithms maintain rule bases assuming that support threshold doesn't change. In this paper, we present incremental maintenance algorithm under support threshold change. This solution allows user to maintain its rule base under any support threshold. △ Less

Submitted 27 January, 2017; originally announced January 2017.

arXiv:1406.5917 [pdf]

BSTree: an Incremental Indexing Structure for Similarity Search and Real Time Monitoring of Data Streams

Authors: Abdelwaheb Ferchichi, Mohamed Salah Gouider

Abstract: In this work, a new indexing technique of data streams called BSTree is proposed. This technique uses the method of data discretization, SAX [4], to reduce online the dimensionality of data streams. It draws on Btree to build the index and finally uses an LRV (least Recently visited) pruning technique to rid the index structure from data whose last visit time exceeds a threshold value and thus min… ▽ More In this work, a new indexing technique of data streams called BSTree is proposed. This technique uses the method of data discretization, SAX [4], to reduce online the dimensionality of data streams. It draws on Btree to build the index and finally uses an LRV (least Recently visited) pruning technique to rid the index structure from data whose last visit time exceeds a threshold value and thus minimizes response time for similarity search queries. △ Less

Submitted 23 June, 2014; originally announced June 2014.

Journal ref: Future Information Technology Lecture Notes in Electrical Engineering Volume 276, 2014, pp 185-190

arXiv:1209.1794 [pdf]

A New Similairty Measure For Spatial Personalization

Authors: Saida Aissa, Mohamed Salah Gouider

Abstract: Extracting the relevant information by exploiting the spatial data warehouse becomes increasingly hard. In fact, because of the enormous amount of data stored in the spatial data warehouse, the user, usually, don't know what part of the cube contain the relevant information and what the forthcoming query should be. As a solution, we propose to study the similarity between the behaviors of the user… ▽ More Extracting the relevant information by exploiting the spatial data warehouse becomes increasingly hard. In fact, because of the enormous amount of data stored in the spatial data warehouse, the user, usually, don't know what part of the cube contain the relevant information and what the forthcoming query should be. As a solution, we propose to study the similarity between the behaviors of the users, in term of the spatial MDX queries launched on the system, as a basis to recommend the next relevant MDX query to the current user. This paper introduces a new similarity measure for comparing spatial MDX queries. The proposed similarity measure could directly support the development of spatial personalization approaches. The proposed similarity measure takes into account the basic components of the similarity assessment models: the topology, the direction and the distance. △ Less

Submitted 9 September, 2012; originally announced September 2012.

Journal ref: International Journal of Database Management Systems ( IJDMS ) Vol.4, No.4, 2012

arXiv:1208.0203 [pdf]

Towards the Next Generation of Data Warehouse Personalization System: A Survey and a Comparative Study

Authors: Saida Aissi, Mohamed Salah Gouider

Abstract: Multidimensional databases are a great asset for decision making. Their users express complex OLAP (On-Line Analytical Processing) queries, often returning huge volumes of facts, sometimes providing little or no information. Furthermore, due to the huge volume of historical data stored in DWs, the OLAP applications may return a big amount of irrelevant information that could make the data explorat… ▽ More Multidimensional databases are a great asset for decision making. Their users express complex OLAP (On-Line Analytical Processing) queries, often returning huge volumes of facts, sometimes providing little or no information. Furthermore, due to the huge volume of historical data stored in DWs, the OLAP applications may return a big amount of irrelevant information that could make the data exploration process not efficient and tardy. OLAP personalization systems play a major role in reducing the effort of decision-makers to find the most interesting information. Several works dealing with OLAP personalization were presented in the last few years. This paper aims to provide a comprehensive review of literature on OLAP personalization approaches. A benchmarking study of OLAP personalization methods is proposed. Several evaluation criteria are used to identify the existence of trends as well as potential needs for further investigations. △ Less

Submitted 1 August, 2012; originally announced August 2012.

Comments: 8 pages

ACM Class: H.2.7

Journal ref: IJCSI International Journal of Computer Science Issues, Vol 9, Issue 3, No 2, May 2012, pages 561-568

arXiv:1208.0163 [pdf]

Spatial and Spatio-Temporal Multidimensional Data Modelling: A Survey

Authors: Saida Aissi, Mohamed Salah Gouider

Abstract: Data warehouse store and provide access to large volume of historical data supporting the strategic decisions of organisations. Data warehouse is based on a multidimensional model which allow to express user's needs for supporting the decision making process. Since it is estimated that 80% of data used for decision making has a spatial or location component [1, 2], spatial data have been widely in… ▽ More Data warehouse store and provide access to large volume of historical data supporting the strategic decisions of organisations. Data warehouse is based on a multidimensional model which allow to express user's needs for supporting the decision making process. Since it is estimated that 80% of data used for decision making has a spatial or location component [1, 2], spatial data have been widely integrated in Data Warehouses and in OLAP systems. Extending a multidimensional data model by the inclusion of spatial data provides a concise and organised spatial datawarehouse representation. This paper aims to provide a comprehensive review of litterature on developed and suggested spatial and spatio-temporel multidimensional models. A benchmarking study of the proposed models is presented. Several evaluation criterias are used to identify the existence of trends as well as potential needs for further investigations. △ Less

Submitted 1 August, 2012; originally announced August 2012.

ACM Class: H.2.7

Journal ref: International Journal of Advanced Research in Computer Science and Software (IJARCCE), Volume 1, Issue 1, March 2012

arXiv:1208.0153 [pdf]

Personalization in Geographic information systems: A survey

Authors: Saida Aissi, Mohamed Salah Gouider

Abstract: Geographic Information Systems (GIS) are widely used in different domains of applications, such as maritime navigation, museums visits and route planning, as well as ecological, demographical and economical applications. Nowadays, organizations need sophisticated and adapted GIS-based Decision Support System (DSS) to get quick access to relevant information and to analyze data with respect to geog… ▽ More Geographic Information Systems (GIS) are widely used in different domains of applications, such as maritime navigation, museums visits and route planning, as well as ecological, demographical and economical applications. Nowadays, organizations need sophisticated and adapted GIS-based Decision Support System (DSS) to get quick access to relevant information and to analyze data with respect to geographic information, represented not only as spatial objects, but also as maps. Several research works on GIS personalization was proposed: Face the great challenge of develo** both the theory and practice to provide personalization GIS visualization systems. This paper aims to provide a comprehensive review of literature on presented GIS personalization approaches. A benchmarking study of GIS personalization methods is proposed. Several evaluation criteria are used to identify the existence of trends as well as potential needs for further investigations. △ Less

Submitted 1 August, 2012; originally announced August 2012.

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 3, July 2012 , pp 291-298

arXiv:1206.1032 [pdf]

Frequent Patterns mining in time-sensitive Data Stream

Authors: Manel Zarrouk, Med Salah Gouider

Abstract: Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the frequent patterns' mining has much more information to track and much greater complexity to manage. Infrequent items can become frequent later on and hence cannot… ▽ More Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the frequent patterns' mining has much more information to track and much greater complexity to manage. Infrequent items can become frequent later on and hence cannot be ignored. The output structure needs to be dynamically incremented to reflect the evolution of itemset frequencies over time. In this paper, we study this problem and specifically the methodology of mining time-sensitive data streams. We tried to improve an existing algorithm by increasing the temporal accuracy and discarding the out-of-date data by adding a new concept called the "Shaking Point". We presented as well some experiments illustrating the time and space required. △ Less

Submitted 5 June, 2012; originally announced June 2012.

Comments: 8pages

arXiv:1012.5546 [pdf]

Mining Multi-Level Frequent Itemsets under Constraints

Authors: Mohamed Salah Gouider, Amine Farhat

Abstract: Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more… ▽ More Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more useful, more refined and more interpretable by the user. Several algorithms have been proposed in the literature to discover the multilevel association rules. In this article, we are interested in the problem of discovering multi-level frequent itemsets under constraints, involving the user in the research process. We proposed a technique for modeling and interpretation of constraints in a context of use of concept hierarchies. Three approaches for discovering multi-level frequent itemsets under constraints were proposed and discussed: Basic approach, "Test and Generate" approach and Pruning based Approach. △ Less

Submitted 26 December, 2010; originally announced December 2010.

Comments: 20 pages

MSC Class: 68P04; 68Q04; 68T04; 68U04 ACM Class: H.2.4; H.2.8; I.2.6; I.2.4; I.1.2

Journal ref: Internatinal Journal of Database Theory and Application, Vol. 3, No. 4, PP. 15-35, December, 2010

arXiv:1009.5149 [pdf, ps, other]

Towards an incremental maintenance of cyclic association rules

Authors: Eya ben Ahmed, Mohamed Salah Gouider

Abstract: Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an in… ▽ More Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an incremental algorithm for cyclic association rules maintenance. The carried out experiments of our proposal stress on its efficiency and performance. △ Less

Submitted 26 September, 2010; originally announced September 2010.

Report number: November 2010, Volume 2, Number 4

Journal ref: International Journal of Database Management Systems (IJDMS), November 2010, Volume 2, Number 4

arXiv:1006.0876 [pdf]

doi 10.5121/ijdms.2010.2207

Building a Data Warehouse for National Social Security Fund of the Republic of Tunisia

Authors: Mohamed Salah Gouider, Amine Farhat

Abstract: The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Bu… ▽ More The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Builder to extract, transform and load the source data into the Data Warehouse, by applying the KDD process. We have implemented the Data Warehouse as an Oracle OLAP. The knowledge extraction has been performed using the Oracle Discoverer tool. This allowed users to take maximum advantage of knowledge as a regular report or as ad hoc queries. We started by implementing the main topic for this public institution, accounting for the movements of insured persons. The great success that has followed the completion of this work has encouraged the NSSF to complete the achievement of other topics of interest within the NSSF. We suggest in the near future to use Multidimensional Data Mining to extract hidden knowledge and that are not predictable by the OLAP. △ Less

Submitted 4 June, 2010; originally announced June 2010.

Comments: 13 pages

Journal ref: International Journal of Database Management Systems 2.2 (2010) 102-114

Showing 1–10 of 10 results for author: Gouider, M S