The rough set approach 7 to data analysis has many important advantages that provide efficient algorithms for finding hidden patterns in data, finds minimal sets of data, evaluates significance of data, generates. The authors have broken the discussion into two sections, each with a specific theme. Rough set approach, fuzzy set approachs, prediction, linear and multipleregression. Finally, some description about applications of the data mining system with rough set theory is included. The rough set theory offers a viable approach for decision rule extraction from data. Through indepth study on the existing rough set and data mining technologies, for the shortcomings of the existing data mining algorithms based on rough set, this paper presents an improved algorithm. Various topics of data mining techniques are identified and described throughout.
In this data mining clustering method, a model is hypothesized for each cluster to find the best fit of data for a given model. The approach is based on the combination of generalization distribution table gdt and the rough set methodology. A partition of u is a family of mutually disjoint nonempty subsets of u, called blocks, such that the union of all blocks is u. Rough set theory was introduced by zdzislaw pawlak in 1982. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Research on data mining algorithm based on rough set. A rough set approach for generation and validation of rules. Data mining, data tables, distributed data mining ddm. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. There are so many approaches for handling missing attribute values. Chapter 2 rough sets and reasoning from data presents the application of rough set concept to reason from data data mining. The rough set approach 7 to data analysis has many important advantages that. And combining with probability logic, random truth degree of rough logic can be studied in the future. A rough set approach for generation and validation of rules for missing attribute values of a data set.
Additionally, the rough set approach to lower and upper approximations and certain possible rule sets concepts are introduced. Comparative analysis between rough set theory and data. Real life data are frequently imperfect, erroneous, incomplete, uncertain and vague. Relationships exist between rough set theory and dempstershafers theory of. The customer related data are categorical in nature. Performance analysis and prediction in educational data. An overview of useful business applications is provided. Data mining is a process of discovering various models, summaries, and derived values from a given collection of data.
Data mining and knowledge discovery in real life applications 36 outset, rough set theory has been a methodology of database mining or knowledge discovery in relational databases. A rough set approach to attribute generalization in data mining. Rough set approach to the analysis of the structureactivity relationship of quaternary imidazolium compounds. Rough set theory and its applications ua computer science. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. Based on the rough set theory, the rough logic and its deduction theory system can be established. Recently, the rough set and fuzzy set theory have generated a great deal of interest among more and more researchers. Data mining, decision tables, rough set, rule extraction. Mining incomplete dataa rough set approach springerlink. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a discussion of data representation with rough set theory including pairs of attributevalue blocks, information tables reducts, indiscernibility relation and.
Puts forward a xml mining model based on rough set theory. Approximation can further be applied to data mining related task, e. Data mining technology has emerged as a means for identifying patterns and trends from large quantities o. The authors express that rough data set theory is not the only discipline in which discretization is. For the purposes of analysis and decision support in the business area in many cases data mining using rough set theory is used. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical.
Download data mining tutorial pdf version previous page print page. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach in the context of data analysis and modelbuilding applications. However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. Chapter 1 basic concepts contains general formulation of basic ideas of rough set theory together with brief discussion of its place in classical set theory. And combining with probability logic, random truth.
The data mining technology instead of classic statistical analysis is developed to help the people to discover the knowledge inside of the data. A convenient way to present equivalence relations is through partitions. A rough set based method for updating decision rules on attribute values coarsening and refining, ieee transactions on knowledge and data engineering, 2612. Another methodology which has high relevance to data mining and plays a central role in this volume is that of. Rule induction from a decision table using rough sets theory. Analysis of imprecise data is an edited collection of research chapters on the most recent developments in rough set theory and data mining. Finally, some description about applications of the data mining system with.
The monograph can be used under the creation of courses for graduate students and for ph. Also, this method locates the clusters by clustering the density. Risk assessment is very important for safe and reliable investment. A rough set approach for generation and validation of. It is a new mathematical tool to deal with partial information. After 15 year of pursuing rough set theory and its application the theory has reached a certain degree of maturity. Parallel computing of approximations in dominancebased rough sets approach, knowledgebased systems, 87. Consequently, the theoretical part is minimized to emphasize the practical application side of the rough set approach. Rough set theory and zadehs fuzzy set theory are two independent approaches to deal with uncertainty.
Inhibitory rules in data analysis a rough set approach. The results considered in this book can be useful for researchers in machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory and logical analysis of data. The chapters in this work cover a range of topics that focus on discovering dependencies among data, and reasoning about vague, uncertain and imprecise information. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to. Chapter 2 presents the data mining process in more detail. Rough set theory, data mining, decision table, decision rule, data representation. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases. Rough set theory fundamental concepts, principals, data. We can use rough set approach to discover structural relationship within imprecise and noisy data. For the rough set theory, in the process of data mining, there are still a large number of problems need to be discussed, such as large data sets, efficient reduction algorithm, parallel computing, hybrid algorithm, etc. A rough set approach to mining incomplete data is presented in this paper. Fuzzyrough data mining with weka aberystwyth university.
Rough set theory has been a methodology of database mining or knowledge discovery in relational databases. The results obtained in the monograph can be useful for researchers in such areas as machine learning, data mining and knowledge discovery, especially for those who are working in rough set theory, test theory, and logical analysis of data lad. The rough set theory, which originated in the early 1980s, provides an alternative approach to the fuzzy set theory, when dealing with uncertainty, vagueness or inconsistence often. Rough set theory provides a simple and elegant method for analyzing data. Intrusion detection a data mining approach nandita. It demonstrates this process with a typical set of data. Summarization providing a more compact representation of the data set, including visualization and report. Pdf a decisiontheoretic rough set approach for dynamic data. Rule generation from raw data is a very effective and most widely used tool of data mining. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering general computer security to server, network, and cloud security.
Rough set theory was proposed by pawlak 17 in 1982, which o ers a mathematical approach to data analysis and data mining 11,1821. Pdf a decisiontheoretic rough set approach for dynamic. This overview provides a description of some of the most common data mining algorithms in use today. This paper introduces a new approach for mining ifthen rules in databases with uncertainty and incompleteness. Discriminant versus rough set approach to vague data analysis.
Risk analysis technique on inconsistent interview big data. Chapter 1 gives an overview of data mining, and provides a description of the data mining process. Granular computing is an emerging computing paradigm of information processing and an approach for knowledge representation and data mining. The dominancebased rough set approach drsa is an extension of rough set theory for multicriteria decision analysis mcda, introduced by greco, matarazzo and slowinski. Pawlak as a mathematical ap proach to deal with vagueness and uncertainty in data analysis 50, 51. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles. Introduction modern organizations use several types of decision support systems to facilitate decision support. A characteristic set, a generalization of the elementary set wellknown in rough set. Data mining is an emerging powerful tool for analysis and prediction. In recent years we witnessed a rapid grow of interest in rough set theory and its application, world wide. The concept of rough, or approximation, set s was introduced by pawlak, and is based on the single assumption that information is associated with. Relationships exist between rough set theory and dempstershafers theory of evidence. This section presents the concepts of rough set theory.
Book description practical applications of data mining emphasizes both theory and applications of data mining algorithms. A crucial concept in the rough set approach to machine learning is that of. Sets, fuzzy sets and rough sets our digital library. It is useful for dealing with indiscernibility of objects caused by incomplete or limited information. Hongmei chen, tianrui li, ieee senior member, chuan luo, shijinn horng, ieee member. This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a. Rough set theory rst is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database. Seminal description of datamining approaches with reference. A rough set knowledge discovery framework is formulated for the analysis of. Visualization of data through data mining software is addressed. In its abstract form, it is a new area of uncertainty mathematics closely related to fuzzy theory. Mining incomplete dataa rough set approach jerzy w.
This paper discusses the basic concepts of rough set theory and point out some rough set based research directions and applications. Comparative analysis between rough set theory and data mining. New directions in rough sets, data mining, and granularsoft computing, lnai. Rough set can be used as a tool to generate rules form decision table in data mining. On rough set based approaches to induction of decision. A rough set is a formal approximation of a crisp set in terms of a pair of sets that give the lower and upper approximation of the original set learn more in. Dominancebased rough set approach for group decisions, european journal of operational research, 2511. This book provides stateoftheart research results on intrusion detection using reinforcement learning, fuzzy and rough set theories, and genetic algorithm and serves wide range of applications, covering. Rough sets theory is a new mathematical approach used in the intelligent data analysis and data mining if data is uncertain or incomplete. In this perspective, granular computing has a position of centrality in data mining. A gdt is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented.
Rough set theory 7 is a new mathematical approach to data analysis and data mining. The general experimental procedure adapted to data mining problems involves the following steps. It is useful for dealing with indiscernibility of objects caused by. Decision rule induction for service sector using data. A rough set approach to data mining this paper reports our experiences with the application of the hierarchy of probabilistic decision tables to face recognition. For the rough set theory, in the process of data mining, there are still a large number. Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. Granular computing is an emerging computing paradigm of information processing. The approach is based on the combination of generalization distribution table gdt. Theory and application on rough set, fuzzy logic, and. Data mining, data tables, distributed data mining ddm, rough sets. A characteristic set, a generalization of the elementary set wellknown in rough set theory, may be computed using such blocks. A survey on rough set theory and its applications sciencedirect. A rough set approach for the discovery of classification rules in.
A decisiontheoretic rough set approach for dynamic data mining. Pdf this article comments on data mining and rough set theory, regarding the article myths about rough set. The advances in locationacquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. A rough set approach to attribute generalization in data. In this thesis, a rulebased rough set decision system is. Due to the xml document is a kind of semistructured data, using the traditional data mining methods for mining of xml data is not applicable. Reduce cost of mailing by targeting a set of consumers likely to buy a new cell phone product approach. Based on rough sets 4 and the concept of lower and upper boundary sets 5, we introduce a method for updating approximations by considering adding and.
1576 378 1260 1403 1062 1294 634 927 1141 1478 1582 737 1288 273 1103 619 1152 1565 785 654 612 35 824 1529 706 130 730 1407 120 878 932 1339 921 1140 1438 517 559 1221 1406 1114 329 210 1263 773 631