Rule Extraction for Infrequent Class

dc.contributor.advisorDatta, Soma
dc.contributor.committeeMemberSha, Kewei
dc.contributor.committeeMemberHasan, Khondker
dc.creatorArputharaj, Anuprabha
dc.date.accessioned2019-09-26T20:19:50Z
dc.date.available2019-09-26T20:19:50Z
dc.date.created2019-05
dc.date.issued2019-05-07
dc.date.submittedMay 2019
dc.date.updated2019-09-26T20:19:51Z
dc.description.abstractThe thesis narrates the classification rules that are developed in the infrequent class to make decisions about their future actions. Rules are the most expressive and most human-readable representation for any kind of hypotheses made in the prediction world. Dealing with the imbalanced datasets it is always portrayed that the standard classifier algorithms are always biased towards the Majority class which finally gives more rules for the majority class when compared to the infrequent class. That is because the conventional algorithms loss functions attempt to optimize quantities such as error rate and not taking the data distribution into consideration. The importance of the infrequent class will be picturized clearly only in the form of the rules that are developed from them. The thesis emphasizes the use of Undersampling technique which is one of the naïve methods used to balance the data and apply the clustering algorithm which clusters the attributes of the similar features and categorize them according to their distance as Euclidean distance and Manhattan distance. The clusters that are generated from the Euclidean distance contributes to the majority class and the Manhattan distance contributes to the minority class. This helps in increasing the minority count of the dataset when compared to the original dataset. Creating a new dataset from them are applied to the conventional classification algorithm to obtain more rules for the minority class which helps in further predictions. The proposed algorithm generates more readable and understandable rules with increased coverage for the minority class when compared to the previously published works.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/10657.1/1473
dc.language.isoen
dc.subjectImbalanced class, distance measures, rule extraction
dc.titleRule Extraction for Infrequent Class
dc.typeThesis
dc.type.materialtext
local.embargo.lift2020-05-01
local.embargo.terms2020-05-01
thesis.degree.grantorUniversity of Houston-Clear Lake
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ARPUTHARAJ-MASTERSTHESIS-2019.pdf
Size:
8.53 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.87 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.46 KB
Format:
Plain Text
Description: