Decision tree
Decision tree is one technique for classification, it has flowchart structure like tree. (Han., 2001). Decision tree build by two nodes, there is the node and leaf. Nodes represent to attribute test, branch of the node related on probabilities result from node test. Hence, the leaf represent value of the class. (kantardzic, 2003).
Decision tree handles two kind of attribute,
- Numeric or continue : Domain have infinite values, it represent in real number. Examples : ages, salary.
- Nominal or category : Domain have finite values (finite set). Examples : jobs, status.
Missing Data
Missing data or missing value is a generally problems in data processing. Data had been collected not always have complete value. In huge data, missing value not influence in data processing result. However, if missing values more, it can influence the data processing result.
Generally we can handles missing values with this methods,
1. Deleting all record with missing values.
2. Make new algorithm or modified old algorithm which handles missing data. (Kantardzic, 2003).
C4.5 Algorithm
C4.5 algorithm is an decision tree algorithm, it showed by Quinlan as result in developing ID3 algorithm. The result is:
1. Counting attribute selection measure have more accurate tree. C4.5 algorithms have counting information Gain or Gain-ratio.
2. It can handle training data with missing value. To handle this problem C4.5 algorithm use counting gain-ratio for get test attributes.
No comments:
Post a Comment