Decision Tree Algorithm in Machine Learning
A decision tree is a flowchart-like tree structure where an internal node represents a feature(or attribute), the branch represents a decision rule, and each leaf node represents the outcome. The topmost node in a decision tree is known as the root node.
Decision trees come under the supervised learning algorithms category. It is primarily used for regression and classification in machine learning models. It provides transparency by offering a single view of all traces and alternatives. Decision tree also assign specific values to problems and decisions, enabling better decision-making.
But do you know how a decision tree works? Most people use it because it’s easy and provides a graphical representation of the problem. In this article, we will look at the decision tree algorithm in detail.
Types of Decision Tree Algorithms
There are two different types of decision tree for machine learning algorithms.
- Classification trees – In this type of decision tree, there is only one outcome from a set of two. The outcome could be either true for a particular dataset or false. The decision variable is categorical.
- Regression trees – In this type of decision tree for machine learning algorithms, the outcome is continuous and changes based on the value of variables in the dataset.
The root of the decision tree algorithm is at the top, and it flows upside down. The branches split downwards and the internal nodes are there to satisfy a particular condition.
The text in black are the internal nodes, which allows the tree to split into branches based on specific conditions. The end of the branch is the decision or leaf, from which no more branch can pan out. The methodology is also popularized as learning decision tree from data.
How Does Decision Tree Work ?
Another decision tree algorithm CART (Classification and Regression Tree) uses the Gini method to create split points.
Where pi is the probability that a tuple in D belongs to class Ci.
The Gini Index considers a binary split for each attribute. You can compute a weighted sum of the impurity of each partition.
Entropy is a measure of disorder or uncertainty and the goal of machine learning models and Data Scientists, in general, is to reduce uncertainty.
We simply subtract the entropy of Y given X from the entropy of just Y to calculate the reduction of uncertainty about Y given an additional piece of information X about Y. This is called Information Gain. The greater the reduction in this uncertainty, the more information is gained about Y from X.
Let’s see an example to train model with diabetes data using the above algorithm
Please note that, We are going to use Pandas and Sklearn for training data and using existing dataset of diabetes from Kaggle.
- Interpretation and visualization is made easy when Decision trees are used.
- Capturing Nonlinear patterns is easier.
- Normalization of columns is not needed as negligible data preprocessing is required from the user.
- Variable selection can be more efficiently done.
- Feature engineering such as predicting missing values can be done very efficiently using this algorithm.
- There are no assumptions about distribution because decision tree has a non-parametric nature.(Source)
- Overfitting noisy data and sensitivity to noisy data is a con.
- Nominal variation in data can result in different decision tree. To reduce this con bagging and boosting algorithms are used.
- Before creating a decision tree it is suggested to balance out the dataset as decision trees are biased to imbalanced dataset.
Decision tree is very easy to understand and communicate. It provides an excellent visual illustration of the data and the dendrogram gives a good look at the relationship between objects.
At BoTree Technologies, we build enterprise applications with our 10+ expert ML developers.
Consulting is free – let us help you grow!
Choose Your Language
- Digital Marketing
- IT Consulting
- Project Management
- Salesforce Development