Decision Tree Classification- A Guide to Supervised Machine Learning Algorithm

One of the most popular algorithms in Machine Learning are the Decision Trees that are useful in regression and classification tasks. Decision trees are easy to understand, and implement therefore, making them ideal for beginners who want to explore the field of Machine Learning. The following blog is a guide to Decision Tree Machine Learning, focusing on how it works and the need to use it in classification tasks.

What is Decision Tree in Machine Learning?

In Supervised Learning, Decision Trees are the Machine Learning algorithms where you can split data continuously based on a specific parameter. The Decision Tree algorithm in Machine Learning are explained by two different entities, namely decision nodes and leaves. The decision nodes are where the data is split and the leaves are the final decisions or outcomes.

For instance, decision tree Machine Learning can be evaluated using a binary tree given below. Accordingly, say you want to find out whether a person is physically fit or not based on the given information like age, height, weight, eating habits, etc. The decision nodes act as questions like ‘what’s the age?’, ‘Does he/she exercise?’, ‘does he eat a lot of burgers?’, etc. On the other hand, decision leaves which are the final outcomes present like either ‘fit’ or ‘unfit. This was a binary classification problem implying that it was a yes/no type of problem that was to be solved.

Significantly, there are two types of Decision Trees including:

  • Classification Trees (Yes/No Types): the given example above is the classification tree where the outcome was a variable based on ‘fit’ or ‘unfit’ categories. Hence, the decision tree variable is categorical.
  • Regression Trees (Continuous Data Types): the decisions or outcomes in this case of variable is mainly continuous for instance like, 123. Accordingly, regression trees have target variables which takes input for continuous variables rather than class labels in leaves. It is useful for explaining decisions, identifying possible evets and predicting potential outcomes.

How Decision Tree Algorithm works?

How Decision Tree Algorithm Works

Making strategic decision splits has a heavy impact on the accuracy of a Decision Tree Machine Learning. The criteria for decisions are different in case of classification and regression trees. There are multiple algorithms that Decision Trees use for deciding to split a note into two or more sub-nodes. As the sub-nodes are created, it increases the level of homogeneity of the resultant sub-nodes. It implies that the sub-nodes’ purity increases with respect to the target variable. Consequently, the decision tree is responsible for splitting the nodes on all variables and further selects the split resulting in the creation of most homogenous sub-nodes.

Selection of the algorithm is based on the type of target variable. Some of the algorithms that decision trees use are as follows:

  • ID3- extension of D3
  • C4.5- successor of ID3
  • CART- Classification and Regression Tree
  • CHAID- Chi-square automatic interaction detection. It performs multiple splits while computing classification trees
  • MARS- multivariate adaptive regression splines

The ID3 algorithm helps in building decision trees that uses a top-down greedy search approach via the space of possible branches without any backtracking. A greedy algorithm is always used to make choices that is best suitable at the moment. If you want to know how decision tree algorithm works, following are the steps in ID3 algorithm:

Step 1: the algorithm starts with the original set S as the root node.

Step 2: at each iteration, it iterates through the unused attribute of the S set and calculates Entropy (H) and Information Gain (IG) of this attribute.

Step 3: then it selects the attribute with the smallest Entropy or Largest Information Gain.

Step 4: the set S is split by the selected attribute producing a subset of the data.

Step 5: therefore, the algorithm continues recurring on each subset considering that the attributes are never selected before.

Why use decision tree classification?

Decision tree classification can be effectively used for solving numerous classification problems. Some of the advantages of using Decision tree classification are as follows:

  • In comparison to other algorithms, decision trees require much less effort for data preparation during data pre-processing.
  • A decision tree machine learning does not require you to undertake normalisation of data.
  • Additionally, decision tree also does not require scaling of data.
  • Missing values within the data does not affect the process of building a decision tree to any considerable extent.
  • Furthermore, a decision tree model tends to be highly intuitive and therefore easy to explain to any technical team and stakeholders.
  • The simplicity of decision trees enables you to code, visalise, interpret and even manipulate simple decision trees. Even for beginners, decision tree classification is easy to understand and learn.
  • Moreover, decision trees follow a non-parametric method implying that it’s distribution free and doesn’t depend on assumptions of probability distribution.
  • Decision trees tend to perform feature section or variable screening completely. It can work both on categorical and numerical data and can handle problems with multiple outputs.
  • When using decision trees, non-linear relationships between parameters does not influence the performance of the trees unlike other classification algorithms.


From the above blog, the concept and application of Decision Tree Machine Learning is understood in detail. Considering that classification and clustering are the most popular algorithms in Machine Learning, the differences lie in terms of pre-defined labels present in classification. Decision Trees being an important algorithm of Supervised Machine Learning which splits data based on pre-defined parameters continuously.


What is decision tree algorithm for binary classification?

A binary decision tree is basically a structure that focuses on sequential decision process. It starts from the root which is a feature evaluated and one of the two branches are selected. The procedure is repeated until the final leaf is reached. It normally represents classification target variable that you are looking for.

Can the decision tree algorithm be used in both classification and regression problems?

In Machine Learning, decision trees are the algorithm that is useful for creating the models for both classification and regression tasks. Accordingly, it can be used in solving both classification and regression problems.

What is the difference between classification decision tree and regression decision tree?

Classification decision trees are used mainly when the dataset requires to be split into classes which belongs to the response variable. In contrast, regression trees are used when the response variable is supposed to be continuous.

Aishwarya Kurre

I work as a Data Science Ops at and am an avid learner. Having experience in the field of data science, I believe that I have enough knowledge of data science. I also wrote a research paper and took a great interest in writing blogs, which improved my skills in data science. My research in data science pushes me to write unique content in this field. I enjoy reading books related to data science.