the threshold specifies where to split ie
A more important difference is how the Decision Tree chose to split off the Bananas. We did it with a single diagonal line. The decision tree used two lines, one horizontal and one vertical.
The decision tree works by picking a criteria and a threshold. The criteria specifies what to split, for instance length or width, and one single criteria is always used. The threshold specifies where to split, i.e. what value to split at.
But to get a diagonal line, we would need to split on two values simultaneously. The best a decision tree can do is to split on one value, and then the other value and repeat that process.
Take this example with made-up data
It is obvious to us that the best split between the red and blue data points is a diagonal line from top left to bottom right. But the decision tree classifier would take quite a few steps to do it, and end up with a stair step like plot, as shown below.
This ends up being one way to improve the results of a random forest. If you see a relationship, like a ratio, between different criteria in the data you can make it easier for the code by making the ratio its own value. The example above took 20 lines to split the red dots from the blue dots. But the data itself is very simple. It is just the expression of the equation y = 9.5 – x for the red dots, and y = 10.5 – x for the blue dots. If I had added the X value to the Y value for this plot, the Random Forest could have made the split in a single line, just anything with Y below 10 was red, and anything with Y above 10 was blue.