Decision Tree (CART) – Machine Learning Fun and Easy


and Welcome to another fun and easy Machine learning tutorial on Decision Trees A Decision tree is a type of supervised learning algorithm. That is mostly used in classification problems A3 has many analogies in life and turns out it is influenced in wide area of machine learning covering both classification and regression Trees otherwise known as caught Please join our notification brigade by subscribing and clicking that bell icon So a decision tree is a flowchart like structure where each internal node Denotes a test on an attribute each branch Represents an outcome of a test and each leaf or terminal node holds a class label The topmost node in a tree is the root node in decision analysis a decision tree can be used to visually and explicitly represent decisions and decision-Making as The name goes it uses a tree like model of decisions So the advantages of of CAR it is simple to understand interpret and visualize decision trees implicitly perform variable screening or feature selection It can handle both numerical as well as categorical data it can also handle multi output problems decision trees requires relatively little effort from the user for data preparation and Nonlinear relationships between parameters do not affect the clip performance The disadvantages of cost however is that decision tree learners can create over complex trees that do not generalize the data well This is also known as overfitting Decision trees can become unstable because small variations in the data might result in a completely different Reading generated this is called Variance which needs to be lowered by methods of bagging and posting Greedy algorithms cannot guarantee to return the globally optimal decision. Tree. This can be mitigated by training multiple trees Where features and samples are randomly? sampled with replacement Decision tree learners also create bias trees if some classes dominate it is therefore recommended to balance Data set Priority setting what the decision tree if you look at some applications of the decision tree We can predict whether a customer will pay his renewal premium was an insurance company So you can predict yes if you all or no if you want you need to predict that dem excel file statistics So if male or female as well as age, what are the chances of survival? He needed to determine if a person is male or female based on the height and weight Also, he needed to determine a price of a home based on how many rooms as well as the floor size a decision tree is Drawn upside down whether its root at the top so in image let’s look at the primary differences and similarities between Classification and regression trees regression trees are used when the dependent variable is continuous Classification trees I use when the dependent variable is categorical In the case of regression Trees the value obtained by terminal nodes in the training Data is the mean or average response of the observation falling in that region thus if an unseen data observation falls in that region Will make its position with a mean value the user of classification tree the value or class obtained by the terminal node in the training Data is the mode of Observation falling in that region thus is an unscented observation falls in that region will make its prediction with a mode value So the splitting process is continued until a user-defined stopping Criteria is reached for Example, we can tell the algorithm to stop once the number of observations per node becomes less than 50 so in both cases the student process results in Fully Grown Trees Until the stopping Criteria is reached but fully grown trees is likely to over data leading to poor accuracy on Data and this brings pruning pruning is one of the techniques used to tackle overfitting we’ll learn more about it in in Future lectures So how can an algorithm be represented as a tree for this let’s consider a basic example? That used the titanic data set for predicting whether a passenger or survived or not this model over here uses three features from the data set namely six age and Number of spouses or children along we can abbreviate this to Si be Sp in this case where the passenger diet or survived is? Represented as red and green text respectively although a real deal set will have a lot more features And this will just be a branch in a much bigger tree, but you can’t ignore the simplicity of the algorithm So what’s actually going on in the background? Going a tree involves deciding on which features to choose and what conditions to use for splitting along with knowing when to stop As it regenerates arbitrarily you need to trim it down for it to look beautiful So let’s started calming techniques use for splitting So how does it read said with split so the decision for making strategic splits heavily affects a tree’s accuracy? The Decision Criteria is different for classification and regression trees Decision trees use multiple algorithms they decide to split a node in two or more sub nodes the creation of sub nodes increases homogeneity of resultant sub nodes in Other words we can group our data in regions based on Data that have similar traits decision Tree splits the nodes on all available variables and then selects the split which results in the most homogeneous subnodes most ethical ignore example shown in this lecture the Algorithm selection is also based on the type of Target variables, so let’s look at the four most commonly used algorithms in Decision tree One Beauty Index to Chi-Squared three information gain for reduction in Variance so we will not go into detail on these algorithms as Some involves quite a lot of math and most of the hard work is done within Scikit-learn libraries Let’s gain an intuition of our splitting the data would work if we tweet, manually So via we have arbitrarily Generated data we have x 1 and x 2 which are our independent variables if you have to look at this data We can split it into five regions So we can draw a line here at x 1 equals 20 as well as x 2 equals 50 and then another one over here at x 1 equals scream 5 and then a last blood over here between 5 x 2 equals 30 So we have regions R1 R2 R3 R4 and R5 and we do this empirically the elements I mentioned earlier will do this for you now, remember you can split it a bit further into more regions to say for example We can split R4 over here, and that will result in more sub nodes in our 3 but for now Let’s just have 5 regions. So we start off over here at our root node 3 also solves is X1 less than printing So we go either yes or no So if yes is x cubed S6 of t so if you look at our graph over there and then we separate that into R1, so if you sv f R1 if no We have asked you Then you go to our other branch and we ask is X1 less than 25 So we look at x less than 25 if yes then it’s R3 if no then we ask ourselves is X2 less than 50 and If yes you got our 5 and if no, we got our 4 so as you can see that is really simple So this is all the basics to get you on Par with Decision tree learning decision Trees are also very useful when you use with other advanced machine learning algorithms like random forests and boosting which we shall cover in Future lectures a popular library for implementing the algorithm Scikit-learn it is a wonderful api that can get your model up and running in just a few lines of code in Python So thank you for watching please don’t forget to smash that like button and click the doll icon to become a part of our notification Brigade and Also support us on Patreon. See you in the next lecture you

About the author

Comments

  1. I do not quite get one point. When you talked about the difference between Regression Tree vs. Decision Tree. The second line is Mean/average vs. mode/ class. What does this mean ? Can you please give an example?

  2. Where is this guy from? The accent is very strange. Besides that, good video for a complete beginner but leaves out crucial details.

  3. Thank you very much for your video! It's really simple. There is just one issue with the tree you're exampling on 7:57. If X2 > 30 then the result would be R4 and not R5. 🙂

  4. I really like your videos, however I find the sharpie animation pretty much useless and actually distracting. I think your videos would be even better without it.

  5. The animations spark up more interest in the brain. It says hey heres a cool animation. Now lets get back to learning. Repeat. Excellent video.a lot of info crunched into 9 mins.

  6. gini index not giri, and all of these are just metrics used by decision tree algorithms such as ID3, CART, C4.5 and MARS, if you want to explain topic in ml please do it based on published paper and reviews

  7. i know how to do linear regression and decision trees and yet these videos make more confusing than it should be

  8. Appreciate the free education; however, please work on improving your communication skills.

    It takes alot of effort to understand what words you are saying most likely due to you not fully opening your mouth and/or not fully pronouncing all syllables of words used. In fact I'm spending so much time trying to understand what words you are saying that I'm not even paying attention to the content. Thanks

  9. Can someone give me the name of the software you use for realising these beautiful videos? Thanks in advance

Leave a Reply

Your email address will not be published. Required fields are marked *