An Introduction To Random Forests

Shrin Glander has a new video, currently only in German but there is an English transcript:

RF is based on decision trees. In machine learning decision trees are a technique for creating predictive models. They are called decision trees because the prediction follows several branches of “if… then…” decision splits – similar to the branches of a tree. If we imagine that we start with a sample, which we want to predict a class for, we would start at the bottom of a tree and travel up the trunk until we come to the first split-off branch. This split can be thought of as a feature in machine learning, let’s say it would be “age”; we would now make a decision about which branch to follow: “if our sample has an age bigger than 30, continue along the left branch, else continue along the right branch”. This we would do until we come to the next branch and repeat the same decision process until there are no more branches before us. This endpoint is called a leaf and in decision trees would represent the final result: a predicted class or value.

At each branch, the feature thresholds that best split the (remaining) samples locally is found. The most common metrics for defining the “best split” are gini impurity and information gain for classification tasks and variance reduction for regression.

Click through for more info and if you understand German, the video is good as well.

Related Posts

Explaining Neural Networks With H2O

Shirin Glander explains some of the concepts behind neural networks using H2O as a guide: Before, when describing the simple perceptron, I said that a result is calculated in a neuron, e.g. by summing up all the incoming data multiplied by weights. However, this has one big disadvantage: such an approach would only enable our neural net […]

Read More

Exploratory Data Analysis In R

Laura Ellis walks us through some easy techniques for learning about our data using R: DIM AND GLIMPSE Next, we will run the dim function which displays the dimensions of the table. The output takes the form of row, column. And then we run the glimpse function from the dplyr package. This will display a […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

October 2018
MTWTFSS
« Sep Nov »
1234567
891011121314
15161718192021
22232425262728
293031