XGBoost In R

Fisseha Berhane explains how to implement Extreme Gradient Boosting in R:

What makes it so popular are its speed and performance. It gives among the best performances in many machine learning applications. It is optimized gradient-boosting machine learning library. The core algorithm is parallelizable and hence it can use all the processing power of your machine and the machines in your cluster. In R, according to the package documentation, since the package can automatically do parallel computation on a single machine, it could be more than 10 times faster than existing gradient boosting packages.

xgboost shines when we have lots of training data where the features are numeric or a mixture of numeric and categorical fields. It is also important to note that xgboost is not the best algorithm out there when all the features are categorical or when the number of rows is less than the number of fields (columns).

xgboost is a nice complement to neural networks, as they tend to be great at different things.

Related Posts

Building an Image Classifier with PyTorch

Rogier van der Geer shows how you can use PyTorch to build out a Convolutional Neural Network for image classification: The tool that we are going to use to make a classifier is called a convolutional neural network, or CNN. You can find a great explanation of what these are right here on wikipedia. But we […]

Read More

xgboost and Small Numbers of Subtrees

John Mount covers an interesting issue you can run into when using xgboost: While reading Dr. Nina Zumel’s excellent note on bias in common ensemble methods, I ran the examples to see the effects she described (and I think it is very important that she is establishing the issue, prior to discussing mitigation).In doing that I ran into one more […]

Read More


March 2018
« Feb Apr »