Koos van Strien moves from Python to R to run an xgboost algorithm:

Note that the parameters of xgboost used here fall in three categories:

  • General parameters

    • nthread (number of threads used, here 8 = the number of cores in my laptop)
  • Booster parameters

    • max.depth (of tree)
    • eta
  • Learning task parameters

    • objective: type of learning task (softmax for multiclass classification)
    • num_class: needed for the “softmax” algorithm: how many classes to predict?
  • Command Line Parameters

    • nround: number of rounds for boosting

Read the whole thing.

Related Posts

CosmosDB Continuation Tokens

Hasan Savran walks us through the idea of a continuation token in CosmosDB: In CosmosDB, TOP option is required and its default value is 100. You can change the default value by sending a different value using the request header “x-ms-max-item-count“. If you have 40000 rows in your Orders table, and run the same query […]

Read More

Sentiment Analysis with Spark on Qubole

Jonathan Day, et al, have a tutorial on using Qubole to build a sentiment analysis model: This post covers the use of Qubole, Zeppelin, PySpark, and H2O PySparkling to develop a sentiment analysis model capable of providing real-time alerts on customer product reviews. In particular, this model allows users to monitor any natural language text […]

Read More


September 2016
« Aug Oct »