Press "Enter" to skip to content

Day: September 27, 2024

Boosting versus Bagging in Tree Models

Vinod Chugani compares two techniques for working with trees:

Ensemble learning techniques primarily fall into two categories: bagging and boosting. Bagging improves stability and accuracy by aggregating independent predictions, whereas boosting sequentially corrects the errors of prior models, improving their performance with each iteration. This post begins our deep dive into boosting, starting with the Gradient Boosting Regressor. Through its application on the Ames Housing Dataset, we will demonstrate how boosting uniquely enhances models, setting the stage for exploring various boosting techniques in upcoming posts.

Read on for more information. The neat part about the “boosting versus bagging” debate is that both techniques are quite useful. Although boosting (via algorithms like XGBoost or LightGBM) is the more popular technique, bagging (random forest) is extremely powerful in its own right.

Comments closed

Rebuilding a Transaction Log

David Fowler fixes a large-scale oopsie:

“Could you help me, we deleted the database’s transaction log file and now that database is stuck in ‘Recovery Pending’?”

This was a panicked call that I received a few weeks ago.

“Sure, no problem” said I, “we’ll have to restore back to your last backup”

And then things went silent for a while before the inevitable, “it’s only a development database, we don’t take backups”.

I can feel the face-palm from here. Read on to learn what you can do if you’re in that situation, as well as David’s important note about taking backups so that you don’t end up in this situation to begin with.

Comments closed

Connecting to Azure Storage from SSIS

Andy Brownsword makes a connection:

Migrating to the cloud can be disruptive to existing processes. Moving storage to Azure isn’t a simple configuration change for SSIS packages.

SSIS doesn’t have native connections for Azure. That doesn’t mean we need to completely re-engineer the process or change technology though.

How can we take the simple package below and move to using Azure storage?

Read on for the answer. Also, I am 100% on Team SAS Token. They are easy to create and give you a lot of control over who gets access to what.

Comments closed