Finding Malicious Domains

Kevin Feasel


R, Security

Rafael San Miguel Carrasco uses dimensionality reduction to figure out if a domain is malicious:

Dimensionality reduction is a common techique to visualize observations in a dataset, by combining all features into two, that can then be used to draw the observation in an scatter plot.

One popular algorithm that implements this technique is PCA (Principal Components Analysis), which is available in R through the prcomp() function.

The algorithm was applied to observations of sthe dataset, and ggplot2’s geom_point() function was used to draw the results in a 2D chart.

I would want to see this done for a couple hundred thousand domains, but I do like the idea of taking advantage of statistical modeling tools to find security threats.

Related Posts

Using Let’s Encrypt Certificates To Encrypt SQL Server Connections

Daniel Hutmacher walks through the process of setting up a certificate on a SQL Server to enable connection encryption: Based on a real-world scenario I encountered recently, here is the premise for this post. I’m putting it here at the top, so I won’t have to expand my post into a gazillion permutations for all […]

Read More

Housing Prices In Ames, Iowa: A Kaggle Competition

Kathryn Bryant and M. Aaron Owen share their Kaggle experiences.  First, Kathryn, et al: The lifecycle of our project was a typical one. We started with data cleaning and basic exploratory data analysis, then proceeded to feature engineering, individual model training, and ensembling/stacking. Of course, the process in practice was not quite so linear and […]

Read More


June 2016
« May Jul »