Finding Malicious Domains

Kevin Feasel


R, Security

Rafael San Miguel Carrasco uses dimensionality reduction to figure out if a domain is malicious:

Dimensionality reduction is a common techique to visualize observations in a dataset, by combining all features into two, that can then be used to draw the observation in an scatter plot.

One popular algorithm that implements this technique is PCA (Principal Components Analysis), which is available in R through the prcomp() function.

The algorithm was applied to observations of sthe dataset, and ggplot2’s geom_point() function was used to draw the results in a 2D chart.

I would want to see this done for a couple hundred thousand domains, but I do like the idea of taking advantage of statistical modeling tools to find security threats.

Related Posts

Protecting Database Assets From Administrators

Louis Davidson walks through which things are granted to administrators of different levels: Own is a strange term, because really there is just one user that is listed as owner, but there are there are three users who essentially are owner level, super-powered users in a database: 1. A login using server rights, usually via […]

Read More


John Mount explains the vtreat package that he and Nina Zumel have put together: When attempting predictive modeling with real-world data you quicklyrun into difficulties beyond what is typically emphasized in machine learning coursework: Missing, invalid, or out of range values. Categorical variables with large sets of possible levels. Novel categorical levels discovered during test, cross-validation, or […]

Read More


June 2016
« May Jul »