Rafael San Miguel Carrasco uses dimensionality reduction to figure out if a domain is malicious:
Dimensionality reduction is a common techique to visualize observations in a dataset, by combining all features into two, that can then be used to draw the observation in an scatter plot.
One popular algorithm that implements this technique is PCA (Principal Components Analysis), which is available in R through the prcomp() function.
The algorithm was applied to observations of sthe dataset, and ggplot2’s geom_point() function was used to draw the results in a 2D chart.
I would want to see this done for a couple hundred thousand domains, but I do like the idea of taking advantage of statistical modeling tools to find security threats.