Data Science At Stack Overflow

David Robinson discusses his role as a data scientist at Stack Overflow:

The most prominent example of where machine learning is used in our product is Providence; our system for matching users to jobs they’ll be interested in. (For example, if you visit mostly Python and Javascript questions on Stack Overflow, you’ll end up getting Python web development jobs as advertisements). I work with engineers on the Data team (Kevin Montrose,Jason Punyon, and Nick Larsen) to design, improve and implement these machine learning algorithms. (Here’s some more about the architecture of the system, built before I joined). For example, we’ve worked to get the balance right between jobs that are close to a user geographically and jobs that are well-matched in terms of technology, and ensuring that users get a variety of jobs rather than seeing the same ones over and over.

A lot of this process involves designing and analyzing A/B tests, particularly about changing our targeting algorithms, ad design, and other factors to improve clickthrough rate (CTR). This process is more statistically interesting than I’d expected, in some cases letting me find new uses for methods I’d used to analyze biological experiments, and in other cases encouraging me to learn new statistical tools. In fact, much of my series on applying Bayesian methods to baseball batting statistics is actually a thinly-veiled version of methods I’ve used to analyze CTR across ad campaigns.

Sounds like a fun place to be.

Related Posts

Kafka And The Differing Aims Of Data Professionals

Kai Waehner argues that there is an impedence mismatch between data engineers, data scientists, and ML production engineers: Data scientists love Python, period. Therefore, the majority of machine learning/deep learning frameworks focus on Python APIs. Both the stablest and most cutting edge APIs, as well as the majority of examples and tutorials use Python APIs. […]

Read More

Solving The Monty Hall Problem With R

Miroslav Rajter builds a Monty Hall problem simulator using R: The original and most simple scenario of the Monty Hall problem is this: You are in a prize contest and in front of you there are three doors (A, B and C). Behind one of the doors is a prize (Car), while behind others is […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930