Accessing BigQuery Data From Python And R

Kevin Feasel


Python, R

Eleni Markou shows how to connect to Google’s BigQuery service using Python and then R:

Some time ago we discussed how you can access data that are stored in Amazon Redshift and PostgreSQL with Python and R. Let’s say you did find an easy way to store a pile of data in your BigQuery data warehouse and keep them in sync. Now you want to start messing with it using statistical techniques, maybe build a model of your customers’ behavior, or try to predict your churn rate.

To do that, you will need to extract your data from BigQuery and use a framework or language that is best suited for data analysis and the most popular so far are Python and R. In this small tutorial we will see how we can extract data that is stored in Google BigQuery to load it with Python or R, and then use the numerous analytic libraries and algorithms that exist for these two languages.

Read on to see how easy it is for either language.

Related Posts

Python versus R (Again)

Alex Woodie looks at whether Python is dominating R in the data science space: There is some evidence that Python’s popularity is hurting R usage. According to the TIOBE Index, Python is currently the third most popular language in the world, behind perennial heavyweights Java and C. From August 2018 to August 2019, Python usage surged […]

Read More

Local Randomness and R

Evgeni Chasnovski has a problem around generating random data: Let’s say we have a deterministic (non-random) problem for which one of the solutions involves randomness. One very common example of such problem is a function minimization on certain interval: it can be solved non-randomly (like in most methods of optim()), or randomly (the simplest approach being […]

Read More


April 2018
« Mar May »