How to run a BigQuery query in Python
This video explains
How to run a BigQuery query in Python
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Industries in Orbit Looping
--
Chapters
00:00 Question
00:48 Accepted answer (Score 15)
01:17 Answer 2 (Score 2)
01:36 Answer 3 (Score 1)
02:03 Answer 4 (Score 1)
02:45 Thank you
--
Full question
https://stackoverflow.com/questions/4500...
Accepted answer links:
[BigQuery Python client lib]: https://github.com/GoogleCloudPlatform/g...
[https://googlecloudplatform.github.io/go...]: https://googlecloudplatform.github.io/go...
[BigQuery Python client tutorial]: https://cloud.google.com/bigquery/docs/r...
Answer 4 links:
[https://googleapis.github.io/google-clou...]: https://googleapis.github.io/google-clou...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #googlebigquery
#avk47
How to run a BigQuery query in Python
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Industries in Orbit Looping
--
Chapters
00:00 Question
00:48 Accepted answer (Score 15)
01:17 Answer 2 (Score 2)
01:36 Answer 3 (Score 1)
02:03 Answer 4 (Score 1)
02:45 Thank you
--
Full question
https://stackoverflow.com/questions/4500...
Accepted answer links:
[BigQuery Python client lib]: https://github.com/GoogleCloudPlatform/g...
[https://googlecloudplatform.github.io/go...]: https://googlecloudplatform.github.io/go...
[BigQuery Python client tutorial]: https://cloud.google.com/bigquery/docs/r...
Answer 4 links:
[https://googleapis.github.io/google-clou...]: https://googleapis.github.io/google-clou...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #googlebigquery
#avk47
ACCEPTED ANSWER
Score 17
You need to use the BigQuery Python client lib, then something like this should get you up and running:
from google.cloud import bigquery
client = bigquery.Client(project='PROJECT_ID')
query = "SELECT...."
dataset = client.dataset('dataset')
table = dataset.table(name='table')
job = client.run_async_query('my-job', query)
job.destination = table
job.write_disposition= 'WRITE_TRUNCATE'
job.begin()
https://googlecloudplatform.github.io/google-cloud-python/stable/bigquery-usage.html
See the current BigQuery Python client tutorial.
ANSWER 2
Score 5
Here is another way using a JSON file for the service account:
>>> from google.cloud import bigquery
>>>
>>> CREDS = 'test_service_account.json'
>>> client = bigquery.Client.from_service_account_json(json_credentials_path=CREDS)
>>> job = client.query('select * from dataset1.mytable')
>>> for row in job.result():
... print(row)
ANSWER 3
Score 3
This is a good usage guide: https://googleapis.github.io/google-cloud-python/latest/bigquery/usage/index.html
To simply run and write a query:
# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'your_dataset_id'
job_config = bigquery.QueryJobConfig()
# Set the destination table
table_ref = client.dataset(dataset_id).table("your_table_id")
job_config.destination = table_ref
sql = """
SELECT corpus
FROM `bigquery-public-data.samples.shakespeare`
GROUP BY corpus;
"""
# Start the query, passing in the extra configuration.
query_job = client.query(
sql,
# Location must match that of the dataset(s) referenced in the query
# and of the destination table.
location="US",
job_config=job_config,
) # API request - starts the query
query_job.result() # Waits for the query to finish
print("Query results loaded to table {}".format(table_ref.path))
ANSWER 4
Score 1
I personally prefer querying using pandas:
# BQ authentication
import pydata_google_auth
SCOPES = [
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive',
]
credentials = pydata_google_auth.get_user_credentials(
SCOPES,
# Set auth_local_webserver to True to have a slightly more convienient
# authorization flow. Note, this doesn't work if you're running from a
# notebook on a remote sever, such as over SSH or with Google Colab.
auth_local_webserver=True,
)
query = "SELECT * FROM my_table"
data = pd.read_gbq(query, project_id = MY_PROJECT_ID, credentials=credentials, dialect = 'standard')