Google BigQuery — List all partitions in a partitioned table

Subscribe to my newsletter and never miss my upcoming articles

Partitioned table

A partitioned table is a special table that is divided into segments, called partitions, that makes it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance, and you can control costs by reducing the number of bytes read by a query.

There are two types of table partitioning available in BigQuery:

  • Tables partitioned by ingestion time: Tables partitioned based on the data’s ingestion (load) date or arrival date.
  • Partitioned tables: Tables that are partitioned based on a TIMESTAMP or DATE column.

List all partitions:

Below query lists all partitions of a partitioned table. By using Legacy SQL Mode we can use Partition summary decorator (PARTITIONS_SUMMARY) to query and get the list of partitions in a partitioned table.

Legacy SQL

To use legacy sql, In you Big Query Editor, Go to More-> Query Setting-> SQL dialect -> enable Legacy

SELECT * FROM
   [PROJECT_ID.DATASET_ID.TABLE_ID$__PARTITIONS_SUMMARY__];

Standard SQL

For tables partitioned based ingestion time

SELECT _PARTITIONTIME as pt, FORMAT_TIMESTAMP("%Y%m%d", _PARTITIONTIME) as partition_id
FROM `PROJECT_ID.DATASET_ID.TABLE_ID`
GROUP BY _PARTITIONTIME
ORDER BY _PARTITIONTIME

--or 
SELECT _PARTITIONDATE as pt, FORMAT_TIMESTAMP("%Y%m%d", _PARTITIONDATE) as partition_id
FROM `PROJECT_ID.DATASET_ID.TABLE_ID`
GROUP BY _PARTITIONDATE
ORDER BY _PARTITIONDATE

Replace PROJECT_ID, DATASET_ID, TABLE_ID with corresponding values. PROJECT_ID is optional.

PROJECT_ID = Google Cloud Project ID
DATASET_ID = Dataset ID 
TABLE_ID = Table ID

Originally posted on medium.com

if you find this article useful, feel free to connect with me on twitter to stay updated on future articles.

No Comments Yet