Google BigQuery Resources exceeded during query execution. How to split large window frames with partition in SQL

smaica

I'm running out of memory with my query on Google BigQuery. I have to calculate multiple window functions like running sums over multiple different time frames. My data mainly consists of an id (string), a value (number), a type ('in' or 'out', could be converted to bool is needed) and a timestamp.

I read that there is no way to increase memory per slot, so the only way to be able to execute the query is to cut it into smaller pieces that can be sent to different slots. A way to do this is to use GROUP BY or OVER (PARTITION BY ...) but I have no idea how I could rewrite my query to make use of it.

I have some calculations that need to use PARTITION BY but for others, I want to calculate the total overall, for example:

Imagine a have a large table (> 1 billion rows) where I want to calculate a rolling sum over all values for different time frames, independent of id.

WITH data AS (
  SELECT * 
  FROM UNNEST([
    STRUCT
    ('A' as id,1 as value, 'out' as type, 1 as time), 
    ('A', -1, 'in', 2),
    ('B', 2, 'out', 2),
    ('C', 1, 'out', 3),
    ('B', -1, 'in', 4),
    ('A', 2, 'out', 4),
    ('C', 5, 'out', 5),
    ('B', 3, 'out', 6),
    ('A', 1, 'out', 6),
    ('A', -4, 'in', 6),
    ('C', -3, 'in', 7)
  ])
)
SELECT 
  id
, value
, type
, time
, SUM(value) OVER (ORDER BY time RANGE UNBOUNDED PRECEDING) as total
, SUM(value) OVER (ORDER BY time RANGE BETWEEN 1 PRECEDING AND CURRENT ROW) as total_last_day
, SUM(value) OVER (ORDER BY time RANGE BETWEEN 3 PRECEDING AND 2 PRECEDING) as total_prev_day
FROM data

How could I split this query to make use of PARTITION BY or GROUP BY in order to fit within the memory limits?

Mikhail Berlyant

Try below approach - I think it has good chances to resolve your issue

SELECT *
FROM data
JOIN (
  SELECT time
  , SUM(time_value) OVER (ORDER BY time RANGE UNBOUNDED PRECEDING) as total
  , SUM(time_value) OVER (ORDER BY time RANGE BETWEEN 1 PRECEDING AND CURRENT ROW) as total_last_day
  , SUM(time_value) OVER (ORDER BY time RANGE BETWEEN 3 PRECEDING AND 2 PRECEDING) as total_prev_day
  FROM (
    SELECT time, SUM(value) time_value
    FROM data
    GROUP BY time
  )
)
USING (time)       

if applied to sample data in your question - output is

enter image description here

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

BigQuery - Resources exceeded during query execution, with Allow Large Results set

BigQuery: How to Avoid "Resources exceeded during query execution." error

Resources Exceeded during query execution. BigQuery

Bigquery resources exceeded during query execution

Program Google Apps Script to throw error when BigQuery "Resources exceeded during query execution"

Google BigQuery: Are we charged when resources exceed during query execution?

Optimize query to avoid "Resources exceeded during query execution"

Resources Exceeded in Google BigQuery

resources exceeded in bigquery sql

How to concatenate and group rows in large BigQuery table with "Resources exceeded" problems

Query Failed Error: Resources exceeded during query execution: The query could not be executed in the allotted memory

WHERE IN query - Is not supported by BigQuery or Resources exceeded

BigQuery resources exceeded

How to minimize cost per SQL query execution in BigQuery

How Split a Large events table in google bigquery into multiple table based on event_type?

Query exceeded resource limits in Bigquery

PERCENT_RANK() in BigQuery returns Resources exceeded

BigQuery: resources exceeded when computing running sum

BigQuery Javascript UDF fails with "Resources Exceeded"

Problems with loading resources during execution

Google Bigquery query execution using google cloud dataflow

Google BigQuery Standard SQL - Sales Sliding Window

In BigQuery, how to random split query results?

How see console window of a service during your full execution lifetime?

How to split column by delimiter on Google BigQuery

Getting Error as "Error occurred during SQL query execution "

SQL query for Google BigQuery to count Sessions and Pageviews

BigQuery and Google Analytics SQL query - expanded question

BigQuery how to query partition by month/year when table partitioned by day?

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  3. 3

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  4. 4

    pump.io port in URL

  5. 5

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

  9. 9

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  10. 10

    How to remove the extra space from right in a webview?

  11. 11

    java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

  12. 12

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  13. 13

    flutter: dropdown item programmatically unselect problem

  14. 14

    How to use merge windows unallocated space into Ubuntu using GParted?

  15. 15

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  16. 16

    Nuget add packages gives access denied errors

  17. 17

    Svchost high CPU from Microsoft.BingWeather app errors

  18. 18

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  19. 19

    12.04.3--- Dconf Editor won't show com>canonical>unity option

  20. 20

    Any way to remove trailing whitespace *FOR EDITED* lines in Eclipse [for Java]?

  21. 21

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

HotTag

Archive