4

I have the following table with data from GA in BigQuery

userid  visitid purchase_date
GH8932  12345   2017-04-09
GH8932  12346   null
GH8932  12347   null
GH8932  12348   null
GH8932  12349   2017-05-30
GH8932  12350   null
GH8932  12351   null
GH8932  12352   2017-06-07
GH8932  12353   null
GH8932  12354   2017-06-30

And I want the resulting table to be

userid  visitid purchase_date
GH8932  12345   2017-04-09
GH8932  12346   2017-05-30
GH8932  12347   2017-05-30
GH8932  12348   2017-05-30
GH8932  12349   2017-05-30
GH8932  12350   2017-06-07
GH8932  12351   2017-06-07
GH8932  12352   2017-06-07
GH8932  12353   2017-06-30
GH8932  12354   2017-06-30

I tried the following

select
a.userid,
a.visitid,
b.purchase_date
from x
left join
( 
select 
userid,
visitid,
purchase_data 
from x 
where purchase_date is not null) as b
on x.userid = b.userid
where x.visitid <= b.visitid

But, this doesn't provide the solution I'm looking for.

Appreciate the help for a BQ rookie

1 Answer 1

15

Use IGNORE NULLS with FIRST_VALUE to find a non null date in the desired window:

SELECT
  userid,
  visitid,
  FIRST_VALUE(purchase_date IGNORE NULLS) OVER (
    PARTITION BY userid ORDER BY visitid
    ROWS BETWEEN CURRENT ROW AND
    UNBOUNDED FOLLOWING) AS purchase_date
FROM x;
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @Elliot Brossard, I just realized that there are a few more 'nulls' after the final visit, which needs to be overwritten with the final 'purchase_date'. Any way we can achieve this?
Can you maybe post a separate question? That way you'll get input on a solution from other people as well.
Sure, I'll do that. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.