We consume a table with about million records that looks something akin to.
Dave, 20090213, Music, ProgramA
John, 20101001, Music, ProgramB
John, 20100101, Math , ProgramC
Dave, 20101112, Music, ProgramB
John, 20110901, Music, ProgramA
We need to extract tables that can answer questions like, "For each person and subject, which program is associated with the earliest date entry?" (like the table below).
Dave, Music, ProgramA
John, Math , ProgramC
John, Music, ProgramB
In the past I've always just pulled the table into SPSS or SAS to make this kind of query (aggregate and return first row based on the sorted values of one or more columns).
I understand this is possible in BigQuery using WITHIN. But even if we were allowed us to load this type of schema, it is not the type of table we're consuming.
Am I overlooking something? Any suggestions or solutions?
Joe