Support subquery in insert_all

248 views
Skip to first unread message

narr...@gmail.com

unread,
Mar 1, 2021, 10:30:09 AM3/1/21
to elixir-ecto
We have the need to use an INSERT query with the source being a subquery. I'd love to do this in ecto DSL since I need to include a few parameters dynamically. 

Right now, it's possible to insert specific fields using subqueries for each field, but that wouldn't work in our case since we want to upsert an aggregation like 

select device_id, t.key as data_key, jsonb_typeof(t.value) as data_type, count(*) as readings_count, min(r.measured_at) as oldest_reading, max(r.measured_at) as newest_reading, now() as last_update from readings r
join lateral (select * from jsonb_each(r.data)) t on true
where measured_at > now() - '1 hour'::interval
group by 1, 2, 3)


While I can express that subquery in Ecto DSL, I can't give it to Repo.insert_all. For the conflicts, it's awesome that supplying an Ecto.Query as on_conflict option is already possible. If it were possible to formulate an INSERT query using just queries and no data it would be even better. 

I imagine the call working something like this:

query = from s in StatSource, 
select: %{
  count: count(s),
  oldest: min(s.inserted_at),
foobar: s.foobar
},
group_by: s.foobar

conflict_query = from s in MyStatisticSchema,
  update: [set: [
    count: fragment("EXCLUDED.count + ?", s.count), 
   oldest: fragment("least(EXCLUDED.oldest, ?", s.oldest)
  ]]
Repo.insert_all(MyStatisticSchema, query, 
  conflict_target: [...], #fields from the given query
  on_conflict: conflict_query
)

which would get transformed to this SQL query:

INSERT INTO statistics (
SELECT COUNT(s) AS count, MIN(s.inserted_at) AS oldest, s.foobar FROM source
GROUP BY s.foobar
)
ON CONFLICT (foobar) DO UPDATE SET
count = EXCLUDED.count + statistics.count, oldest = LEAST(EXCLUDED.oldest, statistics.oldest)

I hope this is clear enough. For now I will have to use raw SQL in my code.



José Valim

unread,
Mar 1, 2021, 10:49:56 AM3/1/21
to elixi...@googlegroups.com
Thanks for the proposal! We would be definitely interested in supporting it if a PR is sent. :)

--
You received this message because you are subscribed to the Google Groups "elixir-ecto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-ecto...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-ecto/55f3f218-ab08-472e-bd63-c2e2ccd84a20n%40googlegroups.com.

Moritz Schmale

unread,
Mar 1, 2021, 12:41:05 PM3/1/21
to elixi...@googlegroups.com, José Valim
Thanks for the nudge, looks promising:
You received this message because you are subscribed to a topic in the Google Groups "elixir-ecto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-ecto/i8AePkCIK3I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-ecto...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-ecto/CAGnRm4JS4L6za%2BZrzpBU9xGBWA475g-bJuT%3Dj%3D3RAQrEX34yxg%40mail.gmail.com.

narr...@gmail.com

unread,
Mar 5, 2021, 10:34:37 AM3/5/21
to elixir-ecto
Hi again,

I have successfully implemented this for a subset of what I imagine would be the MVP for this feature:
  • Postgres support
  • select for maps and keyword lists
  • works together with on conflict seamlessly
Features that are still missing:
  • MyXQL adapter support
  • automaticall select schema
  • integration tests
However, I have a problem at a specific line of code in Ecto.Adapters.SQL, where I need to plan the query using the actual adapter (to get the params). However, the insert_all callback in Adapters.SQL doesn't receive the adapter module anywhere.
Since we needed it in production fast, I've hardcoded it to Ecto.Adapters.Postgres in my fork, but of course I want to have it dynamically use the correct one.

Then there's another question: 
If a query doesn't define a select clause, calling insert_all with it does not work. I'd like to make it so it will select all fields of the source schema, if it exists, similar to when I call Repo.all(MySchema) will internally generate a query that selects struct() and so on.

Is it ok if calling without a defined select is unsupported? If not, are there functions in ecto that amend the select clause to a query that doesn't have it?

Here are the comparison pages for my forks of ecto and ecto_sql if somebody wants to do an early review:


 I'm also happy to open PRs for each repo and continue the discussion in there.

Cheers

José Valim

unread,
Mar 5, 2021, 11:50:23 AM3/5/21
to elixi...@googlegroups.com
> However, I have a problem at a specific line of code in Ecto.Adapters.SQL

You should plan it on the Ecto side of things, before we call the adapter. That perhaps will address your problem too?

> if a query doesn't define a select clause, calling insert_all with it does not work.

There is a function in Ecto.Query.Planner that fills in the select for you, I believe it is called ensure_select.
But, in this case, because the fields need to match the table you are inserting, I am thinking it is probably
better to be explicit about it? Especially in regards to handling defaults and so on.

And we can discuss it over PRs, yeah! Thanks for the work so far!

Reply all
Reply to author
Forward
0 new messages