Performance Issues with Aggregates

25 views
Skip to first unread message

Sam Barrow

unread,
May 10, 2016, 9:05:01 AM5/10/16
to Squeryl
I'm trying to do a simple aggregate using a join. To dumb it down, I'm doing something like this:

SELECT owners.id, COUNT(properties.id)
FROM owners
LEFT JOIN properties ON properties.owner = owners.id

Or, I can use a subquery:

SELECT owners.id, (SELECT COUNT(properties.id) FROM properties WHERE properties.owner = owners.id)
FROM owners

Both are blazing fast in MySQL.

However, squeryl makes me do a subquery like so:

    val query2 = join(ownerTable, propertyTable.leftOuter)(
      (owner, property) =>
        groupBy(owner.id)
          compute (count(property.map(_.id)))
          on (owner.id === property.map(_.owner)))

Which is fast enough as a separate query, again.

However, when I compose this with the query i've written for the owner table, which selects a bunch of other stuff, it's EXTREMELY slow with my large dataset.

I've tried everything. Composed it with from(), with join(), everything is very slow with squeryl.

I can't do a groupBy on a select, which would easily solve my problem, and I can't do a subquery in my select clause, which would solve my problem as well based on my MySQL tests.

Am I missing something here? Because I am surprised that I simply can not do what I need to do (which really is a very simple aggregate query) with this type of large data set with squeryl. There seems to be absolutely no way to do this with any type of usable performance, and I know it's not because of MySQL.

Here is my actual code:

    val ownerCounts = join(propertyTable, propertyTable.leftOuter)(
      (property, ownedProperty) =>
        groupBy(property.id)
          compute (count(ownedProperty.map(_.id)))
          on (property.id === ownedProperty.flatMap(_.ownerPropertyId)))

    val query = join(propertyTable,
      ownerCounts.leftOuter,
      propertyTable.leftOuter,
      metaTable.leftOuter,
      metaTable.leftOuter,
      mailingTable.leftOuter,
      returnedMailTable.leftOuter)(
        (property, ownerCounts, ownerProperty, meta, ownerMeta, lastMailing, lastReturnedMail) =>
          where(condition(Bundle(property, ownerProperty, meta, ownerMeta, lastMailing, lastReturnedMail, Some(0))))
            select (Bundle(property, ownerProperty, meta, ownerMeta, lastMailing, lastReturnedMail, Some(0)))
            on (ownerCounts.map(_.key) === property.ownerPropertyId,
              ownerProperty.map(_.id) === property.ownerPropertyId,
              meta.map(_.propertyId) === property.id,
              ownerMeta.map(_.propertyId) === ownerProperty.map(_.id),
              lastMailing.map(_.id) === join(mailingTable, letterTable)(
                (mailing, letter) =>
                  where(mailing.userId === user.id and letter.propertyId === ownerProperty.map(_.id))
                    select (mailing.id)
                    orderBy (mailing.date desc)
                    on (mailing.id === letter.mailingId)).page(0, 1),
                //lastReturnedMail.map(_.id) === ownerMeta.flatMap(_.lastReturnedMailId)))
                lastReturnedMail.map(_.id) === from(returnedMailTable)(
                  (returnedMail) =>
                    where(returnedMail.propertyId === ownerProperty.map(_.id) and returnedMail.name === property.ownerName)
                      select (returnedMail.id)
                      orderBy (returnedMail.date desc)).page(0, 1)))
                      
                   
Without the ownerCounts subquery, it is very fast. With it, it crawls. I have all of the proper indexes etc etc.

Sam Barrow

unread,
May 10, 2016, 10:43:39 AM5/10/16
to Squeryl
This actually performs relatively well, but it's about the hackiest thing I've ever done. The NumberHack table is a table with numbers from 1-1000 as the id.


    val query = join(propertyTable,
      propertyTable.leftOuter,
      numberHackTable.leftOuter,
      metaTable.leftOuter,
      metaTable.leftOuter,
      mailingTable.leftOuter,
      returnedMailTable.leftOuter)(
        (property, ownerProperty, ownedCount, meta, ownerMeta, lastMailing, lastReturnedMail) =>
          where(condition(Bundle(property, ownerProperty, meta, ownerMeta, lastMailing, lastReturnedMail, ownedCount)))
            select (Bundle(property, ownerProperty, meta, ownerMeta, lastMailing, lastReturnedMail, ownedCount))
            on (ownerProperty.map(_.id) === property.ownerPropertyId,
              ownedCount.map(_.id) === from(propertyTable)(
                property =>
                  where(property.ownerPropertyId === ownerProperty.map(_.id))
                    compute (count)),
                meta.map(_.propertyId) === property.id,
                ownerMeta.map(_.propertyId) === ownerProperty.map(_.id),
                lastMailing.map(_.id) === join(mailingTable, letterTable)(
                  (mailing, letter) =>
                    where(mailing.userId === user.id and letter.propertyId === ownerProperty.map(_.id))
                      select (mailing.id)
                      orderBy (mailing.date desc)
                      on (mailing.id === letter.mailingId)).page(0, 1),
                  lastReturnedMail.map(_.id) === from(returnedMailTable)(
                    (returnedMail) =>
                      where(returnedMail.propertyId === ownerProperty.map(_.id) and returnedMail.name === property.ownerName)
                        select (returnedMail.id)
                        orderBy (returnedMail.date desc)).page(0, 1)))
    query
Reply all
Reply to author
Forward
0 new messages