Size increase of wheels since 3.27

35 views
Skip to first unread message

Michael Fischer

unread,
May 9, 2024, 1:16:12 PM5/9/24
to DataStax Python Driver for Apache Cassandra User Mailing List
Hello,

It appears that the wheels for manylinux2014_x86_64.whl  and manylinux2014_aarch64.whl grew from ~ 3.7MB in the 3.26 release, to ~ 20.3MB in the 3.27 release, and staying close to that size ever since, while the packages for other systems have not grown anywhere near that much.

Unfortunately, this makes it difficult to e.g. stay within the size limits of an aws lambda.

Could you explain why this size increase happened (and only for those systems), and if there is anything we can do to reduce it? (preferably without sacrificing performance).

Thank you.

Bret McGuire

unread,
May 24, 2024, 6:42:55 PM5/24/24
to DataStax Python Driver for Apache Cassandra User Mailing List, mfis...@grubhub.com
   Apologies, I've been working on other things and haven't made it back to this question.  But I did have a bit of time to take a look this afternoon and I'm pretty sure I know what's going on.

   The Python driver 3.27.0 release was the first one where we moved to cibuildwheel for wheel generation.  That brought with it an entire change in wheel build ecosystem which, of course, always has the possibility of introducing changes in behaviour.  We've already seen this in a different context.  It looks like something similar happened here.  In the 3.26.0 build the shared objects built by cython were stripped as part of the build process.  Starting with the 3.27.0 build that isn't the case, and the result appears to account for nearly all of the growth in size of the wheels:

stravinsky:/work/scratch/python-wheel-size/3.27$ cp cassandra/pool.cpython-38-aarch64-linux-gnu.so pool.cpython-38-aarch64-linux-gnu.so

stravinsky:/work/scratch/python-wheel-size/3.27$ strip pool.cpython-38-aarch64-linux-gnu.so 

stravinsky:/work/scratch/python-wheel-size/3.27$ ls -al pool.cpython-38-aarch64-linux-gnu.so

-rwxr-xr-x 1 mersault mersault 481328 May 24 16:42 pool.cpython-38-aarch64-linux-gnu.so

stravinsky:/work/scratch/python-wheel-size/3.27$ ls -al cassandra/pool.cpython-38-aarch64-linux-gnu.so

-rwxr-xr-x 1 mersault mersault 3862120 May  2  2023 cassandra/pool.cpython-38-aarch64-linux-gnu.so

stravinsky:/work/scratch/python-wheel-size/3.27$ ls -al ../3.26/cassandra/pool.cpython-38-aarch64-linux-gnu.so

-rwxr-xr-x 1 mersault mersault 481328 Mar 17  2023 ../3.26/cassandra/pool.cpython-38-aarch64-linux-gnu.so


   I've filed PYTHON-1387 to dig into this a bit more.  I'm cautiously optimistic that this should be something we can fix pretty readily.

   Thanks for the report!

  - Bret -

Michael Fischer

unread,
May 27, 2024, 5:11:15 PM5/27/24
to DataStax Python Driver for Apache Cassandra User Mailing List, Michael Fischer
Thanks for the explanation and the update. I look  forward to hearing where this goes.

Bret McGuire

unread,
Jul 31, 2024, 5:39:42 PM7/31/24
to DataStax Python Driver for Apache Cassandra User Mailing List, mfis...@grubhub.com
   To follow up on this: a fix for PYTHON-1387 has been merged and should be used when building wheels for the upcoming 3.28.2 release.  There's a similar issue on the Windows side in the form of PYTHON-1386.  I'm going to try to get that one into 3.28.2 as well but I can't guarantee that as of this writing.

    - Bret -

Bret McGuire

unread,
Jul 31, 2024, 5:44:25 PM7/31/24
to DataStax Python Driver for Apache Cassandra User Mailing List, Bret McGuire, mfis...@grubhub.com
   Bah, that's what I get for going too fast.  PYTHON-1386 is the Windows version of PYTHON-1378, not PYTHON-1387.  Apologies for the confusion all.

Jody Lent

unread,
Aug 23, 2024, 11:52:33 AM8/23/24
to DataStax Python Driver for Apache Cassandra User Mailing List, Bret McGuire, mfis...@grubhub.com
Gentle nudge here -- any update on when 3.29.2 will be released to PyPI?

It's been 3 weeks since datastax/python-driver-wheels#10 merged.

Bret McGuire

unread,
Aug 27, 2024, 12:07:16 PM8/27/24
to DataStax Python Driver for Apache Cassandra User Mailing List, Jody Lent, Bret McGuire, mfis...@grubhub.com
   Excellent question Jody!  I'll try to answer by providing a general update.

   While working on 3.29.2 we uncovered a few outstanding issues with the wheels being built by cibuildwheel.  These included PYTHON-1387 (the unexpected growth in the size of wheels for recent releases, or the content of this thread of conversation) as well as PYTHON-1396 (failure of wheel builds after the deprecation of CentOS 7).  There was also a great deal of discussion around the need to add libev suppport for Windows wheels, something which has become more important with the lack of reliable asyncio support on Python 3.12.  This represents a new feature rather than a regression but it was judged to be significant enough to warrant fixing for this release.  You can find all the details of that conversation in PYTHON-1386.

   All of these issues have now been resolved.  As of this writing the only outstanding work remaining for 3.29.2 is completion of our enhanced support for the new vector type.  The governing ticket for that work is PYTHON-1369.  The implementation was completed a little while ago but a recent review identified a few additional test cases that should be included before we wrap this ticket up.  I'm working on finishing those test cases now; that work has delayed the release a bit but it's also identified at least one issue we'll need to address in a future release.  I believe this work is nearly complete so with any luck the 3.29.2 release should be out quite soon.

   I appreciate your patience!

   - Bret -
Reply all
Reply to author
Forward
0 new messages