Mercurial startup time is already 45.8x slower than Git whereas tested Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial developers, with a startup time 2x - 3x slower...
So please continue efforts for make Python startup even faster to beat all other programming languages, and finally convince Mercurial to upgrade ;-)
https://www.mercurial-scm.org/wiki/Python3Nevertheless, I can't really be annoyed or upset at them moving slowly to adopt Python 3, as Matt's objections were entirely legitimate.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
If it could save a person’s life, could you find a way to save ten seconds off the boot time? If there were five million people using the Mac, and it took ten seconds extra to turn it on every day, that added up to three hundred million or so hours per year people would save, which was the equivalent of at least one hundred lifetimes saved per year.
Steve Jobs.
It really does depend on how/what users are using Python for. In general, Python has been moving more and more toward a "systems development language" from a "scripting language". Which may make us think "scripting" issues like startup time don't matter -- but,. of course, they matter a lot to those use cases.And about a fifth of the time they spent standing in lines waiting to
buy the latest unnecessary iGadget...
But seriously, that calculation is completely bogus. Not only is Steve
Job's arithmetic *completely* wrong, but the whole premise is nonsense.
Do the maths yourself: ten seconds per day is 3650 seconds in a year,
which is slightly over an hour (3600 seconds). Multiply by five million
users, that's about five million hours, not 300 million. So Jobs
exaggerates the time saved by a factor of sixty.
(Or maybe Jobs was warning that Macs crash sixty times a day...)
But the premise is wrong too. Those hypothetical people don't turn their
Macs on in sequence, each person turning their computer on only after
the previous person's Mac had finished booting. They effectively boot
them up in parallel but offset, spread out over a 24 hour period, so
about 3472 people booting up at the same time each minute of the day.
Time savings for parallel processes don't add in the way Jobs adds them,
if we treat this as 1440 parallel processes (one per minute of the day)
we save 1440 hours a year.
But really, the only meaningful calculation is the each person saves 10
seconds per day. We can't even meaningfully say they save one hour a
year: it doesn't come nicely packaged up for you all at once, so you can
actually do something useful with it, nor can you save those ten seconds
from one day to the next. You only get one shot at using them. What can
you do with ten seconds per day? By the time you decide what to do with
the extra time, it's already gone.
There are good reasons for speeding up boot time, but this sort of
calculation is not one of them. I think it is in particularly bad taste
to exaggerate the significance of it by putting it in terms of saving
lives. You want to save real lives? How about fixing the conditions in
the sweatshops that make Apple phones? And installing suicide nets
around the building doesn't count.
--
Steve
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
2017-07-20 19:09 GMT+02:00 Cesare Di Mauro <cesare....@gmail.com>:
> I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. That's something that also influences the startup time (compiling source vs loading pre-compiled modules).
My benchmark was "python3 -m perf command -- python3 -c pass": I don't
explicitly remove .pyc files, I expect that Python uses prebuilt .pyc
files from __pycache__.
Victor
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
That is what Emacs does, and it causes them a lot of trouble. They're
trying to move away from it at the moment, but the direction is not yet
clear. The keyword is "unexec", and it wrecks havoc with malloc.
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
Emacs has been unexec'ing for as long as I can remember (which is longer thanI can remember Python :). I know that it's been problematic and there have
been many efforts over the years to replace it, but I think it's been a fairly
successful technique in practice, at least on platforms that support it.
I believe the trend is due to language like Python and Node.js, most of which aggressively discourage threading (more from the broader community than the core languages, but I see a lot of apps using these now), and also the higher reliability afforded by out-of-process tasks (that is, one crash doesn’t kill the entire app – e.g browser tabs).
Optimizing startup time is incredibly valuable, and having tried it a few times I believe that the import system (in essence, stat calls) is the biggest culprit. The tens of ms prior to the first user import can’t really go anywhere.
Cheers,
Steve
Top-posted from my Windows phone
I believe the trend is due to language like Python and Node.js, most of which aggressively discourage threading (more from the broader community than the core languages, but I see a lot of apps using these now), and also the higher reliability afforded by out-of-process tasks (that is, one crash doesn’t kill the entire app – e.g browser tabs).
Optimizing startup time is incredibly valuable, and having tried it a few times I believe that the import system (in essence, stat calls) is the biggest culprit. The tens of ms prior to the first user import can’t really go anywhere.
“Stat calls in the import system were optimized in importlib a while back”
Yes, I’m aware of that, which is why I don’t have any specific suggestions off-hand. But given the differences in file systems between Windows and other OSs, it wouldn’t surprise me if there were a more optimal approach for NTFS to amortize calls better. Perhaps not, but it is still the most expensive part of startup that we have any ability to change, so it’s worth investigating.
Cheers,
Steve
Top-posted from my Windows phone
Can you expand on it being "the most expensive part of startup that we
have any ability to change"?
For example, how do Nick's benchmarks above fare on Windows?
Regards
Antoine.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
> Optimizing startup time is incredibly valuable,
I've been reading that from the beginning of this thread but I've been
using python since the 2.4 and I never felt the burden of the startup time.
I'm guessing a lot of people are like me, they just don't express them
self because "better startup time can't be bad so let's not put a
barrier on this".
I'm not against it, but since the necessity of a faster Python in
general has been a debate for years and is only finally catching up with
the work of Victor Stinner, can somebody explain me the deal with start
up time ?
I understand where it can improve your lives. I just don't get why it's
suddenly such an explosion of expectations and needs.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Le 23/07/2017 à 19:36, Brett Cannon a écrit :
>
>
> On Sun, Jul 23, 2017, 00:53 Michel Desmoulin, <desmoul...@gmail.com
> <mailto:desmoul...@gmail.com>> wrote:
>
>
>
> > Optimizing startup time is incredibly valuable,
>
> I've been reading that from the beginning of this thread but I've been
> using python since the 2.4 and I never felt the burden of the
> startup time.
>
> I'm guessing a lot of people are like me, they just don't express them
> self because "better startup time can't be bad so let's not put a
> barrier on this".
>
> I'm not against it, but since the necessity of a faster Python in
> general has been a debate for years and is only finally catching up with
> the work of Victor Stinner, can somebody explain me the deal with start
> up time ?
>
> I understand where it can improve your lives. I just don't get why it's
> suddenly such an explosion of expectations and needs.
>
>
> It's actually always been something we have tried to improve, it just
> comes in waves. For instance we occasionally re-examine what modules get
> pulled in during startup. Importlib was optimized to help with startup.
> This just happens to be the latest round of trying to improve the situation.
>
> As for why we care, every command-line app wants to at least appear
> faster if not be faster because just getting to the point of being able
> to e.g. print a version number is dominated by Python and app start-up.
Fair enought.
> And this is not guessing; I work with a team that puts out a command
> line app and one of the biggest complaints they get is the startup time.
This I don't get. When I run any command line utility in python (grin,
ffind, pyped, django-admin.py...), the execute in a split second.
I can't even SEE the different between:
python3 -c "import os; [print(x) for x in os.listdir('.')]"
and
ls .
I'm having a hard time understanding how the Python VM startup time can
be perceived as a barriere here. I can understand if you have an
application firing Python 1000 times a second, like a CGI service or
some kind of code exec service. But scripting ?
Now I can imagine that a given Python program can be slow to start up,
because it imports a lot of things. But not the VM itself.
>
> -brett
>
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org <mailto:Pytho...@python.org>
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
Now I can imagine that a given Python program can be slow to start up,
because it imports a lot of things. But not the VM itself.
That does remind me of a capability we haven''t played with a lot recently:
$ python3 -m site
sys.path = [
'/home/ncoghlan',
'/usr/lib64/python36.zip',
'/usr/lib64/python3.6',
'/usr/lib64/python3.6/lib-dynload',
'/home/ncoghlan/.local/lib/python3.6/site-packages',
'/usr/lib64/python3.6/site-packages',
'/usr/lib/python3.6/site-packages',
]
USER_BASE: '/home/ncoghlan/.local' (exists)
USER_SITE: '/home/ncoghlan/.local/lib/python3.6/site-packages' (exists)
ENABLE_USER_SITE: True
The interpreter puts a zip file ahead of the regular unpacked standard
library on sys.path because at one point in time that was a useful
optimisation technique for reducing import costs on application
startup. It was a potentially big win with the old "multiple stat
calls" import implementation, but I'm not aware of any more recent
benchmarks relative to the current listdir-caching based import
implementation.
So I think some interesting experiments to try measuring might be:
- pushing the "always imported" modules into a dedicated zip archive
- having the interpreter pre-seed sys.modules with the contents of
that dedicated archive
- freezing those modules and building them into the interpreter that way
- compiling the standalone top-level modules with Cython, and loading
them as extension modules
- compiling in the Cython generated modules as builtins (not currently
an option for packages & submodules due to [1])
The nice thing about those kinds of approaches is that they're all
fairly general purpose, and relate primarily to how the Python
interpreter is put together, rather than how the individual modules
are written in the first place.
(I'm not volunteering to run those experiments, though - just pointing
out some of the technical options we have available to us that don't
involve adding more handcrafted C extension modules to CPython)
[1] https://bugs.python.org/issue1644818
Cheers,
NIck.
P.S. Checking the current list of source modules implicitly loaded at
startup, I get:
>>> import sys
>>> sorted(k for k, m in sys.modules.items() if m.__spec__ is not None and type(m.__spec__.loader).__name__ == "SourceFileLoader")
['_collections_abc', '_sitebuiltins', '_weakrefset', 'abc', 'codecs',
'encodings', 'encodings.aliases', 'encodings.latin_1',
'encodings.utf_8', 'genericpath', 'io', 'os', 'os.path', 'posixpath',
'rlcompleter', 'site', 'stat']
--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
I just now found found this thread when searching the archive for
threads about startup time. And I was searching for threads about
startup time because Mercurial's startup time has been getting slower
over the past few months and this is causing substantial pain.
As I posted back in 2014 [1], CPython's startup overhead was >10% of the
total CPU time in Mercurial's test suite. And when you factor in the
time to import modules that get Mercurial to a point where it can run
commands, it was more like 30%!
Mercurial's full test suite currently runs `hg` ~25,000 times. Using
Victor's startup time numbers of 6.4ms for 2.7 and 14.5ms for
3.7/master, Python startup overhead contributes ~160s on 2.7 and ~360s
on 3.7/master. Even if you divide this by the number of available CPU
cores, we're talking dozens of seconds of wall time just waiting for
CPython to get to a place where Mercurial's first bytecode can execute.
And the problem is worse when you factor in the time it takes to import
Mercurial's own modules.
As a concrete example, I recently landed a Mercurial patch [2] that
stubs out zope.interface to prevent the import of 9 modules on every
`hg` invocation. This "only" saved ~6.94ms for a typical `hg`
invocation. But this decreased the CPU time required to run the test
suite on my i7-6700K from ~4450s to ~3980s (~89.5% of original) - a
reduction of almost 8 minutes of CPU time (and over 1 minute of wall time)!
By the time CPython gets Mercurial to a point where we can run useful
code, we've already blown most of or past the time budget where humans
perceive an action/command as instantaneous. If you ignore startup
overhead, Mercurial's performance compares quite well to Git's for many
operations. But the reality is that CPython startup overhead makes it
look like Mercurial is non-instantaneous before Mercurial even has the
opportunity to execute meaningful code!
Mercurial provides a `chg` program that essentially spins up a daemon
`hg` process running a "command server" so the `chg` program [written in
C - no startup overhead] can dispatch commands to an already-running
Python/`hg` process and avoid paying the startup overhead cost. When you
run Mercurial's test suite using `chg`, it completes *minutes* faster.
`chg` exists mainly as a workaround for slow startup overhead.
Changing gears, my day job is maintaining Firefox's build system. We use
Python heavily in the build system. And again, Python startup overhead
is problematic. I don't have numbers offhand, but we invoke likely a few
hundred Python processes as part of building Firefox. It should be
several thousand. But, we've had to "hack" parts of the build system to
"batch" certain build actions in single process invocations in order to
avoid Python startup overhead. This undermines the ability of some build
tools to formulate a reasonable understanding of the DAG and it causes a
bit of pain for build system developers and makes it difficult to
achieve "no-op" and fast incremental builds because we're always
invoking certain Python processes because we've had to move DAG
awareness out of the build backend and into Python. At some point, we'll
likely replace Python code with Rust so the build system is more "pure"
and easier to maintain and reason about.
I've seen posts in this thread and elsewhere in the CPython development
universe that challenge whether milliseconds in startup time matter.
Speaking as a Mercurial and Firefox build system developer,
*milliseconds absolutely matter*. Going further, *fractions of
milliseconds matter*. For Mercurial's test suite with its ~25,000 Python
process invocations, 1ms translates to ~25s of CPU time. With 2.7,
Mercurial can dispatch commands in ~50ms. When you load common
extensions, it isn't uncommon to see process startup overhead of
100-150ms! A millisecond here. A millisecond there. Before you know it,
we're talking *minutes* of CPU (and potentially wall) time in order to
run Mercurial's test suite (or build Firefox, or ...).
From my perspective, Python process startup and module import overhead
is a severe problem for Python. I don't say this lightly, but in my mind
the problem causes me to question the viability of Python for popular
use cases, such as CLI applications. When choosing a programming
language, I want one that will scale as a project grows. Vanilla process
overhead has Python starting off significantly slower than compiled code
(or even Perl) and adding module import overhead into the mix makes
Python slower and slower as projects grow. As someone who has to deal
with this slowness on a daily basis, I can tell you that it is extremely
frustrating and it does matter. I hope that the importance of the
problem will be acknowledged (milliseconds *do* matter) and that
creative minds will band together to address it. Since I am
disproportionately impacted by this issue, if there's anything I can do
to help, let me know.
Gregory
[1] https://mail.python.org/pipermail/python-dev/2014-May/134528.html
[2] https://www.mercurial-scm.org/repo/hg/rev/856f381ad74b
Gregory
[1] https://mail.python.org/pipermail/python-dev/2014-May/134528.html
[2] https://www.mercurial-scm.org/repo/hg/rev/856f381ad74b
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
What do you propose to make Python startup faster?
As I wrote in my previous emails, many Python core developers care of
the startup time and we are working on making it faster.
INADA Naoki added -X importtime to identify slow imports and
understand where Python spent its startup time.
Recent example: Barry Warsaw identified that pkg_resources is slow and
added importlib.resources to Python 3.7:
https://docs.python.org/dev/library/importlib.html#module-importlib.resources
Brett Cannon is also working on a standard solution for lazy imports
since many years:
https://pypi.org/project/modutil/
https://snarky.ca/lazy-importing-in-python-3-7/
Nick Coghlan is working on the C API to configure Python startup: PEP
432. When it will be ready, maybe Mercurial could use a custom Python
optimized for its use case.
Does Mercurial need all directories of sys.path?
What's the status of the "system python" project? :-)
I also would prefer Python without the site module. Can we rewrite
this module in C maybe? Until recently, the site module was needed on
Python to create the "mbcs" encoding alias. Hopefully, the feature has
been removed into Lib/encodings/__init__.py (new private _alias_mbcs()
function).
Correct me if I'm wrong, but aren't there downsides with regards to C extension compatibility to not having a shared libpython? Or does all the packaging tooling "just work" without a libpython? (It's possible I have my wires crossed up with something else regarding a statically linked Python.)
FWIW, Google has a patched glibc that implements dlopen_with_offset().
It allows you to do things like memory map the current binary and then
dlopen() a shared library embedded in an ELF section.
I've seen the code in the branch at
https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/google/grte/v4-2.19/master.
It likely exists elsewhere. An attempt to upstream it occurred at
https://sourceware.org/bugzilla/show_bug.cgi?id=11767. It is probably
well worth someone's time to pick up the torch and get this landed in
glibc so everyone can be a massive step closer to self-contained, single
binary applications. Of course, it will take years before you can rely
on a glibc version with this API being deployed universally. But the
sooner this lands...
>
> I’ll plug shiv and importlib.resources (and the standalone importlib_resources) again here. :)
>
>> If you go this route, please don't require the use of zlib for file compression, as zlib is painfully slow compared to alternatives like lz4 and zstandard.
>
> shiv works in a similar manner to pex, although it’s a completely new implementation that doesn’t suffer from huge sys.paths or the use of pkg_resources. shiv + importlib.resources saves us 25-50% of warm cache startup time. That makes things better but still not ideal. Ultimately though that means we don’t suffer from the slowness of zlib since we don’t count cold cache times (i.e. before the initial pyz unpacking operation).
>
> Cheers,
> -Barry
>
>
>
_______________________________________________
On Wed, May 2, 2018, at 09:42, Gregory Szorc wrote:
> The direction Mercurial is going in is that `hg` will likely become a Rust
> binary (instead of a #!python script) that will use an embedded Python
> interpreter. So we will have low-level control over the interpreter via the
> C API. I'd also like to see us distribute a copy of Python in our official
> builds. This will allow us to take various shortcuts, such as not having to
> probe various sys.path entries since certain packages can only exist in one
> place. I'd love to get to the state Google is at where they have
> self-contained binaries with ELF sections containing Python modules. But
> that requires a bit of very low-level hacking. We'll likely have a Rust
> binary (that possibly static links libpython) and a separate JAR/zip-like
> file containing resources.
I'm curious about the rust binary. I can see that would give you startup time benefits similar to the ones you could get hacking the interpreter directly; e.g., you can use a zipfile for everything and not have site.py. But it seems like the Python-side wins would stop there. Is this all a prelude to incrementally rewriting hg in rust? (Mercuric oxide?)
Nobody in the project is seriously talking about a complete rewrite in Rust. Contributors to the project have varying opinions on how aggressively Rust should be utilized. People who contribute to the C code, low-level primitives (like storage, deltas, etc), and those who care about performance tend to want more Rust. One thing we almost universally agree on is that we want to rewrite all of Mercurial's C code in Rust. I anticipate that figuring out the balance between Rust and Python in Mercurial will be an ongoing conversation/process for the next few years.
--
Ryan (ライアン)
Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else
https://refi64.com/
> ----------
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
On 5/2/2018 8:56 PM, Gregory Szorc wrote:
Nobody in the project is seriously talking about a complete rewrite in Rust. Contributors to the project have varying opinions on how aggressively Rust should be utilized. People who contribute to the C code, low-level primitives (like storage, deltas, etc), and those who care about performance tend to want more Rust. One thing we almost universally agree on is that we want to rewrite all of Mercurial's C code in Rust. I anticipate that figuring out the balance between Rust and Python in Mercurial will be an ongoing conversation/process for the next few years.Have you considered simply rewriting CPython in Rust?
And yes, the 4th word in that question was intended to produce peals of shocked laughter. But why Rust? Why not Go?
On 3 May 2018 at 15:56, Glenn Linderman <v+py...@g.nevcal.com> wrote:On 5/2/2018 8:56 PM, Gregory Szorc wrote:
Nobody in the project is seriously talking about a complete rewrite in Rust. Contributors to the project have varying opinions on how aggressively Rust should be utilized. People who contribute to the C code, low-level primitives (like storage, deltas, etc), and those who care about performance tend to want more Rust. One thing we almost universally agree on is that we want to rewrite all of Mercurial's C code in Rust. I anticipate that figuring out the balance between Rust and Python in Mercurial will be an ongoing conversation/process for the next few years.Have you considered simply rewriting CPython in Rust?FWIW, I'd actually like to see Rust approved as a language for writing stdlib extension modules, but actually ever making that change in policy would require a concrete motivating use case.
And yes, the 4th word in that question was intended to produce peals of shocked laughter. But why Rust? Why not Go?Trying to get two different garbage collection engines to play nice with each other is a recipe for significant pain, since you can easily end up with uncollectable cycles that neither GC system has complete visibility into (all it needs is a loop from PyObject A -> Go Object B -> back to PyObject A).Combining Python and Rust can still get into that kind of trouble when using reference counting on the Rust side, but it's a lot easier to avoid than it is in runtimes with mandatory GC.
Recently, I reported how stdlib slows down `import requests`.
https://github.com/requests/requests/issues/4315#issuecomment-385584974
* Add faster and simpler http.parser (maybe, based on h11 [1]) and avoid
using email module in http module.
A few of us spent some time at last year’s core Python dev talking about other things we could do to improve Python’s start up time, not just with the interpreter itself, but within the larger context of the Python ecosystem. Many ideas seem promising until you dive into the details, so it’s definitely hard to imagine maintaining all of Python’s dynamic semantics and still making it an order of magnitude faster to start up. But that’s not an excuse to give up, and I’m hoping we can continue to attack the problem, both in the micro and the macro, for 3.8 and beyond, because the alternative is that Python becomes less popular as an implementation language for CLIs. That would be sad, and definitely has a long term impact on Python’s popularity.
Cheers,
-Barry
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
How about go the other way and allow compiling at Python *compile*-time? That would actually make things faster instead of just moving the time spent around.
I do see value in being less eager in Python but I think the real wins are hiding behind ahead-of-time compilation.
- Ł
> On May 2, 2018, at 8:57 PM, INADA Naoki <songof...@gmail.com> wrote:
>
> Recently, I reported how stdlib slows down `import requests`.
> https://github.com/requests/requests/issues/4315#issuecomment-385584974
>
> For Python 3.8, my ideas for faster startup time are:
>
> * Add lazy compiling API or flag in `re` module. The pattern is compiled
> when first used.
How about go the other way and allow compiling at Python *compile*-time? That would actually make things faster instead of just moving the time spent around.
I do see value in being less eager in Python but I think the real wins are hiding behind ahead-of-time compilation.
- Ł
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Could one make a little startup utility that, when invoked the first
time, starts up a raw python interpreter, keeps it running somewhere,
and then forks it to run the actual python code.
Then every invocation after that would make a new fork. I presume
forking is a LOT faster than re-invoking the entire startup.
I suspect that many of the cases where startup time really matters is
when a command line utility is likely to be invoked many times — often
in the same shell instance.
So having a “pre-built” warm interpreter ready to go could really help.
This is way past my technical expertise to know if it’s possible, or
to try to prototype it, but I’m sure many of you would know.
-CHB
Sent from my iPhone
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
On May 11, 2018 9:39:28 AM Chris Barker - NOAA Federal via Python-Dev
<pytho...@python.org> wrote:
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
<plug> https://refi64.com/uprocd/ </plug>
It will broke hash randomization.
See also: https://www.cvedetails.com/cve/CVE-2017-11499/
This discussion subthread is not about having a memory image dumped on
disk, but a daemon utility that preloads a new Python process when you
first start up your CLI application. Each time a new process is
preloaded, it will by construction use a new hash seed.
(by contrast, the Node.js CVE issue you linked to is about having the
same hash seed accross a Node.js version; that's disastrous)
Also you add a reuse limit to ensure that the hash seed is rotated (e.g.
every 100 invocations).
Regards
Antoine.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
Le 14/05/2018 à 19:12, INADA Naoki a écrit :
> I'm sorry, the word *will* may be stronger than I thought.
>
> I meant if memory image dumped on disk is used casually,
> it may make easier to make security hole.
>
> For example, if `hg` memory image is reused, and it can be leaked in some
> way,
> hg serve will be hashdos weak.
This discussion subthread is not about having a memory image dumped on
disk, but a daemon utility that preloads a new Python process when you
first start up your CLI application. Each time a new process is
preloaded, it will by construction use a new hash seed.
We were debugging abysmally slow execution of Mercurial's test harness
on macOS and we discovered a new wrinkle to the startup time problem.
It appears that APFS acquires some shared locks/mutexes in the kernel
when executing readdir() and other filesystem system calls. When you
have several Python processes all starting at the same time, I/O
attached to module importing (import.c:case_ok() by the looks of it for
Python 2.7) becomes a stress test of sorts for this lock acquisition. On
my 6+6 core MacBook Pro, ~75% of overall system CPU is spent in the
kernel when executing the test harness with 12 parallel tests.
If we run the test harness with the persistent `chg` command server
(which eliminates Python process startup overhead), wall execution time
drops from ~37:43s to ~9:06s.
This problem of shared locks on read-only operations appears to be
similar to that of AUFS, which I've blogged about [1].
It is pretty common for non-compiled languages (like Python, Ruby, PHP,
Perl, etc) to stat() the world as part of looking for modules to load.
Typically, the filesystem's stat cache will save you and the overhead
from hundreds or thousands of lookups is trivial (after first load). But
it appears APFS is quite sensitive to it. Any work to reduce the number
of filesystem API calls during Python startup will likely have a
profound impact on APFS when multiple Python processes are starting. A
"frozen" application where modules are in a shared container file is
probably ideal.
Python 3.7 doesn't exhibit as much of a problem. But it is still there.
A brief audit of the importer code and call stacks confirms it is the
same problem - just less prevalent. Wall time execution of the test
harness from Python 2.7 to Python 3.7 drops from ~37:43s to ~20:39.
Overall kernel CPU time drops from ~75% to ~19%. And that wall time
improvement is despite Python 3's slower process startup. So locking in
the kernel is really a killer on Python 2.7.
While we're here, CPython might want to look into getdirentriesattr() as
a replacement for readdir(). We switched to it in Mercurial several
years ago to make `hg status` operations significantly faster [2]. I'm
not sure if it will yield a speedup on APFS though. But it's worth a
try. (If it does, you could probably make
os.listdir()/os.scandir()/os.walk() significantly faster on macOS.)
I hope someone finds this information useful to further improving
[startup] performance. (And given that Python 3.7 is substantially
faster by avoiding excessive readdir(), I wouldn't be surprised if this
problem is already known!)
[1] https://gregoryszorc.com/blog/2017/12/08/good-riddance-to-aufs/
[2] https://www.mercurial-scm.org/repo/hg/rev/05ccfe6763f1
On 9 Oct 2018, at 23:02, Gregory Szorc <gregor...@gmail.com> wrote:
While we're here, CPython might want to look into getdirentriesattr() as
a replacement for readdir(). We switched to it in Mercurial several
years ago to make `hg status` operations significantly faster [2]. I'm
not sure if it will yield a speedup on APFS though. But it's worth a
try. (If it does, you could probably make
os.listdir()/os.scandir()/os.walk() significantly faster on macOS.)