William Pietri <
wil...@scissor.com> wrote:
> My background has some closer-to-the-metal work. Some Apple ][
> assembler as a kid; a bunch of C; lots of Java (including happily
> using Prevayler); tons of performance tuning. But most work these
> last several years has been in startup-land, where speed of
> iteration has been the most important criterion, so I've mainly been
> working in Ruby and Python.
I mainly work in Perl5 or Ruby (and C if required);
but will often use GNU make, shell + awk as necessary; too.
Working with the ruby-core team the past few years, we're
certainly aware of mechanical sympathy; but there's a lot of
trade offs involved with how far we can go given how dynamic
Ruby is.
I'm certain the core developers of other scripting
implementations are similarly knowledgeable about performance;
but there's a huge gap between those developers and most users
of the language :)
> 95% of the time, those languages are fine. But sometimes when I hit
> a performance issue, I'm at a bit of a loss. I know my technical
> options, of course, but there's a... philosophical issue between me
> and most people who work in those languages (which use global
> interpreter locks
> <
https://en.wikipedia.org/wiki/Global_interpreter_lock>). When
> something is taking too long, my instinct is to say, "Hey, let's
> keep the data right where it is in RAM and use the box's 7 idle
> cores." Theirs is to look brightly at me and say, "Oh, we'll just
> run more processes and communicate over sockets. And maybe use more
> boxes. And memcache must work in here somehow, right?"
Yeah, it sucks. We haven't figured out how to remove the GVL
in Ruby without hurting single-threaded performance (which we're
already bad at! :<) or breaking compatibility (we aren't great
there, either).
matz is still thinking and experimenting with better concurrency
APIs, too. Personally, I find the declarative style of Makefiles
to be an excellent way to express concurrency.
> What happens in my head is an accounting of the serialization and
> deserialization costs, plus the RAM soaked up by many runtimes, plus
> a lot of unnecessary trips through the kernel, plus maybe crossing
> the network to other boxes, plus a lot of programming shenanigans to
> break apart and reunify data, plus the assorted ops issues. I try
> not to look pained or sigh audibly, at least until they're out of
> sight.
I certainly think of the data, first: reducing transfers/copies,
reducing round trips, etc. For those reasons, I could never
stand how bloated the web is with graphics/JS/CSS, either.
> Then I think about my grandmother's basement, which had
> shelves full of coffee cans and paper bags and egg cartons, saved
> because she had grown up in the Great Depression, and her value
> metrics were hopelessly out of date. I wonder if that's me now, and
> I sigh again.
I see top-posting and HTML mail on mailing lists wasting my
bandwidth + storage and sigh, too :)
> After one of these architectural conversations, I'm sure one of us
> is missing something important, but I'm not always sure who that is.
>
> So I guess I'm wondering three things:
>
> * How do others with mechanical sympathy in scripting-heavy
> environments deal with thinking about and collaborating around
> system design?
I consider data optimization heavily: ways to reduce I/O,
optimizing DB design, data structures, deduplication,
deltafication, cache/memoizability, compression etc.
Given my time is limited, I think my time is better spent
optimizing data before code; as important data tends to
outlive code (code is just plumbing for data).
But data optimizations apply to code, too: reducing allocations,
smaller data structures, etc...
> * My normal heuristic for deciding when to put on my mechanical
> sympathy goggles is "user-visible performance degradation". Before
> that, I mainly favor other things. This has pluses and minuses. What
> other approaches do people use?
There's generic optimizations to apply throughout the codebase
as long as it does not decrease readability.
Things like: limiting object lifetimes, favoring fast features
of the particular language, and avoiding the slow features.
streaming data to parsers instead of slurping.
I suppose some of this knowledge comes from studying
interpreter/VM internals and contributing patches to the VM,
etc.
But often it's golfing and writing terser code :)
Many programmers I've seen seem to believe they're
paid by the quantity code they produce :<
I'm also cheap and anti-consumerist, so I use older/slower
hardware which forces me notice problems sooner rather
than later.
> * Do folks here see rising language options that provide the
> flexibility of languages like Python and Ruby but still let those of
> us with mechanical sympathy make good use of the resources at hand?
I've actually been using Perl5 more in recent years and
appreciate the stability and longetivity of the language.
If needed, I can (v)fork off and use pipes/sockets or use
Inline::C (I find XS too ugly). But I'll still use
shell/make/awk/sed in some places, too. I like using what's
already bundled on typical GNU/Linux system so I don't have to
wait for new stuff to download or build.