Message from discussion
Split unicodeobject.c into subfiles
Received: by 10.14.199.6 with SMTP id w6mr22558048een.0.1351178124445;
Thu, 25 Oct 2012 08:15:24 -0700 (PDT)
X-BeenThere: dev-python@googlegroups.com
Received: by 10.14.176.133 with SMTP id b5ls1446253eem.1.gmail; Thu, 25 Oct
2012 08:15:24 -0700 (PDT)
Received: by 10.216.143.35 with SMTP id k35mr1192040wej.6.1351178124152;
Thu, 25 Oct 2012 08:15:24 -0700 (PDT)
Received: by 10.216.143.35 with SMTP id k35mr1192039wej.6.1351178124136;
Thu, 25 Oct 2012 08:15:24 -0700 (PDT)
Return-Path: <python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org>
Received: from mail.python.org (mail.python.org. [2001:888:2000:d::a6])
by gmr-mx.google.com with ESMTPS id cv10si641579wib.0.2012.10.25.08.15.24
(version=TLSv1/SSLv3 cipher=OTHER);
Thu, 25 Oct 2012 08:15:24 -0700 (PDT)
Received-SPF: pass (google.com: domain of python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org designates 2001:888:2000:d::a6 as permitted sender) client-ip=2001:888:2000:d::a6;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org designates 2001:888:2000:d::a6 as permitted sender) smtp.mail=python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org; dkim=pass header...@python.org
Received: from albatross.python.org (localhost [127.0.0.1])
by mail.python.org (Postfix) with ESMTP id 3XnX2g6SZdzNfh
for <dev-python+garchive-30976@googlegroups.com>; Thu, 25 Oct 2012 17:15:23 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=python.org; s=200901;
t=1351178123; bh=PazZ/loS3lw4jmjODQoTvuqf9d0Fjqh75RU+OV5o5q8=;
h=Message-ID:Date:From:MIME-Version:To:References:In-Reply-To:Cc:
Subject:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:
List-Subscribe:Content-Type:Sender;
b=maQvPiGWycPUt3S3iwDGEZ/gAFc2SXt0l+/3ygwQW0b6jOSTaDBfuILOanMcC2SVb
hwdf5jPi6CoOKPjXYr2SQMVte2VZP2RuKxHal2Wr3isw+u54JRlLfqZC8FXuB4cNHR
x7WaiU5Fx7ltZ4HEx5r2q/ThPp7HfdfuSrsGHTE8=
X-Original-To: python-...@python.org
Delivered-To: python-...@mail.python.org
Received: from albatross.python.org (localhost [127.0.0.1])
by mail.python.org (Postfix) with ESMTP id 3XnX143VzczM2W
for <python-...@python.org>; Thu, 25 Oct 2012 17:14:00 +0200 (CEST)
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subject:: [': 0.03;
'essentially': 0.04; 'subject:Python': 0.05; 'cpython': 0.05;
'separately': 0.09; 'splitting': 0.09; 'subject:into': 0.09;
'suggest': 0.11; 'files.': 0.13; 'file,': 0.15; "*isn't*": 0.16;
'aesthetic': 0.16; 'agree.': 0.16; 'barrier': 0.16; 'benefit.':
0.16; 'distinct': 0.16; "isn't.": 0.16; 'lowering': 0.16; 'mess;':
0.16; 'nick': 0.16; 'received:192.168.39': 0.16;
'received:192.168.39.87': 0.16; 'subject:Dev': 0.16; 'wrote:':
0.17; 'fix': 0.17; 'compilation': 0.17; 'exists': 0.17; 'hacking':
0.17; 'pieces': 0.17; 'subject:] ': 0.19; 'code.': 0.20;
'proposed': 0.20; 'putting': 0.20; 'file.': 0.20; 'regardless':
0.21; 'object.': 0.22; 'of.': 0.22; 'cc:2**0': 0.23; "i've": 0.23;
'seems': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org':
0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26;
'compiled': 0.27; 'guess': 0.27; 'separate': 0.27; 'object,':
0.27; '(unless': 0.29; 'idea,': 0.29; 'implicitly': 0.29;
'motivation': 0.29; 'proposing': 0.29; 'remains': 0.29; 'str':
0.29; 'subject:skip:u 10': 0.29; 'source': 0.29; "i'm": 0.29;
"we're": 0.30; 'that.': 0.30; 'primary': 0.30; 'code': 0.31;
'point': 0.31; 'file': 0.32; 'received:209.85.160.46': 0.32;
'entry': 0.33; 'that,': 0.34; 'agree': 0.34;
'received:google.com': 0.34; 'doing': 0.35; 'pm,': 0.35;
'similar': 0.35; 'received:209.85': 0.35; 'something': 0.35;
'but': 0.36; 'modules': 0.36; 'smaller': 0.36; 'engineering':
0.36; 'anything': 0.36; 'should': 0.36; 'two': 0.37; 'being':
0.37; 'why': 0.37; 'received:209': 0.37; 'far': 0.37; 'files':
0.38; 'comment': 0.38; 'object': 0.38; 'some': 0.38;
'received:192': 0.39; 'received:192.168': 0.40; 'subject:-': 0.40;
'header:Received:5': 0.40; 'help': 0.40; 'think': 0.40; 'skip:u
10': 0.60; 'back': 0.62; 'future.': 0.62; 'assistance': 0.63;
'natural': 0.65; 'quality': 0.69; 'special': 0.73; 'inherent':
0.84; 'proves': 0.84; 'snow': 0.91
Received: from localhost (HELO mail.python.org) (127.0.0.1)
by albatross.python.org with SMTP; 25 Oct 2012 17:14:00 +0200
Received: from mail-pb0-f46.google.com (mail-pb0-f46.google.com
[209.85.160.46])
(using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
(No client certificate requested)
by mail.python.org (Postfix) with ESMTPS
for <python-...@python.org>; Thu, 25 Oct 2012 17:13:59 +0200 (CEST)
Received: by mail-pb0-f46.google.com with SMTP id rr4so2079141pbb.19
for <python-...@python.org>; Thu, 25 Oct 2012 08:13:58 -0700 (PDT)
d=google.com; s=20120113;
h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject
:references:in-reply-to:content-type:x-gm-message-state;
bh=vQlcvQyH0gu06p7gi1BrEvGLDLcVnByfxfKw4nGhqrM=;
b=j/3bpjs1vTGfDlZAxg1fbTF1AQTapgfl7M21ejlAOUR0LTX1RLjyJu9xd+feOnoaMp
xir1ouE5CI17P+WPqdd4hG0ei5nFGbq0LqLUqZNaZQp1A4pJ3ED11+aQBozExRJHL5cR
zpAtKCRoqJp24Ko9XbKLG4lIg6wiNpShrDmMyQe7AAIsFbAmRByWxMK+JkqWluYcEd39
feGAMj/qji0bvx0vkYA6d0ztjJaHUFuzXmOeMniGFihk2j5kpVYw6QTwvM37bDvcGKjf
rYsw3A+4Ju6NRSLpNhIruiPA30AOTf9+UYoRY9s4Eh0/UUDBN4XNFC4NxsbAT+ldTMyp
WEwg==
Received: by 10.69.0.40 with SMTP id av8mr60806185pbd.117.1351178038122;
Thu, 25 Oct 2012 08:13:58 -0700 (PDT)
Received: from [192.168.39.87]
(173-160-230-209-Washington.hfc.comcastbusiness.net.
[173.160.230.209])
by mx.google.com with ESMTPS id c1sm11426443pav.23.2012.10.25.08.13.55
(version=TLSv1/SSLv3 cipher=OTHER);
Thu, 25 Oct 2012 08:13:56 -0700 (PDT)
Message-ID: <50895731.7040...@hastings.org>
Date: Thu, 25 Oct 2012 08:13:53 -0700
From: Larry Hastings <la...@hastings.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
rv:16.0) Gecko/20121011 Thunderbird/16.0.1
MIME-Version: 1.0
To: Nick Coghlan <ncogh...@gmail.com>
References: <CAMpsgwbQRZMugSprcM9CQGnraFG39AKpDAz-b1YM2j-ag=1...@mail.gmail.com>
<CAPZV6o_0hANb7QbnJwF-j7oRak8oCBpvjsp0uvi+3XHWyOA...@mail.gmail.com>
<k66glu$r1...@ger.gmane.org> <50881178.4010...@hastings.org>
<CADiSq7dvFyjMBBs3nBJbJTrShkMhDFbrjPXZnJ+b=PHS64z...@mail.gmail.com>
In-Reply-To: <CADiSq7dvFyjMBBs3nBJbJTrShkMhDFbrjPXZnJ+b=PHS64z...@mail.gmail.com>
X-Gm-Message-State: ALoCoQlqxtBb91PNHxQ2REy78QfAbRrKs7zT3CJ+1x1iUD3iJbzK3e0a/uvnfKu7N/5tH2HuCOkP
Cc: python-...@python.org
Subject: Re: [Python-Dev] Split unicodeobject.c into subfiles
X-BeenThere: python-...@python.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Python core developers <python-dev.python.org>
List-Unsubscribe: <http://mail.python.org/mailman/options/python-dev>,
<mailto:python-dev-requ...@python.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-dev/>
List-Post: <mailto:python-...@python.org>
List-Help: <mailto:python-dev-requ...@python.org?subject=help>
List-Subscribe: <http://mail.python.org/mailman/listinfo/python-dev>,
<mailto:python-dev-requ...@python.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1849858665=="
Errors-To: python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org
Sender: "Python-Dev"
<python-dev-bounces+dev-python+garchive-30976=googlegroups....@python.org>
This is a multi-part message in MIME format.
--===============1849858665==
Content-Type: multipart/alternative;
boundary="------------090902030607000000000603"
This is a multi-part message in MIME format.
--------------090902030607000000000603
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
On 10/24/2012 03:15 PM, Nick Coghlan wrote:
> Breaking such files up into separately compiled modules serves two
> purposes:
>
> 1. It proves that the code *isn't* a tangled monolithic mess;
> 2. It enlists the compilation toolchain's assistance in ensuring that
> remains the case in the future.
>
Either the code is a "tangled monolithic mess" or it isn't. If it is,
then let's fix that, regardless of the size of the file. If it isn't, I
don't see breaking up the code among multiple files as providing any
benefit. And I see no need for the toolchain's assistance to help us do
something without benefit. The line count of the file is essentially
unrelated to its inherent quality / maintainability.
> We are not special snow flakes - good software engineering practice is
> advisable for us as well, so a big +1 from me for breaking up the
> monstrosity that is unicodeobject.c and lowering the barrier to entry
> for hacking on the individual pieces. This should come with a large
> block comment in unicodeobject.c explaining how the pieces are put
> back together again.
>
I'm all for good software engineering practice. But can you cite
objective reasons why large source files are provably bad? Not "tangled
monolithic messes", not poorly-factored code. I agree that those are
bad--but so far nobody has proposed that either of those is true about
unicodeobject.c (unless you are implicitly doing so above), nor have
they proposed credible remedies. All I've seen is that unicodeobject.c
is a large file, and some people want to break it up into smaller
files. I have yet to see anything but handwaving as justification. For
example, what is this barrier to entry you suggest exists to hacking on
the str object, that will apparently be dispelled simply by splitting
one file into multiple files?
Someone proposed breaking up unicodeobject.c into three distinct
subsystems and putting those in separate files. I still don't agree.
It seems natural to me to have everything associated with the str object
in one file, just as we do with every other object I can think of. If
this were a genuinely good idea, we should consider doing it with every
similar object. But nobody is proposing that. My guess is because the
other files in CPython are "small enough". At which point we're right
back to the primary motivation simply being the line count of
unicodeobject.c, as a purely aesthetic and subjective judgment.
//arry/
--------------090902030607000000000603
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
On 10/24/2012 03:15 PM, Nick Coghlan wrote:<br>
</div>
<blockquote
cite="mid:CADiSq7dvFyjMBBs3nBJbJTrShkMhDFbrjPXZnJ+b=PHS64z...@mail.gmail.com"
type="cite">Breaking such files up into separately compiled
modules serves two purposes:<br>
<p>
1. It proves that the code *isn't* a tangled monolithic mess;<br>
2. It enlists the compilation toolchain's assistance in ensuring
that remains the case in the future.</p>
</blockquote>
<br>
Either the code is a "tangled monolithic mess" or it isn't. If it
is, then let's fix that, regardless of the size of the file. If it
isn't, I don't see breaking up the code among multiple files as
providing any benefit. And I see no need for the toolchain's
assistance to help us do something without benefit. The line count
of the file is essentially unrelated to its inherent quality /
maintainability.<br>
<br>
<br>
<blockquote
cite="mid:CADiSq7dvFyjMBBs3nBJbJTrShkMhDFbrjPXZnJ+b=PHS64z...@mail.gmail.com"
type="cite">
<p>We are not special snow flakes - good software engineering
practice is advisable for us as well, so a big +1 from me for
breaking up the monstrosity that is unicodeobject.c and lowering
the barrier to entry for hacking on the individual pieces. This
should come with a large block comment in unicodeobject.c
explaining how the pieces are put back together again.</p>
</blockquote>
<br>
I'm all for good software engineering practice. But can you cite
objective reasons why large source files are provably bad? Not
"tangled monolithic messes", not poorly-factored code. I agree that
those are bad--but so far nobody has proposed that either of those
is true about unicodeobject.c (unless you are implicitly doing so
above), nor have they proposed credible remedies. All I've seen is
that unicodeobject.c is a large file, and some people want to break
it up into smaller files. I have yet to see anything but handwaving
as justification. For example, what is this barrier to entry you
suggest exists to hacking on the str object, that will apparently be
dispelled simply by splitting one file into multiple files?<br>
<br>
Someone proposed breaking up unicodeobject.c into three distinct
subsystems and putting those in separate files. I still don't
agree. It seems natural to me to have everything associated with
the str object in one file, just as we do with every other object I
can think of. If this were a genuinely good idea, we should
consider doing it with every similar object. But nobody is
proposing that. My guess is because the other files in CPython are
"small enough". At which point we're right back to the primary
motivation simply being the line count of unicodeobject.c, as a
purely aesthetic and subjective judgment.<br>
<br>
<br>
<i>/arry</i><br>
</body>
</html>
--------------090902030607000000000603--
--===============1849858665==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
Python-Dev mailing list
Python-...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com
--===============1849858665==--