set.mm update

226 views
Skip to first unread message

Mario Carneiro

unread,
Apr 16, 2015, 5:10:03 AM4/16/15
to meta...@googlegroups.com
Hello all,

I'm writing to talk about my latest submission to set.mm, which includes a large number of changes and additions that I've been working on over the past few months. This is also a test of sorts, to see how best to give submission notes in a publicly accessible fashion to interested parties.

There are a few main themes, in topology, real analysis, measure theory and graph theory, which are more or less unrelated to each other and are all packaged together in this update. (Reminder: to see any of these theorems, go to http://us2.metamath.org:88/mpegif/<xxx>.html where <xxx> is the theorem name - many of the comments assume that you have the theorem in front of you and may not make sense otherwise. If you want the gory details I would recommend looking directly at the diff from my personal set.mm git repo at https://github.com/digama0/set.mm/commit/0f51761 .)

==Topology==

There are several new topological structures available, and I also made a concerted effort to fill out the table of properties, i.e. "space X has property Y" if the conditions are relatively simple. Let me know if I missed anything obvious; what I needed most during this work was theorem statements, since the proofs are usually either obvious or can be looked up given the statement.

imasval, etc.: The image structure and quotient structure functions "s and /s are now online, although most of the "structure preservation" theorems are not yet done; imasmnd is available as an example of what such a theorem will look like. Unfortunately this construction comes too late to assist in the real number construction - df-nr is essentially this exact construction. But I guess it's not bad to have at least one "elementary" example before abstracting everything.

isclo: A curious test to show that a set is "clopen" (closed and open). Useful for showing that a connected set has a property everywhere given that it preserves the property locally.

Since they are very commonly used, users should take note that stoig2 and stoig3 were renamed to subspuni and subsptop. (I still don't know what "stoig" stands for, maybe FL does.) On this topic, I'd also like to propose a change in the interface for subSp, from ( subSp ` <. A , J >. ) to ( J |`t A ), for notational agreement with |` and |`s, and also to avoid the verbose df-fv + df-op combination in favor of the "newer" (by which I mean 10 years old) df-ov.

tgsubsp, subspsubbas: A subspace is generated by the restricted elements of a basis or subbase for the original topology. Simplifies the proof of txsubsp.

subspdis: A subspace of a discrete space is discrete.

The metric limit predicate ~~>m was replaced a while ago with ~~>t for topological limits of sequences, but it was still in the metric spaces section; in this update I moved it to the same section as Cn and CnP (so now the section is about "Limits and Continuity" instead of just continuity).

ssidcn: The identity function is continuous from J to K iff K C_ J; this is a characterization of fineness of topologies.

cnss1, cnss2: Making the domain finer or the codomain coarser increases the number of continuous functions.

cndis, cnindis: Every function is continuous if the domain is discrete or the codomain is indiscrete. (FYI, I use the term "codomain" in preference to "range" because for me "range" refers to "ran F", i.e. the largest set on which F is surjective, while "codomain" refers to the B in "F : A --> B", which is to say some interesting superset of the range. In the category of sets it is usually easier to work with the range, but in more general categories, for example topological spaces, the codomain is a more natural notion.)

subsphaus: A subspace of a Hausdorff space is Hausdorff
sshaus: A space finer than a Hausdorff space is Hausdorff
dishaus: A discrete space is Hausdorff

rncmp, imacmp: the image of a compact set is compact

discmp: A discrete space is compact iff it is finite. This shows that compactness can be understood as a topological generalization of finiteness.

fiuncmp: The finite union of compact spaces is compact. We don't have a disjoint union constructor yet, so instead this theorem is stated assuming that all spaces are subspaces of some larger space X. I'm still not sure how to approach disjoint unions in this context, so I will wait until I see it used in a theorem before proceeding.

sscmp: A space coarser than a compact space is compact.

I wasn't very happy with the existing set of connectedness theorems, so I added a bunch of "convenience theorems".

conclo: A nonempty clopen subset of a connected space is the whole space.

conndisj: Two open disjoint subsets of a connected space do not cover the space. (I'm not happy with this formulation, borrowed from dfcon2, because usually the fact that they cover the space is an assumption. Short of assuming everything and deriving a contradiction, perhaps it might be better to prove that one of the sets is empty?)

nconnsub: Same as conndisj with a subspace, only this time we assume everything except connectedness and prove that the subspace is not connected.

concn: A continuous function from a connected set which meets a clopen set of the codomain is contained in the set.

clscon: The closure of a connected set is connected (the closure is performed w.r.t some ambient space).

concompid, etc.: Defines an expression to be called the connected component  of X containing x. We can show that this set is a connected subset of X containing x and any other connected neighborhood of x, and it is also closed by clscon. At first I thought I could prove that connected components are also open, but this is not true for, e.g. the rational numbers, whose connected components are singletons (which are not open as QQ is not discrete). They are open if there are only finitely many components in the space, though.

df-1stc, etc.: I didn't write this section, credit goes to Jeff Hankins, but I imported it an expanded it a lot.

1stcfb: A point x in a first countable topology J induces a function f : NN --> J of open neighborhoods which is decreasing (forms an inclusion chain) and constitutes a local base.

2ndcredom: A second-countable space has at most |RR| open sets.

1stcsubsp, 2ndcsubsp: first and second-countability is hereditary (passes to subspaces)

2ndcdisj, 2ndcdisj2: A disjoint collection of open sets in a second-countable topology is countable.

2ndcomap: Open maps (and in particular homeomorphisms) map second-countable topologies to second-countable topologies.

2ndcsep: A second-countable space is separable (uses ax-cc)

dis2ndc: A discrete space is second-countable iff it is countable

df-lly, df-nlly: This is a new topological property constructor, which I am quite fond of due to its easy readability and diverse applications. A topology is "Locally A", where A is a class parameter which is intended to be another topological property like "Comp" or "Con", if every point has an open neighborhood that is "A". Similarly, a topology is "N-Locally A" if every point has a neighborhood that is "A". From this you immediately get the notions locally connected (Locally Con), locally compact (N-Locally Comp), locally Hausdorff (Locally Haus), even locally metrizable (Locally ran MetOpen), although I haven't tried that one yet.

subsplly, subspnlly: If "A" is a hereditary property, then an "A" space is also "Locally A", and "N-Locally A" and "Locally A" coincide.

llysubsp, nllysubsp: "Locally A", "N-Locally A" are hereditary properties.

llyidm, nllyidm: There is a stronger property of topological properties than being hereditary, namely being "local". The "Locally A" predicate allows one to write the property of being local as "Locally A = A"; in this case it is also true that "N-Locally A = A". A property is local if it is true globally iff it is true in a neighborhood around each point. These theorems show that "Locally A" and "N-Locally A" are themselves local properties; the only other nontrivial local property we have is first-countability (lly1stc). Unfortunately we don't have any concise way to write hereditarity, but it is implied by locality.

hausllycmp: A compact Hausdorff space is locally compact. (Note that there are several competing definitions for "locally compact" that coincide on Hausdorff spaces; the one abbreviated here is "N-Locally Comp" which means any point has a local base of compact neighborhoods. The alternative, "every point has a compact neighborhood", has no abbreviation but is written out in full in llycmpkgen2.)

cldllycmp: A closed subset of a locally compact space is locally compact.

dislly: A discrete space is "Locally A" iff singleton spaces are "A".

dis1stc: A neat proof that discrete spaces are first-countable using the idea of local properties. By dislly, ~P X is locally first-countable iff singletons are first-countable, which is true since they are finite and hence second-countable. But since first-countability is local, "locally first-countable" is equivalent to "first-countable" and thus ~P X e. 1stc

1stcelcls: A long time ago I mentioned in the comment that metelcls could be generalized to first-countable spaces; this is that generalization. A point is in the closure of a first-countable space iff there is a sequence converging to it.

hausmapdom, hauspwdom: The property 1stcelcls implies a certain "limitation of size" theorem on Hausdorff spaces. Since points in the closure are determined by functions in the set, this implies that the cardinality of a space is upper-bounded by |A ^ NN|, where A is a dense subset. In particular, a separable first-countable Hausdorff space has at most cardinality continuum.

df-kgen, etc.: A new topological generator, the k-ification operation. Takes as input a topology and outputs a finer topology, such that a set is open in the new topology iff it is open in all compact subspaces under the original topology. Since the operator is idompotent, the set of fixed points of the operation is also its range; these are called compactly generated spaces. (There appears to be some differences in whether or not to require that the subspaces be Hausdorff, and I almost rewrote this entire section at one point under that definition. Anyone have any ideas about the "right" approach here?)

llycmpkgen2, cmpkgen, llycmpkgen: A locally compact (in the weak sense) space is compactly generated. This yields proofs for compact spaces and locally compact spaces as well, since a compact space is locally compact in the weak sense (every point has a compact neighborhood) but not the strong sense (every neighborhood has a compact subneighborhood).

df-xko: Define the compact-open topology on the set of continuous functions, which is generated by the subbase U(K,V) = { f : (f " K) C_ V } for each compact K and open V. (I was conflicted here between exponential and "arrow" notation for the set of continuous functions. The current notation for the compact-open topology of functions from J to K is ( K ^ko J ). This agrees with the mapping notation ( Y ^m X ) of the set of all functions from X to Y, but is reversed from the set of continuous functions ( J Cn K ) or C(J,K) in usual math prose.)

txhaus, tx1stc, tx2ndc, txcon, txscon: The topological product of two spaces is Hausdorff / first-countable / second-countable / connected / simply connected if the factors are.

txkgen: The topological product is compactly generated if one factor is locally compact and the other is compactly generated Hausdorff (or any of a number of more complicated conditions)

xkohaus: The compact-open topology is Hausdorff if the codomain space is.

xkoptsub, xkopt: The compact-open topology is not the only available topology on function spaces; we also have the product topology on equal factors (which topologizes the set of all functions, not just continuous ones). It turns out that the compact open topology is finer than the product topology (which can be thought of as the "finite-open topology" here, so that since there are more compact sets than finite sets the topology has more open sets). In general, the product topology acts like the compact-open topology with discrete domain.

xkoccn: the function x |-> (y |-> x) is continuous
xkopjcn: the function f |-> (f`x) is continuous
xkoco1cn, xkoco2cn: the functions g |-> (g o. F), g |-> (F o. g) are continuous if F is
xkococn: the function f , g |-> (f o. g) is continuous if g : J --> K, f : K --> L and K is exponentiable, where "exponentiable" means basically that it satisfies the conclusion of xkococn. Less trivially, one can show that K is exponentiable iff it is core-compact (this has a complicated definition - look it up if you are curious), and it is sufficient to assume that K is locally compact (in the strong sense) to show that it is exponentiable; this is the assumption I used in xkococn and it propagates to all related "exponential rule" theorems.

xkofvcn: the function f , x |-> (f ` x) is continuous if the domain of f is exponentiable
xkoinjcn: the function x |-> y |-> <. y , x >. is continuous

cnmpt1k, etc.: the continuity theorems induce new cnmpt* theorems which say basically the same thing but are easier to use. I use "k" to abbreviate the "curried" format "x |-> y |->", similar to the current "1" for "x |->" and "2" for "x, y |->".

df-qtop: Another new topological structure, the quotient topology. It works given a topology X and a function F which we usually assume is a function from X onto some Y, and defines a topology on Y as the finest topology making F continuous. The statement "F : J --> K is a quotient map" where J and K are topologies can be stated as "K = ( J qTop F )".

qtopcmplem, qtopcmp, qtopcon, qtopkgen: Since a quotient is the continuous image of a topology, any property which is preserved under continuous image is transferred to a quotient topology, in particular connectedness and compactness. Compactly generated spaces are also preserved under quotients for more complicated reasons.

qtopsubsp: The restriction of a quotient map to a saturated open or closed set is a quotient map.

qtopomap, qtopcmap: Surjective open or closed maps are quotient maps.

txswaphmeo: ( J tX K ) and ( K tX J ) are homeomorphic (useful for swapping the arguments in asymmetric theorems like txtube or txkgen)

xkohmeo: ( L ^ko ( J tX K ) ) and ( ( L ^ko K ) ^ko J ) are homeomorphic, if J and K are exponentiable. (It seems strange to me that both K and J are required to be exponentiable, although it is clearly needed by the current proof, for both directions of continuity.)

0nelfil, etc.: A lot of name changes in the filter and filter limits section were done for standardization reasons.

elflim2: I'm unreasonably happy with myself for discovering that you can do reverse closure of both left and right arguments in this case, because topologies and filters are disjoint from each other (one must contain the empty set and one must not). This allows you to eliminate practically all hypotheses from the theorems using fLim. Similar considerations apply to fClus, in JH's mathbox.

hauspwpwdom: A weaker version of hausmapdom above. Without first-countability, you can still bound the size of a closure, because points in the closure of A are determined by filters in A, which are subsets of the double powerset of A.

met1stc: a metric space is first-countable
met2ndc: a metric space is second-countable iff it is separable (forward implication requires ax-cc)
re2ndc: The real numbers are second-countable (Stefan, I think I sniped this from your to-do list)
zcld, recld: the integers and real numbers are closed sets (could perhaps be used for isumrecl and the like)

df-htpy: Define a homotopy, which is like a path homotopy but with the base topology being generalized to any topology. That is, a homotopy from f : J --> K to g : J --> K is a continuous map h : ( J X. II ) --> K such that h(x,0)=f(x) and h(1,x)=g(x). The extra conditions h(0,y)=f(0) and h(1,y)=f(1) no longer make sense, so they are kept in the definition of path homotopy.

df-om1: Currently just a few teaser definitions, I intend to eventually define the fundamental group as the structure quotient of this loop space, and in the same manner define the n-th homotopy group as the quotient of the n-th iterated loop space.

cnllycmp, rellycmp: The reals and complexes are locally compact.

cnpcon, qtoppcon: Continuous images and quotients of path-connected spaces are path-connected.

cnllyscon: Since the complexes are locally balls and balls are simply connected, the complexes are locally simply connected.

rescon: A subset of the reals is simply connected iff it is connected.

iccllyscon: A closed interval is locally simply connected. (At first I thought that any subset of the reals should be locally connected, but QQ isn't and I think that any such set must actually look locally like an interval or interval endpoint everywhere, so it's not a trivial property.)

cvmliftmo: Separate out the proof of uniqueness of lifts in a covering map. The necessary assumption seems to be that the domain space of the lifted function be connected and n-locally connected, so since [0,1] is both we can eliminate some of the lemmas to cvmlift.

==Measure Theory==

The major achievement in this section is a new choice-free proof that an open set is measurable, although currently it doesn't make it very far before being overcome by ax-cc-dependent theorems, and certainly falls far short of my goal of proving ftc1 and ftc2 without choice. But [Fremlin5] claims that there is quite a lot you can do without choice, and we'll have to see how much further this can go.

df-ovol: I changed the definition here and all dependents to use ( <_ i^i ( RR X. RR ) ) instead of ( < i^i ( RR X. RR ) ). In other words, the open intervals in a covering are now allowed to be empty. Since you can expand all of the intervals slightly to make them strictly positive, it doesn't change the value of the overall infimum, but expanding the intervals is an extra annoying and unnecessary step, so this simplifies the proofs.

ovollb2: In another "expand the intervals a little" argument, you can show that it is sufficient to cover the set with *closed* intervals, which again abstracts this common technique. Using this, you can prove for example ovolctb by covering the countable set by the closed intervals [x, x] = {x}, which each have length zero and so give you the result exactly without any need for fudge factors.

ovolsca: I guess I forgot this the first time around, but the scaling of a set by multiplying everything in the set by a fixed positive number increases the measure of the set by that number. This still leaves open the case of reflecting a set about some number, but that will need a separate proof anyway.

nulmbl2: A nullset is measurable, where "nullset" is meant in a weaker sense as a set which is a subset of sets of arbitrarily small measure. Using ax-cc you can show that such a set is actually a nullset and hence measurable, but without that you can still show measurability. (Actually, now that I think about it, I'm not convinced that there is no proof that such a set is not actually vol*(A)=0 - maybe Fremlin was lying to me.)

uniioombl, uniiccmbl: A rather involved argument, adapted from Fremlin, that the disjoint union of countably many open intervals is measurable. The case of closed intervals is a corollary since the difference is a nullset.

dyadmbl: The set of intervals of the form [m/2^n, (m+1)/2^n] has the property that any two intervals are either in a subset relation or are almost disjoint, and any nonempty set has a maximal element, so given any set A of such intervals, the subset G of maximal elements is countable and disjoint, and has the same union as the union of A; thus uniiccmbl shows that the union of A is measurable.

opnmbl: An open set is the union of all dyadic rational intervals contained in it, because dyadic rational numbers are dense in RR; thus open sets are measurable.

==Real Analysis==

I would say the major achievement here is continuity of log, which was hampering progress for a while. Now that it is complete, several other theorems fall out, in particular differentiability of log, the taylor series for log, and finally the taylor series for arctan and Leibniz' series for pi (which is a metamath 100 theorem).

expmulnbnd, geomulcvg: Convergence of the series k x^k, which is needed for the derivative of a power series.

isercoll, iserodd: A series of the form sum_ n e. NN a_n such that a_n is zero outside the range of an increasing function G:NN-->NN can equivalently be written as sum_ k e. NN a_G(k), in essence skipping the zeroes and reindexing the remaining terms. The most useful application of this is when G(n)=2n+1, in which case it says that sum_ n e. NN a_n = sum_ n e. NN a_(2n+1) if a_2n=0 for each n. This is useful for showing that the power series of arctangent, arctan(x) = sum_ n e. NN0 (-1)^n/(2n+1) x^(2n+1), is actually a power series (can be put in the form a_n x^n). Note that absolute convergence is not necessary, as partial sums are not reordered here; we would need absolute convergence if we allowed G to be a general injective function rather than an increasing function.

iseralt: A surprising omission from the library, the alternating series test. (Thought about generalizing to Dirichlet's test, but it's too complicated to state and I doubt we will need the extra generality.)

tanval2, tanval3: alternative expressions for tangent

sinhval, etc.: I didn't define the sinh (hyperbolic sine) function, as I think that would be unnecessary. Rather, I intend to use the idiom sin(iz)/i to mean sinh(z), and similarly for cosh(z)=cos(iz) and tanh(z)=tan(iz)/i. Independent from the obvious theorems that come from their expressions in terms of trig, the important theorems are x e. RR -> sinh(x),cosh(x),tanh(x) e. RR (resinhcl, recoshcl, retanhcl), the expressions in terms of e^x, like sinh(x)=(e^x+e^-x)/2 (sinhval, coshval), and |tanh(x)|<1 (tanhbnd). It may also be useful to show that e^x=sinh(x)+cosh(x) and e^-x=cosh(x)-sinh(x) at some point.

tanadd: Addition formula for tangent.

dvbsss: Since this theorem is unconditionally true, it allows for an elimination of X C_ RR hypotheses whenever we also assume dom ( S _D F ) = X, which simplified all of the dvmpt* theorems.

dvrec: The derivative of the reciprocal function. From this you can get the quotient rule if you combine it with dvmptmul.

dvcnv: The derivative of an inverse function. This is the "cop-out" version of the theorem, because we assume that f^-1 is already fairly nice - we know that f is bijective onto an open set Y, and the inverse is known to be continuous. In these circumstances, it is not hard to show that (f^-1)'(y) = 1/f'(f^-1(y)). The reason for this theorem is that it doesn't need any strong assumptions on the domain S; the usual statement which avoids things like "f^-1 is continuous" has rather different proofs for S = RR and S = CC.

dvferm1, dvferm2, dvferm: By separating the proof of rolle into "left" and "right" neighborhoods, you can get one-sided results like "if f(u)>=f(x) in some right neighborhood x e. (u, y), then f'(u)<=0". The two-sided version just says that a local maximum has derivative zero, which was already the "hard part" of the proof of rolle.

dvlip, dvlipcn, dvlip2: Adapted from Stefan's c1lip1, we show that a function with derivative bounded by M also has |f(x)-f(y)|<=M|x-y|. The same is true for the complexes, although care is needed to ensure the region is connected (we just assume it's a ball).

dv11cn: Two functions on a ball that have the same derivative and are equal somewhere in the ball are equal everywhere (seems more useful than the "differ by a constant" version).

dvgt0, dvlt0: a function with positive derivative is increasing, and a function with negative derivative is decreasing

dvivth: Darboux' theorem, or the intermediate value theorem for derivatives

dvne0, dvne0f1: Combination of dvgt0, dvlt0 - a function with nonzero derivative is strictly monotone (and hence injective)

dvcnvre: The less "cop-out" version of dvcnv. If f is a differentiable real function whose derivative is never zero, then f is injective (by dvne0f1) and the inverse is continuous on an interval, so by dvcnv (f^-1)'(y) = 1/f'(f^-1(y)).

df-ulm: Uniform convergence of a sequence. This definition is most similar to ~~>, and does not attempt any of the various possibilities for generalization to metric spaces or uniformities or the like. I suppose time will tell if we need the generalization, but for now it is just intended for real and complex analysis. We have F ( ~~>u ` S ) G if F is a sequence of functions F(n) : S --> CC and G : S --> CC and F(n) converges uniformly to G on S.

ulmcau: A sequence converges uniformly (to something) iff it is uniformly Cauchy
ulmbdd: A uniform limit of bounded functions is bounded
ulmcn: A uniform limit of continuous functions is continuous
ulmdv: A limit of differentiable functions whose derivatives converge uniformly to some H is differentiable with derivative H
mtest: Weierstrass M-test: A series of functions bounded uniformly by a convergent series of real numbers converges uniformly.
iblulm, itgulm: The integral of a uniform limit of integrable functions converges to the limit of integrals.

radcnvcl, etc.: Define an expression R which takes as input a coefficient function A(n) and outputs the radius of convergence of sum_ n e. NN0 A(n) x^n. This is a nonnegative extended real, with the property that the series converges absolutely when |x|<R (radcnvlt1) and diverges when |x|>R (or more usefully, if the series converges at x then R<=|x| - radcnvle).

dvradcnv: The formal derivative of the power series is convergent within the radius of convergence
pserulm: The power series converges uniformly on {z:|z|<=M} if M<R
psercn2: The power series is continuous on {z:|z|<=M} if M<R
psercn: The power series is continuous on {z:|z|<R}
pserdv: The power series is differentiable with the termwise derivative on {z:|z|<R}

abelth: Abel's theorem: If the power series is convergent at 1, then it is continuous in a teardrop-shaped subset of the unit disk near 1, or less generally on the interval [0,1], so that the limit of sum_ n e. NN0 A(n) x^n as x->1 from below is sum_ n e. NN0 A(n).

cosne0: There are no zeroes of the cosine function in the region |Re[z]| < pi/2
tanregt0: The tangent function has positive real part in the region 0 < Re[z] < pi/2
logcj: log(z*)=log(z)* if Im[z]=/=0
argregt0: If Re[z]>0 then -pi/2<arg[z]<pi/2
argrege0: If Re[z]>=0 then -pi/2<=arg[z]<=pi/2
argimgt0: If Im[z]>0 then 0<arg[z]<pi
argimlt0: If Im[z]<0 then -pi<arg[z log(-z)=l]<0
logimul: If Re[z]>=0 then log(iz)=log(z)+i pi/2, and in particular arg[iz]=arg[z]+pi/2
logneg2: If Im[z]>0 thenog(z)-i pi, and in particular arg[-z]=arg[z]-pi
tanarg: If Re[z]=/=0 then tan(arg[z])=Im[z]/Re[z]. In combination with logimul, logneg2, you can find the argument of any complex number this way.

dvrelog: The derivative of log |` RR+ (easy consequence of dvcnvre)
relogcn: Continuity of log |` RR+
logcn: Continuity of log on its domain of continuity, namely D = ( CC \ ( -oo , 0 ] ). The proof is direct using the above theorems on arg[z].
dvlog: Differentiability of log on D

logtayl: the series sum_ n e. NN z^n/n converges to -log(1-z)
dvcxp1: the derivative of x^a is a x^(a-1) (for positive real x and complex a)
dvcxp2: the derivative of a^x is log(a)a^x (for complex x and positive real a)

df-asin, df-acos, df-atan: Define the inverse trigonometric functions. Since the other three are easily obtained from these as arccot(z)=arctan(1/z) and so on, I didn't bother to define them. For some time I had considered using "`' sin" in place of arcsin, but there is of course the obvious problem that sin is not injective. We could solve this the same way we solved it for log, but the problem is that the regions of injectivity for the desired branch choices have much more complicated shapes than for log. Instead, I opted to follow Mathematica conventions again and use the definition in terms of log. This has the obvious advantage of making closure proofs easy, but makes theorems like asinsin more complicated.

sinasin, asinsin: The basic "inverse" relations for arcsine, sin(arcsin(z))=z (true generally) and arcsin(sin(z))=z (true when |Re[z]|<pi/2).
cosacos, acoscos: arccos has a fairly trivial definition in terms of arcsin, so most of the theorems are just corollaries of the arcsin theorems.
asinbnd: Together with asinsin this implies that the region D such that arcsin = `' ( sin |` D ) is somewhere between {z:|Re[z]|<pi/2} and {z:|Re[z]|<=pi/2}.
cosasin, cosatan: Theorems like this one, cos(arcsin(z))=sqrt(1-z^2), are not always simple to prove due to the possibility of branch cuts not lining up right. In this case it is more or less by definition, but getting from efiatan to efiatan2 and hence to cosatan requires detailed analysis of the parameters and some strange theorems like atanlogadd and atanlogsub.

atantan: Unlike the tangent, the arctangent is not a simple combination of arcsin and arccos, and all the theorems are separate. Since there are two points which are not in the range of tangent, namely i and -i, we give (CC\{i,-i}) the name "dom arctan" and use that in lieu of the ( A e. CC /\ A =/= 0 ) sort of thing in division theorems. The most interesting theorem here is atantan, showing that arctan(tan(z))=z when |Re[z]|<pi/2.

atans: Like log, arctan has a "domain of continuity"; it is the region of the complex plane with the imaginary axis excluded, except for the open interval from -i to i. It is more naturally written in terms of log as the set {z:1+z^2 e. D} where D is the domain of continuity of log. In particular, it is an open region (atansopn) containing the reals (ressatans), so that the arctan function is differentiable in a neighborhood of RR.

dvatan, atancn: The derivative of arctan, arctan(z)=1/(1+z^2), follows naturally from its definition in terms of log, and continuity is a corollary.

atantayl, atantayl2, atantayl3: The taylor series for log yields a taylor series for arctan. If you just combine series in the obvious way you get sum_n ((-i)^n-i^n) i/2 z^n/n, but the i's cancel in alternate terms and this can be rewritten as sum_n if(2|n, 0, (-1)^((n-1)/2) z^n/n), and using iserodd above we can write it in the conventional form as sum_n (-1)^n z^(2n+1)/(2n+1).

leibpi: Using the Taylor series for arctan together with Abel's theorem abelth, the alternating series sum_n (-1)^n/(2n+1) converges to arctan(1)=pi/4. This is one of the metamath 100 theorems.

rlimcnp: A limit at infinity is equivalent to continuity at 0 from above, by composing with the map x |-> 1/x.

efrlim, dfef2: By using rlimcnp, the limit of the function (1+z/k)^k as k -> +oo is the same as the limit of (1+kz)^(1/k)=exp(log(1+kz)/k) as k -> 0, and the argument to exp there is just the difference quotient for z log'(1)=z, so since exp is continuous we get (1+z/k)^k ~~> exp(z), which is a nice alternate definition of exp (although the equivalence is quite nontrivial!)

==Graph Theory==

df-umgra: One thing that worries me about graph theory more than any other field is the fact that they have no uniform foundation to build on, and we don't have the nice "an X is a Y"-type theorems that the structure builders are giving us in abstract algebra. The goal in this section was to prove the Konigsberg Bridge problem, which involves an undirected, non-simple graph. So I defined <. V , E >. e. UMGrph if E is a function from some set to the collection of 1- and 2-element subsets of V.
Note that if this were a directed graph, it would probably be a function into ordered pairs instead, so G e. DMGrph -> G e. UMGrph would be a non-theorem; if I had a simple graph, E might be a subset of V X. V instead of a function, and then G e. SGrph -> G e. UMGrph would be a non-theorem. I'm still not sure how best to address this problem, but for the moment my thoughts are to just ignore the problem and work with each foundation separately, using isomorphism theorems to translate as necessary. That's part of the reason why I made no attempt to work df-umgra into the extensible structure framework, because there is no obvious uniform way to do it.

eupath: The main theorem of Eulerian paths, which says that the number of vertices of odd degree in an Eulerian graph is either 0 or 2, depending on whether the path returns to the starting point.

konigsberg: The Konigsberg Bridge problem, which says that a certain specific graph has no Eulerian paths. In order to concisely state the vertex and edge information I used a similar approach to the decimal arithmetic theorems, with a small collection of "structural induction" theorems on the edge set, stored as an ordered list. For larger graphs a different storage mechanism may be more appropriate. This is also a metamath 100 theorem.

==Miscellaneous interesting stuff==

dffi3: Extracted from the proof of fictb, this shows that you can iterate binary intersection omega-many times to get all finite intersections.

1lt9, etc.: Addresses a white lie in the "Arithmetic in Metamath" paper, which claims that m<n is a database theorem for all 0<=m<n<=10 - this fills the gaps. Also decnncl2 and decsucc2 were referenced even though they didn't exist - now they do.

seqof: an interesting interaction of sequences with the function operation operator oF.

hasheni: This doesn't exhaust the possibilities, but with the hash function extended to +oo at infinite sets, there are a number of hash function theorems that can be extended to infinite sets as well as finite, such as this one which shows that the forward direction of hashen is true even for infinite sets.

rlim2, etc: Not sure about this change, since it differs from the usual convention, but in order to avoid the $d z ph requirement I wrote the first hypothesis as "( ph -> A. z e. A ..." instead of "( ( ph /\ z e. A ) -> ..."; it could equivalently be written as a conjunction now.

snmlff: A teaser, for a possible future proof of the normal number theorem (the set of normal numbers has full measure).


I also have two proposals. Other than the |`t proposal above I would also like to implement Stefan's earlier suggestion of "fixing" the peculiar behavior of opprc2 via the definition:

df-op $a |- <. A , B >. = if ( ( A e. _V /\ B e. _V ) , { { A } , { A , B } } , (/) ) $.

(This differs from Stefan's original proposal in replacing 1o with (/); Experience with "reverse closure" theorems like elfvdm have shown me that it is useful to have undefined values be (/) specifically.) Of course we probably won't use A e. <. B , C >. very often, but all that matters for opelxp1 is that the out of domain value not be an ordered pair, which is equally true of (/) and 1o. This also eliminates the counter-intuitive asymmetric handling of opprc1 vs. opprc2.

Mario

Stefan O'Rear

unread,
Apr 17, 2015, 3:35:31 AM4/17/15
to meta...@googlegroups.com
On Thursday, April 16, 2015 at 2:10:03 AM UTC-7, Mario Carneiro wrote:
df-umgra: One thing that worries me about graph theory more than any other field is the fact that they have no uniform foundation to build on, and we don't have the nice "an X is a Y"-type theorems that the structure builders are giving us in abstract algebra. The goal in this section was to prove the Konigsberg Bridge problem, which involves an undirected, non-simple graph. So I defined <. V , E >. e. UMGrph if E is a function from some set to the collection of 1- and 2-element subsets of V.
Note that if this were a directed graph, it would probably be a function into ordered pairs instead, so G e. DMGrph -> G e. UMGrph would be a non-theorem; if I had a simple graph, E might be a subset of V X. V instead of a function, and then G e. SGrph -> G e. UMGrph would be a non-theorem. I'm still not sure how best to address this problem, but for the moment my thoughts are to just ignore the problem and work with each foundation separately, using isomorphism theorems to translate as necessary. That's part of the reason why I made no attempt to work df-umgra into the extensible structure framework, because there is no obvious uniform way to do it.

eupath: The main theorem of Eulerian paths, which says that the number of vertices of odd degree in an Eulerian graph is either 0 or 2, depending on whether the path returns to the starting point.

konigsberg: The Konigsberg Bridge problem, which says that a certain specific graph has no Eulerian paths. In order to concisely state the vertex and edge information I used a similar approach to the decimal arithmetic theorems, with a small collection of "structural induction" theorems on the edge set, stored as an ordered list. For larger graphs a different storage mechanism may be more appropriate. This is also a metamath 100 theorem.

I've dabbled with that a bit in the past, and tended towards a factoring more like this:

inci = ( f e. _V |-> ( f ` ( 10 + 4 ) ) )
$( either ran ( inci ` s ) or dom ( inci ` s ) has a claim to the word "edge" $)
edgeDir = ( f e. _V |-> ( f ` ( 10 + 5 ) ) )
$( An incidence structure (a multi-hypergraph) is a base set equipped with a family of subsets. $)
InciSy = { s e. _V | ( inci ` s ) : dom ( inci ` s ) --> ~P ( Base ` s ) }
$( A multigraph is an incidence structure where each edge is a pair. $)
Multigraph = { s e. InciSy | A. e e. ran ( inci ` s ) ( e ~~ 1o \/ e ~~ 2o ) }
$( A quiver/multidigraph is an incidence structure with an additional structure of an ordering for each edge. $)
Quiver = { s e. Multigraph | ( ( edgeDir ` s ) : dom ( inci ` s ) --> ( ( Base ` s ) X. ( Base ` s ) ) /\ A. x e. dom ( inci ` s ) ( ( inci ` s ) ` x ) = U. ( ( edgeDir ` s ) ` x ) ) }
$( A hypergraph is an incidence structure with no empty edges and no repeated edges. $)
Hypergraph = { s e. InciSy | ( (/) e/ ran ( inci ` s ) /\ Fun `' ( inci ` s ) ) }
$( A simple graph is a multigraph with no repeated edges or loops $)
SimpleGraph = { s e. InciSy | ( Fun `' ( inci ` s ) /\ A. e e. ran ( inci ` s ) e ~~ 2o ) }
$( there are several natural candidates for "directed graph"; ( SimpleGraph i^i Quiver ) is one $)
$( Undirected connectivity $)
USTCON = ( g e. InciSy |-> [_ ( Base ` g ) / b ]_ |^| { r e. ~P ( b X. b ) | ( ( _I |` b ) C_ r /\ ( r o. r ) C_ r /\ A. x e. b A. y e. b ( E. z e. ran ( inci ` g ) { x , y } C_ z -> x r y ) } )
$( Directed connectivity; see also df-trcl $)
STCON = ( g e. Quiver |-> [_ ( Base ` g ) / b ]_ |^| { r e. ~P ( b X. b ) | ( ( _I |` b ) C_ r /\ ran ( edgeDir ` g ) C_ r /\ ( r o. r ) C_ r ) } )
$( Connected multigraphs $)
ConInci = { g e. InciSy | A. s e. ( Base ` g ) A. t e. ( Base ` g ) s ( USTCON ` g ) t }
$( Convert a graph to a topological space; this creates one arc per vertex/edge pair.  Wrong for graphs with loops since it doesn't account for "incidence multiplicity" / fails to distinguish a loop from a "dangling edge".  For multidigraphs, an obvious alternative is to use one edge per vertex.  Shinichi Mochizuki prefers "semigraphs" with dangling edges to be topologically non-closed... $)
graph2top = ( g e. InciSy |-> [_ ( Base ` g ) / b ]_ [_ ( inci ` g ) / i ]_ ( ( ~P U_ x e. dom i ( { x } X. ( i ` x ) ) tX II ) qTop [_ ( U_ x e. dom i ( { x } X. ( i ` x ) ) X. ( 0 [,] 1 ) ) / s ]_ ( y e. s |-> [ y ] { <. p , q >. | ( ( p e. s /\ q e. s ) /\ ( p = q \/ ( ( 2nd ` p ) = 1 /\ ( 2nd ` q ) = 1 /\ ( 1st ` ( 1st ` p ) ) = ( 1st ` ( 1st ` q ) ) ) \/ ( ( 2nd ` p ) = 0 /\ ( 2nd ` q ) = 0 /\ ( 2nd ` ( 1st ` p ) ) = ( 2nd ` ( 1st ` q ) ) ) ) ) } ) ) ) 
$( ... you get the picture ... $)

I'm not sure how well this addresses your problems.  It probably introduces a few of its own.

I can't think of any natural example of something that is both an algebraic structure and a graph (not counting the graph structures that can be drawn on some algebraic structures, like Hasse diagrams and Cayley graphs); however graph theory has some natural notions of extensibility within itself, like weightings and constraint decorations.

Random thought: if we changed df-ndx to "ndx = ( _I |` ( 1 ... ( 10 ^ 2 ) ) )" then it would be a finite set and it would be theoretically possible for much of the structure stuff to work in the subsystem that is interpreted in PA (delete ax-inf, ax-inf2, ax-pre-sup, ax-groth, and ax-cnex if/when it comes back).

-sorear

Mario Carneiro

unread,
Apr 17, 2015, 4:43:56 AM4/17/15
to meta...@googlegroups.com
Hmm, I kind of like that approach. One nice thing is that your approach is compatible with mine in the sense that the edge function is stored the same way, so that we have the theorem:

( <. V , E >. e. UMGra <-> { <. ( Base ` ndx ) , V >. , <. ( inci ` ndx ) , E >. } e. Multigraph )

Comparing to other graph definitions in set.mm:

( <. D , C , U >. e. DGra <-> { <. ( Base ` ndx ) , U >. ,
   <. ( inci ` ndx ) , ( x e. dom D |-> { ( D ` x ) , ( C ` x ) } ) >.
   <. ( edgeDir ` ndx ) , ( D oF _I C ) >. } e. Quiver )

( E : D -1-1-> ( _V \ A ) -> ( <. A , ran E >. e. HypGrph <->
 { <. ( Base ` ndx ) , A >. , <. ( inci ` ndx ) , E >. } e. Hypergraph ) )

PsGrph doesn't have an equivalent, although it's a very strange structure - the edges can either have one end or they can be directed with two ends (a one-vertex edge is distinct from a self-loop).

( E : D -1-1-> ( _V \ A ) -> ( <. A , ran E >. e. SmpGrph <->
 { <. ( Base ` ndx ) , A >. , <. ( inci ` ndx ) , E >. } e. SimpleGraph ) )
(assuming df-sgra is fixed according to the comment)

(By the way, a nice recently discovered idiom: Assuming F : A --> B and G : A --> C, ( F oF _I G ) : A --> ( B X. C ) is the function sometimes called (F x G) in math prose which satisfies ( ( F oF _I G ) ` x ) = <. ( F ` x ) , ( G ` x ) >. .)

The counting of VDeg is a bit difficult to generalize, though. It is designed so that a self-loop contributes 2 to the degree of a vertex, which makes sense for Multigraph but not for Hypergraph, when it would probably make more sense for an edge to contribute 1 to each vertex it is incident with, so that a self-loop (or is it a dangling edge?) only adds 1 to the vertex degree.

Regarding actual changes, though, I'd rather hold off on all that machinery until we need it for something concrete like the Szemerédi regularity lemma or something.

--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at http://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

David A. Wheeler

unread,
Apr 17, 2015, 6:18:55 PM4/17/15
to metamath
On Thu, 16 Apr 2015 05:10:02 -0400, Mario Carneiro <di....@gmail.com> wrote:
> I'm writing to talk about my latest submission to set.mm, which includes a
> large number of changes and additions that I've been working on over the
> past few months. This is also a test of sorts, to see how best to give
> submission notes in a publicly accessible fashion to interested parties.

Umm... wow. That's an extraordinary amount of improvement. I'm impressed.

If I may, I do have a general comment about a few points.


> The major achievement in this section is a new choice-free proof that an
> open set is measurable...

Great! I think it's great that many proofs are being made without choice, since
that in some sense is a more controversial axiom.
That said, if the proofs are much easier/simpler/clearer to do with axioms like
choice, I hope that these easier proofs will be retained (e.g., as "-alt" proofs).
You may already be doing that, I just didn't see that mentioned in your description.


> I would say the major achievement here is continuity of log, which was
> hampering progress for a while. Now that it is complete, several other
> theorems fall out, in particular differentiability of log, the taylor
> series for log, and finally the taylor series for arctan and Leibniz'
> series for pi (which is a metamath 100 theorem).

Excellent, glad to hear it!

The "metamath 100" page appears to have disappeared from:
http://wiki.planetmath.org/cgi-bin/wiki.pl/metamath_100
The site appears to have crashed and moved to get back up.
If they don't recover it, we may need to restore the page :-(.


> sinhval, etc.: I didn't define the sinh (hyperbolic sine) function, as I
> think that would be unnecessary. Rather, I intend to use the idiom
> sin(iz)/i to mean sinh(z), and similarly for cosh(z)=cos(iz) and
> tanh(z)=tan(iz)/i.
...
> df-asin, df-acos, df-atan: Define the inverse trigonometric functions.
> Since the other three are easily obtained from these as
> arccot(z)=arctan(1/z) and so on, I didn't bother to define them.

I disagree with this approach. I think it's important to define common functions.
When using standard widely-known names it is easier to
understand the results, easier to find the results, and the whole thing
is less error-prone. If a large percentage of mathematicians know of the function,
then I think there should a definition of it in set.mm. In particular, I think
using *only* idioms like this make the results *less* approachable to later readers.

--- David A. Wheeler

Mario Carneiro

unread,
Apr 17, 2015, 6:44:33 PM4/17/15
to meta...@googlegroups.com
On Thu, Apr 16, 2015 at 5:10 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
> The major achievement in this section is a new choice-free proof that an
> open set is measurable...

Great!  I think it's great that many proofs are being made without choice, since
that in some sense is a more controversial axiom.
That said, if the proofs are much easier/simpler/clearer to do with axioms like
choice, I hope that these easier proofs will be retained (e.g., as "-alt" proofs).
You may already be doing that, I just didn't see that mentioned in your description.

As a matter of fact, in this particular case I did keep the original short proof:

http://us2.metamath.org:88/mpegif/opnmblALT.html
 
> I would say the major achievement here is continuity of log, which was
> hampering progress for a while. Now that it is complete, several other
> theorems fall out, in particular differentiability of log, the taylor
> series for log, and finally the taylor series for arctan and Leibniz'
> series for pi (which is a metamath 100 theorem).

Excellent, glad to hear it!

The "metamath 100" page appears to have disappeared from:
  http://wiki.planetmath.org/cgi-bin/wiki.pl/metamath_100
The site appears to have crashed and moved to get back up.
If they don't recover it, we may need to restore the page :-(.

I emailed the administrators and they appear to be working on it. TBH I think that the metamath 100 page is more appropriate on metamath.org rather than on the wiki since it is easier to keep it up-to-date with set.mm that way (and it's also under more direct control). It's not like we are really using the wiki features (whatever those may be) that much; I tried once to pull up a historical view of the page only to find out that it only keeps 30 days of edits, which is tiny given the edit rate.
 
> sinhval, etc.: I didn't define the sinh (hyperbolic sine) function, as I
> think that would be unnecessary. Rather, I intend to use the idiom
> sin(iz)/i to mean sinh(z), and similarly for cosh(z)=cos(iz) and
> tanh(z)=tan(iz)/i.
...
> df-asin, df-acos, df-atan: Define the inverse trigonometric functions.
> Since the other three are easily obtained from these as
> arccot(z)=arctan(1/z) and so on, I didn't bother to define them.

I disagree with this approach.  I think it's important to define common functions.
When using standard widely-known names it is easier to
understand the results, easier to find the results, and the whole thing
is less error-prone. If a large percentage of mathematicians know of the function,
then I think there should a definition of it in set.mm. In particular, I think
using *only* idioms like this make the results *less* approachable to later readers.

I figured you would, since you are the one who introduced cot, sec, etc. :) One principle which I try to honor in my work is to not define things unless they are really needed. In trig especially, there are no less than 24 functions you can obtain via sin/cos/tan + reciprocal + inverse + hyperbolic, and this alone is a source of confusion for students learning trigonometry at first. The fact that there are so many names for things yields a huge number of identities which are all essentially the same if the language is normalized. It reminds me a little of how Metamath has no greater-than sign. By eliminating a meaningless left-right writing symmetry, you can half the number of order theorems, and also be sure that new theorems will always have their inequalities pointing in the appropriate direction to work well with the library and other theorems.

If we acquire a significant number of theorems which use sinh or one of its brothers, I could be convinced to add them on a case-by-case basis, but having all of them would throw the definition/theorem proportion much higher than I'd like.

P.S.: I just had to nominally make use of arctanh, in order to prove log 2 = sum_ n e. NN0 2 / (3 (2n+1) (9^n)) and ultimately prove our first high-precision numerical estimate, log 2 < 253/365 (accurate to 5 decimals), for the birthday problem. But it was just as well to plug x = i/3 into the Taylor series of arctan, so I didn't need to make a new definition.

Mario

Stefan O'Rear

unread,
Apr 18, 2015, 1:32:32 AM4/18/15
to meta...@googlegroups.com
On Friday, April 17, 2015 at 3:44:33 PM UTC-7, Mario Carneiro wrote:
P.S.: I just had to nominally make use of arctanh, in order to prove log 2 = sum_ n e. NN0 2 / (3 (2n+1) (9^n)) and ultimately prove our first high-precision numerical estimate, log 2 < 253/365 (accurate to 5 decimals), for the birthday problem. But it was just as well to plug x = i/3 into the Taylor series of arctan, so I didn't need to make a new definition.

Birthday problem?  Does that mean we're doing probability theory now?  If so I'm quite curious how that works; I've considered how to formalize probability theory in the past, and had far less luck than with graphs.  Does it involve general (not RR) measure theory?

-sorear 

Mario Carneiro

unread,
Apr 18, 2015, 1:58:21 AM4/18/15
to meta...@googlegroups.com
No, just plain old combinatorics - it's not too different from the subfactorial project. I just finished (one more metamath 100 down!), here is the statement:

  ${
    birthday.s $e |- S = { f | f : ( 1 ... K ) --> ( 1 ... N ) } $.
    birthday.t $e |- T = { f | f : ( 1 ... K ) -1-1-> ( 1 ... N ) } $.
    birthday.k $e |- K = ; 5 3 $.
    birthday.n $e |- N = ; ; ; 5 2 3 1 $.
    $( The Birthday Problem.  There is a more than even chance that out of 23
       people in a room, at least two of them have the same birthday.
       Mathematically, this is asserting that for ` K = 2 3 ` and
       ` N = 3 6 5 ` , fewer than half of the set of all functions from
       ` 1 ... K ` to ` 1 ... N ` are injective. (Note that in extended base 4,
       23 is written ` 5 3 ` and 365 is ` 5 2 3 1 ` .)
       (Contributed by Mario Carneiro, 17-Apr-2015.) $)
    birthday $p |- ( ( # ` T ) / ( # ` S ) ) < ( 1 / 2 ) $=
      ... $.
      $( [17-Apr-2015] $)
  $}

As you can see in the notations for K and N, I'm also trying out a new system for numerical calculation. It's been almost a year since bpos, and the original arithmetic engine wasn't that great; I made it more efficient and complete for this project, and took the opportunity to fix some suboptimal decisions in the original dec* series of theorems. (I have the .nb file if anyone wants the source; it is based on the "Arithmetic in Metamath" paper and is much more detailed than section 3.) The new theorems have the same name except with num* instead, and they use an actual syntax construction, instead of the verbose ( 4 x. A ) + B approach in the dec* theorems:

  cnu $a class ; A B $.

  $( Define the "numeral constructor", which is used to build up "decimal
     integers" or "numeric terms" in base 4. $)
  df-num $a |- ; A B = ( ( 4 x. A ) + B ) $.

I'm not sure how best to display the digit strings, though. I want to just completely hide the semicolon in HTML display, so that instead of ";;;5231" you just see "5231". This does have a possibility of visual ambiguity if someone nests these in weird ways like "; A ; B C" vs "; ; A B C" (the first should never occur if they are using the numeral system as it is intended), in addition to confusion between "; 10 1" (=41) and "; ; 1 0 1" (=17). This case shouldn't happen either if the numeral system is used consistently, although the legal one is not what you might think; since I use the convention of using digits larger than 3 in the high place (which I call "extended base 4"), the first one is written correctly and the second should be "; 4 1".

After playing with the ( 4 x. 3 ) + 1 and ( 3 F 1 ) systems for notation somewhat, I realized that brevity is very important here, because the whole point of the system is to represent large numbers, and especially since the proof is largely computer-generated the visual clutter stacks up quickly, which is why I thought a syntax construction would be the best way to go. (This also recreates a system Norm had in set.mm a long time ago, but scrapped for being too fancy or something like that. Needless to say I think the database is ready for it now.)

Mario

--

Stefan O'Rear

unread,
Apr 18, 2015, 2:22:56 AM4/18/15
to meta...@googlegroups.com

On Friday, April 17, 2015 at 10:58:21 PM UTC-7, Mario Carneiro wrote:
fix some suboptimal decisions in the original dec* series of theorems. (I have the .nb file if anyone wants the source; it is based on the "Arithmetic in Metamath" paper and is much more detailed than section 3.) The new theorems have the same name except with num* instead, and they use an actual syntax construction, instead of the verbose ( 4 x. A ) + B approach in the dec* theorems:

  cnu $a class ; A B $.

  $( Define the "numeral constructor", which is used to build up "decimal
     integers" or "numeric terms" in base 4. $)
  df-num $a |- ; A B = ( ( 4 x. A ) + B ) $.

[...]
After playing with the ( 4 x. 3 ) + 1 and ( 3 F 1 ) systems for notation somewhat, I realized that brevity is very important here, because the whole point of the system is to represent large numbers, and especially since the proof is largely computer-generated the visual clutter stacks up quickly, which is why I thought a syntax construction would be the best way to go. (This also recreates a system Norm had in set.mm a long time ago, but scrapped for being too fancy or something like that. Needless to say I think the database is ready for it now.)

Not just a matter of the "database being ready", your definition is _far_ simpler.   Compare your version (which has two axioms, zero new syntactic categories, and has probably already been mechanically verified by mmj2) to the original: https://github.com/sorear/set.mm-history/blob/39b013/Complex_numbers/Decimal_representation_of_numbers#L59-L106

(Speaking of which, I have no versions newer than set.mm.2015-02-28-so .  I just pulled a file off us2 with a header date of 2015-04-15; is this the immediate successor of 2015-02-28-so, and if so what name shall I give it in the archive?)

-sorear

fl

unread,
Apr 18, 2015, 11:18:41 AM4/18/15
to meta...@googlegroups.com
 
Since they are very commonly used, users should take note that stoig2 and stoig3 were renamed to subspuni and subsptop. (I still don't know what "stoig" stands for, maybe FL does.) On this topic, I'd also like to propose a change in the interface for subSp, from ( subSp ` <. A , J >. ) to ( J |`t A ), for notational agreement with |` and |`s, and also to avoid the verbose df-fv + df-op combination in favor of the "newer" (by which I mean 10 years old) df-ov.
 
I just don't understand why you attach such importance to the meaning of theorem names.
 
Could you please make an acknowledgment when you copy the proofs or theorems of somebody else?
 
--
FL

Norman Megill

unread,
Apr 18, 2015, 2:27:07 PM4/18/15
to meta...@googlegroups.com


On Saturday, April 18, 2015 at 2:22:56 AM UTC-4, Stefan O'Rear wrote:

Not just a matter of the "database being ready", your definition is _far_ simpler.   Compare your version (which has two axioms, zero new syntactic categories, and has probably already been mechanically verified by mmj2) to the original: https://github.com/sorear/set.mm-history/blob/39b013/Complex_numbers/Decimal_representation_of_numbers#L59-L106

Which I soon after abandoned. :)

While I didn't save the set.mm for it, I have some notes from May 1999
on an intermediate version I played with after abandoning the above.  It
in turn was abandoned too, mainly because I wanted to postpone
introducing complexity when it wasn't needed (no numbers over 10 were
needed then) and avoid being locked into a notation that might be hard
to change later.

Essentially, I defined new operation class constants "#" and "." for
integer part and fractional part.  The decimal number 1234.5678 would be
represented as

   ( ( ( ( 1 # 2 ) # 3 ) # 4 ) . ( 5 . ( 6 . ( 7 . 8 ) ) ) )

How these can be defined should be relatively obvious.  I won't bother
to repeat the definitions I used (unless you really want to see them);
they were rather ugly in order to force the result to be zero with
non-numeric arguments, in an attempt to make it easier to prove closure.

My notes also indicate I played with a two-place prefix operation
similar to Mario's "; A B", with "# # # 1 2 3 4" for the integer part
above.  With the fractional part it was uglier, ". # # # 1 2 3 4 . 5 . 6 . 7 8"
since we can't have unambiguous suffix operations without bracketing.

BTW I'm not advocating that Mario add a fractional part. :)

(Speaking of which, I have no versions newer than set.mm.2015-02-28-so .  I just pulled a file off us2 with a header date of 2015-04-15; is this the immediate successor of 2015-02-28-so, and if so what name shall I give it in the archive?)

The following is the complete set since 2015-02-28, which are temporarily in the directory
http://us2.metamath.org:88/metamath/

set.mm.2015-02-28-so.bz2 - the last one you archived
set.mm.2015-03-29.bz2
set.mm.2015-04-14-as.bz2
set.mm.2015-04-15-mc.bz2 - the one on us2 now

-Norm
 

-sorear

David A. Wheeler

unread,
Apr 19, 2015, 3:20:26 AM4/19/15
to metamath, di.gama, nm
On Sat, 18 Apr 2015 01:58:18 -0400, Mario Carneiro <wrote:
> After playing with the ( 4 x. 3 ) + 1 and ( 3 F 1 ) systems for notation
> somewhat, I realized that brevity is very important here, ...

I object to making base 4 a specially-sanctioned base in set.mm.

The world normally uses base 10, not base 4. Supporting other bases than 10 would be great (especially 2, 8, and 16), but using base 4 as the "usual" base for larger numbers is extremely confusing, and is not helpful. I think that set.mm is well on its way to being a "universal" document to explain current mathematics; using base 4 is a regression.

The given here is unconvincing:
$( Closure for a decimal integer (no units place). NB: We have chosen to
use the quaternary system (base 4) rather than the decimal system as a
semi-arbitrary choice, but it means that addition and multiplication
tables only need 16 entries instead of 100 (and actually much less than
that due to special cases for 0 and 1, etc.), while still being
relatively compact as a notational system. (Contributed by Mario Carneiro, 18-Feb-2014.) $)

Metamath has no problems with 100 entries. Sure, the 100 entries have to be created and proved, but that's not a big deal.

--- David A. Wheeler

David A. Wheeler

unread,
Apr 19, 2015, 3:20:26 AM4/19/15
to metamath, metamath
David A. Wheeler said:
> > The "metamath 100" page appears to have disappeared...

Mario Carneiro <di....@gmail.com> wrote:
> I emailed the administrators and they appear to be working on it. TBH I
> think that the metamath 100 page is more appropriate on metamath.org ...

Sure, sounds sensible. I originally put the page on Planet Math to get it started.
Freek wants named anchors inside the HTML, so he can link to specific places.
A simple HTML page would work just fine.

Mario Carneiro wrote:

> > > sinhval, etc.: I didn't define the sinh (hyperbolic sine) function, as I
> > > think that would be unnecessary. Rather, I intend to use the idiom
> > > sin(iz)/i to mean sinh(z), and similarly for cosh(z)=cos(iz) and
> > > tanh(z)=tan(iz)/i.
> > ...


David A. Wheeler

unread,
Apr 19, 2015, 3:20:26 AM4/19/15
to metamath
(Sorry for the previous email, it got accidentally sent while I was typing it.)

Mario Carneiro wrote:
> > > sinhval, etc.: I didn't define the sinh (hyperbolic sine) function, as I
> > > think that would be unnecessary. Rather, I intend to use the idiom
> > > sin(iz)/i to mean sinh(z), and similarly for cosh(z)=cos(iz) and
> > > tanh(z)=tan(iz)/i ...

David A. Wheeler said:
> > I disagree with this approach. I think it's important to define common
> > functions.

Mario Carneiro wrote:
> I figured you would, since you are the one who introduced cot, sec, etc. :)
> One principle which I try to honor in my work is to not define things
> unless they are really needed.

Well, I certainly don't want to disappoint you :-).

I'd like to try to talk you into a slightly different view.

I think that set.mm is truly growing into a modern version of "Principia Mathematica",
showing how how math (as widely understood) can be developed from first principles
and then showing some of the amazing results that can be derived from them.
I think it's amazing, and I've learned a lot from reading it that other books just quietly
skim over. However, to fully succeed in that role, users should be able say,
"I know about function xyz, let me see how that's defined and derived from more basic principles",
and then find out.

A "Principia Mathematica" that fails to clearly define common widely-used functions is,
by that view, a poorer document. I'm not saying you have to *use* sinh everywhere.
But it should *define* functions like sinh and greater-than, as part of its role
of clearly defining "math as commonly understood, built up from basic principles".
Naive rule: If it's often defined in a high school math textbook, then it should be defined in set.mm.
You don't need to *use* it later, but the lack of such things seems glaring to me.

If you find that stylistically using sin(iz)/i is more convenient than sinh(z), well,
I find that awkward but defensible. But if set.mm *lacks* a definition of sinh(z),
then there's no way to argue that sin(iz)/i *is* the same as sin(h) using only the
information in set.mm, since sinh(z) is nowhere defined.

--- David A. Wheeler

Mario Carneiro

unread,
Apr 19, 2015, 4:07:37 AM4/19/15
to David A. Wheeler, metamath, nm
On Sat, Apr 18, 2015 at 11:18 AM, fl <freder...@laposte.net> wrote:
I just don't understand why you attach such importance to the meaning of theorem names.

In one sentence, the short answer is to make them easier to remember.
 
 Could you please make an acknowledgment when you copy the proofs or theorems of somebody else?

Of course. subspuni and subsptop are appropriately attributed; did I miss one?


On Sat, Apr 18, 2015 at 11:48 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
(This still isn't posted on the metamath mailing list, so I'm sending it also to you directly.)

(included the entire text in case this message still isn't on the group)

I can get behind this idea, in a certain sense. Roughly speaking, I treat mathboxes as a place for math that is not yet ready for integration into main set.mm, mainly because there aren't enough theorems for it. Unused definitions fall into this category. If I add a definition to main set.mm, I will feel obligated to prove all the "obvious" theorems about it, even if they don't have immediate applications, because that's what library-building is about. For example, if I added sinh to main set.mm, I would also need to prove that it is odd, even though it's just a simple corollary of the fact that sin is odd. It is this sort of extra "baggage" around imported notions that make me reticent to just import everything off the bat.

If the goal is *exclusively* the actual exposition of the concept itself, with no attempt to make theorems around it, then that baggage disappears, and so it is fine to keep such a definition as long as it stays in a mathbox. In other words, if you want to define sinh and it's 23 cousins in your mathbox, then be my guest. I will still interpret those definitions as "something to get to eventually", and perhaps if it gets to the point that it really will shorten some theorems, then it will probably get pulled into service, but other than that I don't see any problem with such "expository definitions".




On Sat, Apr 18, 2015 at 2:27 PM, Norman Megill <n...@alum.mit.edu> wrote:
BTW I'm not advocating that Mario add a fractional part. :)

In the proof of

    log2ub $p |- ( log ` 2 ) < ( ; ; ; 3 3 3 1 / ; ; ; 5 2 3 1 ) $= ... $.

I handle this issue using lemmas which look like this:

    $( Lemma for ~ log2ub .  In decimal, this is a proof that the first four
       terms of the series for ` log 2 ` is less than
       ` 5 3 0 5 6 / 7 6 5 4 5 ` . $)
    log2ublem3 $p |- ( ( ( 3 ^ 7 ) x. ( 5 x. 7 ) ) x. sum_ n e. ( 0 ... 3 )
      ( 2 / ( ( 3 x. ( ( 2 x. n ) + 1 ) ) x. ( 9 ^ n ) ) ) ) <_
       ; ; ; ; ; ; ; 3 0 3 3 1 0 0 0 $= ... $.

I had to add a bunch of fractions together to prove the claim, and rather than do it exactly, I found a sufficiently large denominator that all the fractions, with directed rounding, still maintain the ordering relationship I need for the claim. It turns out that 76545 = 3^7 x 5 x 7 is a good choice for this because it also clears most of the fractions completely.

In other words, I didn't use "decimals" at all, but chose a more convenient denominator than a power of 4.

On Sat, Apr 18, 2015 at 5:21 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
On Sat, 18 Apr 2015 01:58:18 -0400, Mario Carneiro <wrote:
> After playing with the ( 4 x. 3 ) + 1 and ( 3 F 1 ) systems for notation
> somewhat, I realized that brevity is very important here, ...

I object to making base 4 a specially-sanctioned base in set.mm.

The world normally uses base 10, not base 4.  Supporting other bases than 10 would be great (especially 2, 8, and 16), but using base 4 as the "usual" base for larger numbers is extremely confusing, and is not helpful.  I think that set.mm is well on its way to being a "universal" document to explain current mathematics; using base 4 is a regression.

Your objection comes at an opportune moment; I am currently in the process of revising these "decimal" theorems for the new notation, and especially considering the new notation's worrisome visual conflation of ; 1 0 and 10, using base 10 would definitely improve the readability. I'd like to open this question up to others - should I revise it all for base 10? The times table is likely to be a fairly large calculation, so I'm thinking of using base 4 to bootstrap it. To be precise, I propose the following:

* Rename the dec* theorems to num*. These will not use the new notation, and the current base 4 will be replaced by an extra parameter. For example, decsuc will become:

  ${
    numsuc.1 $e |- T e. NN0 $.
    numsuc.2 $e |- A e. NN0 $.
    numsuc.3 $e |- B e. NN0 $.
    numsuc.4 $e |- ( B + 1 ) = C $.
    numsuc.5 $e |- N = ( ( T x. A ) + B ) $.
    $( The successor of a numeral. (no carry).  (Contributed by Mario
       Carneiro, 18-Feb-2014.) $)
    numsuc $p |- ( N + 1 ) = ( ( T x. A ) + C ) $=
      ... $.
  $}

* Prove the complete addition and times tables for base 10 by using base 4 internally, using the num* theorems with T = 4.

* Rename the num* theorems to dec* (this hasn't been released yet so it should not cause confusion), with the decimal constructor as before:

  $( Define the "decimal constructor", which is used to build up "decimal
     integers" or "numeric terms" in base 10. $)
  df-dec $a |- ; A B = ( ( 10 x. A ) + B ) $.

* Define dec* analogues of the num* theorems, achieved simply by setting T = 10.

Should I carry this out? I'd like another opinion since it will change many things going forward.

Mario

David A. Wheeler

unread,
Apr 19, 2015, 11:39:32 PM4/19/15
to di.gama, nm, metamath
David A. Wheeler:
> > I think that set.mm is truly growing into a modern version of "Principia Mathematica",
> > showing how how math (as widely understood) can be developed from first
> > principles...to fully succeed in that role, users should be able
> > say, "I know about function xyz, let me see how that's defined and derived from
> > more basic principles"...
> > it should *define* functions like sinh and greater-than...
> > Naive rule: If it's often defined in a high school math textbook, then it
> > should be defined in set.mm...


Mario Carneiro:
> I can get behind this idea, in a certain sense. Roughly speaking, I treat
> mathboxes as a place for math that is not yet ready for integration into
> main set.mm, mainly because there aren't enough theorems for it. Unused
> definitions fall into this category. If I add a definition to main set.mm,
> I will feel obligated to prove all the "obvious" theorems about it, even if
> they don't have immediate applications, because that's what
> library-building is about. For example, if I added sinh to main set.mm, I
> would also need to prove that it is odd, even though it's just a simple
> corollary of the fact that sin is odd. It is this sort of extra "baggage"
> around imported notions that make me reticent to just import everything off
> the bat.
>
> If the goal is *exclusively* the actual exposition of the concept itself,
> with no attempt to make theorems around it, then that baggage disappears,
> and so it is fine to keep such a definition as long as it stays in a
> mathbox. In other words, if you want to define sinh and it's 23 cousins in
> your mathbox, then be my guest. I will still interpret those definitions as
> "something to get to eventually", and perhaps if it gets to the point that
> it really will shorten some theorems, then it will probably get pulled into
> service, but other than that I don't see any problem with such "expository
> definitions".

To me, this makes complete sense.

I think set.mm should eventually include
definitions of stuff like sinh() and theorems about it in the main body.
Then someone who says "I want to understand the hyperbolic functions" can
jump to some starting point and see whatever they want to see.
But... that's a lot to do, and it cannot be done all at once.

Mathboxes are ideal spots to create basic definitions, and then start creating
"obvious" theorems. After a certain point, they should grow to the point where there's
enough supporting material to move them into the main body.

That said, I'd like set.mm (and friends) to grow to the point so that if it was the *only*
math text that survived some horrible apocalypse, you'd have a summary of they key
parts of mathematics, rigorously defined. Metamath's ability to show every step in detail,
with a hypertext link on each step to each supporting theorem, is really impressive
and to my knowledge unique. I wish I'd been able to see something like this in high school.

--- David A. Wheeler

David A. Wheeler

unread,
Apr 19, 2015, 11:39:36 PM4/19/15
to Mario Carneiro, metamath, nm
> * Prove the complete addition and times tables for base 10 by using base 4 internally, using the num* theorems with T = 4.

Using base 4 to bootstrap base 10 seems reasonable enough. Odd, but if it's convenient, why not. I just would like to see base 10 as the main base once things get going.

--- David A.Wheeler

fl

unread,
Apr 20, 2015, 5:27:20 AM4/20/15
to meta...@googlegroups.com, n...@alum.mit.edu, dwhe...@dwheeler.com

Of course. subspuni and subsptop are appropriately attributed; did I miss one?


You didn't miss one; it is your usual behavior. Graphs, homotopy, group sum etc. were developed by others 
and you haven't given them the slightest chance of being made official. You replace their work by yours 
without even a tribute. Before their work is dismissed and destroyed maybe you might at least make 
an acknowledgement. If nothing else, they have contributed to the illusion that metamath was a collaborative 
effort. That deserves a reward.

(I don't hold it against you. If your were not permitted, you won't afford to do that.)

-- 
FL

Mario Carneiro

unread,
Apr 20, 2015, 7:37:03 AM4/20/15
to metamath
On Mon, Apr 20, 2015 at 5:27 AM, fl <freder...@laposte.net> wrote:
Of course. subspuni and subsptop are appropriately attributed; did I miss one?

You didn't miss one; it is your usual behavior. Graphs, homotopy, group sum etc. were developed by others 
and you haven't given them the slightest chance of being made official.

The fastest way to make a section "official" is by not letting it sit, and develop the "obvious" theorems to the point that it can be used easily in others' work. This involves lots of active participation and possibly debate over appropriate definitions and style.

I interpret the "Contributed by" comment in a fairly narrow sense; it refers to the person who first penned the theorem statement and produced something that compiled in Metamath. It does not refer to those who wrote similar theorem statements; these theorems have their own contribution comments. This line becomes blurred somewhat after a revision, because now the theorem doesn't look the same as it did when it was originally contributed. Say there are similar theorems A by author X and B by author Y, and I revise A into C which supercedes both A and B. Then C will have "Contributed by X" and "Revised by Mario" comments, and Y does not get a mention. This is often simply because I did not notice theorem B; if I later merge B into C, I try to credit the author with the earlier contribution date.

To address your specific examples: As this thread has revealed, graphs have several distinct expositions in set.mm, and there has not yet been an attempt to unite them. I wrote my own section on undirected graphs because I was not happy with the existing alternative definitions, and to my knowledge there are *no* theorems in any of the alternate definitions - they are just definitions. As such, I wrote a definition I thought would be easy to use and actually wrote up some theorems and proved something nontrivial. As such, I see this as my own work and did not significantly borrow material from any third parties.

Homotopy theory was originally developed in set.mm by Jeff Madsen, in the form of path homotopies and the fundamental group. Other developments in topology are starting to ask more of this section, so I am slowly starting to build it up into something more. Jeff Madsen's contribution comments still exist on all the theorems we wrote, although the whole section has now been revised a few times. Theorems such as ishtpy are new developments (general homotopies, not path homotopies), and as such I believe that crediting myself for such theorems is not unreasonable since I did in fact write the theorems.

The idea of a group sum was first set out by you, in the form of df-prod. I indicated a desire to extend your definition to arbitrary finite sums and placed a commented-out proposed redefinition in your mathbox, because the version you had was not too much more powerful than df-seq to begin with, and it was also based on the old structure definitions. After some time without much progress from you in such an extension, I decided to work on it myself. Initially I attempted to import your theorems and revise them all for the new definition, but you expressed some discontent with my having done so, so I returned your theorems to your mathbox and created a new df-gsum from scratch with the behaviors I needed. There is no contribution comment from you on these theorems because they are not actually based on df-prod, although the idea is similar - df-prod and its theorems, including the "Contributed by FL" comments, continue to exist in your mathbox.

In set.mm, you aren't contributing *ideas*, for the most part - most of the formalization work is of "dead math" which is already written up in a thousand textbooks. What you are contributing is an actual *formal proof* of said ideas, so that contribution comments follow actual proofs and not just definitions of a concept. If two people formalize the same subject independently, I see no reason for either author to have any claim on the theorems of the other, because the formalizations themselves are independent.

Mario

Mario Carneiro

unread,
Apr 20, 2015, 8:10:20 AM4/20/15
to David A. Wheeler, metamath, nm
On Sun, Apr 19, 2015 at 4:07 AM, Mario Carneiro <di....@gmail.com> wrote:
On Sat, Apr 18, 2015 at 5:21 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
On Sat, 18 Apr 2015 01:58:18 -0400, Mario Carneiro <wrote:
> After playing with the ( 4 x. 3 ) + 1 and ( 3 F 1 ) systems for notation
> somewhat, I realized that brevity is very important here, ...

I object to making base 4 a specially-sanctioned base in set.mm.

The world normally uses base 10, not base 4.  Supporting other bases than 10 would be great (especially 2, 8, and 16), but using base 4 as the "usual" base for larger numbers is extremely confusing, and is not helpful.  I think that set.mm is well on its way to being a "universal" document to explain current mathematics; using base 4 is a regression.

Your objection comes at an opportune moment; I am currently in the process of revising these "decimal" theorems for the new notation, and especially considering the new notation's worrisome visual conflation of ; 1 0 and 10, using base 10 would definitely improve the readability. I'd like to open this question up to others - should I revise it all for base 10? The times table is likely to be a fairly large calculation, so I'm thinking of using base 4 to bootstrap it. To be precise, I propose the following...

As much as it pains me to admit it, I agree with you about base 10 over base 4. The initial choice was in order to quickly get something working so that I could get to bpos; but now that is done and in revisiting it now I'd prefer to do it the "right" way.

I've implemented the proposal I mentioned. The hundred facts was not as bad as I thought it would be, and together with specialized lemmas like

$e |- A e. CC
$e |- B e. CC
$e |- ( A + B ) = C
addcomli $p |- ( B + A ) = C

for turning 50 facts into 100, the arithmetic algorithm is getting more efficient. Together with fewer syntax steps (since "; A B" is "cA cB cdc" while "( ( 4 x. A ) + B )" is "c4 cmul cA co caddc cB co") and the (log 10)/(log 4) = 1.66 times shorter expressions for numbers, which produces an overall reduction in proof length, numerical theorems saw an almost 50% decrease in size after the change, including the largest single improvement I've ever seen in one theorem, a decrease of over 200 lines of compressed proof length in 2503lem2.

Which brings me to another proposal: Now that numbers are expressed in base 10 and the expressions are not much longer than a defined symbol, I propose eliminating df-10 and all higher numbers (numbers up to 14 are defined in FL's mathbox). The replacement is simply "10" -> "; 1 0" and so on. This would slightly simplify the decimal algorithm (there is some special-casing for 10 because I am still doing "extended base 10" in the higher digits, for example in using "; 10 0" to mean 100), and eliminate the visual ambiguity of "10" vs "; 1 0" (even though these are the same). but mostly I think it would make the system easier to understand for a learner.

Mario

fl

unread,
Apr 20, 2015, 8:27:12 AM4/20/15
to meta...@googlegroups.com

The fastest way to make a section "official" is by not letting it sit, and develop the "obvious" theorems to the point that it can be used easily in others' work. This involves lots of active participation and possibly debate over appropriate definitions and style.


And I suppose you will be the judge of what is ovbious and what is not so.

But Mario I don't ask you an exercise in communication. I know that you are very good at this. No
I ask you to think about your reputation. Imagine. It is on sunday you are at the church or at the temple
and suddenly in the stalls the old ladies whisper.

"-- Look there! Mario the fittest is coming.

-- Mario the fittest !

-- Yes himself.

-- But you know he is not only Mario the fittest, he is also a humanist. Before dismissing and sending back the
contributors to their obscurity (from where they should never have got out) he condescends (excuse me) he accepts
to make an acknowledment to their ridiculours attempts.

-- He has always been such a very good boy.

-- What a man! But then he is not only the fittest?

-- No not only. He is also a good person. He is the Gandhi of mathematics, the Martin Luther King
of integrals.

-- I would have liked to be her mother.

-- Me too!"

 You imagine Mario? How proud you would be?

--
FL

fl

unread,
Apr 20, 2015, 8:52:07 AM4/20/15
to meta...@googlegroups.com

In set.mm, you aren't contributing *ideas*, for the most part - most of the formalization work is of "dead math" which is already written up in a thousand textbooks.

Yes. And then? I am so conscious of that that I have even given the references to the textbook
where I have found the material. You dismiss the reference as well by the way. For you dismiss is
a system of thought apparently.
--
FL 

Norman Megill

unread,
Apr 20, 2015, 9:59:52 AM4/20/15
to meta...@googlegroups.com
Hi Fred,

Ad-hominem attacks have no place in this group.  As moderator, my responsibility is to prevent that.  Your posts accomplish nothing other than embarrassing yourself and adding distracting noise.  Please discontinue.

If you want to make a positive contribution, please identify specific theorems (not just a general area) that you think are not properly acknowledged, and specific theorems that you can provide textbook theorem and page number references for.  I will be happy to review them and where appropriate add the acknowledgments and references to set.mm.

Norm


On Monday, April 20, 2015 at 8:27:12 AM UTC-4, fl wrote:

The fastest way to make a section "official" is by not letting it sit, and develop the "obvious" theorems to the point that it can be used easily in others' work. This involves lots of active participation and possibly debate over appropriate definitions and style.


And I suppose you will be the judge of what is ovbious and what is not so.


(rest deleted) 

fl

unread,
Apr 20, 2015, 12:29:56 PM4/20/15
to meta...@googlegroups.com

> Ad-hominem attacks have no place in this group.

Happy to see that you are able to take the defense of some contributors against the attacks of others. You
should generalise this attitude.



If you want to make a positive contribution, please identify specific theorems (not just a general area) that you think are not properly acknowledged, and specific theorems that you can provide textbook theorem and page number references for.  I will be happy to review them and where appropriate add the acknowledgments and references to set.mm.



Graphs, homotopy, group sum  As it has been mentioned in a post above. And unfortunately there are
other occurences of the same behavior.

--
FL

fl

unread,
Apr 20, 2015, 12:41:00 PM4/20/15
to meta...@googlegroups.com

(rest deleted) 


Our proofs are deleted. Our apologues are deleted.

So much deletion.

Do we deserve that?

--
FL

Norman Megill

unread,
Apr 20, 2015, 2:04:43 PM4/20/15
to meta...@googlegroups.com, dwhe...@dwheeler.com
I retrieved an old version of the missing "Metamath 100"
page from archive.org and updated it to what I think is the current
state:  http://us2.metamath.org:88/mm_100.html .  Anyone who wants to
edit it can email updates to this page to me.
> To unsubscribe from this group and stop receiving emails from it, send an email to metamath+unsubscribe@googlegroups.com.

Norman Megill

unread,
Apr 20, 2015, 2:13:05 PM4/20/15
to meta...@googlegroups.com


On Monday, April 20, 2015 at 12:29:56 PM UTC-4, fl wrote:
If you want to make a positive contribution, please identify specific theorems (not just a general area) that you think are not properly acknowledged, and specific theorems that you can provide textbook theorem and page number references for.  I will be happy to review them and where appropriate add the acknowledgments and references to set.mm.



Graphs, homotopy, group sum  As it has been mentioned in a post above. And unfortunately there are
other occurences of the same behavior.

This does not help me.  Provide me with the specific theorem names in set.mm and to whom they should be acknowledged.
And for textbook references, provide the textbook title, theorem number, and page number.

Norm 

Stefan O'Rear

unread,
Apr 20, 2015, 3:57:32 PM4/20/15
to meta...@googlegroups.com
If I may, I think this is a reference to the ongoing situation with Grp versus GrpOp and the fact that there is a loss of continuity when sections are migrated, since Metamath only tracks history at the level of theorems which frequently are replaced outright.  If we had a less granular attribution/history system that worked on TOC segments rather than proofs things might be better here.

i.e. when Mario took FL's GrpOp-based prod_, prod2_, and prod3_ to create gsum for Grp, the attribution history contained in §16.9.13 is lost, because none of the *individual theorems* survive the migration.  Our attribution system cares that proofs in the current version are mostly due to Mario, but it can't record fact that the original design comes from FL's prod_ so this is "lost".

-sorear

Mario Carneiro

unread,
Apr 20, 2015, 4:14:44 PM4/20/15
to metamath
On Mon, Apr 20, 2015 at 3:57 PM, Stefan O'Rear <stef...@cox.net> wrote:
If I may, I think this is a reference to the ongoing situation with Grp versus GrpOp and the fact that there is a loss of continuity when sections are migrated, since Metamath only tracks history at the level of theorems which frequently are replaced outright.  If we had a less granular attribution/history system that worked on TOC segments rather than proofs things might be better here.

i.e. when Mario took FL's GrpOp-based prod_, prod2_, and prod3_ to create gsum for Grp, the attribution history contained in §16.9.13 is lost, because none of the *individual theorems* survive the migration.  Our attribution system cares that proofs in the current version are mostly due to Mario, but it can't record fact that the original design comes from FL's prod_ so this is "lost".

Although I don't have anything like a fully satisfactory solution to this problem, I take solace that if all else fails at least your set.mm history project will maintain evidence of old versions of the theorems, so that we can refer back to the exact origin of a theorem in case of an attribution dispute.

Mario

Mario Carneiro

unread,
Apr 21, 2015, 2:48:02 AM4/21/15
to David A. Wheeler, nm, metamath


On Mon, Apr 20, 2015 at 11:31 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
Is there way for this to *look* nicer in the generated HTML, without being confusing?

I think the cleanest way to handle this is simply

htmldef ";" as ""; althtmldef ";" as ""; latexdef ";" as "";

This will cause large numbers to be format the usual way, i.e. "123" for 123, because there is no space around numbers in the htmldef either. For those who don't care about the details of arithmetic on large numbers, this is understandable without needing to look it up, so I strongly recommend it.

That said, we do lose the property we have maintained so far of "totally visually unambiguous", if 10 is abolished we don't have the 10 vs ; 1 0 as ambiguity anymore but there will still be the ambiguity of "; ; A B C" vs  "; A ; B C". However, I reiterate that the second of these should never appear in a proof (if I could make it invalid syntax I would), so I think it is safe enough. (And of course it is not *actually* ambiguous in any case because the semicolons are retained in the ASCII version.)

Mario

fl

unread,
Apr 21, 2015, 7:22:04 AM4/21/15
to meta...@googlegroups.com

This does not help me. 

The work has already been done and is recorded in set.mm. I can't do more for you.  
But I suppose it is what you wanted with this flood of requirements. I simply regret that
you use that sort of stratagem. It disappoints me.

-- 
FL

Norman Megill

unread,
Apr 21, 2015, 9:15:56 AM4/21/15
to meta...@googlegroups.com
I will add appropriate acknowledgments if I am provided with a list in the following format, with the labels of theorems in set.mm.

original   revised

abc        xyz

Without such a list, your complaints are pointless since I have no way to correlate your original contribution with its revision or derivative.

Norm

fl

unread,
Apr 21, 2015, 9:24:27 AM4/21/15
to meta...@googlegroups.com

Without such a list, your complaints are pointless since I have no way to correlate your original contribution with its revision or derivative.

 It's your project it's you who decides the rules of "ethics" that you want to apply.

We will simply know that you don't like us.

--
FL

David A. Wheeler

unread,
Apr 21, 2015, 8:47:32 PM4/21/15
to di.gama, nm, metamath
David A. Wheeler:
> >> I object to making base 4 a specially-sanctioned base in set.mm.
> >> The world normally uses base 10, not base 4....

Mario Carneiro:
> As much as it pains me to admit it, I agree with you about base 10 over
> base 4.

When you leave a monkey (me) at the keyboard, and let the monkey keep typing,
he's bound to say something sensible eventually :-).

> I've implemented the proposal I mentioned. The hundred facts was not as bad
> as I thought it would be... numerical theorems saw an almost 50%
> decrease in size after the change

Excellent! I think base 10 is the right answer even if it was more complicated.
The reduction is just icing on the cake... but it *is* tasty icing.


> Which brings me to another proposal: Now that numbers are expressed in base
> 10 and the expressions are not much longer than a defined symbol, I propose
> eliminating df-10 and all higher numbers...

Removing numbers more than 10 makes sense. I'd suggest waiting before
removing 10, though; 10 is special, and having a defined symbol is sensible for it.

Is there way for this to *look* nicer in the generated HTML, without being confusing?

--- David A. Wheeler

David A. Wheeler

unread,
Apr 21, 2015, 8:47:32 PM4/21/15
to Mario Carneiro, nm, metamath
> I think the cleanest way to handle this is simply
> htmldef ";" as ""; althtmldef ";" as ""; latexdef ";" as "";

That will make the numbers print cleanly, but the semicolon will be referenced everywhere. What does the definition of semicolon look like? I expect that would be very confusing.

I do worry about the hidden nature of the semicolon if this is done . In all other cases you can see everything if you want to.

Is there an old historical operator that does this that could be resurrected instead?
--- David A.Wheeler

Mario Carneiro

unread,
Apr 21, 2015, 10:23:53 PM4/21/15
to metamath
Oops, forgot to copy the group on these messages.

---------- Forwarded message ----------
From: David A. Wheeler <dwhe...@dwheeler.com>
Date: Tue, Apr 21, 2015 at 12:25 PM
Subject: Re: [Metamath] set.mm update
To: Mario Carneiro <di....@gmail.com>


On April 21, 2015 7:06:21 AM PDT, Mario Carneiro <di....@gmail.com> wrote:
I'm not sure I understand the nature of your concern. The semicolon is not itself an operator but rather one part of a syntax construction:

  cdc $a class ; A B $.

If the ; is hidden this will instead appear as if adjacency "AB" is valid syntax, similar to how three-term adjacency "ABC" is actually valid, as df-br, and three-term adjacency with parentheses "(ABC)" is df-ov. Of course we can't actually make two-term adjacency valid syntax due to parsing ambiguities, but I don't see a problem with making it appear as if we are doing so in these limited circumstances. df-dec would look something like this in the unicode HTML:

AB = ((10 · A) + B)

and I don't think that is too difficult to read/understand.

Regarding historical precedents, I believe that the old "way too complex" notation mentioned before for decimals had a _10 suffix operator for converting a digit-string to a class, and this operator was suppressed in output. (You can actually see a remnant of this definition in the current set.mm, commented out just after the htmldef for "RR".)

Mario

I see. I guess my concern was that in general, metamath does an amazing job of making everything very precise. But I agree, if it looks like that, it is still clear and precise.

This does suggest that when writing multi digit numbers, the ASCII representation has additional text that is not evident in the HTML. That said, it doesn't seem very complicated.

I suggest giving it a try, and if it looks good, go for it!


--- David A.Wheeler

Norman Megill

unread,
Apr 23, 2015, 5:36:24 PM4/23/15
to meta...@googlegroups.com


On Thursday, April 16, 2015 at 5:10:03 AM UTC-4, Mario Carneiro wrote:

I also have two proposals. Other than the |`t proposal above I would also like to implement Stefan's earlier suggestion of "fixing" the peculiar behavior of opprc2 via the definition:

df-op $a |- <. A , B >. = if ( ( A e. _V /\ B e. _V ) , { { A } , { A , B } } , (/) ) $.

(This differs from Stefan's original proposal in replacing 1o with (/); Experience with "reverse closure" theorems like elfvdm have shown me that it is useful to have undefined values be (/) specifically.) Of course we probably won't use A e. <. B , C >. very often, but all that matters for opelxp1 is that the out of domain value not be an ordered pair, which is equally true of (/) and 1o. This also eliminates the counter-intuitive asymmetric handling of opprc1 vs. opprc2.

Mario

One of my goals with set.mm has been to be as faithful to the literature
as possible or practical.  Our current df-op is identical to Quine's,
including its behavior at proper classes.  If you recall a past
conversation, I also defined subspace sum df-lsm with the more complex
standard definition to keep closer to the literature, rather than the
practically (but not exactly) identical span of union which would have
been much simpler.  An exception is df-fv, which is different from all
literature definitions I know of, but in that case almost all literature
definitions are different from each other anyway in terms of
out-of-domain behavior (if that behavior is even defined, which it often
is not).

df-op is near the very beginnings of set theory, and for a beginner, I
think simple direct definitions identical to those in the literature are
preferable.  Adding the "if" operator (which is not standard in set
theory texts, or indeed any non-CS math book at all that I'm aware of)
introduces an unfamiliar layer of complexity.  In fact, its first use in
a definition is in df-rdg, which is well advanced, and even there I was
torn when I used it to simplify an earlier definition that was closer
to the literature theorems it was based on.

Anyway, I'd want to be convinced that there really is a clear advantage,
say in terms of dramatically shortening many theorems, with a modified
df-op.  Just making its proper class behavior less "awkward" isn't to me
in itself a sufficient reason to significantly modify a standard
literature definition.

Norm

Norman Megill

unread,
Apr 23, 2015, 5:54:01 PM4/23/15
to meta...@googlegroups.com
Overall, I'm not too worried about experimenting with making the ";"
invisible or not invisible, since it is a trivial change to switch from
one to the other.

My initial inclination is to make it visible, just because I don't like things
to be hidden and cryptic.  Making symbols invisible introduces a new
paradigm of hiding symbols that in its own way is extra complexity, and
the reader has to know how to interpret the screen display since
everything is no longer explicit.  The verifier can't enforce the use of
only digits as ";" arguments, and in principle we could even prove
displayed contradictions such as "1 + 0 =/= 1" (i.e.  "; ; 1 + 0 =/= 1")
that become valid when we add back the implicit symbols.

Of course, seeing weird ";" strings before each number is odd for people
used to ordinary number representations.  We don't have to make the ";"
token display as a semicolon; maybe there is some other symbol that
would seem less intrusive.

It is true that I suppressed the _10 in my abandoned digit-string experiment
a long time ago.  I probably wouldn't do that now.  By the way, in that experiment,
nothing other than 0-9 could appear in place of a digit, and a digit 0-9 could not be
substituted for a class variable since it was a different variable type.  Nowadays I
definitely want to avoid introducing a new variable type (in addition to wff,
class, set), and I see that experiment as unnecessary complication.

Norm

Mario Carneiro

unread,
Apr 23, 2015, 6:35:53 PM4/23/15
to metamath
On Thu, Apr 23, 2015 at 5:36 PM, Norman Megill <n...@alum.mit.edu> wrote:
One of my goals with set.mm has been to be as faithful to the literature
as possible or practical.  Our current df-op is identical to Quine's,
including its behavior at proper classes.  If you recall a past
conversation, I also defined subspace sum df-lsm with the more complex
standard definition to keep closer to the literature, rather than the
practically (but not exactly) identical span of union which would have
been much simpler.  An exception is df-fv, which is different from all
literature definitions I know of, but in that case almost all literature
definitions are different from each other anyway in terms of
out-of-domain behavior (if that behavior is even defined, which it often
is not).

The issue is that the literature rarely uses the "standard" definition out of range, in much the same manner as df-fv. I have yet to see an author explicitly discuss the proper class behavior of df-op, and Quine is no exception - although he defines it for classes, he always uses it on set variables or classes which are obviously sets. Additionally, although I should probably get more actual data on this, I get the feeling that if I were to show your average mathematician opprc2, they would find it a strange and unintuitive theorem that in some way "abuses" the definition. The way I see it, as soon as one agrees that it should be meaningful on proper classes, the definition should be adjusted to address this use case. Usually, authors do this by switching to a proper-class ordered pair like the disjoint union operator, but for us there is a good reason to keep the unconditional sethood that the current df-op has. Consulting the wikipedia page on ordered pairs shows no specific discussion of the proper class behavior of df-op, but does indicate that it is unsuitable for proper classes and discusses the standard ({0} X. A) u. ({1} x. B) alternative (which is itself based on Kuratowski pairs).

df-op is near the very beginnings of set theory, and for a beginner, I
think simple direct definitions identical to those in the literature are
preferable.  Adding the "if" operator (which is not standard in set
theory texts, or indeed any non-CS math book at all that I'm aware of)
introduces an unfamiliar layer of complexity.  In fact, its first use in
a definition is in df-rdg, which is well advanced, and even there I was
torn when I used it to simplify an earlier definition that was closer
to the literature theorems it was based on.

If you really wanted to avoid df-if, you could write it

|- <. A , B >. = { x | ( A e. _V /\ B e. _V /\ x e. { { A } , { A , B } } ) }

but I think that it is easier to understand with it. In any case the very first theorem will of course get rid of the if statement so that it looks like the standard definition, with a suitable discussion of how the Kuratowski pair is not designed for use on proper classes (which is true). And of course the comment can explain why the if is there. (The usage of the literal "if" may be rare in mathematical literature, but it is an extremely common construction, namely piecewise definition, and there is no reason to be afraid of it for this reason. We are simply restricted to using "if" due to the one-dimensional nature of Metamath input and the restriction on multiple-part definitions as elaborated in df-rdg.)
 
Anyway, I'd want to be convinced that there really is a clear advantage,
say in terms of dramatically shortening many theorems, with a modified
df-op.  Just making its proper class behavior less "awkward" isn't to me
in itself a sufficient reason to significantly modify a standard
literature definition.

If you are looking for advantages, that is certainly the case - in fact that's the whole reason I'm suggesting it. Currently we have "reverse closure" for two-argument operations and binary relations only in the left argument, not the right, and this leads to many many avoidable extra sethood hypotheses (we're talking hundreds of theorems here). To take a random sample of examples:

ensymg
elioo3g
fznuz
climcl
ghmid
islmhm
lmbr
isphtpc
eldv
ispgp
cvmlift

Sometimes these are one-step eliminable, sometimes not. Sometimes it means an expensive post-substitution step, an application of dedth, or a revision for generalization, but it is definitely a fact that in terms of simplifying proofs this modification comes out well in the positive.

Mario

Mario Carneiro

unread,
Apr 23, 2015, 7:05:03 PM4/23/15
to metamath
On Thu, Apr 23, 2015 at 5:54 PM, Norman Megill <n...@alum.mit.edu> wrote:
Overall, I'm not too worried about experimenting with making the ";"
invisible or not invisible, since it is a trivial change to switch from
one to the other.

My initial inclination is to make it visible, just because I don't like things
to be hidden and cryptic.  Making symbols invisible introduces a new
paradigm of hiding symbols that in its own way is extra complexity, and
the reader has to know how to interpret the screen display since
everything is no longer explicit.  The verifier can't enforce the use of
only digits as ";" arguments, and in principle we could even prove
displayed contradictions such as "1 + 0 =/= 1" (i.e.  "; ; 1 + 0 =/= 1")
that become valid when we add back the implicit symbols.

Here's how I reason out the existence of "displayed contradictions": *In principle*, we could define 0-9 to be of type digit, and replace "; class class" with "class digit" together with digit -> class conversions, to create exactly the same behavior as the current one but without the semicolon. It would still be an unambigous grammar, and there are no "displayed contradictions" possible (it would fail in your example because + is not a digit). Now the HTML view is only available for published theorems, so it follows that provided no *published* theorem uses a non-digit in the second argument to df-dec, there will be no displayed contradictions. And since published theorems are subject to at least a minimal review process, it is not unreasonable to expect published theorems to not "misuse" the definition. In short, we are talking about an inconsequential (does not create actual contradictions) ambiguity which only occurs in published proofs, yet will never appear in a published proof.
 

Of course, seeing weird ";" strings before each number is odd for people
used to ordinary number representations.  We don't have to make the ";"
token display as a semicolon; maybe there is some other symbol that
would seem less intrusive.

Even using a space would look odd - I don't think there are any alternatives to the empty string that avoid the "what is that weird stuff on the left" reaction.
 
Mario

Mario Carneiro

unread,
Apr 23, 2015, 7:34:27 PM4/23/15
to metamath
P.S.: The "; ; 1 + 0 =/= 1" proof would not work without -. (/) e. CC; a simpler displayed contradiction would be "; ; 1 0 0 =/= ; 1 ; 0 0" (the left side is 100, the right is 10).

Norman Megill

unread,
Apr 23, 2015, 10:55:00 PM4/23/15
to meta...@googlegroups.com
On Thursday, April 23, 2015 at 6:35:53 PM UTC-4, Mario Carneiro wrote:

[...]

> The issue is that the literature rarely uses the "standard" definition
> out of range, in much the same manner as df-fv.  I have yet to see an
> author explicitly discuss the proper class behavior of df-op, and Quine
> is no exception - although he defines it for classes, he always uses it
> on set variables or classes which are obviously sets.

Quine is happy with { A } = (/) for proper classes, so I think he would
be fine with opprc2 and friends.  I'd be surprised if he wasn't already
aware of them, maybe even assigning them as exercises for his students
to help them understand proper classes.  The reason he doesn't derive
them is likely just that he doesn't need them for his book.  In set.mm
we have room to play with things books don't do, and I intended opprc2
to be as much of a curiosity done for fun as anything practical.


>  Additionally,
> although I should probably get more actual data on this, I get the
> feeling that if I were to show your average mathematician opprc2, they
> would find it a strange and unintuitive theorem that in some way
> "abuses" the definition.

The average mathematician rarely uses proper classes, and many set
theory books avoid them altogether, so behavior at proper classes would
be meaningless for them.  Others might find the behavior outside of the
intended domain interesting, showing limitations of the Kuratowski
(/Quine, for proper classes) definition they may not have thought of
before.

I don't think your proposal necessarily makes it more intuitive, since
the proper class behavior still has nothing to do with ordered pairs.
As I see it, it is basically a trick to allow brrelex to work for both
sides.


> If you really wanted to avoid df-if, you could write it
>
> |- <. A , B >. = { x | ( A e. _V /\ B e. _V /\ x e. { { A } , { A , B } } ) }
>
> but I think that it is easier to understand with it.

If we have to go with it, I'd much prefer this over the one with "if".
Every symbol is immediately understandable to a mathematician who knows
nothing about Metamath.  To me, it is immediately apparent that it
collapses the the empty set for proper classes.

Still not as satisfying to me as |- <. A , B >. = { { A } , { A , B } }.


> In any case the
> very first theorem will of course get rid of the if statement so that it
> looks like the standard definition, with a suitable discussion of how
> the Kuratowski pair is not designed for use on proper classes (which is
> true).  And of course the comment can explain why the if is there.  (The
> usage of the literal "if" may be rare in mathematical literature, but it
> is an extremely common construction, namely piecewise definition, and
> there is no reason to be afraid of it for this reason.

My concern is the first impression it makes, because "if" is unfamiliar
and looks like arcane "computer-like" stuff that people complain about
with other proof languages, turning them away before they even start.
People have enough difficulty "getting" Metamath as it is, and they
might not grasp immediately that "if" is merely another defined
mathematical symbol and not some mysterious internal directive to the
Metamath engine.

I think there can be a certain comfort seeing the existing, familiar
df-op displayed exactly like it is in a book.  Unlike a lot of more
advanced definitions where I don't care that much, df-op is a very
visible definition people encounter early in their exploration of
set.mm.


> If you are looking for advantages, that is certainly the case - in fact
> that's the whole reason I'm suggesting it.  Currently we have "reverse
> closure" for two-argument operations and binary relations only in the
> left argument, not the right, and this leads to many many avoidable
> extra sethood hypotheses (we're talking hundreds of theorems here).  To
> take a random sample of examples:
>
> ensymg
> elioo3g
> fznuz
> climcl
> ghmid
> islmhm
> lmbr
> isphtpc
> eldv
> ispgp
> cvmlift
>
> Sometimes these are one-step eliminable, sometimes not.  Sometimes it
> means an expensive post-substitution step, an application of dedth, or a
> revision for generalization, but it is definitely a fact that in terms
> of simplifying proofs this modification comes out well in the positive.

Well, a purist might argue that by all rights these should have sethood
hypotheses for *both* sides since for proper classes the relations
become meaningless, and we're just exploiting a cheap trick provided by
brrelex to get rid of one of them.  :)

I think the fraction of theorems with this "problem" is relatively small
in the big picture, although I agree that some savings could be
accomplished.  The issues are what is the best df-op definition to make
a good initial impression, what is the best one for beginners studying
set.mm vs. their textbook, and even what the philosophical goal of
set.mm should be - to formalize existing math faithfully, or pursue a
separate vision of how mathematics should be done?

I'd like to hear other opinions.

Norm

David A. Wheeler

unread,
Apr 26, 2015, 9:28:31 AM4/26/15
to metamath
On Thu, 23 Apr 2015 19:55:00 -0700 (PDT), Norman Megill <n...@alum.mit.edu> wrote:
> I'd like to hear other opinions.

I'm not sure it's an *informed* opinion, but that never stopped me before :-).

Please take my comments with a big grain of salt. I understand the basics of
proper classes and sets, but I do not have a deep understanding of classes.

On the other hand, I am probably a good representative of people who
are learning more about math by reading set.mm. I've learned a *ton*
of concepts from set.mm that were never explained clearly in my math classes
and separate reading. Hypertext links everywhere, combined with a simple
substitution rule that you only need to learn once, are a really powerful
combination for independent study and learning. I would like to reach the point
where if civilization collapsed, they could receive *ONLY* these files and by them
retain all of the essentials of mathematics.


Norman Megill:
> I think the fraction of theorems with this "problem" is relatively small
> in the big picture, although I agree that some savings could be
> accomplished. The issues are what is the best df-op definition to make
> a good initial impression, what is the best one for beginners studying
> set.mm vs. their textbook, and even what the philosophical goal of
> set.mm should be - to formalize existing math faithfully, or pursue a
> separate vision of how mathematics should be done?

I think set.mm should generally side with "good initial impression" and
"formalize existing math faithfully". That is why, for example, I also believe set.mm
should define "greater-than" and "sinh" somewhere. These are part of mathematics
that are in common use, and thus should be formalized.


This comment makes me wonder, though (Norman Megill):
> The average mathematician rarely uses proper classes, and many set
> theory books avoid them altogether, so behavior at proper classes would
> be meaningless for them.
> Others might find the behavior outside of the
> intended domain interesting, showing limitations of the Kuratowski
> (/Quine, for proper classes) definition they may not have thought of before.


I understand that the current definition, which mimics what is widely used, is this:
> |- <. A , B >. = { { A } , { A , B } }.


I have a simple question, which may be very very hard to answer:
"Is it typically understood in most typical widely-accepted mathematical sources
that A and B must be sets in <. A , B >. ?"

If that *is* the common understanding, then perhaps the current formalization
is *not* an accurate representation of mathematics as typically intended.
I think mathematical texts are sometimes sloppy about what would be types
in the computing world; this might be another example.

If the mathematical world often presumes that A and B are sets in <. A , B >.
then the rule "formalize existing math faithfully" would imply that that the
current definition, while common, is incorrect and not *actually* what most
mathematicians typically *mean*. I realize that asking "what do most mathematicians
mean" is a tricky slope, but presumably they intend to mean *something*.

On Thursday, April 23, 2015 at 6:35:53 PM UTC-4, Mario Carneiro wrote:
> > The issue is that the literature rarely uses the "standard" definition
> > out of range, in much the same manner as df-fv. I have yet to see an
> > author explicitly discuss the proper class behavior of df-op, and Quine
> > is no exception - although he defines it for classes, he always uses it
> > on set variables or classes which are obviously sets.

I guess this continues my question. Do most mathematicians
seriously intend for <. A , B >. to be used on proper classes as-is? Or is Quine
a special case (or not really serious about it)?


> Others might find ... showing limitations of the Kuratowski
> (/Quine, for proper classes) definition they may not have thought of before.
...
> Quine is happy with { A } = (/) for proper classes, so I think he would
> be fine with opprc2 and friends. I'd be surprised if he wasn't already
> aware of them, maybe even assigning them as exercises for his students
> to help them understand proper classes. The reason he doesn't derive
> them is likely just that he doesn't need them for his book. In set.mm
> we have room to play with things books don't do, and I intended opprc2
> to be as much of a curiosity done for fun as anything practical.

If most mathematicians expect that only *sets* are used in <. A , B >.
then perhaps the definition of *that* operator should limit it to sets.

An alternative, if appropriate, would be to *also* define a separate "Quine" operator
without a "set-only" limitation for those few proofs where you want to show
what the naive operators would do. Perhaps there could be a
<.q A , B >.q for the "Quine-like" (naive?) versions of the operators.
Then it'd be easy to contrast the operators. I don't have a problem with
defining different symbols for similar-yet-different operations;
that makes it easier to understand (and contrast) their differences.

I *personally* don't mind "if", because I come at metamath from
a *computing* background. Whenever I read out piecewise functions definitions,
I read them out with "if" as well. However, if mathematicians typically mean
for <. A , B >. to only apply to sets, and some people might be uncomfortable
with the "if" version, I think this version is also clear:

> > |- <. A , B >. = { x | ( A e. _V /\ B e. _V /\ x e. { { A } , { A , B } } ) }

Norm said:
> To me, it is immediately apparent that it
> collapses the the empty set for proper classes.

I'm not a professional mathematician, but I also found this immediately apparent.
That definition is a little odd because of its indirectness, but
compared to some *other* definitions it's not too hard :-).
I think with some brief text commentary it would work just fine, indeed,
the text commentary might be interesting in its own right :-).

> I think there can be a certain comfort seeing the existing, familiar
> df-op displayed exactly like it is in a book. Unlike a lot of more
> advanced definitions where I don't care that much, df-op is a very
> visible definition people encounter early in their exploration of set.mm.

That does make sense, of course.


> > If you are looking for advantages, that is certainly the case - in fact
> > that's the whole reason I'm suggesting it. Currently we have "reverse
> > closure" for two-argument operations and binary relations only in the
> > left argument, not the right, and this leads to many many avoidable
> > extra sethood hypotheses...

> Well, a purist might argue that by all rights these should have sethood
> hypotheses for *both* sides since for proper classes the relations
> become meaningless, and we're just exploiting a cheap trick provided by
> brrelex to get rid of one of them. :)

If other definitions are intended to only work on sets, then
they should somewhere say so.

Anyway, my two cents. I don't know if I helped, but I hope to.

--- David A. Wheeler

David A. Wheeler

unread,
Apr 26, 2015, 9:28:31 AM4/26/15
to metamath, nm, di.gama
(This wasn't posted on the metamath group promptly; this is a resend.)
----- End Forwarded Message -----

Norman Megill

unread,
Apr 26, 2015, 9:47:21 AM4/26/15
to meta...@googlegroups.com, dwhe...@dwheeler.com, n...@alum.mit.edu, di....@gmail.com
Hi David,

(I'm not sure why your messages are ending up in the moderator approval
queue, flagged as spam.  I have always released them using "Post and
always allow future messages from author", and you are a member with
unrestricted posting privileges.)

You bring up some good points.  AFAIK Quine is the only author defining
ordered pairs for class variables; all others e.g.  Takeuti/Zaring
define <a,b> with their lower-case set variables.

So, I am OK with adopting Mario's proposal.  (Gérard also told me he
likes it.)  I do want to use the non-if version for the actual
definition, because all of its symbols are standard and familiar.  Of
course we can derive the 'if' version as a theorem, as well as the
current Kuratowski version but with sethood hypotheses.

In the past, I have been resistant to deviating from Quine for some
early definitions related to classes.  His book was the first place I
saw them explained clearly, and also he is the only author to generalize
their usage in definitions which other authors define only for sets.  He
provided the only "standard" available for such generalizations.

OTOH set.mm deviates from Quine later on to match more typical
literature; e.g. Quine reverses x and y in the ordered pairs for
functions compared to us and most other literature.

df-sbc is another where only recently have I loosened my rule requiring
reference only to dfsbcq.  (As the comments explain, this rule prevented
deviating from Quine in the behavior of proper classes.)  This is
another area where a large savings will result by eliminating the
sethood requirement on sbc8g etc.  Indeed, the whole hb*d series e.g.
hbbrd exists primarily to service this requirement.

Norm

David A. Wheeler

unread,
Apr 26, 2015, 5:57:47 PM4/26/15
to metamath, di.gama, nm
Well, let me further demonstrate my lack of knowledge.

The current definition is:
> |- <. A , B >. = { { A } , { A , B } }.

If the new plan is to limit <. ... >. to sets, why not just use
set variables in the definition, like this?:
> |- <. x , y >. = { { x } , { x , y } }.

I'm sure there's a great reason, but I'm also sure I don't understand it :-).
After all, many other definitions use set variables instead of class variables.

--- David A. Wheeler

Mario Carneiro

unread,
Apr 26, 2015, 6:16:41 PM4/26/15
to David A. Wheeler, metamath, nm
There is indeed a good reason why class variables are used here instead of set variables, and it has very little to do with the actual applicability of the definition to non-sets. The key thing to understand is that set variables are used whenever you need a *variable* explicitly, as opposed to any object. Basically, we only use set variables when we need to do something variable-like, like substituting the variable for something else or quantifying over the variable. We can't do these things with class variables, which act more like constants. But what you gain in exchange is much more powerful: you are able to substitute class variables with *expressions* which can be arbitrarily large. Set variables, by definition, cannot be directly substituted for anything other than other set variables (although there are theorems like vtocl for manually performing substitutions on set variables).

One example of a set variable in an ordered pair is df-opab. In this case we are interested in using the variables x,y there as bound variables, so we have to use set variables instead of class variables. However, this means that something like { <. (/) , ( x + 2 ) >. | ph } is not valid syntax, because we can't substitute class expressions into the left side of the abstraction. If df-op used set variables, we would only ever be allowed to write ordered pairs of the form <. x , y >. or <. a , b >. or <. z , w >. , and could not use expressions *even if we know the expressions are sets*. In particular, the very useful construction <. <. x , y >. , z >. would not be valid. Note how in order to support this extension to df-opab, we actually need a completely separate definition, df-oprab, and even then we can't do, say, { <. x , <. y , z >. >. | ph } because we would need another definition for that. So the reason to use class variables and not set variables has much more to do with the substitution mechanism than the actual interpretation of classes as elements of the universe.

Mario

Norman Megill

unread,
Apr 26, 2015, 6:29:43 PM4/26/15
to meta...@googlegroups.com, dwhe...@dwheeler.com, n...@alum.mit.edu, di....@gmail.com
On 4/26/15 12:51 PM, David A. Wheeler wrote:
> Well, let me further demonstrate my lack of knowledge.
>
> The current definition is:
>> |- <. A , B >. = { { A } , { A , B } }.
>
> If the new plan is to limit <. ... >. to sets, why not just use
> set variables in the definition, like this?:
>> |- <. x , y >. = { { x } , { x , y } }.
>
> I'm sure there's a great reason, but I'm also sure I don't understand it :-).

This is a good question.  It is important understand set vs. class
variables to properly grasp set.mm as well as the literature.

We can substitute set variables for class variables by virtue of syntax
axiom cv, but there is no reverse syntax axiom allowing substitution of
a class variable for a set variable.  BTW cv is "justified" with the
metalogical argument in the comment of cvjust.

So, from your proposed version with set variables, we cannot conclude
the class-variable version.

The reason there is no reverse syntax axiom for elevating set variables
to class variables is, roughly, to ensure soundness.  For example,
substituting A for y in a9e would let us assert that proper classes such
the Russell class exist, leading to Russell's paradox.  It also wouldn't
make sense to substitute a constant class term, such as _V or (/), for x
in A. x ph.

The rigorous justification for all of this is metalogical reasoning
outside of the scope of Metamath, given for example in Ch. 4 of Takeuti
and Zaring.  We implement the final result of this reasoning in set.mm
in the form of "definitions" df-clel and df-cleq, which are really more
like axioms for class variables, together with cv and df-clab.

In principle, class variables and terms are not necessary to do set
theory/mathematics.  They can always be eliminated from theorems into
equivalent theorems in the primitive language.  See comment in abeq2.
But they allow us to define familiar class constants such as (/) or 3,
which are much more intuitive than complex expressions filled with
quantifiers and set variables.  Compare ax-inf2 vs. omex.


> After all, many other definitions use set variables instead of class variables.

Once class variables are introduced, set variables are used only
as bound variables in definitions.  We use class variables for all free variables
in order to make the definition as general as possible.

Norm

Norman Megill

unread,
Apr 26, 2015, 7:02:39 PM4/26/15
to meta...@googlegroups.com, n...@alum.mit.edu, di....@gmail.com, dwhe...@dwheeler.com
I might add that we can convert set variables to class variables in
cases where the class variable can be shown to be a set, by using
theorems such as vtocl ("[set] variable to class").

However, this can be done only for set variables occurring where the
syntax already allows class variables.

Suppose we keep syntax axiom cop as it is now, "class <. A , B >.", so that
the syntax itself allows class variables.  If we defined df-op as


|- <. x , y >. = { { x } , { x , y } }

then using vtocl2g, we could prove the theorem

|- ( ( A e. _V /\ B e. _V ) -> <. A , B >. = { { A } , { A , B } } )

However, the behavior would remain undefined for proper classes; for
example we couldn't collapse it the empty set.  Instead, we would have
to accompany every use of ordered pairs with A e. _V and B e. _V
hypotheses, which would be much more annoying than even now where at
least we have the limited properties provided by opprc1, opprc2.  The
situation would be very similar to the old prohibition on using df-sbc
directly.

Norm


On Sunday, April 26, 2015 at 6:29:43 PM UTC-4, Norman Megill wrote:
On 4/26/15 12:51 PM, David A. Wheeler wrote:
> Well, let me further demonstrate my lack of knowledge.
>
> The current definition is:
>> |- <. A , B >. = { { A } , { A , B } }.
>
> If the new plan is to limit <. ... >. to sets, why not just use
> set variables in the definition, like this?:
>> |- <. x , y >. = { { x } , { x , y } }.
 
...

Mario Carneiro

unread,
Apr 26, 2015, 7:15:25 PM4/26/15
to Norman Megill, metamath, David A. Wheeler
On Sun, Apr 26, 2015 at 7:02 PM, Norman Megill <n...@alum.mit.edu> wrote:
Suppose we keep syntax axiom cop as it is now, "class <. A , B >.", so that
the syntax itself allows class variables.  If we defined df-op as

|- <. x , y >. = { { x } , { x , y } }

then using vtocl2g, we could prove the theorem

|- ( ( A e. _V /\ B e. _V ) -> <. A , B >. = { { A } , { A , B } } )

Actually, I don't think this is sufficient - you would also need opeq12 or an equivalent as axiomatic, or else you would never be able to get class variables into the notation, even if the syntax itself supports it.
 
However, the behavior would remain undefined for proper classes; for
example we couldn't collapse it the empty set.  Instead, we would have
to accompany every use of ordered pairs with A e. _V and B e. _V
hypotheses, which would be much more annoying than even now where at
least we have the limited properties provided by opprc1, opprc2.  The
situation would be very similar to the old prohibition on using df-sbc
directly.

Indeed, any convention is better than no convention.

Mario Carneiro

unread,
Apr 26, 2015, 7:21:25 PM4/26/15
to Norman Megill, metamath, David A. Wheeler
On Sun, Apr 26, 2015 at 7:15 PM, Mario Carneiro <di....@gmail.com> wrote:


On Sun, Apr 26, 2015 at 7:02 PM, Norman Megill <n...@alum.mit.edu> wrote:
Suppose we keep syntax axiom cop as it is now, "class <. A , B >.", so that
the syntax itself allows class variables.  If we defined df-op as

|- <. x , y >. = { { x } , { x , y } }

then using vtocl2g, we could prove the theorem

|- ( ( A e. _V /\ B e. _V ) -> <. A , B >. = { { A } , { A , B } } )

Actually, I don't think this is sufficient - you would also need opeq12 or an equivalent as axiomatic, or else you would never be able to get class variables into the notation, even if the syntax itself supports it.
 
However, the behavior would remain undefined for proper classes; for
example we couldn't collapse it the empty set.  Instead, we would have
to accompany every use of ordered pairs with A e. _V and B e. _V
hypotheses, which would be much more annoying than even now where at
least we have the limited properties provided by opprc1, opprc2.  The
situation would be very similar to the old prohibition on using df-sbc
directly.

Indeed, any convention is better than no convention.

Also, note that this definition would not be acceptable to the definition checker - one of the requirements is that the syntax be defined for all possible inputs, which means that if you have "class <. A , B >." then the definitional axiom better start with "<. A , B >. = ...". Without this it becomes harder to make sense of the metalogical conservativity argument that works by replacing the definiendum with the definiens everywhere, since <. A , B >. might occur in a proof even though only <. x , y >. is defined.

Norman Megill

unread,
Apr 26, 2015, 7:40:14 PM4/26/15
to meta...@googlegroups.com, n...@alum.mit.edu, dwhe...@dwheeler.com
On Sunday, April 26, 2015 at 7:15:25 PM UTC-4, Mario Carneiro wrote:


On Sun, Apr 26, 2015 at 7:02 PM, Norman Megill <n...@alum.mit.edu> wrote:
Suppose we keep syntax axiom cop as it is now, "class <. A , B >.", so that
the syntax itself allows class variables.  If we defined df-op as

|- <. x , y >. = { { x } , { x , y } }

then using vtocl2g, we could prove the theorem

|- ( ( A e. _V /\ B e. _V ) -> <. A , B >. = { { A } , { A , B } } )

Actually, I don't think this is sufficient - you would also need opeq12 or an equivalent as axiomatic, or else you would never be able to get class variables into the notation, even if the syntax itself supports it.
 
You are right, I overlooked that.

Norm

Cris Perdue

unread,
Apr 27, 2015, 2:36:54 AM4/27/15
to meta...@googlegroups.com
As another non-mathematician who has only begun to scratch the surface of set theory as it appears in set.mm, I would just like to add my very non-professional point of view to the others here, that the definition of ordered pair 

 |- <. A , B >. = { x | ( A e. _V /\ B e. _V /\ x e. { { A } , { A , B } } ) }

looks more helpful to me than the current df-op. I like the way that it makes obvious that there are issues with making assumptions about A and B. I like the clarity that the result may be the empty set, depending on A and B. This also does not look very complicated compared with the shorter classic definition. Anyway the classic definition (to me) is associated with versions of set theory that do not have classes in any form.

Best regards,
Cris


--
You received this message because you are subscribed to the Google Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metamath+u...@googlegroups.com.
To post to this group, send email to meta...@googlegroups.com.
Visit this group at http://groups.google.com/group/metamath.
For more options, visit https://groups.google.com/d/optout.

David A. Wheeler

unread,
Apr 27, 2015, 2:36:54 AM4/27/15
to Norman Megill, meta...@googlegroups.com, di....@gmail.com
> This is a good question. It is important understand set vs. class
variables to properly grasp set.mm as well as the literature.

Ah, got it!! That makes sense.

Is this explained (in summary form) within set.mm comments? If not, I think it should be.


--- David A.Wheeler

Mario Carneiro

unread,
Apr 28, 2015, 10:24:17 AM4/28/15
to Norman Megill, metamath, David A. Wheeler
On Sun, Apr 26, 2015 at 9:47 AM, Norman Megill <n...@alum.mit.edu> wrote:
So, I am OK with adopting Mario's proposal.  (Gérard also told me he
likes it.)  I do want to use the non-if version for the actual
definition, because all of its symbols are standard and familiar.  Of
course we can derive the 'if' version as a theorem, as well as the
current Kuratowski version but with sethood hypotheses.

I'm currently working on implementing this proposal, so expect it in the next update. The biggest changes so far have been eliminating sethood hypotheses from most of the equinumerosity and dominance theorems (there were a lot of unnecessary hypotheses here even under the old definition) and the real number construction (theorems like addcomnq don't need sethood hypotheses anymore).

I have one more question, while I'm at it, regarding the definition of < . We've discussed the inclusion of -. 0 e. CC as an axiom before, in order to eliminate some hypotheses, and the proposal was rejected. But how do you feel about eliminating unnecessary hypotheses when it is possible to do so *without adding any axioms*? In particular, under the current version "<" is actually a relation, unlike its axiomatic counterpart <RR, so that with the new df-op we have

( A < B -> ( A e. RR* /\ B e. RR* ) )

which as you can imagine can eliminate a whole lot of hypotheses. Similar remarks apply to "<_".

This discussion also extends to theorems involving + and x. , if one is permitted to use ax-addopr and ax-mulopr. As it stands currently, we are avoiding these theorems when possible, but practically speaking we have to use them for theorems like addcn, and I find this noncommittal stance somewhat troubling. It is true that we could, if we wanted, use some axiomatic +CC thing which is not necessarily a function, but then we would almost certainly want to define an actual function + = ( x e. CC , y e. CC |-> ( x +CC y ) ) because it is so much easier to work with actual functions here.

In short, there are a lot of unnecessary hypotheses in the real and complex numbers section, and every time I do an algebraic proof (expect the quadratic, cubic, quartic formulas in the next update!) I wish that we took more advantage of the axioms we are already using.

Mario

Norman Megill

unread,
Apr 29, 2015, 8:26:36 PM4/29/15
to meta...@googlegroups.com, dwhe...@dwheeler.com, n...@alum.mit.edu


On Tuesday, April 28, 2015 at 10:24:17 AM UTC-4, Mario Carneiro wrote:

I have one more question, while I'm at it, regarding the definition of < . We've discussed the inclusion of -. 0 e. CC as an axiom before, in order to eliminate some hypotheses, and the proposal was rejected. But how do you feel about eliminating unnecessary hypotheses when it is possible to do so *without adding any axioms*? In particular, under the current version "<" is actually a relation, unlike its axiomatic counterpart <RR, so that with the new df-op we have

( A < B -> ( A e. RR* /\ B e. RR* ) )

which as you can imagine can eliminate a whole lot of hypotheses. Similar remarks apply to "<_".

This discussion also extends to theorems involving + and x. , if one is permitted to use ax-addopr and ax-mulopr. As it stands currently, we are avoiding these theorems when possible, but practically speaking we have to use them for theorems like addcn, and I find this noncommittal stance somewhat troubling. It is true that we could, if we wanted, use some axiomatic +CC thing which is not necessarily a function, but then we would almost certainly want to define an actual function + = ( x e. CC , y e. CC |-> ( x +CC y ) ) because it is so much easier to work with actual functions here.

In short, there are a lot of unnecessary hypotheses in the real and complex numbers section, and every time I do an algebraic proof (expect the quadratic, cubic, quartic formulas in the next update!) I wish that we took more advantage of the axioms we are already using.

Mario

I've been thinking this over and have mixed feelings.  There is part of me that likes the shorter theorems and proofs that will result.  But I try to look outside of just what might appeal to me personally and consider how it might be received by the intended audience.

It seems weird to have theorems about reals without hypotheses, exploiting tricks that depend on arbitrary properties of how our particular definitions behave outside of their intended domains.  Someone else's definitions might give different results.

This also seems a little different from exploiting the proper class behavior of df-op to eliminate sethood hypotheses.  A lot of literature implicitly assumes variables are sets, so we actually get closer to matching the exact form of the theorems in such literature, just slightly generalized to work (if meaninglessly) with proper classes, when we are able to omit sethood hypotheses.

I know we do these tricks a lot in the construction of the reals, but that is something which once done is hidden away and few people study.  Its main purpose isn't expository but is to get the reals as efficiently as possible so we can start working with them.  Anyone going through it will probably have a relatively sophisticated level of set theoretical knowledge.

The development of RR and CC, though, is expository, at least near the beginning, and we need to take into account the people who will be studying it.

I'd like to hear other opinions on this.  How do people feel about A + B = B + A, and A < B /\ B < C -> A < C without hypotheses, even though the operations and relations are intended only for CC and RR?

The initial RR/CC development is likely used by people interested in learning how the properties of the numbers they are familiar with emerge from the number axioms.  At that level they may not be ready to study the set-theoretical underpinnings in depth.  Is it going to be confusing to sometimes require and sometimes omit hypotheses that the variables are numbers?  How will they know what the rules are for omitting or not omitting hypotheses?

Norm

David A. Wheeler

unread,
Apr 29, 2015, 10:38:20 PM4/29/15
to meta...@googlegroups.com, Norman Megill, n...@alum.mit.edu
On April 29, 2015 8:26:36 PM EDT, Norman Megill <n...@alum.mit.edu> wrote:
>I'd like to hear other opinions on this. How do people feel about A +
>B =
>B + A, and A < B /\ B < C -> A < C without hypotheses, even though the
>operations and relations are intended only for CC and RR?

I find it deeply troubling to accept A + B equals B + A without any check of the parameter types. If + is supposed to only apply to CC, then that should be explicit somewhere. Some systems predefine the type of certain symbols, e.g., to make some symbols mean a complex number, but the type is still explicit for each operation.


--- David A.Wheeler

Mario Carneiro

unread,
Apr 29, 2015, 10:42:27 PM4/29/15
to metamath
Note that you don't have to 'accept' that A + B = B + A without hypotheses - we're talking about *proofs* here. The axiom ax-addcom would be unchanged, and in particular still indicates that the validity is limited to CC, but it is possible to prove from that axiom the more general version with no hypotheses on A and B.

David A. Wheeler

unread,
Apr 30, 2015, 12:24:38 AM4/30/15
to metamath
On Wed, 29 Apr 2015 22:42:27 -0400, Mario Carneiro <di....@gmail.com> wrote:
> Note that you don't have to 'accept' that A + B = B + A without hypotheses
> - we're talking about *proofs* here. The axiom ax-addcom would be
> unchanged, and in particular still indicates that the validity is limited
> to CC, but it is possible to prove from that axiom the more general version
> with no hypotheses on A and B.

I don't see an "ax-addcom"; I do see "addcom", and I presume that's what you're referring to:
|- ( ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A ) )

So, let me restate things to see if I understand. Addition is defined using df-plus:
df-plus $a |- + = { <. <. x , y >. , z >. | ( ( x e. CC /\ y e. CC ) /\
E. w E. v E. u E. f ( ( x = <. w , v >. /\ y = <. u , f >. )
/\ z = <. ( w +R u ) , ( v +R f ) >. ) ) } $.
Since the input values "x" and "y" must be in CC to return a value in the set, "+"
returns the empty set if either x or y is not in CC. Since the empty set is equal
to itself, "A + B" is always equal to "B + A" if either A or B are not in CC, since both
will return the empty set. That's true if A or B are non-CC sets, or are proper classes...
and thus, the hypotheses are not actually needed. Do I have it basically correct?

Hmmm. I don't like that, but I don't see any easy solutions.
In IEEE floating point you can return a "NaN"
which is not even equal to itself, which attempts to address this sort of thing...
but that sort of thing creates a lot of complications.

--- David A. Wheeler

Mario Carneiro

unread,
Apr 30, 2015, 7:23:31 AM4/30/15
to metamath

On Thu, Apr 30, 2015 at 12:24 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
On Wed, 29 Apr 2015 22:42:27 -0400, Mario Carneiro <di....@gmail.com> wrote:
> Note that you don't have to 'accept' that A + B = B + A without hypotheses
> - we're talking about *proofs* here. The axiom ax-addcom would be
> unchanged, and in particular still indicates that the validity is limited
> to CC, but it is possible to prove from that axiom the more general version
> with no hypotheses on A and B.

I don't see an "ax-addcom"; I do see "addcom", and I presume that's what you're referring to:
 |- ( ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A ) )

So, let me restate things to see if I understand.  Addition is defined using df-plus:
    df-plus $a |- + = { <. <. x , y >. , z >. | ( ( x e. CC /\ y e. CC ) /\
                  E. w E. v E. u E. f ( ( x = <. w , v >. /\ y = <. u , f >. )
                  /\ z = <. ( w +R u ) , ( v +R f ) >. ) ) } $.
Since the input values "x" and "y" must be in CC to return a value in the set, "+"
returns the empty set if either x or y is not in CC.  Since the empty set is equal
to itself, "A + B" is always equal to "B + A" if either A or B are not in CC, since both
will return the empty set.  That's true if A or B are non-CC sets, or are proper classes...
and thus, the hypotheses are not actually needed.  Do I have it basically correct?

Hm, perhaps commutativity of addition was not the best example, as it is no longer an axiom thanks to Eric Schmidt's work. The same remarks can be applied to ax-mulcom instead, if you like. But it is neither necessary nor desirable to start from df-plus; that is part of the construction of the reals and would not be permitted to be used in a planned no-hypothesis proof of |- ( A + B ) = ( B + A ). Rather, the starting point is ax-addopr, which is just as good, since it also says that + C_ ( ( CC X. CC ) X. CC ), so that ( A + B ) = (/) if -. ( A e. CC /\ B e. CC ) (by ndmov), and the rest of your analysis is correct. This whole argument is encoded in the already existing theorem ndmovcom, and you can see it in use in addcompi. The effect of the df-op change has been to extend the applicability of these arguments to proper classes as well (under the old df-op something like ( _V + 0 ) = (/) but ( 0 + _V ) = ( 0 + 0 ) = 0 by ovprc2 would give a counterexample).

The idea of the real number axioms is to abstract away the specific definitions of df-r, df-plus and so on, and get the essential properties that we expect to be satisfied by these objects. Thus the only thing you have to "accept" are these axioms themselves, and by no means am I suggesting that these should be generalized to areas where they don't obviously apply. But to take addcom as an example, we used to have this as an axiom, until it was discovered that it is already provable even if we don't know it to start with. This is the power of proof - we now have one fewer thing to "accept" while still retaining the same theorems. Now what I am saying is that from these axioms the current *theorems* addcom, mulcom etc. don't go as far as they could in generalizing the statement, and in fact ( A + B ) = ( B + A ) is true generally.

As a reader, when I see a theorem with fewer hypotheses than I expect, it is a happy accident, but doesn't significantly affect my interpretation of the theorem - much more important to me are the hypotheses that *are* there. This happens occasionally even after the real number section, for example expneg, which doesn't require A =/= 0 even though it's almost certainly meaningless for A = 0, or elfzuz3, which doesn't require M e. ZZ and N e. ZZ even the theorem is always used in this case (because it is implied by K e. ( M ... N ) ). Generally, I try to eliminate as many hypotheses as I can in every theorem, even using "weird tricks" when necessary, because it is one less proof obligation for users of the theorem. climge0 is another example of this (if you look at the proof you will note that brprc is used to eliminate the assumption A e. _V). If I stuck to the usual approach, this theorem would include A e. RR and F : Z --> RR and a bunch of other stronger assumptions that are not needed but are provable from the given hypotheses. The reason I bring this up for discussion is because addcom and lttr are conspicuously un-simplified in this sense, and I am currently treating these theorems as a special case even though the distinction is questionable.

PS: ax-addopr and ax-mulopr seem badly named. Can we change these to ax-addf and ax-mulf (in keeping with absf, gcdf, metf, etc.)?

Mario

Norman Megill

unread,
Apr 30, 2015, 10:49:46 AM4/30/15
to meta...@googlegroups.com
On 4/30/15 7:23 AM, Mario Carneiro wrote:
> As a reader, when I see a theorem with fewer hypotheses than I expect,
> it is a happy accident, but doesn't significantly affect my
> interpretation of the theorem

My reaction is the same, but you and I are "advanced" readers who know the set-theoretical and definition-specific reasons for it.  We need to take into account the intended audience.

TBH I don't really know what that audience is, but it could include a student struggling to understand how things are proved from axioms (in this case, the axioms of complex numbers plus some elementary logic but not too much set theory), as well as a mathematician who might get turned off by the seemingly random requirement of hypotheses in some places and not in others.  They don't match anything in a book because our definitions e.g. df-fv are different from essentially all books in their out-of-domain behavior.

Let's say a student writes down a list of useful elementary theorems obtained from studying the early development of complex numbers in set.mm, struggling from a very elementary point of view to understand how proofs work.  The list will have hypotheses for some, no hypothesis for others, and a seemingly random partial set of hypotheses for still others.  Not knowing the underlying tricks and not having the background to dig into the set-theoretical reasons, wouldn't it seem like some kind of black art, confusing and frustrating?  No textbook would ever publish a list of theorems with such randomly required hypotheses, instead it would say that _all_ variables are assumed to be complex or real numbers unconditionally.  Imagine if it said something like "a must be a complex number while b and c can be any sets in the following theorem..."


> - much more important to me are the
> hypotheses that *are* there. This happens occasionally even after the
> real number section, for example expneg, which doesn't require A =/= 0
> even though it's almost certainly meaningless for A = 0, or elfzuz3,
> which doesn't require M e. ZZ and N e. ZZ even the theorem is always
> used in this case (because it is implied by K e. ( M ... N ) ).

You are right that later on, we start to use such tricks.  It doesn't bother me as much because the material is more advanced, and anyone who has gotten that far is hopefully in a position to understand the tricks.

Perhaps a compromise is to partition the complex number development into "elementary" and "advanced" material.  The former would always include hypotheses, even if unnecessary based on our definitions.  The emphasis would be on simple proofs from complex number axioms, preferring chains of equalities where possible and attempting to keep quantification at a minimum, perhaps with some of it rewritten to use your "ph->" deductions when it makes reading the proofs easier.

Not being a teacher nor a new user of set.mm, I don't know if I'm creating straw men to claim problems that don't exist.  It would be nice to hear from an educator (and new users).

BTW I wonder if for the purpose of an "elementary" section we should restore ax-addcom and some some others you can see removed from the Real and Complex Numbers page, that use somewhat advanced first-order logic for their derivations.  Of course these results (most due to Eric Schmidt) are excellent from a theoretical point of view, and I'm very glad that we have them.  But for teaching purposes, it seems like quite a hurdle to overcome before the student can prove that 2+3=3+2.  Again, it would be nice to hear an educator's take on this.  Perhaps a compromise is to get the difficult axiomatics like ax-addcom out of the way near the beginning, then collect again all of the "normal" axioms including addcom in one place that would become the starting point for a beginner.  I think we already have most of the axioms repeated without "ax-" for naming consistency e.g. mulcom.



> Generally, I try to eliminate as many hypotheses as I can in every
> theorem, even using "weird tricks" when necessary, because it is one
> less proof obligation for users of the theorem. climge0 is another
> example of this (if you look at the proof you will note that brprc is
> used to eliminate the assumption A e. _V). If I stuck to the usual
> approach, this theorem would include A e. RR and F : Z --> RR and a
> bunch of other stronger assumptions that are not needed but are provable
> from the given hypotheses. The reason I bring this up for discussion is
> because addcom and lttr are conspicuously un-simplified in this sense,
> and I am currently treating these theorems as a special case even though
> the distinction is questionable.

(BTW an issue we haven't discussed is that using tricks to eliminate hypotheses makes set.mm more "brittle" in the sense that a change to a definition's out of domain behavior can require a major change in the database.  In software, "brittle" code is generally considered undesirable since small changes risk introducing bugs.  With Metamath the issue is mainly the difficulty of making a change since bugs in principle can't exist.  I guess this isn't a real problem as long as you are willing to do the work...)


>
> PS: ax-addopr and ax-mulopr seem badly named. Can we change these to
> ax-addf and ax-mulf (in keeping with absf, gcdf, metf, etc.)?

I agree, please rename them.

Norm

David A. Wheeler

unread,
Apr 30, 2015, 12:31:19 PM4/30/15
to metamath
Mario Carneiro wrote:
> ... Rather, the starting point is ax-addopr,
> which is just as good, since it also says that + C_ ( ( CC X. CC ) X. CC ),
> so that ( A + B ) = (/) if -. ( A e. CC /\ B e. CC ) (by ndmov), and the
> rest of your analysis is correct...

Okay. Here's my understanding. Under the current definition for many
operators intended for complex and real numbers, they return the empty set
if they are not given solely complex/real numbers. Of course,
there's only 1 empty set (and it's always equal to itself).
This means that, in many cases, results are equal that may be surprising.
E.G., forall A, B: -. ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A )

I think many people will find it... well... odd that one can prove that
A + B = B + A if at least one is a proper classes. Indeed, if either A and B
are proper classes, it's always true that A + B = A x. B = A / B = A - B .

> The idea of the real number axioms is to abstract away the specific
> definitions of df-r, df-plus and so on, and get the essential properties
> that we expect to be satisfied by these objects. Thus the only thing you
> have to "accept" are these axioms themselves, and by no means am I
> suggesting that these should be generalized to areas where they don't
> obviously apply. But to take addcom as an example, we used to have this as
> an axiom, until it was discovered that it is already provable even if we
> don't know it to start with. This is the power of proof - we now have one
> fewer thing to "accept" while still retaining the same theorems. Now what I
> am saying is that from these axioms the current *theorems* addcom, mulcom
> etc. don't go as far as they could in generalizing the statement, and in
> fact ( A + B ) = ( B + A ) is true generally.
>
> As a reader, when I see a theorem with fewer hypotheses than I expect, it
> is a happy accident, but doesn't significantly affect my interpretation of
> the theorem - much more important to me are the hypotheses that *are*
> there.

I understand, but if you can prove something unexpected there are several possibilities:
1. Cool new result (output is okay!)
2. Bug in verifier (process was wrong)
3. Incorrectly-stated assumptions (input was wrong)

My worry is #3. Some mathematicians seem to be sloppy about what I'd call "types".
They may present a formal-looking statement, but merely imply with words
that (for example) they assume that all values are types... and do NOT
formally define what happens when you given the non-complex-numbers
(or even worse, proper classes).

That's what I mean by "accept". If something can be proven, it may be true... or
it may be that the assumptions as stated are not actually what was meant.
If there are fewer hypotheses than expected, it may be
that there's a hidden assumption that needs un-hiding.

> ... Generally, I try to eliminate as many
> hypotheses as I can in every theorem, even using "weird tricks" when
> necessary, because it is one less proof obligation for users of the theorem.

Generalization is great!! But sometimes, if a hypotheses is not needed, that may
indicate a bug (i.e., that there's an assumption that failed to be stated).

The current approach - simply returning the empty set if there are no valid results -
is easy to understand. If it was decided that this was *NOT* an adequate formalism, "fixing" it
might be very hard. Creating new types for numbers (beyond wff, set, class) is one approach.
Another is having the "results" be the original expression if it's out-of-bounds, e.g.,
if P and Q are proper classes, then P + Q is P + Q. Clearly the proofs follow the formalism,
but does the formalism what mathematicians actually mean, and if not, is there a simple "fix"
to capture what they actually mean? (I suspect the answer to that last question is "no").

--- David A. Wheeler

Mario Carneiro

unread,
Apr 30, 2015, 1:10:36 PM4/30/15
to metamath
On Thu, Apr 30, 2015 at 12:31 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
Okay. Here's my understanding.  Under the current definition for many
operators intended for complex and real numbers, they return the empty set
if they are not given solely complex/real numbers.  Of course,
there's only 1 empty set (and it's always equal to itself).
This means that, in many cases, results are equal that may be surprising.
  E.G., forall A, B:  -. ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A )

Yes, this is correct.
 
Actually, the way I see it it's the other way around. When verifying that a theorem proves what you want it to prove, you only need check that every hypothesis of the formalization is a hypothesis of the intended result. If there are hypotheses of the intended result that are not in the formalization, well that just means the formalization is more general, even if the added generality is not in a meaningful direction. In other words, given a "too general" result, you can easily make it less general by adding hypotheses and ignoring these extra hypotheses. If a mathematician tells me that a + b = b + a when a and b are real numbers and I prove that a + b = b + a for any a,b at all, then as long as my system includes something that can be interpreted as a real number then I can claim to have proven the same theorem (with some extra generality). "Hidden assumptions" are the opposite issue, when the definitions include more or different assumptions than are normally understood for the concepts, for example if I define a prime number to be an odd number and then prove Bertrand's postulate under that assumption.

> ... Generally, I try to eliminate as many
> hypotheses as I can in every theorem, even using "weird tricks" when
> necessary, because it is one less proof obligation for users of the theorem.

Generalization is great!! But sometimes, if a hypotheses is not needed, that may
indicate a bug (i.e., that there's an assumption that failed to be stated).

Generally, if you try to prove a theorem without enough assumptions, you will notice this halfway into the proof and will not be able to complete the proof. The whole point of a verifier is to make sure that if you make it all the way to the finish line, you have indeed made all the necessary assumptions.
 
The current approach - simply returning the empty set if there are no valid results -
is easy to understand.  If it was decided that this was *NOT* an adequate formalism, "fixing" it
might be very hard.  Creating new types for numbers (beyond wff, set, class) is one approach.
Another is having the "results" be the original expression if it's out-of-bounds, e.g.,
if P and Q are proper classes, then P + Q is P + Q.  Clearly the proofs follow the formalism,
but does the formalism what mathematicians actually mean, and if not, is there a simple "fix"
to capture what they actually mean?  (I suspect the answer to that last question is "no").

I'm not sure I like this characterization. There is a good analogy between terms in a formal system and evaluable expressions in a programming language, but it's not a perfect one and terms in Metamath don't exactly "evaluate". We have a set of rules for transforming expressions into other expressions, but they need not be used or used in any particular order in the manner of evaluating a function in a programming language. Especially when it comes to axiomatic behavior, a given expression may not be evaluable at all. For example "if ( 0 = (/) , 1 , 0 )" is known to be either 1 or 0, but we don't know which because the axioms of the reals don't say anything about whether 0 is the empty set or not. Another good example is Inacc (df-ina). We know that om e. Inacc but there are no other known elements and ZFC cannot prove or disprove the existence of any other elements - it could be { om } or a proper class or anything in between.

So if P and Q are proper classes, of course P + Q is P + Q. This is true for non-proper classes too. But the term "P + Q" itself is an expression in the metamath language, which is interpreted as (but separate from) an element of the universe. Note that the typecodes "set", "wff", "class" refer to types of terms in the metamath language, not types of elements of the universe, which is more accurately represented by "A e. CC", "A e. ZZ", etc. Norm has mentioned an alternative definition of df-fv where the out-of-domain behavior sets it to _V instead of (/) and this is also a possibility (although not a very convenient one for a few reasons). But there is another choice, to have it equal nothing at all, and this is where one departs from the computer language analogy. We did this for a long time with df-sbc, where there was no defined proper-class behavior, because we didn't use the definition and instead derived everything through dfsbcq instead. One can argue that the most common "mathematician's definition" of df-fv is exactly this, to simply not define it out of domain, and then you are forced to include in-domain assumptions in any theorem using it.

Mario


David A. Wheeler

unread,
Apr 30, 2015, 2:20:31 PM4/30/15
to metamath
On 4/30/15 7:23 AM, Mario Carneiro wrote:
> > As a reader, when I see a theorem with fewer hypotheses than I expect,
> > it is a happy accident, but doesn't significantly affect my
> > interpretation of the theorem

On Thu, 30 Apr 2015 07:49:46 -0700 (PDT), Norman Megill <n...@alum.mit.edu> wrote:
> My reaction is the same, but you and I are "advanced" readers who know the
> set-theoretical and definition-specific reasons for it. We need to take
> into account the intended audience.
>
> TBH I don't really know what that audience is, but it could include a
> student struggling to understand how things are proved from axioms (in this
> case, the axioms of complex numbers plus some elementary logic but not too
> much set theory), as well as a mathematician who might get turned off by
> the seemingly random requirement of hypotheses in some places and not in
> others. They don't match anything in a book because our definitions e.g.
> df-fv are different from essentially all books in their out-of-domain
> behavior.

I think both students interested in math, and mathematicians, at least are in the audience.
In fact, I think in the long run you'll have more student-readers than anyone else.
I have learned more from set.mm than reading several other books ever taught me.
HOL Light is a fine tool, but you can't *read* the proofs the same way as you can with set.mm.

Set.mm could be the gateway drug for future mathematicians :-).

> ... No textbook would ever publish a list of
> theorems with such randomly required hypotheses, instead it would say that
> _all_ variables are assumed to be complex or real numbers unconditionally.
> Imagine if it said something like "a must be a complex number while b and c
> can be any sets in the following theorem..."

> Not being a teacher nor a new user of set.mm, I don't know if I'm creating
> straw men to claim problems that don't exist. It would be nice to hear
> from an educator (and new users).

I'm not exactly the right example, but I do teach, and I'm a newer user.

Currently you can prove A + B = B + A = B x. A = B - A ,
for all A and B, if A or B aren't complex numbers. I find that, well, surprising.

That doesn't make it *wrong* - it clearly DOES follow from the current definitions.
If those definitions really are what mathematicians normally mean, then so be it.
But perhaps that means that the current formalizations don't precisely capture
what most mathematicians *mean*, and thus it may be wise to change the
definitions. That said, there must be a sensible alternative.

> (BTW an issue we haven't discussed is that using tricks to eliminate
> hypotheses makes set.mm more "brittle" in the sense that a change to a
> definition's out of domain behavior can require a major change in the
> database. In software, "brittle" code is generally considered undesirable
> since small changes risk introducing bugs. With Metamath the issue is
> mainly the difficulty of making a change since bugs in principle can't
> exist. I guess this isn't a real problem as long as you are willing to do
> the work...)

Yes. Especially if there's a possibility that the out-of-domain results may change.

Mario:
> Norm has mentioned an alternative definition of df-fv where the
> out-of-domain behavior sets it to _V instead of (/) and this is also a
> possibility (although not a very convenient one for a few reasons). But
> there is another choice, to have it equal nothing at all, and this is where
> one departs from the computer language analogy. We did this for a long time
> with df-sbc, where there was no defined proper-class behavior, because we
> didn't use the definition and instead derived everything through dfsbcq
> instead. One can argue that the most common "mathematician's definition" of
> df-fv is exactly this, to simply not define it out of domain, and then you
> are forced to include in-domain assumptions in any theorem using it.

I *like* considering this "other choice" for +, x., and so on.... simply don't define
these operations' results outside the domain at all. Since it then doesn't produce
"the empty set", then the results are not "equal".
I do think there is a very good argument that this is
the common "mathematician's definition", and that was my point earlier.

What would that look like? What would the impact be on the current material?
Are there serious drawbacks to that approach? Is there an even better approach?
I'm guessing this would be a major effort. But maybe not;
if most theorems require the hypothesis anyway, maybe it's no big deal.

--- David A. Wheeler

fl

unread,
May 1, 2015, 6:12:34 AM5/1/15
to meta...@googlegroups.com

Let's say a student writes down a list of useful elementary theorems obtained from studying the early development of complex numbers in set.mm, struggling from a very elementary point of view to understand how proofs work.  The list will have hypotheses for some, no hypothesis for others, and a seemingly random partial set of hypotheses for still others.  Not knowing the underlying tricks and not having the background to dig into the set-theoretical reasons, wouldn't it seem like some kind of black art, confusing and frustrating? 


1) Metamath has the peculiarity of never being frustrating because everything is said. You may need time but it is never frustrating. 
Other systems like Mizar can be very frustrating.

2) The best way to help the student is to explain clearly Metamath's tricks. The list of which includes:

   a) A definition of structures which is far from being standard (in particular a ring is a group).
   b) Propositional variables are included erratically in theorems (it never happens in a textbook).
   c) There are classes and set variables (it never happens in a textbook).
   d) The proper classes are taken into account (a textbook only uses sets).
   e) We use A e. B to mean that something is a set.
   f) Sometimes we use om to mean the natural numbers, sometimes we use NN.
   g) Mario Carneiro "borrows" definitions and theorems to his friends without even an acknowledgement. (real authors like Knuth always mention what they borrow.) 
   h) The definition of numbers is far from being standard. Normal textbooks define NN before CC.
   i) Recursive functions are defined in an odd manner.
   j) In functions implemented as class builders, sometimes conditions about the variables are given, sometimes they are not.
   k) iota never appears in any textbook.
   l) Partial functions return many things and in particular an Undefined value.
   m) Most constants is the short for a class builder but not always. At least explain this is to check easily the system is sound.
   n) The venerable ( ph -> A. x ph ) deserves an explanation.
   o) In metamath we substitute variables explicitely. (In textbooks they do it discreetly.)
   p) The deduction theorem is not implemented but we have tricks to deal with the most usual cases.

3) For the student I recommend that you add a star to the theorem that are particularly useful. None of us for example knows 
every propositional or predicate calculus theorems. In fact we only need a handle of them. The student may need  
to know that.

4) If I may give my opinion I don't think the definition of a pair is a problem. It is the classical one. Simply textbooks
forget to raise there are issues with proper classes and sweep the dust under the rug. In this regard set.mm is peculiarly
useful to the "student" because it shows him/her what the textbooks don't say. It shouldn't be hidden for hypothetical
student headaches (in my opinion their fees are more  able to give them night-time sweating than the
definition of the pair and with better reason).

-- 
FL

fl

unread,
May 1, 2015, 7:40:38 AM5/1/15
to meta...@googlegroups.com

 
 [...]
  n) The venerable ( ph -> A. x ph ) deserves an explanation.
   o) In metamath we substitute variables explicitely. (In textbooks they do it discreetly.)
   p) The deduction theorem is not implemented but we have tricks to deal with the most usual cases.


  q) the use of implicit substitutions.
 
3) For the student I recommend that you add a star to the theorem that are particularly useful. None of us for example knows  every propositional or predicate calculus theorems. In fact we only need a handle of them. The student may need  to know that.


Not a "handle" a "handful". The foreign student that attends an English course will understand me.

--
FL

Mario Carneiro

unread,
May 1, 2015, 9:29:10 AM5/1/15
to metamath
On Thu, Apr 30, 2015 at 10:49 AM, Norman Megill <n...@alum.mit.edu> wrote:
Perhaps a compromise is to partition the complex number development into "elementary" and "advanced" material.  The former would always include hypotheses, even if unnecessary based on our definitions.  The emphasis would be on simple proofs from complex number axioms, preferring chains of equalities where possible and attempting to keep quantification at a minimum, perhaps with some of it rewritten to use your "ph->" deductions when it makes reading the proofs easier.

Okay, so here's a revised proposal to hopefully address the "pedagogical issues" with the original. Currently, the *i inference theorems have seen a systematic decrease in use due to the added emphasis on deduction-style proofs. Thus I propose to remove all the *i theorems from the introductory real/complex numbers section and move them to a new section, say just before 5.4 "Integer sets", entitled 5.3.11 "Hypothesis-free field axioms and properties" or similar, and then remove as many hypotheses as possible from these theorems. Thus section 5.2 will be only Hilbert-style proofs, while the new section derives theorems making maximal use of "set-theoretic tricks" to get rid of the hypotheses. Thus we will have

addcom $p |- ( ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A ) )

in an early section, and

addcomi $p |- ( A + B ) = ( B + A )

later on. The "pedagogical" theorems like 2p2e4 will be mostly in section 2.2 and 2.3 and use entirely Hilbert-style proofs with all appropriate assumptions stated in an antecedent.

Also, should this version be acceptable (to be done in the next next update, after the df-op update is done), I also plan to add deduction versions of the most useful algebra theorems, especially closure theorems.

Mario

David A. Wheeler

unread,
May 1, 2015, 11:11:26 AM5/1/15
to metamath
(First, a long quote for context.)

> On Thu, Apr 30, 2015 at 10:49 AM, Norman Megill <n...@alum.mit.edu> wrote:
> > Perhaps a compromise is to partition the complex number development into
> > "elementary" and "advanced" material. The former would always include
> > hypotheses, even if unnecessary based on our definitions. The emphasis
> > would be on simple proofs from complex number axioms, preferring chains of
> > equalities where possible and attempting to keep quantification at a
> > minimum, perhaps with some of it rewritten to use your "ph->" deductions
> > when it makes reading the proofs easier.


On Fri, 1 May 2015 09:29:10 -0400, Mario Carneiro <di....@gmail.com> replied:
> Okay, so here's a revised proposal to hopefully address the "pedagogical
> issues" with the original. Currently, the *i inference theorems have seen a
> systematic decrease in use due to the added emphasis on deduction-style
> proofs. Thus I propose to remove all the *i theorems from the introductory
> real/complex numbers section and move them to a new section, say just
> before 5.4 "Integer sets", entitled 5.3.11 "Hypothesis-free field axioms
> and properties" or similar, and then remove as many hypotheses as possible
> from these theorems. Thus section 5.2 will be only Hilbert-style proofs,
> while the new section derives theorems making maximal use of "set-theoretic
> tricks" to get rid of the hypotheses. Thus we will have
>
> addcom $p |- ( ( A e. CC /\ B e. CC ) -> ( A + B ) = ( B + A ) )
>
> in an early section, and
>
> addcomi $p |- ( A + B ) = ( B + A )
>
> later on. The "pedagogical" theorems like 2p2e4 will be mostly in section
> 2.2 and 2.3 and use entirely Hilbert-style proofs with all appropriate
> assumptions stated in an antecedent.

*Personally* I like the idea of limiting "+" so it cannot be used with non-CC arguments;
I think that's consistent with how some sources define them.

But if this is the direction set.mm goes, I think it's important that this be discussed
in comments somewhere in set.mm, and then have many other theorems point to that.
Then this no longer looks haphazard, but is a clearly-documented approach.

It should note, for example, that these elementary definitions with hypotheses would
work even if those functions only had a definition when provided complex values.
It should also note somewhere that if these functions are given non-CC they are
equal to the empty set, and thus (for example)
A + B = A x. B for all A, B if either are non-CC - including proper classes!

--- David A. Wheeler

David A. Wheeler

unread,
May 1, 2015, 1:19:42 PM5/1/15
to metamath
On Thu, 30 Apr 2015 07:49:46 -0700 (PDT), Norman Megill <n...@alum.mit.edu> wrote:
> (BTW an issue we haven't discussed is that using tricks to eliminate
> hypotheses makes set.mm more "brittle" in the sense that a change to a
> definition's out of domain behavior can require a major change in the
> database. In software, "brittle" code is generally considered undesirable
> since small changes risk introducing bugs. With Metamath the issue is
> mainly the difficulty of making a change since bugs in principle can't
> exist. I guess this isn't a real problem as long as you are willing to do
> the work...)

This is exactly the sort of thing that I think should be documented
(e.g., somewhere near the beginning the "elementary" section of set.mm).
Something like this:

=================================

Many sources in the literature are silent, or inconsistent, about what happens
when functions that expect numbers are given out-of-domain values.
Set.mm includes specific definitions of out-of-domain behavior for functions that
expect numbers in CC, ones that we find convenient and justifiable according to [Quine].

In this "elementary" section we include hypotheses that
require that the functions receive values within their usual domain,
even though technically these hypotheses are not required by our definition.
That means that if the definition of these functions were changed
to forbid out-of-domain values, these theorems would typically
not need to change (or would not change much),
because these theorems already explicitly require that their parameters stay within the domain.

Later on, e.g., in ~ ...., we show that by exploiting the full capabilities of
our specific definitions, we actually do not need these hypotheses.
As a result, by exploiting their full capabilities,
we can prove things such as A + B = B + A for any A and B.
That is because given our actual definition,
if A or B are not in CC (e.g., because they are proper classes),
then A + B = B + A = (/). However, at that point we are exploiting
the full properties of our specific definition, which has its advantages
but is different (e.g., more general) than some of the definitions used in the literature.

--- David A. Wheeler

Mario Carneiro

unread,
May 1, 2015, 3:11:05 PM5/1/15
to metamath
On Fri, May 1, 2015 at 11:11 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
*Personally* I like the idea of limiting "+" so it cannot be used with non-CC arguments;
I think that's consistent with how some sources define them.

You asked earlier if there was a drawback to a "partial definition", and the answer is yes, in a certain precise sense it is the *objectively worst* choice for theorem proving. Think of it this way: we have a certain concept we want to define, and it is universally understood to have whatever meaning in its "intended domain". Outside that domain, we are free to give it any definition we like. Under *any* definition that matches the agreed one on the intended domain, any theorem with hypotheses that limit all arguments to the intended domain will be provable. For any specific definition, there will also be some extra theorems that extend beyond the intended domain (at the very least, the definition itself). These theorems vary from one alternative definition to the next, but almost any definition will allow some theorems to be stated with fewer assumptions than what would be needed to put all arguments in the intended domain.

In contrast, a partial definition is the "worst of both worlds": *none* of these extra theorems are provable, so all theorems must come with intended domain hypotheses. Although this situation is easy to understand, one should note that it is *not a good thing* for proof length. For example, consider suprema, which have nontrivial closure proofs. If you have proven sup ( A , RR , < ) <_ 1 by showing every element of A is less than 1, in order to "use" this inequality somewhere, say to show it is less than 2, you will need to prove also that it is real, which involves exhibiting an upper bound for A (even though you already did that earlier). This adds extra steps to the proof, *even though at no point are we using parameters out of the intended domain*.

Finally, a partial definition is not acceptable to the mmj2 definition checker because you can't freely apply the definition during reduction of a proof as part of the conservativity argument. The best you can do is use an axiom, as with df-addcl, or use a "gentleman's agreement" to only use the definition filtered through some theorem, as we did with dfsbcq.

To give another example, you recently introduced a definition of ">". Your original definition was (a): "( A > B <-> -. A <_ B )", and I suggested the modification (b): "( A > B <-> B < A )". Let's contrast these two with (c): "( A e. RR* /\ B e. RR* ) -> ( A > B <-> B < A ) )"

   Provable?    Theorem
   (a) (b) (c)
1. Yes Yes Yes |- ( ( A e. RR* /\ B e. RR* ) -> ( A > B <-> B < A ) )
2. Yes Yes Yes |- ( ( A e. RR* /\ B e. RR* ) -> ( A > B <-> -. A <_ B ) )
3. Yes Yes Yes |- ( ( A e. RR* /\ B e. RR* ) -> ( A > B \/ A <_ B ) )
4. Yes Yes Yes |- ( ( A e. RR* /\ B e. RR* ) -> ( ( A < B /\ A > B ) -> A = B ) )
5. No  Yes No  |- ( A > B <-> B < A )
6. Yes No  No  |- ( A > B <-> -. A <_ B )
7. Yes No  No  |- ( A > B \/ A <_ B )
8. No  Yes No  |- ( ( A < B /\ A > B ) -> A = B )

Note that theorems 1,2,3,4 that properly constrain their arguments are provable under all three definitions, while none of the nonconstrained theorems are provable by (c). Each of the total definitions improves on this baseline with a few nonconstrained theorems, which leads to fewer proof obligations for uses of these theorems. The two definitions yield different simplifications, but either one yields all the theorems that (c) gives plus some extras. I mentioned earlier that "any convention is better than no convention"; this is what I am referring to.

But if this is the direction set.mm goes, I think it's important that this be discussed
in comments somewhere in set.mm, and then have many other theorems point to that.
Then this no longer looks haphazard, but is a clearly-documented approach.

I will point out that this is already used in several places if you look elsewhere in the database from the real numbers, which are in many ways treated a little specially. For example, "F e. ( A -cn-> B )" is used to talk about continuity of complex functions. The definition makes sense when A and B are subsets of CC, but rather than require that everywhere we use cncfrss which says that if F e. ( A -cn-> B ) then A is a subset of CC, so we don't need to specify one if we already know the other. This is just the contrapositive of the fact that -cn-> is a function on ( ~P CC X. ~P CC ), so ( A -cn-> B ) = (/) when A and B are not both subsets of CC. Similarly, ZZ>= is a function from ZZ to subsets of ZZ, so "M e. ( ZZ>= ` N )" asserts the ZZ-closure of both M (as you would expect) and N (because it is empty out of domain).
 

It should note, for example, that these elementary definitions with hypotheses would
work even if those functions only had a definition when provided complex values.
It should also note somewhere that if these functions are given non-CC they are
equal to the empty set, and thus  (for example)
A + B = A x. B for all A, B if either are non-CC - including proper classes!

To be honest I think that this theorem is more confusing than anything else, and doesn't need to be brought specifically to attention. The empty set is equal to a lot of things: ( 9 + om ), ( 2 ^ pi ), ( Poly ` +oo ), ( iota x x e. _V ), sum_ n e. ZZ 0 and a bunch of other nonsense. I think that the convention of "make nonsense equal the empty set" is fairly easy to understand in addition to being useful in many ways, although it does lead to many unusual equalities if you think too hard about specific choices of nonsense.
 
On Fri, May 1, 2015 at 1:19 PM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
This is exactly the sort of thing that I think should be documented
(e.g., somewhere near the beginning the "elementary" section of set.mm).
Something like this:

--snip--

I think it is more consistent with our usual convention to have such discussion on the first theorem that uses the "tricks", which would be the first theorem in the proposed new section, rather than before it where it seems a bit unmotivated. Other than that I think this would be a good discussion for the comment.

Mario

David A. Wheeler

unread,
May 1, 2015, 6:01:51 PM5/1/15
to metamath
David A. Wheeler wrote:
> > *Personally* I like the idea of limiting "+" so it cannot be used with
> > non-CC arguments; I think that's consistent with how some sources define them.

On Fri, 1 May 2015 15:11:04 -0400, Mario Carneiro wrote:
> You asked earlier if there was a drawback to a "partial definition", and
> the answer is yes, in a certain precise sense it is the *objectively worst*
> choice for theorem proving. Think of it this way: we have a certain concept
> we want to define, and it is universally understood to have whatever
> meaning in its "intended domain". Outside that domain, we are free to give
> it any definition we like. Under *any* definition that matches the agreed
> one on the intended domain, any theorem with hypotheses that limit all
> arguments to the intended domain will be provable. For any specific
> definition, there will also be some extra theorems that extend beyond the
> intended domain (at the very least, the definition itself). These theorems
> vary from one alternative definition to the next, but almost any definition
> will allow some theorems to be stated with fewer assumptions than what
> would be needed to put all arguments in the intended domain.
>
> In contrast, a partial definition is the "worst of both worlds": *none* of
> these extra theorems are provable, so all theorems must come with intended

Fair enough, and clearly stated (thank you!).

I'd like to see a summary of this somewhere in set.mm.
A lot of books limit functions like + to CC, while set.mm intentionally defines
the out-of-domain result as the empty set. Thus, set.mm is different
in this respect from some other works. That's not a bad thing.
However, it'd be good to have documentation
inside set.mm for the rationale on WHY that decision was made,
with enough hyperlinks so that people are likely to actually *find* that rationale.
What you just posted looks like a good start :-).

More generally, l I'd like to see more rationales written down
in comments on WHY things are done in a certain way.
I can see the definitions, but the motivations for them aren't always obvious.


> The empty set is equal to a lot of things: ( 9 + om ), ( 2 ^ pi ), ( Poly ` +oo ), (
> iota x x e. _V ), sum_ n e. ZZ 0 and a bunch of other nonsense. I think
> that the convention of "make nonsense equal the empty set" is fairly easy
> to understand in addition to being useful in many ways, although it does
> lead to many unusual equalities if you think too hard about specific
> choices of nonsense.

My point is just that it should be specifically *documented* that,
in many definitions, "nonsense" produces the empty set... and that therefore,
under these definitions, a lot of unusual equalities result.
A few examples of what is "equal" under these definitions would help the reader.

BTW, ( 2 ^ pi ) is a gotcha. In many systems that is NOT nonsense.
I don't have a solution for the gotcha.


> I think it is more consistent with our usual convention to have such
> discussion on the first theorem that uses the "tricks", which would be the
> first theorem in the proposed new section, rather than before it where it
> seems a bit unmotivated. Other than that I think this would be a good
> discussion for the comment.

Okay, that location makes sense.

That said, I think there should be forward references; specifically,
from the various places where people MIGHT care, to the comment
which explains what's going on.
E.G., I think it'd be useful to include a link inside the comments for
(1) the initial definition and (2) the first theorems that use unnecessary hypotheses.

--- David A. Wheeler

Cris Perdue

unread,
May 1, 2015, 9:01:10 PM5/1/15
to meta...@googlegroups.com
On the virtues of "partial definitions"  and the assertions that in some sense they are the "objectively worst" choice.

On Fri, May 1, 2015 at 12:11 PM, Mario Carneiro <di....@gmail.com> wrote:


On Fri, May 1, 2015 at 11:11 AM, David A. Wheeler <dwhe...@dwheeler.com> wrote:
*Personally* I like the idea of limiting "+" so it cannot be used with non-CC arguments;
I think that's consistent with how some sources define them.

You asked earlier if there was a drawback to a "partial definition", and the answer is yes, in a certain precise sense it is the *objectively worst* choice for theorem proving. Think of it this way: we have a certain concept we want to define, and it is universally understood to have whatever meaning in its "intended domain". Outside that domain, we are free to give it any definition we like. Under *any* definition that matches the agreed one on the intended domain, any theorem with hypotheses that limit all arguments to the intended domain will be provable. For any specific definition, there will also be some extra theorems that extend beyond the intended domain (at the very least, the definition itself). These theorems vary from one alternative definition to the next, but almost any definition will allow some theorems to be stated with fewer assumptions than what would be needed to put all arguments in the intended domain.

All of this seems to suggest that there is value in the extra theorems that are true outside the intended domain of a definition. Definitions with greater generality often result in theorems of greater generality. For example theorems about fields apply to the real numbers. However defining (say) addition to apply to all sets does not necessarily mean that there are useful or interesting theorems about addition of sets that do not represent numbers. Commutativity has been mentioned, A + B = B + A. If the values are the empty set when given a non-number input, this is not a useful or interesting theorem.

What if the definition results in less trivial theorems when applied to domains extended beyond the usual? There must be cases where this occurs, and I would think that if the definition is understandable in the "usual" case, the generalization might be quite interesting.

I do think that one significant value of Metamath is in helping self-study of the foundations of mathematics. For learners, I would much favor definitions that explicitly indicate the intended domain -- in other words "partial definitions", because the conditions guide the user's expectations. Without the extra conditions, the learner may imagine there will be interesting results for all sets, when really the results are only interesting on the intended domain. Even nonstandard results I think can be confusing, especially when the result depends on peculiarities of the definition used. In such a case Metamath might give results that actually contradict other sources. It seems to me that it is generally more to the benefit of users if results in Metamath agree with results from other sources.

Also for learners, I believe that theorems with explicit hypotheses about the intended domains help guide their expectations regarding where to apply them, and I think help with browsing of the material on the site.  There is some tedium including the hypotheses explicitly, but I think it also has a payoff.

Let me give one trivial example. With the von Neumann definition of the ordinals, the empty set is zero. If A + B is the empty set given a non-number input, then A + B = 0 in this sense, probably not a fact intended as a benefit of the definition of addition.
 

Mario Carneiro

unread,
May 1, 2015, 11:41:29 PM5/1/15
to metamath
On Fri, May 1, 2015 at 9:01 PM, Cris Perdue <cr...@perdues.com> wrote:
All of this seems to suggest that there is value in the extra theorems that are true outside the intended domain of a definition. Definitions with greater generality often result in theorems of greater generality. For example theorems about fields apply to the real numbers. However defining (say) addition to apply to all sets does not necessarily mean that there are useful or interesting theorems about addition of sets that do not represent numbers. Commutativity has been mentioned, A + B = B + A. If the values are the empty set when given a non-number input, this is not a useful or interesting theorem.

I may not have been clear enough on this point, but let me emphasize that I have no interest in proving interesting facts about addition of sets outside the complexes. It is a bit subtle, but it turns out that by generalizing theorems to beyond the intended domain, you can prove theorems within the intended domain faster and easier. I'm having some difficulty coming up with examples, but this is a very common technique in both standard mathematics and Metamath. Here's one: Suppose I want to prove that |sin x| >= 0 for real x, and suppose that sin x is defined as (e^ix)-(e^-ix)/2i. If the absolute value function was defined as if(x>0,x,-x) as is normal in real analysis, then the proof would require me to first show that sin x is real, which is not immediately obvious given the way it has been written. But if I generalize the definition of absolute value to |z| = sqrt(z z*), then |z| >= 0 for any complex z, and so now I don't even need to know that sin x is real (even though it *is* real in this case), and complex closure of sin x is clear from the expression.
 
What if the definition results in less trivial theorems when applied to domains extended beyond the usual? There must be cases where this occurs, and I would think that if the definition is understandable in the "usual" case, the generalization might be quite interesting.

Actually, when reasonable I try to state a definition in as much generality as I can without significantly complicating theorems - usually the correct domain is clear by just letting it be the largest domain such that all the terms in the definition make sense. One recent definition which I am rather happy with for its unexpected generality is the "Locally A" predicate (df-lly). The "A" is intended to be some collection of subspaces of the topology, but after writing it down I realized that there is no need for "A" to have any particular properties; it need not be a set or a subclass of Top or anything else, and since all the proofs leverage properties of subspace topologies you actually get quite a lot of nontrivial theorems out with no restrictions on "A".
 
I do think that one significant value of Metamath is in helping self-study of the foundations of mathematics. For learners, I would much favor definitions that explicitly indicate the intended domain -- in other words "partial definitions", because the conditions guide the user's expectations.

You don't need partial definitions to learn the intended domain - it is almost always indicated in the definition itself as well as most theorems about the concept in one way or another. For functions, this is the domain of the function, which is written right in the mapping. After it is applied, either the domain restriction will be part of the hypothesis, or there will be a "reverse closure" theorem asserting that the parameters are in the intended domain when the function is applicable (cncfrss and lmfpm are examples of this).
 
Without the extra conditions, the learner may imagine there will be interesting results for all sets, when really the results are only interesting on the intended domain. Even nonstandard results I think can be confusing, especially when the result depends on peculiarities of the definition used. In such a case Metamath might give results that actually contradict other sources. It seems to me that it is generally more to the benefit of users if results in Metamath agree with results from other sources.

This should not happen if the definition agrees on the intended domain and all parameters are in the intended domain as well (which you can trivially assure by restricting the theorem to add any missing domain closure hypotheses). In general, though, you should not expect that definitions match in all the fine points. For example I happen to know that in HOL Light the set of natural numbers is defined such that n is the set of all n-element sets (in a certain enclosing type). This definition is provably different from our definition of om (their naturals contain an infinite set, ours does not), and there are good reasons for both our choice and theirs (this Frege-Russell definition is natural in type theories, but produces a proper class in ZFC).

I am also reminded of a conversation I had with some of the folks on the Mizar team last year. I asked them what they did about division by zero, and they told me that they define division such that x/0 = 0. At first I thought that this was necessary because of the typing, to make division a total function, but they said that no, it was a possibility to define it as a partial function, but they made the definition this way simply because it is easier to work with, and I couldn't disagree - there are several theorems that lose the =/= 0 constraint under this definition, like x/-y = -x/y, 1/1/x = x, x*(1/y)=x/y and many others.

Also for learners, I believe that theorems with explicit hypotheses about the intended domains help guide their expectations regarding where to apply them, and I think help with browsing of the material on the site.  There is some tedium including the hypotheses explicitly, but I think it also has a payoff.

The primary usefulness of this technique is in "utility theorems" which are optimized for efficient usage over presentation because they are used so much. It is a bit of an unfortunate coincidence that the real numbers section is optimized for pedagogy while algebraic theorems have such a utilitarian aspect to them. The propositional calculus section has also been mercilessly optimized for usage, but it doesn't get the same bad rap possibly because there isn't really anything that can be classified as a "weird trick" in that section.
 
Let me give one trivial example. With the von Neumann definition of the ordinals, the empty set is zero. If A + B is the empty set given a non-number input, then A + B = 0 in this sense, probably not a fact intended as a benefit of the definition of addition.

Of course if that was the case you would have A - A = 0 with no assumptions on A. Or if we knew that 0 =/= (/), then we could prove 1 / A =/= 0 with no assumptions. Every piece of information you get about out-of-domain behavior simplifies some theorem. Of course you could also prove -. A e. CC -> A + B = 0 but since this theorem is not applicable to complex numbers it is not as practically useful.

Mario

Norman Megill

unread,
May 2, 2015, 12:47:58 AM5/2/15
to meta...@googlegroups.com
While I'm still unsettled about this, let me mention a couple of things
we may want to think about related to "brittleness" (for lack of a
better word, or maybe that is the right word).  Essentially, we should
keep in mind the possibility of generalizations that may happen in the
future and what impact they may have.


On Tuesday, April 28, 2015 at 10:24:17 AM UTC-4, Mario Carneiro wrote:
>    ( A < B -> ( A e. RR* /\ B e. RR* ) )

One reason this vaguely bothers me is that we are now forever committed
to having RR* be the domain of < if we eliminate the domain requirement
with this trick.  At one point the past, I extended < from RR to RR*,
and it was painless because all theorems using < had antecedents
specifying RR, so they automatically continued to work without
modification.  This wouldn't have been the case had we eliminated their
hypotheses with ( A < B -> ( A e. RR /\ B e. RR ) ).

Now I don't expect that we will extend < further; this is just an
example of what could have happened.  There may be other things we might
eventually extend, for example adding +oo, -oo to one or more of the
arithmetic operations, and ideally existing theorems should continue to
work.

(BTW I would not want to add +oo, -oo to the arithmetic operations until a
definite advantage arises, which I don't see happening anytime soon.  As usual,
I don't like to add things unless they are clearly needed.  The extension
to RR* was very clearly needed by the work I wanted to do, and it is
also a very common literature extension compared to extending the
operations.)


Another example, if you recall from an old email:

> By the way, on a slightly unrelated note, I just noticed that you can set
> it up so that "W e. H" implies "K e. HL", thus simplifying that conjunct
> you've been trying to clean up in all your theorems. You have to change the
> definition df-lhyp to
>
> |- LHyp = ( k e. HL |-> { x e. ( Base ` k ) | x ( <o ` k ) ( 1. ` k ) } )

(vs. the current:)
 |- LHyp = ( k e. _V |-> { x e. ( Base ` k ) | x ( <o ` k ) ( 1. ` k ) } )

>
> which is to say, restrict the domain to HL, but this isn't a problem since
> all theorems that use it eventually make this assumption. One nice property
> of getting elements of a mapping operation, in a construction like W e. (
> LHyp ` K ), is that since it is the empty set outside its domain (as well
> as when K is not a set), this gives closure of both W and K here.

Thus we can use merely 'W e. ( LHyp ` K )' instead of
'( K e. HL /\ W e. ( LHyp ` K ) )' as an antecedent to many of the
theorems in my mathbox.  While this is very clever, it restricts the use
of LHyp to HLs (Hilbert lattices).  There are many kinds of lattices
where hyperplanes are important, and it is possible that I may want to
use it for other kinds in the future.  It also obscures the fact that
K e. HL i.e. that the theorem is specifically about Hilbert lattices,
(an important fact for the reader) without going back to the definition
of LHyp.


Finally, if we ever want to translate set.mm to another proof language,
I think it would be harder not having the intended domain specified,
because their out-of-domain behavior (if even defined) might be
different.  OTOH maybe we want to encourage a unidirectional translation
to Metamath as the universal language.  :)


P.S.  I assume that your proposal still prohibits the use of (/) e/ CC,
and that it doesn't sneak in somehow.  :)  We want to keep that
prohibition to keep the development independent of the CC construction.

Norm

Mario Carneiro

unread,
May 2, 2015, 1:33:34 AM5/2/15
to metamath
On Sat, May 2, 2015 at 12:47 AM, Norman Megill <n...@alum.mit.edu> wrote:
While I'm still unsettled about this, let me mention a couple of things
we may want to think about related to "brittleness" (for lack of a
better word, or maybe that is the right word).  Essentially, we should
keep in mind the possibility of generalizations that may happen in the
future and what impact they may have.

You are absolutely right about brittleness. Using the example of the two definitions of ">", if we wanted to switch from definition (a) to (b), we would have to revise theorems 5-8 to add the hypothesis to the theorems that are no longer true generally, and also presumably generalize the theorems that are now true generally as a result of the new definition. If all theorems used in-domain hypotheses, the theorem proofs may need to change but the statements would not, so the change would be much more painless. It's basically a tradeoff between proof length and the extra maintenance needed to keep it that way. (The other solution is not to change the definition, of course.)
 
On Tuesday, April 28, 2015 at 10:24:17 AM UTC-4, Mario Carneiro wrote:
>    ( A < B -> ( A e. RR* /\ B e. RR* ) )

One reason this vaguely bothers me is that we are now forever committed
to having RR* be the domain of < if we eliminate the domain requirement
with this trick.  At one point the past, I extended < from RR to RR*,
and it was painless because all theorems using < had antecedents
specifying RR, so they automatically continued to work without
modification.  This wouldn't have been the case had we eliminated their
hypotheses with ( A < B -> ( A e. RR /\ B e. RR ) ).

I don't think this is exactly correct. It's hard to conceive of yet more generalization beyond the current "<", so consider the RR -> RR* change. It is true that you would have gone from ( A < B -> A e. RR ) to the weaker ( A < B -> A e. RR* ), but since the extended real less-than is also a total order many (all?) of the ordering theorems that apply to RR* would have their forms unchanged from the original, for example ( ( A < B /\ B < C ) -> A < C ), where the same theorem that applied to RR now works on RR*, but with no change to users of RR inequality. However, you would see a change for those theorems that need RR specifically, say ( ( A < B /\ C e. RR ) -> ( A + C ) < ( B + C ) ). But I'm sure you won't be surprised that I prefer aggressive maintenance and short proofs because it keeps the database nimble and efficient. At some point, set.mm may grow to the point that I can't just revise the whole database every time I get a good idea for a definition improvement, and then a more conservative strategy will be needed, but for now I would just say "if the definition changes I'll just revise the database to fix what needs to be fixed".
 
(BTW I would not want to add +oo, -oo to the arithmetic operations until a
definite advantage arises, which I don't see happening anytime soon.  As usual,
I don't like to add things unless they are clearly needed.  The extension
to RR* was very clearly needed by the work I wanted to do, and it is
also a very common literature extension compared to extending the
operations.)

I've thought about this very thing in the past. One issue is that ( CC u. RR* ) is not a very nice set, and so you would lose the elegance of theorems like addcn if the domain was not CC, nothing more, nothing less. FL actually has a definition for this, df-adde, although I would revise it so that ( +oo + -oo ) = ( -oo + +oo ) = 0 instead to get unconditional closure. Something like extended real addition is often needed when dealing with measures, and several of the lebesgue measure theorems could possibly be stated a bit more simply using this. But I agree with FL's approach of simply using an entirely separate operator tailored for this use-case, since RR* and CC extend RR in largely incompatible directions. Also needed is a topology on RR*, so that the limits don't need as many special cases - my thought is to define an order topology generator and then use ( < ordTop RR* ) or similar.

Mario

Norman Megill

unread,
May 2, 2015, 11:25:09 AM5/2/15
to meta...@googlegroups.com
On Saturday, May 2, 2015 at 1:33:34 AM UTC-4, Mario Carneiro wrote:
> But I'm sure you won't be surprised that I prefer aggressive
> maintenance and short proofs because it keeps the database nimble and
> efficient.

As I've said before, part of me agrees with this.  If I were doing
set.mm to show off to a limited audience of formal math hackers (if
there were such a thing) I would have no hesitation.  It might be
analogous to writing the shortest perl program to solve Rubik's cube,
highly valued in a certain community but unreadable for most.  It comes
down to the subjective issue of the purpose of set.mm.

Here is some history.

For ordinals, set.mm uses membership and subset directly for strong and
weak ordering, without redefining them as custom relations on ordinals
(as many texts do).  Since the intended domains of e. and C_ are the
universe, I don't use "A e. On" as an antecedent whenever the theorem
also holds for non-ordinal classes.  For example, I don't restate a pure
set theory theorem such as sstr with a redundant ordinal antecedent when
I use it for ordinals; instead I just use sstr directly without an
antecedent.  This makes a lot of proofs shorter and occasionally gives
us a useful general-purpose theorem of set theory for free.  I have no
hesitation doing this, since no out-of-domain behavior is involved for
the e. and C_ symbols themselves.  (LMod is another place where I take
advantage of the fact that subspaces are ordered by subset to shorten
proofs.)

Once I added ordinal operations, I was wary about out-of-domain behavior
because it depends on arbitrary choices we made in the definitions that
aren't specified in the literature or may even contradict the
literature.  I felt that abusing those choices could be misleading.  I
particularly had the student in mind, although I also thought
mathematicians could find such abuse distasteful, turning them off to
the then-immature set.mm.  Note the comment for om0x, added in 1996:

  MM> sh st om0x/c
  "Ordinal multiplication with zero.  Definition 8.15 of [TakeutiZaring]
       p. 62.  Unlike ~ om0 , this version works whether or not ` A ` is an
       ordinal.  However, since it is an artifact of our particular function
       value definition outside the domain, we will not use it in order to be
       conventional and present it only as a curiosity."
  20567 om0x $p |- ( A .o (/) ) = (/) $= ... $.
  MM> sh st om0/c
  "Ordinal multiplication with zero.  Definition 8.15 of [TakeutiZaring]
         p. 62."
  20564 om0 $p |- ( A e. On -> ( A .o (/) ) = (/) ) $= ... $.

My caution extended even to their proofs; om0 is derived from rdg0
(like Takeuti and Zaring do implicitly) rather than just using using a1i to
tack on an antecedent to om0x.

As for A+B=B+A or (A<B /\ B<C) -> A<C without hypotheses, it troubles me
a little that, in isolation, the reader has no idea what the theorem is
about.  Instead, the reader has to dig back to the definitions and
theorems like A<B -> (A e. RR* /\ B e. RR*).  Of course the comment can
alleviate some of that, but in an ideal world one might hope that the
math itself could be self-explanatory in a user-friendly way.
Unfortunately, this tends to be at the expense of longer proofs in some
cases.

You've brought up eluzle etc.  I agree I'm abusing the out-of-domain
behavior there, although at least I have an antecedent that hints at the
domain of the theorem.  I'm not rigidly single-minded about this highly
subjective issue, and I can change my opinions over time.  To me,
eluzle etc. are a compromise that slightly abuses definitions without
outright removal of hypotheses.  From a purist point of view, perhaps
they were a mistake.

I still would like to hear some more input from others.  I don't think
there is any urgency in making a decision about your proposal (I think
it was brought up as a side remark in the df-op discussion), unless
there is something particular that is holding you up.

Norm

Mario Carneiro

unread,
May 2, 2015, 11:42:48 AM5/2/15
to metamath
On Sat, May 2, 2015 at 11:25 AM, Norman Megill <n...@alum.mit.edu> wrote:
I still would like to hear some more input from others.  I don't think
there is any urgency in making a decision about your proposal (I think
it was brought up as a side remark in the df-op discussion), unless
there is something particular that is holding you up.

No, this is not essential for me or the current df-op update, nor do I intend to force the issue, seeing as for now most have chimed in against the proposal. We can always revisit it later should something change to make this a more attractive option. I think it might be good to derive ( A < B -> ( A e. RR* /\ B e. RR* ) ) anyway (and not use it) to add some of the discussion here about the merits and drawbacks analogous to om0x and fin2inf.

Mario

Mario Carneiro

unread,
May 6, 2015, 7:05:16 AM5/6/15
to metamath
On Wed, Apr 29, 2015 at 8:26 PM, Norman Megill <n...@alum.mit.edu> wrote:
I'd like to hear other opinions on this.  How do people feel about A + B = B + A, and A < B /\ B < C -> A < C without hypotheses, even though the operations and relations are intended only for CC and RR?


By the way, I have one last thing to add to this thread: I just discovered, in the course of the df-op update, that both of these are already theorems in set.mm: addcomgi is in Andrew Salmon's mathbox, and xrletr2 is in FL's mathbox (although it uses <_ instead of <). Also, my suggested reverse closure theorem ( A <_ B -> ( A e. RR* /\ B e. RR* ) ) is mlteqer in FL's mathbox.

Mario
Reply all
Reply to author
Forward
0 new messages