Overall it seems to me that IT is still totally shying away from its huge need of identifiers in order to be written and function properly. People don't like huge flat list, but there are many of them, like ISBNs for books for instance, without which not much would function. There is also a tendancy to always mix naming and classification, which is very understandable when using key words as IDs, but this reason disappears when using numbers kind IDs, and after all a memory adress space is also a huge flat list. Of course if these are "adminsitrative", then comes up the question of their sources and distribution principles. Tought about that for sometimes already, and came up with an idea (filed and published as a patent) that I think would provides the necessary flexibility while maintaining the possibility to have these IDs on a fixed length. Besides, "name spaces" that are usually described as "hierarchical" are in fact not hierarchic at all. By that I mean that a file path for instance, is very often more an s-expression on a flat list of symbol than a "hierarchical path" : /mysoft/lib/doc is something like (doc (lib (mysoft))), the documentation part of the library part of mysoft, so that the symbol "lib" and "doc" are part of a global namespace referring to the same one in /anothersoft/lib/doc.
More on this below : http://iiscn.wordpress.com/about/ Unfortunately only in french for the time being, hope to find time to make an english version.
Note : All of the above refers to publications, programs, works, list or set of these things, other concepts or conventions such as unicode code points, and not to individuals or persons, on this I am perfectly in line with below IEEE position statement : http://web.archive.org/web/20041106011802/http://www.ieeeusa.org/poli...
Which, that is a bit strange, has disappeared from the ieee web site.
In other words, much more than new syntaxes or models, what IT needs is a common labels distributor, or a common huge stack of pins.
> Overall it seems to me that IT is still totally shying away from its > huge need of identifiers in order to be written and function properly. > People don't like huge flat list, but there are many of them, like > ISBNs for books for instance, without which not much would function. > There is also a tendancy to always mix naming and classification, > which is very understandable when using key words as IDs, but this > reason disappears when using numbers kind IDs, and after all a memory > adress space is also a huge flat list. Of course if these are > "adminsitrative", then comes up the question of their sources and > distribution principles. Tought about that for sometimes already, and > came up with an idea (filed and published as a patent) that I think > would provides the necessary flexibility while maintaining the > possibility to have these IDs on a fixed length. Besides, "name > spaces" that are usually described as "hierarchical" are in fact not > hierarchic at all. By that I mean that a file path for instance, is > very often more an s-expression on a flat list of symbol than a > "hierarchical path" : /mysoft/lib/doc is something like (doc (lib > (mysoft))), the documentation part of the library part of mysoft, so > that the symbol "lib" and "doc" are part of a global namespace > referring to the same one in /anothersoft/lib/doc.
> More on this below :http://iiscn.wordpress.com/about/ > Unfortunately only in french for the time being, hope to find time to > make an english version.
> Note : All of the above refers to publications, programs, works, list > or set of these things, other concepts or conventions such as unicode > code points, and not to individuals or persons, on this I am perfectly > in line with below IEEE position statement :http://web.archive.org/web/20041106011802/http://www.ieeeusa.org/poli...
> Which, that is a bit strange, has disappeared from the ieee web site.
> In other words, much more than new syntaxes or models, what IT needs > is a common labels distributor, or a common huge stack of pins.
> In other words, much more than new syntaxes or models, what IT needs > is a common labels distributor, or a common huge stack of pins.
1. UUIDs: <http://en.wikipedia.org/wiki/Universally_unique_identifier> give you *independent* unique "pins," but these are, quite intentionaly, *not* a "common labels distributor." The practical solution is to *not* coordinate, but make the liklihood of collision very small because of the length of the ID. The whole point of UUIDs is to avoid the computational, network, and administrative overhead of global UUID coordination.
And there are platform specific calls for UUIDs as well.
What you're asking for is a globally coordinated, UUID database. I think this will have to wait for a repressive world government or robot overlords, whichever comes first ;^). IOW, it's only really useful if every programmer on the planet can be compelled to use it for every software entity that sees the light of day (i.e., interacts with other software or users over any network). Enforcing that would require an authoritarian world government. Remember, unless you can enforce registration with this global database, you'll still run the risk of collisions.
> Note : All of the above refers to publications, programs, works, list > or set of these things, other concepts or conventions such as unicode > code points, and not to individuals or persons, on this I am perfectly > in line with below IEEE position statement : > http://web.archive.org/web/20041106011802/http://www.ieeeusa.org/poli...
Which,
> that is a bit strange, has disappeared from the ieee web site.
2. Actually, the position statement and its background are still on the ieee website:
3. Beyond this clerical matter, I think it approaches doublespeak to say that software entities should be uniquely identified, but that people shouldn't, since people are frequently components of software systems (customers, account holders, voters, taxpayers, deed holders, prisioners, etc.).
IOW, either universal identifiers are a good idea or they aren't, and the ieee position paper suggests that they are not. If the ieee position is correct, then if you want security and privacy built into your systems from the ground up, you want "a family of identifiers [that] would allow different identifiers to be used, as appropriate to the security needs, privacy desires, and other tradeoffs of different transactions or situations," e.g., my social security number should not be included as part of the "From:" field of this Usenet post (and of course, it isn't).
Now, within any given domain, we certainly want to require uniqueness - for example, no two people should have the same email address - but I'm pretty sure that this need for uniqueness within a given domain has been well known for a very long time.
<raffaelcavall...@pas.despam.s.il.vous.plait.mac.com> wrote: > On 2011-06-22 05:50:09 +0000, yves75 said:
> > In other words, much more than new syntaxes or models, what IT needs > > is a common labels distributor, or a common huge stack of pins.
> 1. UUIDs: <http://en.wikipedia.org/wiki/Universally_unique_identifier> > give you *independent* unique "pins," but these are, quite > intentionaly, *not* a "common labels distributor." The practical > solution is to *not* coordinate, but make the liklihood of collision > very small because of the length of the ID. The whole point of UUIDs is > to avoid the computational, network, and administrative overhead of > global UUID coordination.
> And there are platform specific calls for UUIDs as well.
> What you're asking for is a globally coordinated, UUID database. I > think this will have to wait for a repressive world government or robot > overlords, whichever comes first ;^). IOW, it's only really useful if > every programmer on the planet can be compelled to use it for every > software entity that sees the light of day (i.e., interacts with other > software or users over any network). Enforcing that would require an > authoritarian world government. Remember, unless you can enforce > registration with this global database, you'll still run the risk of > collisions.
> > Note : All of the above refers to publications, programs, works, list > > or set of these things, other concepts or conventions such as unicode > > code points, and not to individuals or persons, on this I am perfectly > > in line with below IEEE position statement : > >http://web.archive.org/web/20041106011802/http://www.ieeeusa.org/poli...
> Which,
> > that is a bit strange, has disappeared from the ieee web site.
> 2. Actually, the position statement and its background are still on the > ieee website:
> 3. Beyond this clerical matter, I think it approaches doublespeak to > say that software entities should be uniquely identified, but that > people shouldn't, since people are frequently components of software > systems (customers, account holders, voters, taxpayers, deed holders, > prisioners, etc.).
> IOW, either universal identifiers are a good idea or they aren't, and > the ieee position paper suggests that they are not. If the ieee > position is correct, then if you want security and privacy built into > your systems from the ground up, you want "a family of identifiers > [that] would allow different identifiers to be used, as appropriate to > the security needs, privacy desires, and other tradeoffs of different > transactions or situations," e.g., my social security number should not > be included as part of the "From:" field of this Usenet post (and of > course, it isn't).
> Now, within any given domain, we certainly want to require uniqueness - > for example, no two people should have the same email address - but I'm > pretty sure that this need for uniqueness within a given domain has > been well known for a very long time.
> warmest regards,
> Ralph
> -- > Raffael Cavallaro
Thanks for your answer Raffael, I know perfectly well about UUIDs, but these do not provide the necessary "functionality". Typically UUIDs are used for objects with short or medium term "life span", and/or for objects for which the reference is absolutely never seen or need to be typed. But nobody would think of using UUIDs as order IDs for instance, or documents such as the ones with DOIs, or for all the products having product IDs as GS1 bar codes, or for ISANs, or even less for signs and characters such as UNICODE code points for instance, why would "IT" think its problem is different ? Point is it isn't at all, and a major part of the mess is due to not recognizing that.
About authoritarian world government not at all, the ISBN org for instance, is a non profit organisation located in Germany I think, far from being a world government, and I doubt the ISBN.org organisation ever had a single word at all to say about which book should be published or not ! (apart from books about its own standard maybe)
And in fact the solution proposed removes fixed number of segments in IDs (the case most of the time), as well as a defined size for each segment (it is in fact based on a mathematical code decode algorithm), so that it would provide I think the necessary flexibility, as well as the ability to integrate most of the current already defined flat ID spaces.
The idea is then to use these "ISCNs" in a very "native" way in the programming environment, and that could be typically building a lisp world where they are used exactly (or almost) as memory locations are today used in a "classic" lisp environment.
And overall the message is also that there is no "solution" to be found in a new syntax or model (as we hear time and time again), but really in having these shared ID spaces. (people always tend to hear "syntax, grammar, structure, rules" when they hear language, and especially programming languages, but in the end the most important thing is the dictionnary (more or less libraries and defined symbols for programming languages))
As to "I think it approaches doublespeak to say that software entities should be uniquely identified, but that people shouldn't, since people are frequently components of software systems (customers, account holders, voters, taxpayers, deed holders, prisioners, etc.). " Not sure why you say so, a publication is a publication, the fact it is given a number is independent from its author. But in fact the idea presented on my "blog" on this aspect is that there should be a clear separation between entities responsible to maintain a list of "personal contracts" on some people "accounts", and entities providing associated services or contents, and that in fact if you look at possible protocols, it is very easy to define them such as no account ID at all has to be transmitted between "accounts holders" and "service providers". Today we are heading in a direction where, it is seen as a "fatality" that either service providers also maintain the account, or that a common ID as to be used in all cases, not the case at all ! Today it is a bit like if you should have an account within each company emitting shares in order for you to buy some, clearly there must be a clear separation between entities responsible to maintain your "personal bookshelf and key data, that is references", and only doing that, with being strictly forbidden to either look or publish it, and entities responsible for providing services/contents, and there should be several for each kind, which is perfectly feasible. (and you should be able to transfer your complete account from one of the "account holder entitie to another")
Also don't forget that 128 bits (the size of an IP v6 adress, IP adresses which are also distributed in an "adminsitrative" fashion by the way), it corresponds to 5 10^18 numbers/IDs per earth surface square millimeters, or 1 10^16 numbers/IDs per second and per inhabitant along a millenium for a population of 10 billions.
For 128 the proposal would loose some of these bits, but still there is a lot .. And the point is really to improve the ability to write and modify the thing, not everyone obliged to use it.
> 3. Beyond this clerical matter, I think it approaches doublespeak to > say that software entities should be uniquely identified, but that > people shouldn't, since people are frequently components of software > systems (customers, account holders, voters, taxpayers, deed holders, > prisioners, etc.).
French law prohibit the use of universal identifiers for people.
For example, enterprises should not use the S curit Social number as an identifier in their databases.
> IOW, either universal identifiers are a good idea or they aren't, and > the ieee position paper suggests that they are not. If the ieee > position is correct, then if you want security and privacy built into > your systems from the ground up, you want "a family of identifiers > [that] would allow different identifiers to be used, as appropriate to > the security needs, privacy desires, and other tradeoffs of different > transactions or situations," e.g., my social security number should > not be included as part of the "From:" field of this Usenet post (and > of course, it isn't).
Well, in France, since you must not use the SSN as an identifier, it doesn't matter if you publish a SSN. Actually, most of them are a matter of public record:
S YY MM DD CCC NNN KK
S = sex (1 = male, 2 = female, other values are possible...) YY = birth year modulo 100 MM = birth month DD = departement CCC = commune NNN = number of your entry in the birth register of the commune (for that year and month). KK = check "sum".
So you can compute the SSN for normal people. There are other numbers allocated for aliens, or other situations, that are less guessables, but as mentionned above, it doesn't matter, since only the SSN use them as identifier. Each administration and each enterprise has its own identifier scheme.
Now, determining the degree this law is a law forbidding gravity is left as an exercise to the reader.
> Now, within any given domain, we certainly want to require uniqueness > - for example, no two people should have the same email address - but > I'm pretty sure that this need for uniqueness within a given domain > has been well known for a very long time.
That doesn't prevent a lot of couples to use the same email address (and therefore to register only once to such domains, or use the + trick).
> > Now, within any given domain, we certainly want to require uniqueness > > - for example, no two people should have the same email address - but > > I'm pretty sure that this need for uniqueness within a given domain > > has been well known for a very long time.
> That doesn't prevent a lot of couples to use the same email address (and > therefore to register only once to such domains, or use the + trick). > A bad day in () is better than a good day in {}.
Really, the proposal isn't about people identifiers at all, to tell the truth I "hate" all this real names or non anonymous mindset, be they on the "coolest" side ŕ la facebook, or the most "technoid" ones like an RFID chip under everybody skin (there are some people, even academic ones that say they are in favor of such a thing ..) pseudonyms have been used for a very long time! The proposal is really about the "book of technology" (that can be considered as all machines and programs stopped and considering only the evolution of them through new versions, the connections of different instance somehow also being part of the writing, and then you can also consider the contents library). For people on the other hand, recognizing the need for a "personal contracts, accounts, and licenses library" to be handled by X or Y and transferable, but basically that doesn't change anything you can do now (several emails, shared emails, etc), and that wouldn't impose any common ID per person to be used in the systems by different actors. It is also about having something which isn't intrisically monopols based. And for contents for instance, I would really appreciate being able to "buy a website", for life in a single action (access to it), for instance the petit robert 2011, or many other possible things (which is basically the same as buying an iphone app for instance btw).
> solution is to *not* coordinate, but make the liklihood of collision > very small because of the length of the ID. The whole point of UUIDs
One way to make the probability of collision zero would be to derive the UUID from the date and time and the latitude and longitude of the place where it was generated. If the time gradient is fine enough, each new UUID would be different because it would be generated at a different time. And if the geographic gradient is fine enough, each place where UUID's are generated would generate different ones. If you need them more frequently than clock cycles, you can count up from zero at the start of each clock cycle.
>> solution is to *not* coordinate, but make the liklihood of collision >> very small because of the length of the ID. The whole point of UUIDs
> One way to make the probability of collision zero would be to derive the > UUID from the date and time and the latitude and longitude of the place > where it was generated.
We already have computer systems around or on different bodies in the solar system. So you will have to add the body index relative to the Sun.
We already have identified bodies around other stars, so you will have to add the star index in the Milky Way.
And a galaxy index, and a cluster index, and a super cluster index.
I guess until we find a way to travel "faster than light", we can skip the light cone identifier and the universe identifier.
> About authoritarian world government not at all, the ISBN org for > instance, is a non profit organisation located in Germany I think, far > from being a world government, and I doubt the ISBN.org organisation > ever had a single word at all to say about which book should be > published or not !
Malicious parties can spoof/counterfiet IDs and only law enforcement can stop it.
To use your ISBN example, there are already fraudulent printers who print unauthorized books, in violation of the Bern Convention, using the ISBN code for the book they are counterfeiting. ISBN codes are being hacked *already*. UPC codes have been hacked to allow lower price purchase of merchandise.
A global identification database is only as useful as the degree of compliance. Compliance is a law enforcement matter; the only reason ISBN numbers work as well as they do is because national governments take a dim view of book counterfeiting and periodically raid such fraudulent printers; the only reason UPC codes work as well as they do is because police arrest people counterfeiting them.
Enforcing *universal* compliance is an impossibility. That's why programmers wisely chose a system (UUIDs) that doesn't require such enforcement in order to function reliably.
<raffaelcavall...@pas.despam.s.il.vous.plait.mac.com> wrote: > On 2011-06-22 16:49:00 +0000, yves75 said:
> > About authoritarian world government not at all, the ISBN org for > > instance, is a non profit organisation located in Germany I think, far > > from being a world government, and I doubt the ISBN.org organisation > > ever had a single word at all to say about which book should be > > published or not !
> Malicious parties can spoof/counterfiet IDs and only law enforcement > can stop it.
> To use your ISBN example, there are already fraudulent printers who > print unauthorized books, in violation of the Bern Convention, using > the ISBN code for the book they are counterfeiting. ISBN codes are > being hacked *already*. UPC codes have been hacked to allow lower price > purchase of merchandise.
> A global identification database is only as useful as the degree of > compliance. Compliance is a law enforcement matter; the only reason > ISBN numbers work as well as they do is because national governments > take a dim view of book counterfeiting and periodically raid such > fraudulent printers; the only reason UPC codes work as well as they do > is because police arrest people counterfeiting them.
> Enforcing *universal* compliance is an impossibility. That's why > programmers wisely chose a system (UUIDs) that doesn't require such > enforcement in order to function reliably.
> warmest regards,
> Ralph
> -- > Raffael Cavallaro
Raffael, first of all the most usual way to generate UUIDs is to take a MAC address (which are distributed administratively for every ethernet or layer 2 device by the IEEE) and "lenghten" them up to 128 bits with random numbers, time stamps, or a mix of both. (that in itself can create "tracability" questions), or sometime take a "domain name" as the base (also distributed or "managed" administratively) and also lenghten them up, so in a sense UUIDs are an extension of administratively distributed IDs. But the key point is that, take the "primary_keys" columns or "foreign_keys" columns of any or 99% of todays RDBMS tables, and I don't think you will find any UUIDs in there. Again UUIDs are mostly used for "transient" OS objects, but not for any "real" persistent objects, be they business oriented such as "order ids", "shipment ids", "product ids", "ticket numbers", "SIRET numbers" or network oriented such as "node ids", "gateway ids", 'port IDs", "all the IDs associated to phones or mobile devices" or contents oriented such as "document ids", "ISANs", or "IMDB ids", "youtube video IDs" and even less for all the attributes or functions defined and used on these, and I don't think anybody is really thinking of using them for these, as noted for instance in page 2 in below document : http://bibnum.bnf.fr/identifiants/identifiants-200605.pdf Then about faking an existing ISBNs, ISCNs or other administratively distributed IDs, don't see why UUIDs would make any difference there, any existing ID can be "faked", that is copied and used in a context where it shouldn't have been, and UUIDs are in no way different there. In the end this is based on a "faith or trust chain", if ordering a book with a given ISBN on Amazon or other shop, I trust Amazon for providing me the right one, and in turn Amazon trust the publisher, printer or distributor they are dealing with for providing the right book. Same as if I buy an ethernet card, I trust the card maker to have put in there a MAC adress that he got from the IEEE distribution process. Or if receiving a packet from UPS or other package delivery company having a given ID, I trust them for providing me the package that was given this ID at shipment time. Really don't see how UUIDs, administrative IDs or not, change anything there. However one thing for sure, people always complain that everything works in "silos" in "IT", but this is primarily due to the fact that if primary keys are defined as different IDs spaces at the beginning, as is the case most of the time, then this is part of the systems TEXT (or code) and data, and can only evolve this way. Using IDs in the same space at the beginning on the other hand, doesn't oblige at all to put everything in the same system, but allows further evolution or refactoring made quasi impossible in the first case. Cheers, Yves
> >> solution is to *not* coordinate, but make the liklihood of collision > >> very small because of the length of the ID. The whole point of UUIDs
> > One way to make the probability of collision zero would be to derive the > > UUID from the date and time and the latitude and longitude of the place > > where it was generated.
> We already have computer systems around or on different bodies in the > solar system. So you will have to add the body index relative to the > Sun.
> We already have identified bodies around other stars, so you will have > to add the star index in the Milky Way.
> And a galaxy index, and a cluster index, and a super cluster index.
> I guess until we find a way to travel "faster than light", we can skip > the light cone identifier and the universe identifier.
Pascal, as already pointed out in fr.comp.lang.lisp , I truly don't understand your point regarding space travel and this IDs business, computers sent to space are just a "projection" full of the same human conventions and signs used on earth, "IT" doesn't "model" the world in anyway (apart maybe from climate modeling and the like), it just writes part of it, as does "la technique" in general.
> don't see why UUIDs would make any difference there, > any existing ID can be "faked
Version 4 UUIDs can't be easily faked because an attacker would have only a 1 in 10^36 chance of guessing correctly. Under your system, the actual IDs are entered in a *public* database making them trivial to counterfeit. This is why it is so trivial (and common) to counterfeit ISBN numbers for example; they're in a public database! This public database only works because of *law enforcement*, not because everyone on the planet magically agrees to be nicey-nice about book printing and the associated copyrights.
The only remedy is to associate your public tags with *private* keys known only to authorized parties, but then, of course, you no longer have a *single* identifier, which is what the ieee (and I) have been saying for this entire thread. If you want security, you must use multiple IDs for different purposes/levels of privacy/levels of authentication/levels of authorization.
OR just go the easy, probabilistic route, and use version 4 UUIDs, which are effectively impossible to spoof because there are more than 10^36 of them.
<raffaelcavall...@pas.despam.s.il.vous.plait.mac.com> wrote: > On 2011-06-23 06:58:16 +0000, yves75 said:
> > don't see why UUIDs would make any difference there, > > any existing ID can be "faked
> Version 4 UUIDs can't be easily faked because an attacker would have > only a 1 in 10^36 chance of guessing correctly. Under your system, the > actual IDs are entered in a *public* database making them trivial to > counterfeit. This is why it is so trivial (and common) to counterfeit > ISBN numbers for example; they're in a public database! This public > database only works because of *law enforcement*, not because everyone > on the planet magically agrees to be nicey-nice about book printing and > the associated copyrights.
> The only remedy is to associate your public tags with *private* keys > known only to authorized parties, but then, of course, you no longer > have a *single* identifier, which is what the ieee (and I) have been > saying for this entire thread. If you want security, you must use > multiple IDs for different purposes/levels of privacy/levels of > authentication/levels of authorization.
> OR just go the easy, probabilistic route, and use version 4 UUIDs, > which are effectively impossible to spoof because there are more than > 10^36 of them.
> warmest regards,
> Ralph
> -- > Raffael Cavallaro
Sorry Raffael, but I think you mix up two very different things.
1) Counterfeiting a publication that already has an IDs (for the publication itself,not copy instances) : then this ID is public whatever the system you use, as it is its --purpose-- to provide access to this publication or information related to it. So whatever ID used to identify this publication, it is public and you can just paste it on your counterfeit copy and this is it. Again here you rely on a trust chain to get a valid copy. And by the way, ISBNs aren't in a "public database" at all, ISBNs are a three segments identifier, first segment associated to a country (or sometimes publication language, as for German), second segment the editor, third segment the book or other published item within the editor. Then you have databases of "ISBN identified published items" related to libraries, online or not online bookshops, etc, but there isn't a "central public database" of all "published"(used to identify a publication) ISBNs. You could say there could be problems of "stealing ISBNs" for an editor that wouldn't want to bother getting an editor prefix for its publications for instance (a bit like "unofficial port numbers"), but this isn't the same issue at all as this guessing or counterfeiting ISBNs. And by the way I don't think there are that much counterfeited books these days, at least in Europe or the US, but this hasn't always been the case by far, as described for instance by Diderot below : http://classiques.uqac.ca/classiques/Diderot_denis/lettre_commerce_li...
2) The problem of valid license numbers or instances identifiers : there I understand your point, but this is completely different from the publication identification aspect (the set of all copies), and here I agree that it should be the responsibility of every publisher to manage the corresponding IDs, and not use patterns that allow "license key generators" if possible. But here somehow we go in the "DRMs" issues, and somehow today I think if there were "private bookshelves account holders" (or movies, records, etc, shelves), recording that such account holder has bought this publication (the access to it), at purchasing time in an interaction between the account holder and the shop, this would be sufficient for further access to the publication/service. Although yes, instance copies (or license IDs) with "privately held databases of all assigned copy or license IDs" could also be used in some cases, and picking them up amongst a given ISCNs prefix could also be used there.
But again, in the "writing of the book", the issue is primarily related to "concept" and publication IDs, much more than copies/ instance IDs, or for other things such as order ids, shipment ids, etc where this "guessing" an id issue doesn't really come up.
> Sorry Raffael, but I think you mix up two very different things.
It is you who are confused. You specifically want a *public* database of *unique* tags. This is precisely what the IEEE recommends *against*.
Creating unique identifiers is *not* a technical problem; known solutions are in widespread use. But making such a system work as desired when the tags are *publicly* known is a political, social, legal, compliance, and enforcement problem, not a technical one.
A *public* database is only as valuable as the level of compliance. Compliance can only be enforced by governments/police. You seem to assume that everyone on the planet will willingly cooperate just because it would be convenient for you if they did. This is childishly naive.
If compliance is not enforced you get:
a. Collisions. When people make up their own tags without registering or requesting them from your global issuing authority, you *will* have collisions. Since the only purpose of your tag database is uniqueness, collisions render your whole system worthless.
b. Counterfeit. If governments/police do not enforce compliance, you *will* have bad faith actors counterfeiting tags for profit and/or malicious purposes. We already see this with ISBN codes (your own hand picked example, btw), and UPC codes. And that's *with* law enforcement trying to track down counterfeiters and stop them. What government or police organization is going to enforce *global* compliance with your proposed database?
So your only practical options are either:
1. Private, not public. Make your *unique* tags private to your internal systems. This way you avoid collisions within all of your own systems. You could use UUIDs for this as I suggested previously. The chances of collisions is practically 0, and the opportunities for malicious parties to guess an internally used UUID are likewise essentially 0.
OR
2. Not unique. You publish one type of tag, but you map it to a different *private* key for each purpose of identification, authentication, and authorization, and for each desired degree of privacy (this is what the ieee position paper is talking about).
You can't have a *public* and *unique* ID database without enforced compliance. Compliance is a political, social, legal, and enforcement problem, not a technical one. Moreover, compliance is a very difficult problem to completely solve - that's why we have identity theft and other forms of spoofing and counterfeiting. That's why the IEEE came out *against* unique identifiers.
<raffaelcavall...@pas.despam.s.il.vous.plait.mac.com> wrote: > On 2011-06-23 18:32:19 +0000, yves75 said:
> > Sorry Raffael, but I think you mix up two very different things.
> It is you who are confused. You specifically want a *public* database > of *unique* tags. This is precisely what the IEEE recommends *against*.
> Creating unique identifiers is *not* a technical problem; known > solutions are in widespread use. But making such a system work as > desired when the tags are *publicly* known is a political, social, > legal, compliance, and enforcement problem, not a technical one.
> A *public* database is only as valuable as the level of compliance. > Compliance can only be enforced by governments/police. You seem to > assume that everyone on the planet will willingly cooperate just > because it would be convenient for you if they did. This is childishly > naive.
> If compliance is not enforced you get:
> a. Collisions. When people make up their own tags without registering > or requesting them from your global issuing authority, you *will* have > collisions. Since the only purpose of your tag database is uniqueness, > collisions render your whole system worthless.
> b. Counterfeit. If governments/police do not enforce compliance, you > *will* have bad faith actors counterfeiting tags for profit and/or > malicious purposes. We already see this with ISBN codes (your own hand > picked example, btw), and UPC codes. And that's *with* law enforcement > trying to track down counterfeiters and stop them. What government or > police organization is going to enforce *global* compliance with your > proposed database?
> So your only practical options are either:
> 1. Private, not public. Make your *unique* tags private to your > internal systems. This way you avoid collisions within all of your own > systems. You could use UUIDs for this as I suggested previously. The > chances of collisions is practically 0, and the opportunities for > malicious parties to guess an internally used UUID are likewise > essentially 0.
> OR
> 2. Not unique. You publish one type of tag, but you map it to a > different *private* key for each purpose of identification, > authentication, and authorization, and for each desired degree of > privacy (this is what the ieee position paper is talking about).
> You can't have a *public* and *unique* ID database without enforced > compliance. Compliance is a political, social, legal, and enforcement > problem, not a technical one. Moreover, compliance is a very difficult > problem to completely solve - that's why we have identity theft and > other forms of spoofing and counterfeiting. That's why the IEEE came > out *against* unique identifiers.
> warmest regards,
> Ralph
> -- > Raffael Cavallaro
Sorry Raffael, you just don't understand what I'm talking about, and the issue adressed, and putting little stars around each or a lot of words doesn't change much about it.
Just consider that - **all GS1** codes are public, and administratively distributed - **all ISBNs** codes are public, and administratively distributed - **all ISSNs** are public, and administratively distributed - **all DOIs** are public, and administratively managed - **all UNICODE code points** are public, and administratively managed - no supermarket in the world would work without GS1 codes - no library in the world or bookstore online or not would work without ISBN codes (which are also GS1 codes) - much of the reasearch papers references, would not work without DOIs - No multilingual system would work without UNICODE code points - The above represent ARTEFACT publication domains - "IT" is also an artefact publication domain - "IT", in all its glory, and awareness, pretends it could do without IDs equivalent to the above (although it is *wiser* in some domains, such as UNICODE for instance)
> Just consider that > - **all GS1** codes are public, and administratively distributed > - **all ISBNs** codes are public, and administratively distributed > - **all ISSNs** are public, and administratively distributed
...
Aren't we forgetting something?
Internet domain names, and by extension all URLs. There's ICANN at the top deciding what tlds exist and who administers them; then the .com, .net, etc. registrars; then individual server operators who have foo.com and can create bar.foo.com and other subdomains of foo.com; and then individual site operators who have a website at bar.foo.com and can create http://bar.foo.com/my/website/index.htm and the like.
Ultimately the one central point is ICANN deciding what tlds exist and what registrars can create domains in each. Mostly you can get by ignoring ICANN if you don't want a whole new tld. If you want a .com you talk to a registrar. If you want yourname.blogspot.com you only need to fill in a web form at Google and Google computers somewhere will make sure it's not already taken.
Green Council" <fp-eotbp...@ibm.com> wrote: > On 23/06/2011 4:18 PM, yves75 wrote:> Just consider that > > - **all GS1** codes are public, and administratively distributed > > - **all ISBNs** codes are public, and administratively distributed > > - **all ISSNs** are public, and administratively distributed
> ...
> Aren't we forgetting something?
> Internet domain names, and by extension all URLs. There's ICANN at the > top deciding what tlds exist and who administers them; then the .com, > .net, etc. registrars; then individual server operators who have foo.com > and can create bar.foo.com and other subdomains of foo.com; and then > individual site operators who have a website at bar.foo.com and can > createhttp://bar.foo.com/my/website/index.htmand the like.
> Ultimately the one central point is ICANN deciding what tlds exist and > what registrars can create domains in each. Mostly you can get by > ignoring ICANN if you don't want a whole new tld. If you want a .com you > talk to a registrar. If you want yourname.blogspot.com you only need to > fill in a web form at Google and Google computers somewhere will make > sure it's not already taken.
Yes fully agree with that, and the point is not to replace URLs (or URIs, or URNs) at all, but the point is that : - as said in the first message, "hierarchical" name spaces aren't in fact --quite often-- hierarchic at all, but are more s-expression on a flat symbol name space (or part of it) - the perception that these "hierarchical" name spaces "scale up" is an illusion, due to above item - IT, and the web, as any other publication, artefacts, or "conceptual" domain, needs a much more flattened, numerical style ID space - although I don't like the word "semantic" much, being more of a "tel quelist", or Rimbaldist if you prefer (or maybe JY Girardist), realizing something like the "semantic web", as defined by T.B. Lee, means having something like that - a symbol, a sign, an ideogramme, a letter, and associated UNICODE code point isn't a "resource", it isn't an object either, but they are key conventions in IT, and not only, a lot of things are similar - if you buy the set (box) of Knuth first 3 volumes of "the art of computer programming", this set (box) is a product having an ISBN -- next too--, or alongside if you prefer, the ISBNs of each of its volumes - any "primary key" in today's database systems is an identifier for an IT "object" (and also often an ID for something in the "real" world, if you want), "foreign key" is just a funny term to designate another object. - you can check the little wars about whether a DOI should be an URI/ URN or not - today to insert a video in a forum or blog post, you don't use an URL, you use something like [youtube]E&pgtrf[/youtube], where "E&pgtrf" is a flat administratively managed public ID in the context of youtube (and similar things for other video hosters) - the vision isn't a top level with as many levels as needed down, but more a source of IDs or a source of "convention atoms" or source of "symbolic gas or pins" to facilitate the writing and ability to modify the "technical book" - the hierarchical aspect in these IDs is just there for distribution aspects, and can be limited to 7 for instance (and doesn't need to be fixed) - a set or list of things can have one of these IDs alongside others - software is as hard as hardware even if the "edit test edit" loop is quite convenient in allowing to set up complex things - the complexity in current systems is as much due to not "taking care" of these IDs as the inherent complexity of the thing being written, if not much more - any ID can "traverse" any syntax or model, and this happens all the time in IT - talking or looking for an "enveloppe syntax" doesn't make sense, see above point, and the fact that you can always define another syntax, and use previous (already defined) IDs in it - I am not sure whether this is related to Gödel theorem or not - I am fully in line with ieee position statement referenced above
yves75 <yt75...@gmail.com> writes: > Yes fully agree with that, and the point is not to replace URLs (or > URIs, or URNs) at all, but the point is that : > - as said in the first message, "hierarchical" name spaces aren't in > fact --quite often-- hierarchic at all, but are more s-expression on a > flat symbol name space (or part of it)
This is wrong.
> - today to insert a video in a forum or blog post, you don't use an > URL, you use something like [youtube]E&pgtrf[/youtube], where > "E&pgtrf" is a flat administratively managed public ID in the context > of youtube (and similar things for other video hosters)
Why can't you see that "youtube" is a node in the hierarchical name space, with "E&pgtrf" being a child?
If you need to incorporate them in a lisp system, however, it would be smarter to use another data structure than a package (with a single hash-table of symbols) to store them, because you might have quite a lot of leaves in those trees...
> yves75 <yt75...@gmail.com> writes: > > Yes fully agree with that, and the point is not to replace URLs (or > > URIs, or URNs) at all, but the point is that : > > - as said in the first message, "hierarchical" name spaces aren't in > > fact --quite often-- hierarchic at all, but are more s-expression on a > > flat symbol name space (or part of it)
> This is wrong.
> > - today to insert a video in a forum or blog post, you don't use an > > URL, you use something like [youtube]E&pgtrf[/youtube], where > > "E&pgtrf" is a flat administratively managed public ID in the context > > of youtube (and similar things for other video hosters)
> Why can't you see that "youtube" is a node in the hierarchical name space, > with "E&pgtrf" being a child?
> looks quite hierarchical, and functional to me...
> If you need to incorporate them in a lisp system, however, it would be > smarter to use another data structure than a package (with a single > hash-table of symbols) to store them, because you might have quite a lot > of leaves in those trees...
What is wrong Pascal ? That key words "hierarchical" namespaces or paths are often more s-expressions than real hierarchical spaces ? No I think it is right : Don't you thing "README" for instance, is used as the same symbol in all directory it appears ? The "README" file being the image of a "README" function for the package or directory it appears ? And by the way this isn't a critic at all of this usage.
As to youtube, ISBN or other things, I'm just considering what appears today in current TEXT (or databases).
And never said everything should be in a single system.
yves75 <yt75...@gmail.com> writes: > What is wrong Pascal ? That key words "hierarchical" namespaces or > paths are often more s-expressions than real hierarchical spaces ? > No I think it is right : Don't you thing "README" for instance, is > used as the same symbol in all directory it appears ? The "README" > file being the image of a "README" function for the package or > directory it appears ? > And by the way this isn't a critic at all of this usage.
Common Lisp has no hierarchical packages. That's enough of a problem in this day and age...
On Jun 24, 8:45 am, "Pascal J. Bourguignon" <p...@informatimago.com> wrote:
> yves75 <yt75...@gmail.com> writes: > > What is wrong Pascal ? That key words "hierarchical" namespaces or > > paths are often more s-expressions than real hierarchical spaces ? > > No I think it is right : Don't you thing "README" for instance, is > > used as the same symbol in all directory it appears ? The "README" > > file being the image of a "README" function for the package or > > directory it appears ? > > And by the way this isn't a critic at all of this usage.
> Common Lisp has no hierarchical packages. That's enough of a problem in > this day and age...
> - **all ISBNs** codes are public, and administratively distributed
And they only work because governments and police enforce copyright laws.
What part of "compliance" do you not understand?
You propose a globabl, public, unique tag database. Who is going to enforce compliance?
We can't even get a few lisp implementors to agree to update some aspects of an already existing spec. You propose that every programmer on the planet will agree to exclusively use tags assigned by a single global authority in all software that runs on the internet. This borders on delusional.
On 2011-06-23 20:34:16 +0000, Fuschia, President-Elect of the Bright Purplish-Green Council said:
> Aren't we forgetting something?
> Internet domain names, and by extension all URLs.
Which, again, only work because governments and the police enforce them. Try spoofing the domain name of a major corporation and see how long before the police kick in your door.
Wake up children - global schemes like these only function to the extent that they do because of government and police enforcement, not because everyone on the planet magically decides to be 100% compliant.