Discussion on json-schema-possible-formats

1,160 views
Skip to first unread message

Ferry Phang

unread,
Feb 24, 2008, 8:03:43 PM2/24/08
to JSON Schema
What about a format for password ?

Scott Clarke

unread,
Feb 24, 2008, 8:11:46 PM2/24/08
to json-...@googlegroups.com
Since its not such a good idea to be passing unencrypted (plain text) passwords in requests due to security concerns, i doubt that a password format is necessary. Instead i would recommend encrypting the password for transit and then the password value is just a plain old string which is useless to anyone without the decryption key.

just my 2 cents.

Kris Zyp

unread,
Feb 24, 2008, 11:12:59 PM2/24/08
to json-...@googlegroups.com
Perhaps a "password" format might be useful for indicating that sensitive data is in the instance property and should be encrypted, and of course for using non-visible text in a user interface. Interesting idea...
Kris

Scott Clarke

unread,
Feb 27, 2008, 11:57:39 AM2/27/08
to json-...@googlegroups.com
ive been thinking this one a bit and thought i would ask... instead of having just one field called password or whatever that is encrypted, how about instead we add an optional property called encryption which would have the encryption type used as its value. So, let me try and example and see if i get this right:

{
"description":"A person",
"type":"object",

"properties": {
    "name": {"type":"string"},
    "gender" : {"type":"string",
      "options":[
          {"value:"male","label":"Guy"},
          {"value":"female","label":"Gal"} ]    },
    "address" : { "type":"object" }},
    "ssNumber" : { "type":"string",
                           "ecrypted": "blowfish" },
    "PIN" : { "type":"integer", "ecrypted": "md5", "minLength" : 4, maxLength: 4 },
    "personalUrl" : { "type" : "url", "ecrypted": "urlEncrypted" } <-- not sure if this would be useful or not
}
}

thoughts, ideas?

Thanks

Scott

Matt (MPCM)

unread,
Feb 27, 2008, 12:25:16 PM2/27/08
to JSON Schema
I can see the value in sharing the meta data about a string (result of
encryption). There are really three layers we are talking about so far
in this thread, the UI reaction of treating an input box as a password
box to protect from local on-lookers, the transport layer using SSL/
SSH, and strings that would need meta data to be decoded.

There is little benefit from talking about the transport, as often all
sensitive data should be on a secure channel. What is sensitive?,
these days I would say almost anything.

Perhaps the meta data is the most interesting, as you need to tell the
client how to decrypt the message. I am not sure this fits deeply into
json-schema as a type or format however. Seems like it could become a
microformat unto itself.

Would people do this on a field by field basis? In my experience this
ends up being done at higher levels for an entire object, with much
less fined grained trust levels. Or if the personal data is needed
long term it is shipped into a different silo and the public facing
silo replaces the last n characters w/ *'s.

Are there use cases you have in mind were a single field should
include meta data about the encryption method?

--
Matt (MPCM)

Scott Clarke

unread,
Feb 27, 2008, 4:13:28 PM2/27/08
to json-...@googlegroups.com
Yeah, you make a really valid point that if you are going to encrypt one string, you may as well do the whole json packet during transport and be done with it. Then the schema doesn't need to care about security and life is much simpler for everyone involved.
I am interested in the concept of meta data tho, so for the sake of discussion how about this possible use case. Say a server is returning user account data back to a client and some, but not all, values being returned are stored in the database encrypted (such as password) with the expectation that the client will be responsible for decrypting any string marked as encrypted prior to validating it. In that case then it the client would be able to tell when it needs to decrypt the value and when not to while still being able to validate the final result against it defined data format. I belive that it instead of ended up with a microformat, we instead end up just adding more detail to the existing formats. I may be way off base here and if so please tell me, but i think there is some value in this.

thanks

Matt (MPCM)

unread,
Feb 27, 2008, 5:01:07 PM2/27/08
to JSON Schema
<Sorta thinking out loud />
Passwords are a bad example, as you tend to want those to be strong
one-way encryption.

Lets assume there is a swiss bank account field or something equally
sensitive. At a minimum you need the cipher and the key (perhaps more
keys) to do the decryption, and we will assume the client knows how to
do it. Lets say the field is a 16 digit number, so validation would
require a union type of a number or a string (?).

It is kind of silly to store both together in the real world, unless
the server is acting as an agent of some sort. If the schema dictated
the cipher, it would be a comment on how to interpret the string in
the client.

Knowing the limited range of input weakens the encryption, so a
specific pre-encryption schema requirement is also a bit silly, though
perhaps useful.

I'm not sure of a real world example where encryption is done by the
client after input validation, and then tells the storage medium the
key and cipher, while also telling the limited input range. I can see
it all being done during storage behind the scene, and the client
supplying details to cause a return of the original data or the
encrypted data, but that's somewhat outside the schema range.

I guess it depends on which you trust less, and if you control the
client? There can be a lot of layers in the whole thing, but often
putting encryption all over the place does not offer much gain,
especially depending on where you decrypt it...

--
Matt (MPCM)

Scott Clarke

unread,
Feb 28, 2008, 10:51:24 AM2/28/08
to json-...@googlegroups.com
I didnt really touch on cypher keys because i don't really have a great idea on how they would be handled. I guess in my head i was thinking that the client would auto-magically have the keys before ever making a request. I agree that we would never want to pass the key with the encrypted value because its like locking your car and leaving the keys on the roof. Zero security ;)
Sooooo i guess my flimsy agrument didnt hold up at all. Your right Matt in that we should just encrypt the whole packet (if encryption is needed) and be done with it. The schema doesn't need to know about it at all and it would just make things unnecessarily complicated with little to no benefit. Seemed like a good idea at first tho ;)

Anyway, thanks for the discussion. It was interesting.

Matt (MPCM)

unread,
Feb 28, 2008, 10:55:24 AM2/28/08
to JSON Schema
> Anyway, thanks for the discussion. It was interesting.

You are quite welcome, perhaps someone wiser than me can offer a
better insight. I'm hardly a crypto wizard in any sense.

--
Matt (MPCM)

Kris Zyp

unread,
Feb 29, 2008, 10:50:50 AM2/29/08
to json-...@googlegroups.com
I have updated my JavaScript implementation of JSON Schema with the latest
modifications to the JSON Schema proposal:
http://json-schema.googlegroups.com/web/jsonschema.js
I also updated the common schemas for the most recent changes as well.
http://www.json-schema.org/
(This includes the schema for a schema: http://json-schema.org/schema)
I have also done some corrections/updates to the SMD definition (mostly
typos and such):
http://groups.google.com/group/json-schema/web/service-mapping-description-proposal
The SMD definition will probably soon be hosted by Dojo as well.
Thanks,
Kris

brent

unread,
Aug 27, 2009, 2:23:52 PM8/27/09
to JSON Schema
The Google Data APIs support two formats for expressing postal
addresses (postalAddress and structuredPostalAddress). Looks like you
have the structured address covered, but a simple string
representation of an address (that knows it's an address) might be
useful as well.

http://code.google.com/apis/gdata/docs/2.0/elements.html

Jim Keener

unread,
Mar 15, 2011, 12:19:25 PM3/15/11
to JSON Schema
I feel that format should be a real format, and should be statically
validateable. It shouldn't (just) be a marker to what's in the field.

A MIME type is not a format. It is a MIME type. We should not be
responsible for decoding the, for example, base64 jpg and making sure
it's a valid jpeg. What if it's a non-standard type? how do we
validate that? I would suggest adding a mime attribute to the field.

>Any valid MIME media type may be used as a format value, in which case the instance property value must be a string, representing >the contents of the MIME file.


Should the time field allow the use of milliseconds?
time
This should be a time in the format of hh:mm:ss(.millisec(\d{6}). It
is recommended that you use the "date-time" format instead of "time"
unless you need to transfer only the time part. The type of value must
be a string.

What is the format of a phone number? Should we force E.123 values?
Otherwise this is just a text blob, not a format.
>phone
>This should be a phone number (format may follow E.123). This value must be a string.

While everyone "knows" what a uri and url are, I feel that we should
reference the RFC or other source where a format is defined (e.g.: RFC
1738 for URLs, RFC 3986 for URIs, and RFC 5322/1 for emails)
>uri
>This should be a URI (that is it should be a url or urn). The value must be a string.
>url
>This should be a URL. Relative urls should be allowed and when possible, should be resolved relative the url used to retrieve the
>instance data. The value must be a string.
>email
>This should be an email address.
>ip-address
>This should be an ip version 4 address.
>ipv6
>This should be an ip version 6 address.
>urn
>This should be a URN. The value must be a string.

Can ip-address be renamed ipv4 or add ipv4 and allow ip-address to
accept both ipv6 and ipv4?

Also, for urn, is there a way to specify that something must be of
namespace isbn or uuid? Also, are formats checked for correctness,
i.e.: will the validates validate that a uuid is the correct
'format' (8-4-4-4-12) and has the correct internal format (that it's
v1 or v4 &c and not just completely random?) If this isn't
implemented in the validator, everyone will need to reparse and
validate this field themselves. If this is done, it would need to be
noted exactly which namespaces are being validated in this manner.

This is not a format. How do we validate this? Should the validator
be required to fetch the resource and make sure it's the correct
type? What if there is no internet connection? Would this fail
validation?
>image
>This should be an image-ref or image-attachment. The value must be a string.
>image-ref
>This should be a URL that references an image. The value must be a string.
>image-attachment
>This should be a MIME attachment with an image as the attachment. The value must be a string.

These are also not formats. I feel like they should be removed.
These are just text blobs. Validation of these is more complex than
what a data-serialization format is responsible for. I feel like it
would be bad presence to put these in a _format_ section. These are
not formats in the canonical sense of the word.
>street-address
>This should be a street address.
>locality
>This should be a city or town.
>region
>This should be a region (a state in the US, province in Canada, etc.)
>postal-code
>This should be a postal code (AKA zip code).

Is this just ISO 3166 country codes? I feel that would make most
sense.
>country
>This should be the name of a country.

Is it important to have country in here? If so, would language (as
speced by ISO-639) and currency (as speced by ISO-4217) be in there?
What about a timezone format? Would it be zoneinfo format? or UTC
offset? Or name like EST (UTC and EST have certain problems when
talking about a tz that will be applied to all forthcoming times
900EST is different from 900EDT (900Z-500 and 900Z-400), 900 America/
New_York will be the same 900. (Sorry that's poorly worded, I hope I
got my point across. Think of it in terms of bounding. This user can
do this between x and y, relative to the current timezone including
DST, 900EDT would be 1000EST. so if you had 900EDT it would bound at
1000EST (I hope I didn't mix that up, but the point would still
stand)).

I also think that an encoding that would be validated would be nice to
have as well. Such as 'url encoded', 'html encoded', base64, or
UUEncoded would be useful as well. These could simply be listed as
formats and used on text blobs.

Jim

Robert Turner

unread,
Apr 28, 2011, 7:58:44 PM4/28/11
to JSON Schema
What is the best way to model a currency property?

Could it be expressed as:

"type" : "number",
"format" : "?currency?"

A regex format could be used, but the representation will change
between locations. Would also be nice to express the currency type
(eg. USD) so that tools can convert between local/global values.

Gary Court

unread,
May 10, 2011, 6:41:35 PM5/10/11
to json-...@googlegroups.com
Currency is a political concept, and therefore varies between
different countries (like timezones). Therefore, you can't have a
format "currency" as it is not universally standardized on. The best
way I have found to model currency is to use a structure like this:

{
"title" : "Price",
"type" : "object",
"properties" : {
"value" : {
"title" : "Currency Value"
"type" : "number"
},
"currency" : {
"title" : "ISO 4217 Currency Code",
"type" : "string",
"format" : "iso4217",
"pattern" : "^[A-Z]{3}$"
}
}
}

-Gary

> --
> You received this message because you are subscribed to the Google Groups "JSON Schema" group.
> To post to this group, send email to json-...@googlegroups.com.
> To unsubscribe from this group, send email to json-schema...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/json-schema?hl=en.
>
>

Reply all
Reply to author
Forward
0 new messages