Custom CssSchema & Properties

212 views
Skip to first unread message

wynn...@gmail.com

unread,
Mar 22, 2018, 1:02:08 PM3/22/18
to OWASP Java HTML Sanitizer Support
Hi there, 
I'm looking to create my own custom schema to allow both css properties that aren't included in CssSchema.java (e.g. align-content), and also values that aren't listed either (e.g. background 'url' value).
I understand anything left out of the class was left out for good reason, but I have a requirement to implement a more lenient CSS whitelist to preserve existing data.
From looking through the code it looks like this could be quite an involved process. 
To get the ball rolling though, what is the 'bits' variable in a Property object? What function do they serve? How will I figure out the correct 'bits' value for css properties that aren't included in the file?
I'm assuming the bits var is not an indication of how many bits the value can use in memory as some of the properties are declared with 0 bits
Can you recommend some reading / articles discussing this approach to  bits / bit fields?
Thanks!

Mike Samuel

unread,
Mar 22, 2018, 1:21:15 PM3/22/18
to OWASP Java HTML Sanitizer Support
On Thu, Mar 22, 2018 at 1:02 PM, <wynn...@gmail.com> wrote:
Hi there, 
I'm looking to create my own custom schema to allow both css properties that aren't included in CssSchema.java (e.g. align-content), and also values that aren't listed either (e.g. background 'url' value).
I understand anything left out of the class was left out for good reason, but I have a requirement to implement a more lenient CSS whitelist to preserve existing data.
From looking through the code it looks like this could be quite an involved process. 
To get the ball rolling though, what is the 'bits' variable in a Property object? What function do they serve? How will I figure out the correct 'bits' value for css properties that aren't included in the file?

Sorry for the lack of internal docs.
Bits is a bitwise-OR of the BIT_* fields.
It serves two goals:
*  It determines what kinds of tokens are allowed in the value for a property.  So a token is allowed in a value if it is in literals, is a function described by fnKeys, or if the type of literal is in that bitfield.
*  It is used to transform some tokens.  If a string is there, and BIT_URL is set and BIT_STRING is not, then it's safe to rewrite the string token to a url("...") token after applying the policies URL transform.  If BIT_STRING is set and BIT_URL is not, then it's safe to allow the string token without URL rewriting.

The former happens throughout

The latter kind of transform happens at:
https://github.com/OWASP/java-html-sanitizer/blob/c638258c6fe03cf4cbb61c5a1a6e6d3f9d4e4aa7/src/main/java/org/owasp/html/StylingPolicy.java#L195-L200
https://github.com/OWASP/java-html-sanitizer/blob/c638258c6fe03cf4cbb61c5a1a6e6d3f9d4e4aa7/src/main/java/org/owasp/html/StylingPolicy.java#L147-L164

Figuring out the bits property involves going through CSS specifications (fun) and figuring out what kinds of tokens you want to allow, what kind can be part of the value grammar, and then intersecting those.



 
I'm assuming the bits var is not an indication of how many bits the value can use in memory as some of the properties are declared with 0 bits
Can you recommend some reading / articles discussing this approach to  bits / bit fields?

 
Thanks!

--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

wynn...@gmail.com

unread,
Mar 26, 2018, 11:52:17 AM3/26/18
to OWASP Java HTML Sanitizer Support
Thanks for the reply!, so I just want to make sure I'm correct in my thinking before moving forward

The property border-spacing has a bits value of '5', 
    this means that (ignoring the literals & functions) border-spacing can accept Quantity values which may be negative 
           ([BIT_NEGATIVE=4] + [BIT_QUANTITY=1])  = 5

Similarly, font-family has a bits value of 72
  meaning it can accept Unreserved words and strings ([BIT_UNRESERVED_WORD=64] + [BIT_STRING=8]) = 72

Thus overflow which has a bits value of 0 can accept nothing (again, ignoring literals & functions) 

But what about those like color(bits=258) or cursor(bits=272)? Would we not require a BIT_* field representing 256? What values can these properties accept?

Also I can't see where the BIT_UNICODE_RANGE field is ever used in StylingPolicy or elsewhere, is it used?

Thanks again!

wynn...@gmail.com

unread,
Mar 27, 2018, 8:58:30 AM3/27/18
to OWASP Java HTML Sanitizer Support
Oh another probably more important step, 
  Is it even possible to create a CssSchema with custom properties that aren't defined in your class? I see the constructor is private and the only other methods that return CssSchema are withProperties (which will throw an IllegalArgumentException if you pass it a property not defined in the class) and union (which requires the CssSchema objects as params).
I could be entirely wrong, but it seems to me the only way to create custom CssSchemas with custom Properties (i.e. those not included in CssSchema.DEFINITIONS) would be to hack the actual jar itself no?

Cheers!

wynn...@gmail.com

unread,
Mar 29, 2018, 11:05:35 AM3/29/18
to OWASP Java HTML Sanitizer Support
Hi again, Did you get a chance to review this yet? From your initial reply I got the impression that creating a schema with properties not included in the definitions is possible, but the code seems to  contradict that.
Thanks!

Mike Samuel

unread,
Mar 29, 2018, 2:39:37 PM3/29/18
to OWASP Java HTML Sanitizer Support
Sorry.  Missed this.  Responses inline.

On Tue, Mar 27, 2018 at 8:58 AM, <wynn...@gmail.com> wrote:
Oh another probably more important step, 
  Is it even possible to create a CssSchema with custom properties that aren't defined in your class? I see the constructor is private and the only other methods that return CssSchema are withProperties (which will throw an IllegalArgumentException if you pass it a property not defined in the class) and union (which requires the CssSchema objects as params).
I could be entirely wrong, but it seems to me the only way to create custom CssSchemas with custom Properties (i.e. those not included in CssSchema.DEFINITIONS) would be to hack the actual jar itself no?

There's always Method.setAccessible but lets not encourage that :)
I'll put out a version that provides access.


 
Cheers!


On Monday, March 26, 2018 at 4:52:17 PM UTC+1, John Wynne wrote:
Thanks for the reply!, so I just want to make sure I'm correct in my thinking before moving forward

The property border-spacing has a bits value of '5', 
    this means that (ignoring the literals & functions) border-spacing can accept Quantity values which may be negative 
           ([BIT_NEGATIVE=4] + [BIT_QUANTITY=1])  = 5
 
Similarly, font-family has a bits value of 72
  meaning it can accept Unreserved words and strings ([BIT_UNRESERVED_WORD=64] + [BIT_STRING=8]) = 72

Yep.  (BIT_NEGATIVE | BIT_QUANTITY) == 5 && (BIT_UNRESERVED_WORD | BIT_STRING) == 72

 
Thus overflow which has a bits value of 0 can accept nothing (again, ignoring literals & functions) 

Exactly.
 
But what about those like color(bits=258) or cursor(bits=272)? Would we not require a BIT_* field representing 256? What values can these properties accept?

Hmm.  IIRC, I generated this list from a curated list from the Caja project.  That was long enough ago that I don't remember the details, but it's possible they used a
bit that I didn't bother to reproduce.
 
Also I can't see where the BIT_UNICODE_RANGE field is ever used in StylingPolicy or elsewhere, is it used?

Quite possibly not.  I was working from a version of the CSS grammar that specified a unicode range as a kind of token, but I don't know what it's used for.

Jim Manico

unread,
Apr 3, 2018, 9:58:45 PM4/3/18
to owasp-java-html-...@googlegroups.com, Mike Samuel

I want to put something out there that was discussed at a conference I'm at.

If you're accepting CSS from a user, this is something heavily discouraged and there is no good way to lock this down. CSS can be used to clobber existing CSS, modify major portions of the page, and much worse. This set of features is something Mike supports because it's asked about so much, but it's a sign of very bad design that is going to be dangerous no matter how much we validate un-trusted CSS.

We advise staging up CSS that is static and allow untrusted HTML to only reference those classes instead of providing new CSS style.

With respect,

Jim

To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.

Mike Samuel

unread,
Apr 4, 2018, 10:04:18 AM4/4/18
to Jim Manico, OWASP Java HTML Sanitizer Support
On Tue, Apr 3, 2018 at 9:58 PM, Jim Manico <jim.m...@owasp.org> wrote:

I want to put something out there that was discussed at a conference I'm at.

If you're accepting CSS from a user, this is something heavily discouraged and there is no good way to lock this down. 

CSS can be used to clobber existing CSS, modify major portions of the page, and much worse.

We disallow position:sticky and position:fixed so that client code can use a position:relative;overflow:hidden to contain self-styling sanitized snippets.
Embedders of sanitized content do have to consistently do that and make sure that contributed content is clearly demarcated.

Most of the "much worse" require a payload to specify selectors which the sanitizer should not allow.   Unproxied images do allow tracking and, by positioning below the fold, can track whether a user scrolls down.  Embedders do need to use URL rewriting if they allow background styling and use sensible Referrer-Policy and related headers.  

That said, even if care is taken, CSS has a large attack surface, so not using it puts you in a safer place.

This set of features is something Mike supports because it's asked about so much, but it's a sign of very bad design that is going to be dangerous no matter how much we validate un-trusted CSS.

We advise staging up CSS that is static and allow untrusted HTML to only reference those classes instead of providing new CSS style.

Agreed.  One way to give flexibility to embedded content providers is to define a curated set of .contributed-foo{...} styles and whitelisting class attributes matching /^((?:^|\s+)contributed-\w*)\s*$/.

Jim Manico

unread,
Apr 4, 2018, 12:04:52 PM4/4/18
to owasp-java-html-...@googlegroups.com, Mike Samuel, Jim Manico

This is very helpful information Mike, I'll add it to the wiki.

Aloha, Jim

To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Jim Manico
Manicode Security
https://www.manicode.com

Jim Manico

unread,
Apr 9, 2018, 6:11:54 AM4/9/18
to owasp-java-html-...@googlegroups.com, Mike Samuel, Jim Manico

I added a brief note about this on the wiki.

https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project#tab=CSS_Sanitization

Aloha, Jim

To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Jim Manico
Manicode Security
https://www.manicode.com

wynn...@gmail.com

unread,
Nov 8, 2018, 8:07:59 AM11/8/18
to OWASP Java HTML Sanitizer Support
Hey guys, regarding your comment on March 29th, 
"There's always Method.setAccessible but lets not encourage that :)
I'll put out a version that provides access."
Was there ever a version released with this access granted?
Thanks in advance! 

Mike Samuel

unread,
Nov 13, 2018, 11:10:58 AM11/13/18
to owasp-java-html-...@googlegroups.com
Sorry, John, I dropped the ball on this one.

If https://github.com/OWASP/java-html-sanitizer/commit/1e6b03527a9b02c158ff92fc50fc20bfecc2da2d looks good to you, I'll push a new version to maven central.

--
Reply all
Reply to author
Forward
0 new messages