I'm not - I believe you misinterpreted my statement. Oohembed is presently
returning titles that have HTML entities in them, so it is oohembed that
is assuming that oembed responses are used exclusives in the context of
HTML.
> In any case, the HTML spec itself says that TITLE should be text but
> entities are
> allowed.
Right, no dispute there. I'm talking about the oembed title specification,
which indicates text, not HTML.
> Further, how do you propose handling cases like this? Should I just strip
> the
> entities out? That may actually leave the title meaningless. Or should I
> maintain
> some kind of conversion table and try converting every entity encountered
> into
> suitable unicode?
You should convert the HTML entities encountered in the HTML <title> into
unicode for the oembed title. For example, in PHP, there is the
html_entities_decode
http://us2.php.net/manual/en/function.html-entity-decode.php function
which takes a string that has HTML entities in it and converts those
entities into their unicode equivalents.
> I still think you should just take the suggestion in the link I provided
> in
> the
> previous comment.
I think I am handling it correctly now. The oembed spec says that the
title property is text, and I display it as such. It does not say the
title property is HTML.
Thanks,
~Craig