HTML or Unicode in titles?

21 views

Skip to first unread message

Jörg Prante

unread,

Aug 1, 2024, 4:44:15 AMAug 1

to OpenAlex Community

Hello,

I would like to know if there are any plans to clean up HTML tags in titles such as in W2315752015:

The First Example of a Crystalline Subvalent Organolanthanum Complex: [K([18]crown-6)- (η2-C6H6)2][(LaCptt2)2(μ-η6:η6-C6H6)]•2C6H6 (Cptt = η5-C5H3But2-1,3)

Probably, HTML tags like or might be preferred, or just being brought in by the original sources, and kept in OpenAlex for reference or title matching, for whatever reason.

As there are possibilities to replace the HTML tags by Unicode - see https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts - it would seem to be feasible and helpful to the community to use Unicode instead.

There are rare examples of titles which seem to exceed to maximum length, and therefore, the HTML tags are unbalanced. Unfortunately, I do not have the W identifiers of such titles at hand.

Is it recommendable to start a private effort to clean up title strings?

Best regards,

Jörg

Reply all

Reply to author

Forward

0 new messages