Now what's weird here is that the smart codes have been correctly transcoded in utf-8; however the HTML escaped sequences are mangled: \xc2\x93 is not a valid UTF-8 codepoint; but \x93 is the correct windows-1252 codepoint....
So somehow the escaped sequences have been - correctly - transcoded to windows-1252, but then incorrectly translated to UTF-8...
What's going on? Interestingly html5lib works correctly, but both html.parser and lxml fail:
In [51]: diagnose(a)
Diagnostic running on Beautiful Soup 4.4.1
Python version 3.4.3 (default, Nov 28 2017, 16:40:41)
[GCC 4.8.4]
Found lxml version 3.8.0.0
Found html5lib version 1.0b3
Trying to parse your markup with html.parser
Here's what html.parser did with the markup:
<html>
<head>
<title>
Message: “Our Line’s Been Changed Again”
</title>
</head>
<p>
Message: “Our Line’s Been Changed Again”
</p>
<p>
But... “What Does It Mean?—Not Very Much.”
</p>
</html>
--------------------------------------------------------------------------------
Trying to parse your markup with html5lib
Here's what html5lib did with the markup:
<html>
<head>
<title>
Message: “Our Line’s Been Changed Again”
</title>
</head>
<body>
<p>
Message: “Our Line’s Been Changed Again”
</p>
<p>
But... “What Does It Mean?—Not Very Much.”
</p>
</body>
</html>
--------------------------------------------------------------------------------
Trying to parse your markup with lxml
Here's what lxml did with the markup:
<html>
<head>
<title>
Message: “Our Line’s Been Changed Again”
</title>
</head>
<body>
<p>
Message: “Our Line’s Been Changed Again”
</p>
<p>
But... “What Does It Mean?—Not Very Much.”
</p>
</body>
</html>
--------------------------------------------------------------------------------
Trying to parse your markup with ['lxml', 'xml']
Here's what ['lxml', 'xml'] did with the markup:
<?xml version="1.0" encoding="utf-8"?>
<html>
<head>
<title>
Message: “Our Line’s Been Changed Again”
</title>
</head>
<p>
Message: “Our Line’s Been Changed Again”
</p>
<p>
But... “What Does It Mean?—Not Very Much.”
</p>
</html>
--------------------------------------------------------------------------------