Every once in a while, someone includes some special characters, or
HTML code in the tweet's body, and that trips DOMDocument (when I do --
>saveXML()). Would anyone have advice on how to get rid of special
characters, and HTML code from a string?
What kind of error message do you get? What do you mean with "special
characters"? You could probably use a regular expression to do like
$string = preg_replace('#\W#', '', $string);, but to give an optimal
solution, more information would be needed.
Warning: DOMDocument::load() [domdocument.load]: PCDATA invalid Char
value 11
What about using <![CDATA[ ........ ]]>
or HTML encode the data!
This suggest that the XML file is not valid... Can you load the XML file
within a web browser? Can you provide a sample?
--
-- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programaci�n web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
The XML file structure is valid. One of the elements is labeled
"Text", it works fine, with the exception of one line that has the
following text in it: " (:€:shorty:„:)"
Note that the character that follows the first " is invisible, and
that's what's causing the problem. It's a character below 20 ASCII.
Specifically, this is ASCII 11 or hex 0b