.net - How can I prevent strange characters when pulling the atom feed from a wordpress 3.0 blog -
i have atom feed on wordpress blog here: http://blogs.legalview.info/auto-accidents/feed/atom
when download text of file , display on site, strange charactes accented 'a' here:
recent studies showing car accident -related fatalities have declined 10% since 2008. reason
i using following code in c# web application download feed:
webclient client = new webclient(); client.headers.add(@"accept-language: en-us,en accept-charset: utf-8"); string xml_text = client.downloadstring(_atom_url);
and xml_text.contains("Â")
returns true, if download feed in browser no such  exists. i'm pretty sure character set issue, can't figure out why. examining client.responseheaders
, can see in fact downloading text in utf-8, , response on .net site utf-8 well, can't figure out why weirdness appears
i ...fatalitiesÂ
when force browser interpret feed iso-8859-1 instead of utf-8 (which correct character set feed.)
i'm pretty sure either webclient somehow defaults iso-8859-1, or output encoding on site iso-8859-1, garbles utf-8 input.
maybe start checking site's output first. if utf-8, take @ webclient.
Comments
Post a Comment