WordPress Encoding Woes

WordPress is starting to really, really annoy me with its text encoding quirks. I’m not sure why, but I’ll post something that has foreign characters in it, and more or less on its own, the foreign text – whether it is umlauts or Cyrillic – gets garbled into weird characters. I’ve tried changing the text encoding setting in WordPress to UTF-8, which doesn’t seem to help. If I change the text encoding in Firefox to UTF-8, though, the garbled text is fixed, and I see what I actually wrote.

Does anyone know how to go about fixing this permanently? Telling people to use a certain browser and set their text encoding to a particular setting isn’t exactly a “fix” in my book.

By |2010-08-11T17:41:21+00:00August 11th, 2010|All Entries|4 Comments


  1. Keith August 11, 2010 at 6:32 pm - Reply

    It seems your browser is not correctly identifying the text encoding. There are some auto-detect settings, so maybe it is detecting the foreign characters incorrectly. What do you have for your auto-detect settings?

    The tag in your template might be incorrect. Here is an example I found (without the open/closing brackets):
    meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″

    Your document also doesn’t have a proper header. Perhaps changing your template would help.

  2. Joshua J. Slone August 12, 2010 at 9:07 am - Reply

    Yeah, perhaps it is on your end. Going back a few pages to find something with unusual text, the June 29 entry with Russian looks fine to me.

    I see that your page headers don’t have ALL the parts Keith mentions (my page displaying Japanese stuff has it all, though in a slightly different order), but I don’t know if just having

    meta charset=”utf-8″

    is actually a problem; I’d think if WordPress is doing things that way it realizes the other stuff is the assumed default or whatever.

  3. doviende August 13, 2010 at 2:03 pm - Reply

    Ya, I’m with Keith. You really need the proper header to identify it as UTF-8, and then every browser should be fine. UTF-8 is a very standard encoding, and I think even a lot of users of other encodings will be switching over to it in the near future. It’s supported by pretty much everything. I think your problem is just an incorrect meta tag that prevents the autodetection.

  4. Josh August 29, 2010 at 9:02 am - Reply

    Thanks guys, I think that was indeed the problem; after messing around with the encoding some more, things seem to be displaying correctly now. If you see any bizarre characters, let me know, please.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.