Jump to content

json_encode() returns unicode

- - - - -

  • Please log in to reply
12 replies to this topic

#1
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
Hi all,

I'm currently doing a project where it's possible to type in data in HTML to be imploded to and JSON string in the PHP backend.
The user have two input fields: "Title" and "value".
But "value" could contain an JSON object itself which you have "Title" as index in the major object.

I do this on backend:

            foreach($_POST['jsonArguments'] as $key=>$arg) {

    		$val = $_POST['jsonArguments'][$key];

                    $json_parse = json_decode($val);

                    if($json_parse) $val = $json_parse;

                    $json[$arg] = $val;

            }

            $parser_setting_new = json_encode($json);


And when you save and refresh the page, the saved JSON settings should be show as you wrote them.
The problem comes when I try to use Scandinavian (Danish actually) characters like; æ ø å.
The word "måge" (seagull) becomes "måu005ege" because "å" is replaced with "u005e".

My question is, how to I handle this correct so that I can see the actual characters in my frontend instead of unicodes?

#2
Orjan

Orjan

    Writes binary right handed and hex left handed

  • Moderators
  • 3,299 posts
  • Location:Karlstad, Sweden
  • Programming Language:C, Java, C++, C#, PHP, JavaScript, Pascal
  • Learning:Java, C#
Hello Neighbor!

if it would replace å with u005e, måge would become mu005ege, not måu005ege as you wrote. is it a typing mistake, or is there some other error?
__________________________________________
I study Information Systems at Karlstad University when I'm not on CodeCall

#3
Alexander

Alexander

    It's Science!

  • Moderators
  • 4,124 posts
  • Location:Vancouver, Eh! Cleverness: 200
json_{encode, decode} are fully UTF8 compliant, so you must ensure they are encoded and decoded with that encoding.

Orjan is right to be suspicious, måu005ege is not an artefact of UTF encoding issues (it would not display at all), was the data that you applied json encoding to malformed when transported?

Edited by Alexander, 16 March 2011 - 11:47 PM.

Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.

#4
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
sorry mate, I meant "mu005ege".

But okay, I just need to utf8_encode all data before saving it to the database?
Do I need to make sure the encoding in the database is UTF-8 as well?

I dot SET NAMES 'utf8' when creating connection.

#5
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
I belive it's json_encode that eats my specialchars.
This scripts does the same as my original:


<?php

$s = 'måge';

echo json_encode($s);

?>



#6
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
Uhh, but I see here that I get m\u005ege instead of mu005ege (must have missed a stripslashes somewhere) - which means I can make a preg_replace on the unicode, but doesn't like that way of making hakcs all the time.
Wanna make it work as supposed.

But I get unicode everytime I use json_encode.

#7
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
@Alexander OFF-T: Sorry about the doubble posts and doubble threads I've made in this category - keeps forgetting someone needs to approve them.

#8
Alexander

Alexander

    It's Science!

  • Moderators
  • 4,124 posts
  • Location:Vancouver, Eh! Cleverness: 200
It is fine about the postings. JSON will create Javascript compatible arrays or objects for serialization purposes, and Javascript will not be able to properly store UTF encoded characters in native strings, so it is converted to the \u0000 format. Do you require Javascript compatibility, or can you use serialization (serialize() and unserialize()) functions which PHP provides for database storage?
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.

#9
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
Yeah, I kinda need JS compatibility, 'cause the JSON is splitted up and putted into text fields frontend, through JS.
Could it be PDO that's stripping the backslashes before the 'u' ?

#10
Alexander

Alexander

    It's Science!

  • Moderators
  • 4,124 posts
  • Location:Vancouver, Eh! Cleverness: 200
If you use PDO prepared statements, there is absolutely no escaping required so it would not automatically filter it. It must be before that.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.

#11
__ak

__ak

    Newbie

  • Members
  • PipPip
  • 24 posts
Okay, so PDO has no role here.

I've tried to utf8 en/decode as I think is right, but i end up with "null" in DB.


            foreach($_POST['jsonArguments'] as $key=>$arg) {

    		$val = $_POST['jsonArguments'][$key];

                    $json_parse = json_decode(utf8_decode($val));

                    if($json_parse) $val = $json_parse;

                    $json[$arg] = $val;

            }

            $result = json_encode(utf8_encode($json));



#12
Alexander

Alexander

    It's Science!

  • Moderators
  • 4,124 posts
  • Location:Vancouver, Eh! Cleverness: 200
That should not matter, somewhere something is stripping the tags, that is why you should look at the data ($_POST['jsonArguments']) and see if the tags are alright there. Also check every time you modify that data.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users