Jump to content

Mysql & utf-8

- - - - -

  • Please log in to reply
10 replies to this topic

#1
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
Hello,

On one of my site I have installed a jquery autocomplete and everything work great... most of the time.
For fetching data from my database I use this command:

select id, name from person where name like '%$term%'

Where $term is what I wrote in the autocomplete.
Everything work great, except when I type the letter "a"
When I type the letter "a" it output every word with the letter "a" in it... and every word with the letter "è" in it too like Geneviève, which does not have any a in it, but it output it anyway.

I looked in this issue, when I do a request from the mysql query browser, "Geneviève" output like "Geneviève", so i'm guessing that where mysql find my "a"
And on the website, on the autocomplete word Geneviève don't have an error

Not too sure, why it didn't use utf-8 encoding... when I create my database I started with
CREATE DATABASE IF NOT EXISTS gestion
CHARACTER SET utf8;

And every request from php (I use PDO) I specified to use utf-8 too like so:
db->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8'");

so... as far as I know, everything should be in a correct utf-8 format

I even try to force it with putting the default character and the default collate to utf8 and utf8_general_ci in mysql administrator (of course, i rebooted the server, flush the database, and add everything back after that) but still the same error.

So... anyone have an idea?

Thank you

#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
select id, name from person where name like _utf8'%$term%'
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
Is like_utf8 a function?

When I tried it, I have an error:
exception 'PDOException' with message 'SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'like_utf8'%a%' at line 1' in /home/etienne/Desktop/Projets/PHP/Includes/SubModules/Manage/Autocomplete.php:43

I use mysql version 5.1.54-1ubuntu4 (Ubuntu)

#4
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
Maybe this could help, here my show create table person

CREATE TABLE `person` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`id_Group` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `id_Group` (`id_Group`),
CONSTRAINT `person_ibfk_1` FOREIGN KEY (`id_Group`) REFERENCES `Group` (`id_Group`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8

I don't think I can be more utf-8


And also, I find on a other site I could use this select id, name from person where name like '%$term%' COLLATE utf8_general_ci
But when I do, I got an error PDOException' with message 'SQLSTATE[42000]: Syntax error or access violation: 1253 COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'' in ...
Where did he find an latin1 character set?!?!

#5
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
there's a space between like and _utf8. _utf8 tells it the string is a UTF-8 string, instead of an ANSI string.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#6
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
Thank for the tip... but sadly it didn't work... I still have Geneviève in my result :(

But I find an other possibility, I have this error on other place on my website... actually I have it everywhere where I use jquery to send data, so i'm starting to think it's not a mysql or php problem, but a javascript problem...

#7
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
Sadly no... this had nothing to do with jquery or ajax, it was htmlentities that I forgot to put the charset
So now I guess I should reinvestigate mysql & php

OK, so if I use the collation utf8_bin Geneviève dosen't show when I press "a"
But it's now case sensitive so if I press "g" it dosen't work, I have to press "G"

Edited by Vaielab, 28 July 2011 - 07:35 AM.


#8
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
If someone is interested in the answer there it is:
The problem was with PDO
That dosen't not communicate with mysql in utf8, even if I added the attribute PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES utf8" or used ;charset=UTF-8 or ;charset=UTF8 at the end of the connection string
Each time I create a connection, I have to execute this ->exec("set names utf8");
I know, this means each time I create a connection, I will double the number of query, so I will make everything slower, but this is the solution I finded.

Hope this help someone!

#9
Orjan

Orjan

    Writes binary right handed and hex left handed

  • Moderators
  • 3,299 posts
  • Location:Karlstad, Sweden
  • Programming Language:C, Java, C++, C#, PHP, JavaScript, Pascal
  • Learning:Java, C#
try to uppercase everything before doing your comparison, and it becomes easier in that case.
__________________________________________
I study Information Systems at Karlstad University when I'm not on CodeCall

#10
Vaielab

Vaielab

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 547 posts
You mean using utf8_bin and putting everything in upper case?
That would have been a pretty nice alternative, but if I type "e" the word with an "é" for exemple would not have come up since it's in utf8_bin
That why I keep trying to figure a way to do it in utf8_general_ci or utf8_unicode_ci

But thx for the advice, I didn't thought of it!

#11
Orjan

Orjan

    Writes binary right handed and hex left handed

  • Moderators
  • 3,299 posts
  • Location:Karlstad, Sweden
  • Programming Language:C, Java, C++, C#, PHP, JavaScript, Pascal
  • Learning:Java, C#
you could save a search string along with the real string where all "strange" letters are replaced with a normal one, so when you search, you go for the search string and otherwise display the real string. But there is probably nicer solutions out there.
__________________________________________
I study Information Systems at Karlstad University when I'm not on CodeCall




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users