Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

Issues with non-ASCII characters for my application.

ascii

  • Please log in to reply
2 replies to this topic

#1 Graphene

Graphene

    CC Regular

  • Member
  • PipPipPip
  • 34 posts
  • Learning:C, C++, Python, JavaScript

Posted 23 February 2011 - 08:30 PM

I have started writing an application for somebody where articles are to be submitted, the client in this case caters to mostly another language (croatian, and other languages around that area)

I am unaware of how PHP handles above 127 for the application, a quick example of what I am having trouble with:

echo strlen("ĀāĂ㥹ĆćĈĉĊċ");


I hope they appear on here, basically those are twelve characters and strlen makes it seem like 24! This is a bit troublesome as some statistics/ordering relies on the length of the article.

How would I come about to fix this issue? What would I need to know?
  • 0

#2 Alexander

Alexander

    YOL9

  • Moderator
  • 3963 posts
  • Location:Vancouver, Eh! Cleverness: 200
  • Programming Language:C, C++, PHP, Assembly

Posted 23 February 2011 - 08:50 PM

PHP has no concept of byte encodings in its native strings, your compliant string may appear as the following internally:
ĀāĂ㥹ĆćĈĉĊċ
As you can see strlen will report it of 24 characters length, originally you would have account for this manually but later versions of PHP include a multibyte function library (mb) which should be enabled by default.

mb_internal_encoding("UTF-8"); //system dependent, so be explicit
echo mb_strlen("ĀāĂ㥹ĆćĈĉĊċ"); //use the mb_ function for strlen
As it uses a library (much as PCRE is used) it will naturally be slower than the standard functions, in the case you wanted to use mass automation of some sort.

Some useful documentation:
PHP: Multibyte String - Manual
  • 0

All new problems require investigation, and so if errors are problems, try to learn as much as you can and report back.


#3 Graphene

Graphene

    CC Regular

  • Member
  • PipPipPip
  • 34 posts
  • Learning:C, C++, Python, JavaScript

Posted 23 February 2011 - 09:49 PM

It returns 12, this is great! A great answer, thank you for taking the time to explain this to me in depth.
  • 0





Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download