I am working on a CodeIgniter project that has information stored in the database with extended ascii characters (128-255) and UTF-8 encoding. As most of you know, CodeIgniter is fairly strict on what is allowed in a url (they only allow what SHOULD be permitted per RFC 1738 ;)). Furthermore sunce you can create basically any character by allowing a percent sign (%) in the URL, alowing that was not a viable option for me.
For instance you cannot use “ñ” (n with a tilde) in the URL parameters. Replacing it with “n” for the database query will however return a valid match. Problem solved! I can just use Normalizer or iconv, right?
Unfortunately I do not have access to Normalizer::normalize or iconv on the server in question, so I had to write my own function for the conversion. I HATE hard coding function arguments, but in this case it was required since storing key/value pairs in the database would have been overkill. So here is my function…
function _normalize($str) { $table = array( chr(128) => 'EUR', chr(129) => ' ', chr(130) => ',', chr(131) => 'f', chr(132) => '"', chr(133) => '...', chr(134) => 't', chr(135) => 'tt', chr(136) => '^', chr(137) => '0/00', chr(138) => 'S', chr(138) => '<', chr(140) => 'OE', chr(141) => ' ', chr(142) => 'Z', chr(143) => ' ', chr(144) => ' ', chr(145) => '\'', chr(146) => '\'', chr(147) => '"', chr(148) => '"', chr(149) => '*', chr(150) => '-', chr(151) => '-', chr(152) => '~', chr(153) => 'TM', chr(154) => 's', chr(155) => '>', chr(156) => 'oe', chr(157) => ' ', chr(158) => 'z', chr(159) => 'Y', chr(160) => ' ', chr(161) => '!', chr(162) => 'c', chr(163) => '#', chr(164) => '$', chr(165) => 'Y', chr(166) => '|', chr(167) => 'S', chr(168) => '..', chr(169) => '(c)', chr(170) => 'a', chr(171) => '<<', chr(172) => '<>', chr(173) => ' ', chr(174) => '(r)', chr(175) => '_', chr(176) => '', chr(177) => '+/-', chr(178) => '(2)', chr(179) => '(3)', chr(180) => '`', chr(181) => 'u', chr(182) => 'P', chr(183) => '*', chr(184) => ',', chr(185) => '(1)', chr(186) => '', chr(187) => '>>', chr(188) => '1/4', chr(189) => '1/2', chr(190) => '3/4', chr(191) => '?', chr(192) => 'A', chr(193) => 'A', chr(194) => 'A', chr(195) => 'A', chr(196) => 'A', chr(197) => 'A', chr(198) => 'AE', chr(199) => 'C', chr(200) => 'E', chr(201) => 'E', chr(202) => 'E', chr(203) => 'E', chr(204) => 'I', chr(205) => 'I', chr(206) => 'I', chr(207) => 'I', chr(208) => 'D', chr(209) => 'N', chr(210) => 'O', chr(211) => 'O', chr(212) => 'O', chr(213) => 'O', chr(214) => 'O', chr(215) => 'x', chr(216) => 'O', chr(217) => 'U', chr(218) => 'U', chr(219) => 'U', chr(220) => 'U', chr(221) => 'Y', chr(222) => 'B', chr(223) => 'S', chr(224) => 'a', chr(225) => 'a', chr(226) => 'a', chr(227) => 'a', chr(228) => 'a', chr(229) => 'a', chr(230) => 'ae', chr(231) => 'c', chr(232) => 'e', chr(233) => 'e', chr(234) => 'e', chr(235) => 'e', chr(236) => 'i', chr(237) => 'i', chr(238) => 'i', chr(239) => 'i', chr(240) => 'o', chr(241) => 'n', chr(242) => 'o', chr(243) => 'o', chr(244) => 'o', chr(245) => 'o', chr(246) => 'o', chr(247) => '/', chr(248) => 'o', chr(249) => 'u', chr(250) => 'u', chr(251) => 'u', chr(252) => 'u', chr(253) => 'y', chr(254) => 'b', chr(255) => 'y' ); return strtr(utf8_decode($str), $table); }
Hope this is what you were looking for.