regex - Regular Expressions - Parsing a url -
i've delved regular expressions 1 of first times in order parse url. without going depth, want friendly urls , i'm saving each permalink in database, because of differences in languages , pages want save 1 permalink , parse url page , language. if i'm getting this:
http://domain.com/lang/fr/category/9/category_title/page/3.html
all want bit "category/9/category_title" know page i'm on. i've come function:
$return = array(); $string = 'http://domain.com/lang/fr/category/9/category_title/page/3.html'; //remove domain , http $string = preg_replace('@^(?:http://)?([^/]+)@i','',$string); if(preg_match('/^\/lang\/([a-z]{2})/',$string,$langmatches)) { $return['lang'] = $langmatches[1]; //remove lang $string = preg_replace('/^\/lang\/[a-z{2}]+/','',$string); } else { $return['lang'] = 'en'; } //get extension $bits = explode(".", strtolower($string)); $return['extension'] = end($bits); //remove extension $string = preg_replace('/\.[^.]+$/','',$string); if(preg_match('/page\/([1-9+])$/',$string,$pagematches)) { $return['page'] = $pagematches[1]; //remove lang $string = preg_replace('/page\/[1-9+]$/','',$string); } else { $return['page'] = 1; } //remove additional slashes beginning , end $string = preg_replace('#^(/?)|(/?)$#', '', $string); $return['permalink'] = $string; print_r($return);
which returns above example:
array ( [lang] => fr [extension] => html [page] => 3 [permalink] => category/9/category_title )
this perfect , want. question is, have gone using regular expressions correctly? there better way this, instance strip domain, extension , additional slashes @ beginning , end 1 kick ass expression?
you should use parse_url
split url components. , when having url path, can use explode
split path segments, array_slice
specific segments , pathinfo
extension.
Comments
Post a Comment