url - How do I check for valid (not dead) links programmatically using PHP? -


given list of urls, check each url:

  • returns 200 ok status code
  • returns response within x amount of time

the end goal system capable of flagging urls potentially broken administrator can review them.

the script written in php , run on daily basis via cron.

the script processing approximately 1000 urls @ go.

question has 2 parts:

  • are there bigtime gotchas operation this, issues have run into?
  • what best method checking status of url in php considering both accuracy , performance?

use php curl extension. unlike fopen() can make http head requests sufficient check availability of url , save ton of bandwith don't have download entire body of page check.

as starting point use function this:

function is_available($url, $timeout = 30) {     $ch = curl_init(); // curl handle      // set curl options     $opts = array(curlopt_returntransfer => true, // not output browser                   curlopt_url => $url,            // set url                   curlopt_nobody => true,         // head request                   curlopt_timeout => $timeout);   // set timeout     curl_setopt_array($ch, $opts);       curl_exec($ch); // it!      $retval = curl_getinfo($ch, curlinfo_http_code) == 200; // check if http ok      curl_close($ch); // close handle      return $retval; } 

however, there's ton of possible optimizations: might want re-use curl instance and, if checking more 1 url per host, re-use connection.

oh, , code check strictly http response code 200. not follow redirects (302) -- there curl-option that.


Comments

Popular posts from this blog

c++ - How do I get a multi line tooltip in MFC -

asp.net - In javascript how to find the height and width -

c# - DataTable to EnumerableRowCollection -