Why is this returning a "Not Found" with PHP and cURL? -
my script works other links tried, , same response curl (and lot smaller, code):
<?php $url = $_get['url']; $header = get_headers($url,1); print_r($header); function get_url($u,$h){ if(preg_match('/200/',$h[0])){ echo file_get_contents($u); } elseif(preg_match('/301/',$h[0])){ $nh = get_headers($h['location']); get_url($h['location'],$nh); } } get_url($url,$header); ?>
(and anthropologie product links). i'm assuming other sites have no yet found act way also. here header response:
array ( [0] => http/1.1 200 ok [server] => apache [x-powered-by] => servlet 2.4; jboss-4.2.0.ga_cp05 (build: svntag=jbpapp_4_2_0_ga_cp05 date=200810231548)/jbossweb-2.0 [x-atg-version] => version=rentlufeqyxbvedqbgf0zm9ybs85ljfwmsxbremgwybeufnmawnlbnnllzagif0= [content-type] => text/html;charset=iso-8859-1 [date] => sat, 24 jul 2010 23:47:47 gmt [content-length] => 21669 [connection] => keep-alive [set-cookie] => array ( [0] => jsessionid=65ca111adbf267a3b405c69a325576f8.app46-node2; path=/ [1] => visitcount=1; expires=fri, 29-may-2026 00:41:07 gmt; path=/ [2] => uoccii:=; expires=mon, 23-aug-2010 23:47:47 gmt; path=/ [3] => lastvisited=2010-07-24; expires=fri, 29-may-2026 00:41:07 gmt; path=/ ) )
i'm guessing maybe has cookies? ideas?
install fiddler , see being sent.
you can try setting user-agent real browser. sites try prevent scraping checking this.
Comments
Post a Comment