We are using below function for getting webpage content using curl module of php.
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile); //The name of the file containing the cookie data. The cookie file can be in Netscape format, or just plain HTTP-style headers dumped into a file.
curl_setopt($ch, CURLOPT_URL,$urlValue); // The URL to fetch. This can also be set when initializing a session with curl_init().
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile); // The name of a file to save all internal cookies to when the connection closes.
curl_setopt($ch, CURLOPT_HEADER,1); // TRUE to include the header in the output.
curl_setopt($ch,CURLOPT_AUTOREFERER,1); // TRUE to automatically set the Referer: field in requests where it follows a Location: redirect.
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
curl_setopt($ch, CURLOPT_POST,0); //TRUE to do a regular HTTP POST. This POST is the normal application/x-www-form-urlencoded kind, most commonly used by HTML forms.
return $data = curl_exec($ch);
It worked fine for many websites.
But it started giving “500 Internal Server Error” message when using it for one specific website.
I came to know that some web servers will block the requests from non-identified user-agents (browsers).
We have resolved this issue by including below lines in the function for spoofing it as FireFox 2.0.
$useragent=”Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:18.104.22.168) Gecko/20061204 Firefox/22.214.171.124″;
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);