There are various ways of getting web page content in PHP. Among them, CURL is one that is frequently used. CURL stands for Client URL. To use CURL library, you need to have curl enabled to use it. To enable it, simply edit php.ini file, uncomment this line: extension=php_curl.dll
Keep in mind that executing a CURL always include these following steps.
- Create a CURL handle using curl_init().
- Set up the request using curl_setopt() or curl_setopt_array().
- Request the page using curl_exec().
- Check if an error occurred using curl_errno().
- Get the HTTP header using curl_getinfo().
- Close the CURL handle using curl_close().
Basic syntax for CURL is as follow:
$ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'http://www.example.com'); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $data = curl_exec(); curl_close($ch);
Breaking down the code, we have following:
Code | Description |
---|---|
curl_init() | intiate the curl object |
curl_setopt() | specify the file or url to load |
curl_exec() | perform the cURL request |
curl_close() | close the connection |
There are various configuration options while executing. You can find all of them at PHP official documentation (http://php.net/manual/en/book.curl.php).
Functions | Description |
---|---|
curl_copy_handle | Copy a cURL handle along with all of its preferences |
curl_errno | Return the last error number |
curl_error | Return a string containing the last error for the current session |
curl_escape | URL encodes the given string |
curl_exec | Perform a cURL session |
curl_file_create | Create a CURLFile object |
curl_getinfo | Get information regarding a specific transfer |
curl_init | Initialize a cURL session |
curl_multi_add_handle | Add a normal cURL handle to a cURL multi handle |
curl_multi_close | Close a set of cURL handles |
curl_multi_errno | Return the last multi curl error number |
curl_multi_exec | Run the sub-connections of the current cURL handle |
curl_multi_getcontent | Return the content of a cURL handle if CURLOPT_RETURNTRANSFER is set |
curl_multi_info_read | Get information about the current transfers |
curl_multi_init | Returns a new cURL multi handle |
curl_multi_remove_handle | Remove a multi handle from a set of cURL handles |
curl_multi_select | Wait for activity on any curl_multi connection |
curl_multi_setopt | Set an option for the cURL multi handle |
curl_multi_strerror | Return string describing error code |
curl_pause | Pause and unpause a connection |
curl_reset | Reset all options of a libcurl session handle |
curl_setopt_array | Set multiple options for a cURL transfer |
curl_setopt | Set an option for a cURL transfer |
curl_share_close | Close a cURL share handle |
curl_share_errno | Return the last share curl error number |
curl_share_init | Initialize a cURL share handle |
curl_share_setopt | Set an option for a cURL share handle |
curl_share_strerror | Return string describing the given error code |
curl_strerror | Return string describing the given error code |
curl_unescape | Decodes the given URL encoded string |
curl_version | Gets cURL version information |
Now, lets write something useful on our real life scenario. Let’s create a function that takes url as a parameter and executes CURL library and return accordingly.
"GET", //set request method type post or get CURLOPT_POST =>false, //set to GET CURLOPT_USERAGENT => $user_agent, //set user agent CURLOPT_COOKIEFILE =>"cookie.txt", //set cookie file CURLOPT_COOKIEJAR =>"cookie.txt", //set cookie jar CURLOPT_RETURNTRANSFER => true, // return web page CURLOPT_HEADER => false, // don't return headers CURLOPT_FOLLOWLOCATION => true, // follow redirects CURLOPT_ENCODING => "", // handle all encodings CURLOPT_AUTOREFERER => true, // set referer on redirect CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect CURLOPT_TIMEOUT => 120, // timeout on response CURLOPT_MAXREDIRS => 10, // stop after 10 redirects ); $ch = curl_init( $url ); curl_setopt_array( $ch, $options ); $content = curl_exec( $ch ); $err = curl_errno( $ch ); $errmsg = curl_error( $ch ); $header = curl_getinfo( $ch ); curl_close( $ch ); $header['errno'] = $err; $header['errmsg'] = $errmsg; $header['content'] = $content; return $header; } ?>
Now, we will call this function and handled the response. For this, we have to write code as below:
// call function to execute curl $result = get_web_page( $url ); if ( $result['errno'] != 0 ) ... error: bad url, timeout, redirect loop ... if ( $result['http_code'] != 200 ) ... error: no page, no permissions, no service ... $page = $result['content'];