RSS

Category Archives: Mixed

The Hitchhikers Guide to PHP Load Balancing

There was once a time when running a big (or popular) web application meant running a big web server. As your application attracted more users you would add more memory and processors to your server.

Today, the ‘one huge server’ paradigm has been replaced with the idea of having a large number of smaller servers which employ one or more methods of balancing the load (known as ‘load balancing’) across the entire group (known as a ‘farm’ or ‘cluster’). This is partly down to the fall in hardware prices which made this approach more viable.

The advantages of ‘many small servers’ approach over the old ‘big server’ method are twofold:

If a server falls over then the load balancing system is normally clever enough to stop sending requests to the crashed machine and distribute the load across the remaining healthy servers.
It now becomes much easier to scale your infrastructure. All you need to do is plug a few new servers into the load balancer and away you go. No downtime necessary.
So there must be a catch… right? Well yes, it can make developing your applications a little more complex and this is what the remainder of the blog post will be covering.

At this point you may be saying to yourself, ‘ok, but how do I know if I am using this load balancing thingy’. The honest answer is that if you are asking this question then the answer is probably going to be that you are not using a load balanced system and you don’t need to worry too much. Most people explicitly setup load balancing when their applications grow to a size that demands it. However, I have occasionally seen web hosting companies that load balance their shared hosting accounts. These companies may tell you that they do this load balancing, or you may have to work it out for yourself based on the remainder of the blog post.

Before we continue I would like to point out that this post focuses on load balancing from the perspective of PHP (or your chosen server side language). I may write an additional blog post on database load balancing in the future, but for now that will have to wait.

A note on ‘web applications’: You may notice that I keep referring to ‘web applications’ rather than websites. I do this to distinguish between sites which simply display static content, and more advanced sites which are typically powered by databases and server-side programming.

Your PHP Files

The first question you may have is, ‘if there are lots of servers, how do I get my files onto all of them?’
There are a few of ways we can distribute our files to all of our servers:

  1. Upload all of our files to each server separately, just as we used to do with one server. This way clearly sucks because a) imagine doing this for 20 servers and b) it is far too easy to get something wrong and have different versions of your files on different servers.
  2. Use ‘rsync‘ (or similar). Such tools can sync local directories to multiple remote locations. An example of this would be syncing your single staging server with your multiple live servers.
  3. Use a version control system (such as subversion). This is my preferred method as I can maintain my code in subversion and then run an ‘svn update’ command on each live server when I am ready to make my changes live. This also makes rolling back changes particularly easy.
  4. Use a file server (you may find NFS useful). In this case you use a file server to store your web root directory/directories and then mount the share onto each of your web server. Of course, if your file server crashes then all your sites go down, and there can be a lot of overhead in pulling files from remote machines, especially if each request involves a large number of files.

The option you choose will depend on your requirements and the skills at your disposal. If you use a version control system then you may want to devise a way of running the update command on all servers simultaneously, whereas if you use a file server you may want to implement some form of failover system in case the file server crashes or becomes unreachable.

Uploads

A file upload is still a simple request, so the file will only be sent to one server. This is, of course, not a problem when you only have one server, but what do we do when there are several machines that the file needs to be placed on?

The problem of handling file uploads is very similar that of distributing your files across the server farm, and has the following potential solutions:

  1. Store your uploaded files in a database. Most databases allow you to store files in the form of binary data, so you can use a database to store you file uploads rather than storing them as files on disk. When you come to send the file to a user you can pull the file data from the database along with some extra data such as the file name and mime type. Before continuing down this route you should consider how much database storage you have at your disposal. This method also has quite a high overhead as the request needs to be passed though PHP, which then has to pull the data from the database.
  2. Store your uploaded files on a file server. Here, as with the previous section, you mount a fileserver share on each of your web servers and place your uploads there. This way the upload is available on all your web servers instantly. However, if the file server is unavailable then you may end up with broken images or downloads. Also, there is still an overhead in pulling files off a file server.
  3. Design your upload handling code to transfer the file to each server. This option does not have the disadvantages of using a file server or database but will probably complexity to your code. For example, what happens if one server is down at the time of the upload?

It is possible to reduce the overhead with the database storage option by keeping a local file cache. When a server receives a request for an uploaded file it first checks its local cache of uploaded files. If the file is found then it can be sent from the cache, otherwise it can be pulled from the database and the cache updated with the file.

Sessions

If you are familiar with PHP’s built in session handling you will probably know that by default it stores session data in temporary session files on the server. Again, this file is only present on the server which handled the request, but subsequent requests which use the session could be passed to any server in your farm. The result is that sessions are frequently not recognised, and the user is (for example) continually logged out.

The solution I recommend here is to either override PHP’s built in session handling and store the session data in your database, or implement some guaranteed way of always sending users to the same server. It may be easier to use the former given that larger applications will probably implement their own session handling anyway.

Configuration

I feel that configuration is worth covering here even though the topic is not just isolated to PHP. When running a server farm it is an excellent idea to have some way of keeping your configuration files in sync between servers, whether they are related to PHP or not. If your configuration files become out of sync it can lead to some very strange and intermittent behaviour that can be difficult to track down.

I recommend keeping most, if not all, of your configuration files in a separate area of your version control system. This way you can store different PHP configuration files for different installations of your projects, as well as keep all your server configuration files in sync.

Logging

Like configuration, logging is not solely related to PHP but it is still very important for keeping an eye on your servers’ health. Without a proper logging system how will you know if your PHP code starts producing errors (you do have display_errors set to ‘off’, right?)

There are a few ways you can implement logging:

  1. Log on each server – This is the simplest method available to you (aside from not logging at all!). Each machine simply logs to a file as it would do if it were not part of a larger farm. This has the benefit of being simple and requiring very little (if any) work to setup. However, as the number of servers you have begins to grow you may find that monitoring these individual log files become untenable.
  2. Log to a share – In this method each machine still has its own log files, but they are stored on a central file server and are written to via a share. This can make log monitoring simpler as all the logs are stored centrally, but there is an overhead in writing to log files on a share. Also, if the file server becomes unreachable your servers or applications may do anything from simply not logging messages to completely crashing.
  3. Log to logging server – You can use a logging application such as syslog to perform all your logging on a central server. Although this method requires the most configuration it also provides the most robust solution.

I have only covered a small area of logging here, the (very exciting! Ahem) topic of logging best practices could justify an entire blog post to itself. As always, I recommend choosing the technique best suited to your situation.

Conclusion

This blog post has given you an introduction to running your PHP applications on a larger scale. Most of the problems listed here stem from working with files, and the fact that these files often need to be shared between all the servers in your farm. It is therefore a good idea to think about the implications of load balancing when working with local files.

You can apply the techniques here if you are developing a large scale web application or if you work on projects which are distributed to a number of users or clients. For example, if you contribute to an open source project it is likely that some installations of your application will be run across several servers. It is therefore important to be aware of this when designing and creating your application.

 
Leave a comment

Posted by on June 9, 2011 in Mixed, PHP

 

Some techniques to optimize the mysql query

I would like to introduce you to some very simple techniques that can speed up your queries.

1) LIMIT:


Using the limit statement is the first and most critical query statement, because the dbms ( database management system ) will stop searching when it will found the necessery records. It reduces query time specially in tables with lots of records.

examples :

SELECT * FROM table WHERE field = 'value' LIMIT 10
                    UPDATE LOW_PRIORITY table SET field = 'value' WHERE field = 'value' LIMIT 1

 

2) UPDATE LOW_PRIORITY


When you are making an update in tables that their data are not about to be used immediately, you can use update statements with LOW_PRIORITY. This way the query will be stored in a buffer, and it will be executed when the server is not busy. This type of query is perfect for statistics, session control and rate it types of tables.

example :

UPDATE LOW_PRIORITY table SET field = 'value' WHERE field = 'value' LIMIT 1

 

3) Allways search indexed fields


When you are making SELECT statements try to insert indexed fields in the WHERE clause. To create an indexed field use this command:

ALTER TABLE `table` ADD INDEX ( `field` )

 

4) INSERT DELAYED statement


When you insert a record into a table, but you do not actually want it to be available in the exact moment, and you are not about to use the insert id of the record, you can use this statement. This way, the query will be puted to a buffer, and when the database is not busy, it will execute it. This way the server is not producing overhead, and the client get the requests faster.

example :

INSERT DELAYED INTO log_table VALUES ('1', '2', '3' )
 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

Some caching techniques with PHP

In this article I will try to give a view of what is the custom caching with php, why and how we can use it.

In the modern days, most of the sites are database driven. That means that your site is actually an application which retrieves data from a DBMS ( database managment system, eg MySQL) , parses the data and shows the result to the user. Most of these data are usually don’t change frequently or don’t change at all, and the reason that we use the database is that we can easilly update the site and the content.

A problem that this process creates is the server overhead. Every time we execute a query in the database, the instance of our script will call the DBMS, and then the DBMS will send the results of the query. This is time consuming, and especcially for sites with heavy traffic is a real big problem.

How we can solve this problem?

There are two ways to solve this if you want to make your site faster. First is optimizing the queries, but we will not talk about this at the present article. The second and most valuable is using some kind of custom caching technique.

Custom caching with php

First let me explain the idea behind custom caching. When we have dynamic pages that their data is not updated frequently, we can use a ‘system’ that will be able to create the page, and then store it for later use. That means that after the page’s creation, our application will not run the queries again in order to display the page, but it will show the cached one. Of course this system must be able to keep the cached pages for a time period that we will set.

Let’s code it

Here is a simple class that will do the job. Let’s see the code first :

<?php
class cache
{
    var $cache_dir = './tmp/cache/';//This is the directory where the cache files will be stored;
    var $cache_time = 1000;//How much time will keep the cache files in seconds.

    var $caching = false;
    var $file = '';

    function cache()
    {
        //Constructor of the class
        $this->file = $this->cache_dir . urlencode( $_SERVER['REQUEST_URI'] );
        if ( file_exists ( $this->file ) && ( fileatime ( $this->file ) + $this->cache_time ) > time() )
        {
            //Grab the cache:
            $handle = fopen( $this->file , "r");
            do {
                $data = fread($handle, 8192);
                if (strlen($data) == 0) {
                    break;
                }
                echo $data;
            } while (true);
            fclose($handle);
            exit();
        }
        else
        {
            //create cache :
            $this->caching = true;
            ob_start();
        }
    }

    function close()
    {
        //You should have this at the end of each page
        if ( $this->caching )
        {
            //You were caching the contents so display them, and write the cache file
            $data = ob_get_clean();
            echo $data;
            $fp = fopen( $this->file , 'w' );
            fwrite ( $fp , $data );
            fclose ( $fp );
        }
    }
}

//Example :
$ch = new cache();
echo date("D M j G:i:s T Y");
$ch->close();
?>

Now let me explain:

function cache()

This is the constructor function of the class. The job of this function is to check if there is a cached file for the page that we want, or it should create it. Here is how this is done :
$this->file = $this->cache_dir . urlencode( $_SERVER[‘REQUEST_URI’] );

This line creates the file name of our cached page. So the cached file will be something like /path/to/cache/dir/request_uri

if ( file_exists ( $this->file ) && ( fileatime ( $this->file ) + $this->cache_time ) > time() )

Here we check if there is a cached version of this page, and if the file must be recreated because it has expired. If the file is cached, it will show the cached page and the exit. I will explain later why exit. If the cached file must be created this code will be executed :

$this->caching = true;
ob_start();

The first statement indicates to the close() function that it is creating the cache file, and the ob_start() will start buffering the output. The buffer’s data will be used later by the close() function to save the cache file.

function close()

This function must be called from the end of your script, and it will do the rest of the job. Actually it is needed only when we are in the process of caching that’s why it starts with the statement if ( $this->caching )

Let me explain what is happening here :

$data = ob_get_clean();

Here we get all the data from the output buffer while we unset it, and put the data in the $data variable. The four statements that folow up are showing the data and then write the cache file.

Troubleshooting

This is a very simple class, and the purpose is to learn how you can implement a caching solution for your site. The obligation using this class is that you must use it only in this form :

<?php
 $a = new cache();
 ....
 ....
 ....
 $a->close();
?>

If you have code after the $a->close() statement, the class will not work right. This is because of the exit() statement in the cache() function.

Of course you can take this code and make it work for your own needs.

A quick solution is to remove the exit() statement in the cache() function and then use the class this way :

<?php
 $a = new cache();
 if ( $a->caching )
 {
 ....
 ....
 ....
 }
 $a->close();
?>

Hope this helped.

 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

Creating web services using PHP

First of all I want to explain or define what is web service ? You can think of web services as DLL files over the web you write the code then compile it and every one can use it whatever the language they are using , but in the web you don’t compile anything . Actually when I started using webservices , I was writing c# code so .Net was doing every thing for me. I couldn’t understand what is happening behind the scene I just type [webservice] above the function and everything is running. So In this article i ‘m going to explain what is happening.

How it works?

1) WSDL

First you have to define what you are serving. Think of it as a menu in a restaurant. You can define this by creating WSDL file (web service definition language). You actually define what are the functions you are using.

2) SOAP

You can consider SOAP as a waiter in a restaurant. He writes your order, delivers it to the kitchen and gets the food back to you, that is actually what the soap does; you encapsulate your request to a standard format that matches the definitions in your WSDL file. And the server in the return encapsulates the result into a standard format based also on the WSDL file; you can consider the WSDL file as the menu in the restaurant. You have to order something from the menu, and the kitchen has to deliver to you what you requested according to the details on the menu.

What do you need?

php soap

If you are using php 5 or more  on windows, go to your php.ini and uncomment the following line extension=php_soap.dll. When run your phpinfo you should see it installed.

If you are using linux, you can install it using the following line:

yum install php-soap

if you are using php 4 you can use nusoap

Creating your first web service

we are going to create a web service for a library , users can search for authors, books , browse each author’s books

Each author has the following :

author_id

author_name

And each book has the following:

book_id

book_name

author_id

We are going to assume that each book is written by only one author to simplify the example

Creating the WSDL file

I’m going now to create the WSDL file “menu”. Here is the code:

<?xml version="1.0" encoding="UTF-8"?>

<wsdl:definitions name="Library" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="Library" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:tns="Library" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"  >

    <xsd:documentation></xsd:documentation>
    <wsdl:types>
     <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="Library">
            <xsd:complexType name="Book">

                <xsd:sequence>
                 <xsd:element name="ID" type="xsd:integer"></xsd:element>
                 <xsd:element name="NAME" type="xsd:string" ></xsd:element>

                 <xsd:element name="AUTHOR" type="tns:Author" ></xsd:element>
                </xsd:sequence>
            </xsd:complexType>

            <xsd:complexType name="Author">

                <xsd:sequence>
                 <xsd:element name="ID" type="xsd:integer"></xsd:element>
                 <xsd:element name="NAME" type="xsd:string" ></xsd:element>

              <xsd:element name="BOOKS" type="tns:Book"
               minOccurs="0" maxOccurs="unbounded">
              </xsd:element>

             </xsd:sequence>
            </xsd:complexType>

            <xsd:complexType name="Authors">
                <xsd:sequence>
              <xsd:element name="AUTHOR" type="tns:Author"

               minOccurs="0" maxOccurs="unbounded">
              </xsd:element>
             </xsd:sequence>
            </xsd:complexType>

            <xsd:complexType name="Books">
                <xsd:sequence>
              <xsd:element name="BOOK" type="tns:Book"
               minOccurs="0" maxOccurs="unbounded">

              </xsd:element>
             </xsd:sequence>
            </xsd:complexType>
     </xsd:schema></wsdl:types>
  <wsdl:message name="searchAuthorsRequest">

   <wsdl:part name="NAME" type="xsd:string"></wsdl:part>
  </wsdl:message>
  <wsdl:message name="searchAuthorsResponse">
   <wsdl:part name="AUTHORS" type="tns:Authors"></wsdl:part>

  </wsdl:message>
  <wsdl:message name="searchBooksRequest">
   <wsdl:part name="NAME" type="xsd:string"></wsdl:part>
  </wsdl:message>

  <wsdl:message name="searchBooksResponse">
   <wsdl:part name="BOOKS" type="tns:Books"></wsdl:part>
  </wsdl:message>
  <wsdl:portType name="Library">

    <wsdl:operation name="searchAuthors">
      <wsdl:input message="tns:searchAuthorsRequest"/>
      <wsdl:output message="tns:searchAuthorsResponse"/>
    </wsdl:operation>

    <wsdl:operation name="searchBooks">
     <wsdl:input message="tns:searchBooksRequest"></wsdl:input>
     <wsdl:output message="tns:searchBooksResponse"></wsdl:output>
    </wsdl:operation>

 </wsdl:portType>
  <wsdl:binding name="Library" type="tns:Library">
   <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http" />

   <wsdl:operation name="searchAuthors">
    <soap:operation soapAction="http://localhost/Blog/Library/library.php" />
    <wsdl:input>
     <soap:body use="literal" namespace="Library" />

    </wsdl:input>

    <wsdl:output>
     <soap:body use="literal" namespace="Library" />
    </wsdl:output>

   </wsdl:operation>

   <wsdl:operation name="searchBooks">
    <soap:operation soapAction="http://localhost/Blog/Library/library.php" />
    <wsdl:input>

     <soap:body use="literal" namespace="Library" />
    </wsdl:input>
    <wsdl:output>
     <soap:body use="literal" namespace="Library" />

    </wsdl:output>
   </wsdl:operation>
  </wsdl:binding>
  <wsdl:service name="Library">
    <wsdl:port binding="tns:Library" name="YLibrary">

      <soap:address location="http://localhost/Blog/Library/library.php" />
    </wsdl:port>
  </wsdl:service>
</wsdl:definitions>

the wsdl file contains main 4 parts :

  1. types
    You simply define the types that you are going to use , it looks like the variables definition in a PASCAL program.
  2. messages
    Each function in the web service has two messages; one for the input and the other for the output , also you can add one message to handle the exception.
  3. port type
    the port type combines one or more messages to represent a function. For example the “searchBooks” operation combines two messages; one for the input and one for the response.
  4. binding
    In the binding section , you define protocol details for the web service

I will create two classes to hold the data  one for books and the other for authors. Please notice that you have to name the members exactly the same way you did in the WSDL file.

The BookData class

<?php
class BookData {
     public $ID; 
     public $NAME; 
     public $AUTHOR; 
}
?>

The Author Data class

<?php
class AuthorData {
 public $ID; 
 public $NAME;
 public $BOOKS; 
}
?>

now I am going to create wrapper classes

Book class

<?php
class Book {
 private $_nID; 
 private $_strName; 

 public function __construct($ID,$Name){
  $this->SetID($ID); 
  $this->SetName($Name); 
 }
 public function SetID($ID){
  $this->_nID=mysql_real_escape_string($ID); 
 }

 public function GetID(){
  return $this->_nID; 
 }

 public function SetName($Name){
  $this->_strName=mysql_real_escape_string($Name); 
 }

 public function GetName () {
  return $this->_strName; 
 }

 public function CreateBook (){
  //Code to create Book
 }

 public function UpdateBook () {
  //code to update book 
 }

 public function DeleteBook () {
  //code to delete book 
 }
 public function SearchBooks () {
  $SQL = "SELECT * FROM books 
          INNER JOIN authors on books.author_id= authors.author_id
          WHERE books.book_name like '%{$this->_strName}%'";
  $Query = mysql_query($SQL) or die (mysql_error()); 
  $NumOfBooks = mysql_num_rows($Query); 
  $Result= array () ; 
  for ($i=0;$iID=mysql_result($Query, $i,"author_id");
    $Author->NAME=mysql_result($Query, $i,"author_name");
    //we will set this when we search by author name
    $Author->BOOKS=array(); 
    $Book = new BookData(); 
    $Book->ID = mysql_result($Query, $i,"book_id"); 
    $Book->NAME=mysql_result($Query, $i,"book_name");
    $Book->AUTHOR=$Author;

    $Result[]= $Book; 
  }
  return $Result; 
 }
}
?>

Author Class

<?php
class Author {

 private $_nID; 
 private $_strName; 

 public function __construct($ID,$Name){
  $this->SetID($ID); 
  $this->SetName($Name); 
 }

 public function SetID($ID){
  $this->_nID=mysql_real_escape_string($ID); 
 }

 public function GetID(){
  return $this->_nID; 
 }

 public function SetName($Name){
  $this->_strName=mysql_real_escape_string($Name); 
 }

 public function GetName () {
  return $this->_strName; 
 }

 public function CreateAuthor (){
  //Code to create Author
 }

 public function UpdateAuthor() {
  //code to update Author
 }

 public function DeleteAuthpr() {
  //code to delete Author
 }

 public function SearchAuthors() {
  $SQL = "SELECT * FROM authors WHERE authors.author_name like '%{$this->_strName}%'"; 
  $Query = mysql_query($SQL) or die (mysql_error()); 
  $NumOfAuthors= mysql_num_rows($Query); 

  $Result= array () ; 

  for ($i=0;$iID=mysql_result($Query, $i,"author_id");
                                $Author->NAME=mysql_result($Query, $i,"author_name");
    $Author->BOOKS=$this->GetBooksByAuthorID($Author->ID);  
    $Result[]= $Author; 
  }
  return $Result; 
 }

 public function GetBooksByAuthorID($ID){

  $SQL = "SELECT * FROM books WHERE author_id = $ID"; 

  $Query = mysql_query($SQL);  
  $NumOfBooks = mysql_num_rows($Query); 

  $Result = array () ; 

  for($i=0;$iID=mysql_result($Query, $i,"books.book_id");
    $Book->NAME=mysql_result($Query, $i,"books.book_name");

   $Result[]= $Book ; 

  }
  return $Result; 
 }
}
?>

Finally we will create php file who is supposed to receive the request from soap clients and return results. This file is the one written in the soap action attribute in the WSDL file.

library.php

<?php

mysql_connect("localhost","root",""); 
mysql_select_db("library");

function __autoload($ClassName) {
 require_once $ClassName.".php";
}

function searchAuthors ($NAME) {
 $Author = new Author(null, $NAME); 
 return $Author->SearchAuthors();
}

function searchBooks($NAME){
 $Book = new Book(null, $NAME); 
 return $Book->searchBooks();
}

ini_set("soap.wsdl_cache_enabled", "0"); 

$classmap = array(
 "Book"=>"BookData",
 "Author"=>"AuthorData"
);
$server = new SoapServer("Library.wsdl",array("classmap"=>$classmap));
$server->addFunction("searchAuthors"); 
$server->addFunction("searchBooks"); 
$server->handle();
?>

Please notice you have to create a map between your complex types and your classes “the ones that hold the data”.

For example here, I mapped between Book “the complex type in the WSDL file” and BookData “The php class”.

Now everything is ready. Lets call the web service:

index.php

<?php
$client = new SoapClient("http://localhost/Blog/Library/Library.wsdl",array("trace" => 1)); 
try
{
 $response = $client->searchAuthors("Beh");
 //$response = $client->searchBooks("comp");

 var_dump($response);
 echo "Request".htmlspecialchars($client->__getLastRequest())."";
 echo "Response".htmlspecialchars($client->__getLastResponse())."";

}catch(Exception $e){
 var_dump($e);
 echo $client->__getLastRequest();
 echo $client->__getLastResponse();
}
?>
 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

Sorting 2D-arrays in PHP – anectodes and reflections

One of the first things you might run into as a PHP developer is having to sort a two dimensional array (table) by an arbitrary field. Consider the following array for instance:

$customers = array(
    array("name" => "David", "age" => 32),
    array("name" => "Bernard", "age" => 45)
);

What if you want to sort it so that Bernard ends up on top instead? I ended up writing this function:

static function sort2d_asc(&$arr, $key){
    //we loop through the array and fetch a value that we store in tmp
    for($i = 0; $i < count($arr); $i++){
        $tmp = $arr[$i];
        $pos = false;
        //we check if the key we want to sort by is a string
        $str = is_numeric($tmp[$key]);
        if(!$str){
            //we loop the array again to compare against the temp value we have
      for($j = $i; $j < count($arr); $j++){
        if(StringManip::is_date($tmp[$key])){
          if(StringManip::compareDates($arr[$j][$key], $tmp[$key], $type = 'asc')){
            $tmp = $arr[$j];
              $pos = $j;
          }
          //we string compare, if the string is "smaller" it will be assigned to the temp value  
        }else if(strcasecmp($arr[$j][$key], $tmp[$key]) < 0){
            $tmp = $arr[$j];
            $pos = $j;
        }
      } 
    }else{
        for($j = $i; $j < count($arr); $j++){
        if($arr[$j][$key] < $tmp[$key]){
            $tmp = $arr[$j];
            $pos = $j;
        }
      }
    }
    if($pos !== false){
        $arr[$pos] = $arr[$i];
        $arr[$i] = $tmp;
    }
  }
}

You pass in the array you want to sort as &$arr and the key you want to sort by as $key and the array get sorted in an ascending fashion. This works OK as long as you have a two dimensional array like the one above. But it wont work if your array looks like this:

$customers = array(
    "cus1" => array("name" => "David", "age" => 32),
    "cus2" => array("name" => "Bernard", "age" => 45)
);

This array does not have numerical keys, the results using the above mega-function will not be reliable. Usually this is not a problem since 2D arrays retrieved from the database will be numbered numerically. The above function actually worked fine until I wanted to sort an array like the one just above. It wouldn’t work of course so I reviewed the manual and the array_multisort function. I realized that it can be used to sort arrays just like I wanted. This is the result:

static function multi2dSortAsc(&$arr, $key){
  $sort_col = array();
  foreach ($arr as $sub) $sort_col[] = $sub[$key];
  array_multisort($sort_col, $arr);
}

The crux is that we have to create the 1D array we want to sort by on the fly, once we have this array it can be used to sort the parent 2D array by passing it as a second argument. This function will work on both of the customer arrays.

 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

How to Debug in PHP

This article breaks down the fundamentals of debugging in PHP, helps you understand PHP’s error messages and introduces you to some useful tools to help make the process a little less painful.

Doing your Ground Work

It is important that you configure PHP correctly and write your code in such a way that it produces meaningful errors at the right time. For example, it is generally good practice to turn on a verbose level of error reporting on your development platform. This probably isn’t such a great idea, however, on your production server(s). In a live environment you neither want to confuse a genuine user or give malicious users too much information about the inner-workings of your site.

So, with that in mind lets talk about the all too common “I’m getting no error message” issue. This is normally caused by a syntax error on a platform where the developer has not done their ground work properly. First, you should turn display_errors on. This can be done either in your php.ini file or at the head of your code like this:

<?php
ini_set('display_errors', 'On');

Next, you will need to set an error reporting level. As default PHP 4 and 5 do not show PHP notices which can be important in debugging your code (more on that shortly). Notices are generated by PHP whether they are displayed or not, so deploying code with twenty notices being generated has an impact upon the overhead of your site. So, to ensure notices are displayed, set your error reporting level either in your php.ini or amend your runtime code to look like this:

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);

Tip: E_ALL is a constant so don’t make the mistake of enclosing it in quotation marks.

With PHP 5 it’s also a good idea to turn on the E_STRICT level of error reporting. E_STRICT is useful for ensuring you’re coding using the best possible standards. For example E_STRICT helps by warning you that you’re using a deprecated function. Here’s how to enable it at runtime:

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL | E_STRICT);

It is also worth mentioning that on your development platform it is often a good idea to make these changes in your php.ini file rather than at the runtime. This is because if you experience a syntax error with these options set in your code and not in the php.ini you may, depending on your set up, be presented with a blank page. Likewise, it is worth noting that if you’re setting these values in your code, a conditional statement might be a good idea to avoid these settings accidentally being deployed to a live environment.

What Type of Error am I Looking at?

As with most languages, PHP’s errors may appear somewhat esoteric, but there are in fact only four key types of error that you need to remember:

Syntax Errors

Syntactical errors or parse errors are generally caused by a typo in your code. For example a missing semicolon, quotation mark, brace or parentheses. When you encounter a syntax error you will receive an error similar to this:

Parse error: syntax error, unexpected T_ECHO in Document/Root/example.php on line 6

In this instance it is important that you check the line above the line quoted in the error (in this case line 5) because while PHP has encountered something unexpected on line 6, it is common that it is a typo on the line above causing the error. Here’s an example:

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);
$sSiteName = "Think Vitamin"
echo $sSiteName;

In this example I have omitted the semi-colon from line 5, however, PHP has reported an error occurred on line 6. Looking one line above you can spot and rectify the problem.

Warnings

Warnings aren’t deal breakers like syntax errors. PHP can cope with a warning, however, it knows that you probably made a mistake somewhere and is notifying you about it. Warnings often appear for the following reasons:

1. Headers already sent. Try checking for white space at the head of your code or in files you’re including.
2. You’re passing an incorrect number of parameters to a function.
3. Incorrect path names when including files.

Notices

Notices aren’t going to halt the execution of your code either, but they can be very important in tracking down a pesky bug. Often you’ll find that code that’s working perfectly happily in a production environment starts throwing out notices when you set error_reporting to E_ALL.

A common notice you’ll encounter during development is:

>Notice: Undefined index: FullName in /Document/Root/views/userdetails.phtml on line 55

This information can be extremely useful in debugging your application. Say you’ve done a simple database query and pulled a row of user data from a table. For presentation in your view you’ve assigned the details to an array called $aUserDetails. However, when you echo $aUserDetails[‘FirstName’] on line 55 there’s no output and PHP throws the notice above. In this instance the notice you receive can really help.

PHP has helpfully told us that the FirstName key is undefined so we know that this isn’t a case of the database record being NULL. However, perhaps we should check our SQL statement to ensure we’ve actually retrieved the user’s first name from the database. In this case the notice has helped us rule out a potential issue which has in turn steered us towards the likely source of our problem. Without the notice our likely first stop would have been the database record, followed by tracing back through our logic to eventually find our omission in the SQL.

Fatal Errors

Fatal Errors sound the most painful of the four but are in fact often the easiest to resolve. What it means, in short, is that PHP understands what you’ve asked it to do but can’t carry out the request. Your syntax is correct, you’re speaking its language but PHP doesn’t have what it needs to comply. The most common fatal error is an undefined class or function and the error generated normally points straight to the root of the problem:

Fatal error: Call to undefined function create() in /Document/Root/example.php on line 23

Using var_dump() to Aid Your Debugging

var_dump() is a native PHP function which displays structured, humanly readable, information about one (or more) expressions. This is particularly useful when dealing with arrays and objects as var_dump() displays their structure recursively giving you the best possible picture of what’s going on. Here’s an example of how to use var_dump() in context:

Below I have created an array of scores achieved by users but one value in my array is subtly distinct from the others, var_dump() can help us discover that distinction.

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);
$aUserScores = array('Ben' => 7,'Linda' => 4,'Tony' => 5,'Alice' => '9');
var_dump($aUserScores);

Tip: Wrap var_dump() in tags to aid readability.

The output from var_dump() will look like this:

array(4) {
  ["Ben"]=>
  int(7)
  ["Linda"]=>
  int(4)
  ["Tony"]=>
  int(5)
  ["Alice"]=>
  string(1) "9"
}

As you can see var_dump tells us that $aUserScores is an array with four key/value pairs. Ben, Linda, and Tony all have their values (or scores) stored as integers. However, Alice is showing up as a string of one character in length.

If we return to my code, we can see that I have mistakenly wrapped Alice’s score of 9 in quotation marks causing PHP to interpret it as a string. Now, this mistake won’t have a massively adverse effect, however, it does demonstrate the power of var_dump() in helping us get better visibility of our arrays and objects.

While this is a very basic example of how var_dump() functions it can similarly be used to inspect large multi-dimensional arrays or objects. It is particularly useful in discovering if you have the correct data returned from a database query or when exploring a JSON response from say, Twitter:

<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);
$sJsonUrl = 'http://search.twitter.com/trends.json';
$sJson = file_get_contents($sJsonUrl,0,NULL,NULL);
$oTrends = json_decode($sJson);
var_dump($oTrends);

Useful Tools to Consider when Debugging

Finally, I want to point out a couple of useful tools that I’ve used to help me in the debugging process. I won’t go into detail about installing and configuring these extensions and add-ons, but I wanted to mention them because they can really make our lives easier.

Xdebug

Xdebug is a PHP extension that aims to lend a helping hand in the process of debugging your applications. Xdebug offers features like:

  • Automatic stack trace upon error
  • Function call logging
  • Display features such as enhanced var_dump() output and code coverage information.

Xdebug is highly configurable, and adaptable to a variety of situations. For example, stack traces (which are extremely useful for monitoring what your application is doing and when) can be configured to four different levels of detail. This means that you can adjust the sensitivity of Xdebug’s output helping you to get granular information about your app’s activity.

Stack traces show you where errors occur, allow you to trace function calls and detail the originating line numbers of these events. All of which is fantastic information for debugging your code.

Tip: As default Xdebug limits var_dump() output to three levels of recursion. You may want to change this in your xdebug.ini file by setting the xdebug.var_display_max_depth to equal a number that makes sense for your needs.

Check out Xdebug’s installation guide to get started.

FirePHP

For all you FireBug fans out there, FirePHP is a really useful little PHP library and Firefox add-on that can really help with AJAX development.

Essentially FirePHP enables you to log debug information to the Firebug console using a simple method call like so:

<?php
$sSql = 'SELECT * FROM tbl';
FB::log('SQL query: ' . $sSql);

In an instance where I’m making an AJAX search request, for example, it might be useful to pass back the SQL string my code is constructing in order that I can ensure my code is behaving correctly. All data logged to the Firebug console is sent via response headers and therefore doesn’t effect the page being rendered by the browser.

Warning: As with all debug information, this kind of data shouldn’t be for public consumption. The downside of having to add the FirePHP method calls into your PHP is that before you go live you will either have to strip all these calls out or set up an environment based conditional statement which establishes whether or not to include the debug code.

You can install the Firefox add-on at FirePHP’s website and also grab the PHP libs there too. Oh, and don’t forget if you haven’t already installed FireBug, you’ll need that too.

 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

Array Pagination in PHP

Pagination in PHP is a topic covered by a lot of tutorials and is therefore quite saturated. Although I’m not going to introduce any wild new concepts into this tutorial I will explain how you can use pagination for data held within an array.

Normally you’d only need to paginate your data if you’ve got quite a lot of it in which case you’re most likely using some sort of database. With database systems of course pagination can be achieved relatively easily by specifying the offset parameters in an SQL query. But what if your data didn’t come from a database table and instead came from a flat file.

Take a look at the following code which shows exactly how it’s done.

  1. // Data, normally from a flat file or some other source
  2. $data = “Item 1|Item 2|Item 3|Item 4|Item 5|Item 6|Item 7|Item 8|Item 9|Item 10”;
  3. // Put our data into an array
  4. $dataArray = explode(‘|’, $data);
  5. // Get the current page
  6. $currentPage = trim($_REQUEST[page]);
  7. // Pagination settings
  8. $perPage = 3;
  9. $numPages = ceil(count($dataArray) / $perPage);
  10. if(!$currentPage || $currentPage > $numPages)
  11.     $currentPage = 0;
  12. $start = $currentPage * $perPage;
  13. $end = ($currentPage * $perPage) + $perPage;
  14. // Extract ones we need
  15. foreach($dataArray AS $key => $val)
  16. {
  17.     if($key >= $start && $key < $end)
  18.         $pagedData[] = $dataArray[$key];
  19. }

Now for the explanation. To begin with I have created the $data array which contains a long line of items split up by the pipe ‘|’ character. Realistically this would be real data however just for this tutorial I’ll keep it simple. Then, using the explode() function I’ve cut up the $data variable into an array using ‘|’ as the delimeter.

Line 8 simply gets the current page number if one is provided.

Lines 11 to 17 are all to do with the simple math calculations which make this array pagination work. Firstly set how many items we’d like to display per page into the variable, $perPage. In the example above I’ve set this to 3.

On line 12 we’re working out how many pages there are going to be. This can be done by dividing the total number of items (by using the count() function) by the items per page value. Notice on this line that I’m also using the ceil() function. This basically rounds the number up (e.g. 5.134 becomes 6).

We then have a simply if statement on lines 13 and 14. It’s basically saying that if no page number has been provided or if the provided page number is more than the number of pages, set it to 0. This stops people from trying to access pages which have no items.

On lines 19 and 20 we’re setting the $start and $end variable which you might recognize if you’ve done pagination using SQL queries before. The $start variable holds the lowest ID of the item which can be displayed on this page. The $end variable is maximum cap which the items ID can be to be displayed (actually, it’s one above, but this depends on how you do your if statement on line 22).

Great, we’re nearly there. Now, on line 20 we start a foreach statement which loops through each of our data items. Inside this loop is a simple if statement to see if the id of the current data item is above (or equal to) the $start value and below the $end value. If it is then we place a copy of it into our $pagedData array.

Once the foreach loop has finished the $pagedData array now contains all of the data items which we should be displaying on the current page. All we have to do now is to loop through and display them. This has been shown in the following code snippet.

  1. foreach($pagedData AS $item)
  2.     echo $item . “<br>”;

As far as displaying the data goes that’s it. All we need to do now is to display the pagination links to let you navigate your way through the pages. Here’s the code for that.

  1. if($currentPage > 0 && $currentPage < $numPages)
  2.     echo ‘<a href=”?page=’ . ($currentPage – 1) . ‘”>« Previous page</a><br>’;
  3. if($numPages > $currentPage && ($currentPage + 1) < $numPages)
  4.     echo ‘<a href=”?page=’ . ($currentPage + 1) . ‘” class=”right”>Next page »</a><br>’;

The above snippet consists of two quite simple if statements. The first is for displaying the “previous page” link and the second is for displaying the “next page” link.

Starting with the first, the if statement checking to see if the current page is more than 0 (you wouldn’t have a previous link on the very first page) and if the current page is less than the total number of pages (to avoid displaying it on pages with no data).

The second if statement is checking to see if the total number of pages is more than the current page number (so you’re not vieiwng the last page) and if there are any more pages after the current page.

 
Leave a comment

Posted by on June 8, 2011 in Mixed, PHP

 

9 PHP Debugging Techniques You Should Be Using

Enable Notices

The wonderful developers of PHP (no, not PHP developers) bestowed a great gift when they created the language we all know and love, and that was the gift of notices.

What are notices? A notice is a type of error message which is less severe than parse, fatal and warning error messages. A notice may tell you something along the lines of “Hey, you have used that variable without defining it! What gives?”

Notices are very useful tool for avoiding bugs during development as they can catch problems caused by mistyped variable names or non-existent array indexes. This will save you a lot of headaches.

If you have been developing with notices turned off then please go and turn them on. Please! You can do this in one of three ways. Firstly, you can do this in your php.ini file:

error_reporting = E_ALL

Or you can insert the following line at the start of your PHP code:

ini_set(‘error_reporting’, E_ALL);

Lastly, you can set this on a per-virtual host in Apache HTTPD using the following:

php_value error_reporting 8191

If you want to test that notices are turned on you can simply create a file with the following:

<?php
echo "You should see a notice below this line\n";
echo $thisVariableDoesNotExist;
?>

At this point I should stress that you should always ensure that the display_errors ini setting is set to ‘Off’ for any live/production sites. It is a very bad idea to allow the public to view your error messages because a) it looks bad, and b) it can give away a lot of information to potential crackers.

Now that you have turned notices on you may find that your code creates a lot of them. I would start correcting these at the first chance you get as you will never see new notices amid such clutter. You may find empty() andis_set() useful.

See also:

Use a Logging System

A logging system can be very useful in tracking down bugs, especially when they happen in a production environment. Such a system can also be useful in debugging during development but I find it much easier to use an IDE to debug my development environment (more on this shortly).

I feel that it is important to log any significant actions performed (e.g. user created, group deleted, registration email sent) as well as any errors that occur. Some people advocate logging a message every time you enter or leave a function. I find that this method to be a bit too verbose, especially with more complex applications. I prefer to ditch the reams of log messages for a good IDE and debugger.

When you actually come to log your messages I would recommend having a fallback mechanism (you never know, the logger itself could have just broken!) For example, you could try logging to a database, and failing that you can append to a log file, and failing that you can send an email. You may find exceptions useful for this.

See also:

Log Errors

We have to accept, that despite out best efforts, errors can (and do) occur in production environments. When these hiccups do arise we have to ensure that they are dealt with quickly, otherwise users (or even, gasp, clients) get angry.

Some types of errors we cannot do anything about (think parse errors) and we just have to ensure that we have a close eye on our error logs so we notice when they occur. Fortunately, it is these types of bugs that are normally caught very quickly during development and testing.

As for all the other errors, we need to make sure that they are caught and dealt with properly. Make sure the user is shown a nice error page (with a suitably cute ‘oops-back-soon’ picture) and then log, log
everything in sight! I recommend storing the following:

  • The stack current trace (see debug_backtrace() and debug_print_backtrace()).
  • The output of get_defined_vars(). However, this is only useful if you call it at the point the error occurs, not at the point the error is logged. This includes global variables.
  • Any and all information about the remote user (IP address, user agent, session data)
  • All global variable data (which includes the contents of $_COOKIES, $_SERVER etc.)
  • Any other status data which is specific to your application

This information will be invaluable when you come to tracking down errors in production code.

How you choose to actually capture your errors is your own choice. You can use trigger_error and a custom error handler, but I prefer to use a custom exception class which takes care of logging for me.

See also:

Check Function Parameters

Checking function parameters can help you catch a lot of bugs before the erroneous data passes too far through your application. I like to test that the input parameters are of the expected type and are reasonably sane.

To avoid too much clutter I generally only do this on utility methods (as these are used often and in many places) and methods which permanently manipulate data such as files and databases (as errors here have the potential to destroy a lot of data).

See also:

Use an Integrated Development Environment and Debugger

I created my first ever website on a Geocities account. To do this I used was a textarea field in the Geocities admin area. It was OK for the simple HTML I was writing, but I would not like to create an entire PHP application in this way!

I have now moved on to using an Integrated Development Environment (IDE) for the majority of my development, and I highly recommend that you do the same. I use Zend Studio and you only have to look at thefeature list to see why I find it completely indispensible, especially for debugging. If you have not used an IDE before I recommend you have a look at one of the applications listed at the end of this section.

I also use a remote debugger (ZendDebugger), which ties into the IDE. The remote debugger is a PHP module that allows you to debug code on your server using the IDE on your local machine. You can set breakpoints, inspect variables, examine stack traces, profile code and all the other benefits you would expect from a debugger. And no, Zend does not sponsor me.

See also:

Unit Testing

Unit testing may not be everyone’s idea of fun, but I can be very effective for developing larger projects. It can give you confidence when you have to make significant changes to the code base, as well as point out problems before your code goes into production.

There are two catches with unit tests, the biggest of which is that you have to actually write the Unit tests themselves. Although this should save time in the long run (or at least lead to a more robust product), it is hard to avoid thinking that you could be spending the time developing functionality.

The second catch is that it often forces you to refactor your code into more test-friendly chunks. This is probably a good thing but it will take more time. The best approach would be to write unit tests from the very start of the project or, for an existing project, you can write a unit test for every bug that is fixed.

If you are using unit tests you should also be aware of the concept of code coverage. This is a metric which shows what percentage of your code is run during the testing process. The higher value for this indicates a more robust set of unit tests. You can calculate your code coverage using a debugger, as was discussed in the previous section.

See also:

No Magic! (Or, Avoid Side Effects)

A side effect can be described as a non-obvious effect that was caused by performing an action (see Wikipediafor a more technical description). For example:

So what is going on here? You can see that we start with a $radius and $area variable which we use to show the area of the circle. We then want to display the area of a square with the same dimensions, so we call getSquareArea. Although this function does what its name implies, it also alters the $radius variable (intentionally or not). This is defiantly non-obvious in the rest of our code and can cause severe headaches when it comes to debugging, especially in more complex applications. Of course, you should also avoid global variables for similar reasons.

This also applies to modifying function parameters (which were passed by reference). If you find yourself doing this then you should probably refactor your code. To do this you can either return the parameter rather than modifying it, or you can split the function into several smaller functions. Also, don’t forget that objects are always passed by reference in PHP 5.

Use Manual Redirects When Debugging

Many developers (including myself) will make use of redirects when developing web applications. To refresh your memory, here is how you do a redirect in PHP:

This technique can be very useful for sending the user to the correct page, but it can also be very problematic for debugging. For example, do you keep getting sent off to a bizarre area of your application? Do you know if it is just one redirect sending you there or many? What do you do when you get trapped in an infinite redirect loop?

My answer to these problems was to introduce the concept of manual redirects which would only be used for debugging. Rather than sending a header to the client, I would send a link to the target page as well as a stack trace. This would allow me to monitor the redirects that were happening in my application and clearly see what was happening if the application went wrong.

The code I use looks something like this:

<?php
function redirect($url, $debug = false) {
	//If manual page redirects have been enabled then print some html

	if ($debug) {
		echo "Redirect: You are being   redirected to:$url\n";
		echo "Backtrace:\n";
		debug_print_backtrace ();
		echo "";
} else {
header ( "Location: $url" );
}

exit ( 0 );

}
?>

You may find it useful to pull the $debug value from your configuration system of choice rather than having to pass it for each function
call, but it works in this example.

Keep Things Simple

I think this rule probably exists for every profession out there, so it should be no surprise that it applies to software development.

It is good practice to write software using a clear structure and using standard design patterns, but this is only a high level approach to keeping things simple. We also need to keep your individual algorithms as simple as possible as this will make your code easier to understand in six months when it needs fixing, and will also make it easier to fix.

Here are some ways you can achieve this:

  1. Keep an eye on functions that are growing. You may find that you can split the code into several smaller functions.
  2. Functions that are only called in one place may be too specific. You can either bring the code inline, or generalise using several smaller functions. You can always keep the specific function and just use that to call and aggregate the new, smaller, functions.
  3. Watch out for functions with very long names or lots of arguments. This can be a sign that the function could be split into several smaller functions, or it could even be replaced with a class.
  4. Use built in functions where possible. This will help avoid spurious amounts of PHP code and there is a good chance the internal function will be faster as it is written in C (and by the pros!) Some of the most underappreciated internal functions are the array functions.
  5. If you really must have long and complex sections of code, then make sure you add some documentation. You and your fellow developers will be thankful of this when it comes to debugging.

Most of these points are about splitting large functions into smaller ones. It is also important to ensure you do not end up with lots of tiny functions, but I feel this is a much more unusual problem.

See also:

 
Leave a comment

Posted by on June 7, 2011 in Mixed, PHP

 

The Truth About PHP Variables

I wanted to write this post to clear up what seems to be a common misunderstanding in PHP – that using references when passing around large variables is a good way save memory. To fully explain this I will need to explain how PHP handles variables internally. I hope that you will find this interesting and useful and that it helps dispel some myths around references and memory management in PHP.

Basic References in PHP

(Note: If you are already familiar with references in PHP then feel free to skip this section)

In PHP it is possible to assign variables by value or by reference. The former method is the most common, and should look very familiar to you:

<?php
//Example 1: Assigning variables by value (the 'standard' way)
$var1 = 'hello!';
$var2 = $var1;
$var2 = 'goodbye!';
echo $var1; // Produces: hello!
echo "
\n"; echo $var2; //Produces: goodbye! ?>

This should be no surprise to you, just simple assigning of variables in PHP. The next example is very similar, but we assign $var2 by reference rather than by value.

<?php
//Example 2: Assigning variables by reference
$var1 = 'hello!';
$var2 =& $var1; // Notice the ampersand. This means $var2
                // is a reference to $var1
$var2 = 'goodbye!' // because $var2 is a reference to $var1,
                   // both variables now have the value 'goodbye!';
echo $var1; // Produces: goodbye!
echo "
\n"; echo $var2; //Produces: goodbye! ?>

This may be more surprising to some of you, so I will explain what is happening. The first step is no different, we simply initialise $var1 with a value of ‘hello!’. However, in the next step we assign $var1 to $var2 using the ‘=&’ operator, which causes a reference to $var1 to be passed, rather than the actual contents of $var1. This means that both variables point to the same data in memory, so any changes to either variable will affect the other.

How PHP Handles Variables Internally (using zvals!)

While the above explanation of references is sufficient for a general understanding, it is often useful to understand how PHP handles variable assignment internally. This is where we introduce the concept of the zval.

zvals are an internal PHP structure which are used for storing variables. Each zval contains various pieces of information, and the ones we will be focusing on here are as follows:

The actual data stored within the zval – In our example this would be either ‘hello!’ or ‘goodbye!’. The zval also knows the type of data it contains, but this is not especially relevant here so it has been omitted from the above list.

The first item in our list, the actual data, does not require much explanation. The second item on this list (is_ref) indicates if variables should address this zval by value or by reference, the implications
of which are addressed shortly. The third item (ref_count) stores the number of variables that currently address this zval. If ref_count ever reaches zero (for example, if you call unset()) then PHP assumes that it can remove the zval and free up the memory it was using.

Now this bit is important: You may think that the ref_count value is only used when dealing with a reference (i.e. when is_ref=true), but this is not the case. The ref_count variable is used regardless of the value of is_ref. So what does this mean?

Being A Little Bit Clever

This is where, as the headline suggests, PHP is a little bit clever. When you assign a variable by value (such as in example 1) it does not create a new zval, it simply points both variables at the same zval and increases that zval’s ref_count by one. “Wait!” I hear you cry, “Isn’t that passing by reference?” Well, although it sounds the same, all PHP is doing is postponing any copying until it really has to – and it knows this because is_ref is still false. “Hum, so how does it work?” Ok, here is an example:

<?php
//Example 3a: Assigning variables by value (but with more detail)

//Here our zval is created for $var1.
$var1 = 'hello!';
//Our zval now has ref_count=1, is_ref=false

//We now assign $var1 to $var2
$var2 = $var1;
//Our zval now has ref_count=2, is_ref=false

debug_zval_dump($var2); //Produces: string(6) "hello!" refcount(3)
//(Why refcount(3)? See "An important note on debug_zval_dump()")

//We now assign a new value to $var2. So what happens to our zval?
$var2 = 'goodbye!';
//Read on to find out...

?>

An important note on debug_zval_dump(): php.net says this function “dumps a string representation of an internal Zend value to output.” This is true, but calling this function inherently causes another reference to the variable to be created, so you can (in these examples) subtract one from the ref_count value given in the output.

In the above example we see how both $var1 and $var2 refer to the same zval (as can be seen by the call to debug_zval_dump()). So what happens on the last line when we assign a new value to $var2? Does $var1 change too? Of course the answer is no, but why?

When we assign ‘goodbye!’ to $var2 in the example above, PHP examines the is_ref value of the underlying zval. If is_ref is false (as it is in this example) PHP knows that it can only change the value of the zval if the ref_count is 1 (as the change will not affect any other variables). However, in our example the ref_count is 2, therefore PHP realises that it is not allowed to change the zval’s value and so creates another zval to which $val2 is the associated. The is illustrated by the finished example below:

<?php
//Example 3b: Assigning variables by value (the complete example)

//Here our zval is created for $var1.
$var1 = 'hello!';
//Our zval now has value='hello!', ref_count=1, is_ref=false

//We now assign $var1 to $var2
$var2 = $var1;
//Our zval now has value='hello!', ref_count=2, is_ref=false

debug_zval_dump($var2); //Produces: string(6) "hello!" refcount(3)
//(Why refcount(3)? See "An important note on debug_zval_dump()")

//We now assign a new value to $var
$var2 = 'goodbye!';
//We now have two zvals:
//   The first: value='hello!', ref_count=1, is_ref=false
//   The second: value='goodbye!', ref_count=1, is_ref=false

?>

So we can see that, in the case of passing-by-value, PHP only copies data if a value is changed.

For the sake of completeness, here is an example where we pass-by-reference;

<?php
//Example 4: Assigning variables by value (the complete example)

//Here our zval is created for $var1.
$var1 = 'hello!';
//Our zval now has value='hello!', ref_count=1, is_ref=false

//We now assign $var1 to $var2
$var2 =& $var1;
//Our zval now has value='hello!', ref_count=2, is_ref=true

debug_zval_dump(&$var2); //Produces: &string(6) "hello!" refcount(3)
//(Why refcount(3)? See "An important note on debug_zval_dump()")

//We now assign a new value to $var
$var2 = 'goodbye!';
//We still have one zval, but with a
//new value: value='goodbye!', ref_count=2, is_ref=true

debug_zval_dump(&$var1); //Produces: &string(8) "goodbye!" refcount(3)
debug_zval_dump(&$var2); //Produces: &string(8) "goodbye!" refcount(3)
//(Why refcount(3)? See "An important note on debug_zval_dump()")

?>

As expected, we can see that the zval for both $var1 and $var2 has changed to a value of ‘goodbye!’ and has a ref_count of 2.

A Little More Complex

So now we know how PHP handles values and references, and isn’t it is all wonderfully exciting? “Oh yes! Please tell me more!” I hear you say? Ok then…

There is one last thing to mention in this area, which I think is especially relevant to those of you who love to (ahem) save memory by passing around references – what happens when values and references meet.

You may have noticed that the zval’s is_ref flag does not permit a zval to be both a reference and a value at the same time (as it is either true or false). On the face of it this is probably for the best as I suspect it could lead to all kinds of strangeness from an internal perspective. However, a result of this is that if you are using a variable by value in several places (i.e. the variables underlying zval has a ref_count greater than 1) and then pass it by reference (for example, to a function), PHP will have to copy the value into a entirely new zval in order to set the is_ref flag to true. The following example illustrates how this can result in substantially increased memory usage:

<?php
//Example 5: Showing how mixing references and values can lead
//           to increased memory consumption

memory_show_usage(); //Zero bytes

$v1 = str_repeat('0', 100000);//Generate 100kb of dummy data
memory_show_usage(); //100kb

$v2 = $v1;
//We now have two variables pointing to a zval in the form:
//   is_ref=false, ref_count=2
memory_show_usage(); //100kb

$r1 =& $v2; //We now assign our value by reference
memory_show_usage(); //200kb
//PHP has now had to create a second zval in the form:
//   is_ref=true, ref_count=1

$v3 = $r1; //We now assign second zval by value
memory_show_usage(); //300kb
//PHP has now had to create a third zval in the form:
//   is_ref=false, ref_count=1

$v4 = $v3; //Now assign by value
memory_show_usage(); //300kb (no increase)
//Our third zval now has a ref_count of 2

//Both $v3 and $v4 now have the same zval, which may only be
//passed by value as it has a ref_count greater than one

$r2 =& $v3; //So now we assign $v3 by reference
memory_show_usage(); //400kb
//Here PHP has been forced to create a fourth zval with yet
//another copy of the data. The new zval is in the form:
//    is_ref=true, ref_count = 1

//Simple function to show memory use from a baseline
function memory_show_usage(){
    static $baseline = null;
    if(is_null($baseline)){
        //Initialise to get an accurate memory use value
        $baseline = 1;
        $baseline = memory_get_usage();
    }

    echo (memory_get_usage() - $baseline) . " bytes\n";
}

?>

Although this example only assigns variables directly, the same principles apply when performing function calls where parameters are passed by reference. You can see that, unless the developer is completely consistent, passing variables by reference can easily lead to increased memory usage.

Conclusion

If you concern is to conserve memory then it is best to simply pass data by value as the PHP language is smart enough to conserve memory automatically. If you really must pass a value by reference then make sure that it is done consistently as this will avoid consuming many times more memory (and CPU cycles) than is necessary. Alternatively you could wrap your data in an object as PHP5 (but not PHP4) will pass this by reference as the default behaviour.

As a side note I would like to point out that side affecting function parameters (which may be your intention if you are passing by reference) is generally discouraged as it can make some bugs very hard to track down (a similar argument to that against global variables).

 
Leave a comment

Posted by on June 7, 2011 in Mixed, PHP

 

Advanced PHP tips to improve your programming

PHP programming has climbed rapidly since its humble beginnings in 1995. Since then, PHP has become the most popular programming language for Web applications. Many popular websites are powered by PHP, and an overwhelming majority of scripts and Web projects are built with the popular language.

Because of PHP’s huge popularity, it has become almost impossible for Web developers not to have at least a working knowledge of PHP. This tutorial is aimed at people who are just past the beginning stages of learning PHP and are ready to roll up their sleeves and get their hands dirty with the language. Listed below are 10 excellent techniques that PHP developers should learn and use every time they program. These tips will speed up proficiency and make the code much more responsive, cleaner and more optimized for performance.

1. Use an SQL Injection Cheat Sheet

SQL injection is a nasty thing. An SQL injection is a security exploit that allows a hacker to dive into your database using a vulnerability in your code. While this article isn’t about MySQL, many PHP programs use MySQL databases with PHP, so knowing what to avoid is handy if you want to write secure code.

Furruh Mavituna has a very nifty SQL injection cheat sheet that has a section on vulnerabilities with PHP and MySQL. If you can avoid the practices the cheat sheet identifies, your code will be much less prone to scripting attacks.

2. Know the Difference Between Comparison Operators

Comparison operators are a huge part of PHP, and some programmers may not be as well-versed in their differences as they ought. In fact, an article at I/O reader states that many PHP developers can’t tell the differences right away between comparison operators. Tsk tsk.

These are extremely useful and most PHPers can’t tell the difference between == and ===. Essentially, == looks for equality, and by that PHP will generally try to coerce data into similar formats, eg: 1 == ‘1′ (true), whereas === looks for identity: 1 === ‘1′ (false). The usefulness of these operators should be immediately recognized for common functions such as strpos(). Since zero in PHP is analogous to FALSE it means that without this operator there would be no way to tell from the result of strpos() if something is at the beginning of a string or if strpos() failed to find anything. Obviously this has many applications elsewhere where returning zero is not equivalent to FALSE.

Just to be clear, == looks for equality, and === looks for identity. You can see a list of the comparison operators on the PHP.net website.

3. Shortcut the else
It should be noted that tips 3 and 4 both might make the code slightly less readable. The emphasis for these tips is on speed and performance. If you’d rather not sacrifice readability, then you might want to skip them.

Anything that can be done to make the code simpler and smaller is usually a good practice. One such tip is to take the middleman out of else statements, so to speak. Christian Montoya has an excellent example of conserving characters with shorter else statements.

Usual else statement:
if( this condition )
{
$x = 5;
}
else
{
$x = 10;
}

If the $x is going to be 10 by default, just start with 10. No need to bother typing the else at all.
$x = 10;
if( this condition )
{
$x = 5;
}
While it may not seem like a huge difference in the space saved in the code, if there are a lot of else statements in your programming, it will definitely add up.

4. Drop those Brackets

Much like using shortcuts when writing else functions, you can also save some characters in the code by dropping the brackets in a single expression following a control structure. Evolt.org has a handy example showcasing a bracket-less structure.str_replace()

if ($gollum == ‘halfling’) {
$height –;
}

This is the same as:

if ($gollum == ‘halfling’) $height –;

You can even use multiple instances:

if ($gollum == ‘halfling’) $height –;
else $height ++;

if ($frodo != ‘dead’)
echo ‘Gosh darnit, roll again Sauron’;

foreach ($kill as $count)
echo ‘Legolas strikes again, that makes’ . $count . ‘for me!’;

5. Favour str_replace() over ereg_replace() and preg_replace()

Speed tests show that str_replace() is 61% faster.

In terms of efficiency, is much more efficient than regular expressions at replacing strings. In fact, according to Making the Web, str_replace() is 61% more efficient than regular expressions like ereg_replace() and preg_replace().

If you’re using regular expressions, then ereg_replace() and preg_replace() will be much faster than str_replace().

6. Use Ternary Operators

Instead of using an if/else statement altogether, consider using a ternary operator. PHP Value gives an excellent example of what a ternary operator looks like.

//PHP COde Example usage for: Ternary Operator
$todo = (empty($_POST[’todo’])) ? ‘default’ : $_POST[’todo’];

// The above is identical to this if/else statement
if (empty($_POST[’todo’])) {
$action = ‘default’;
} else {
$action = $_POST[’todo’];
}
?>

The ternary operator frees up line space and makes your code less cluttered, making it easier to scan. Take care not to use more than one ternary operator in a single statement, as PHP doesn’t always know what to do in those situations.

7. Memcached

Memcached is an excellent database caching system to use with PHP.

While there are tons of caching options out there, Memcached keeps topping the list as the most efficient for database caching. It’s not the easiest caching system to implement, but if you’re going to build a website in PHP that uses a database, Memcached can certainly speed it up. The caching structure for Memcached was first built for the PHP-based blogging website LiveJournal.

PHP.net has an excellent tutorial on installing and using memcached with your PHP projects.

8. Use a Framework

CakePHP is one of the top PHP frameworks.

You may not be able to use a PHP framework for every project you create, but frameworks like CakePHP, Zend, Symfony and CodeIgniter can greatly decrease the time spent developing a website. A Web framework is software that bundles with commonly needed functionality that can help speed up development. Frameworks help eliminate some of the overhead in developing Web applications and Web services.

If you can use a framework to take care of the repetitive tasks in programming a website, you’ll develop at a much faster rate. The less you have to code, the less you’ll have to debug and test.

9. Use the Suppression Operator Correctly

The error suppression operator (or, in the PHP manual, the “error control operator“) is the @ symbol. When placed in front of an expression in PHP, it simply tells any errors that were generated from that expression to now show up. This variable is quite handy if you’re not sure of a value and don’t want the script to throw out errors when run.

However, programmers often use the error suppression operator incorrectly. The @ operator is rather slow and can be costly if you need to write code with performance in mind.

Michel Fortin has some excellent examples on how to sidestep the @ operator with alternative methods. Here’s an example of how he used isset to replace the error suppression operator:

if (isset($albus)) $albert = $albus;
else $albert = NULL;

is equivalent to:

$albert = @$albus;

But while this second form is good syntax, it runs about two times slower. A better solution is to assign the variable by reference, which will not trigger any notice, like this:

$albert =& $albus;

It’s important to note that these changes can have some accidental side effects and should be used only in performance-critical areas and places that aren’t going to be affected.

10. Use isset instead of strlen

Switching isset for strlen makes calls about five times faster.

If you’re going to be checking the length of a string, use isset instead of strlen. By using isset, your calls will be about five times quicker. It should also be noted that by using isset, your call will still be valid if the variable doesn’t exist. The D-talk has an example of how to swap out isset for strlen

 
Leave a comment

Posted by on June 7, 2011 in Mixed, PHP