Jump directly to content

PHP tips

I'm an enthusiastic participant of the stackoverflow platform, a Question & Answer site for programmers. From time to time I stumble over an interesting problem (at least for me), then I try to solve the problem and publish a small article of the solution here on this page.

If you should have problems, questions or suggestions about the functions below, or if you simply find them useful, don't hesitate to send me an email to .

Overview


Using X-Frame-Options and Content-Security-Policy with PHP

Most browsers today will help protecting your site from malicious attacks, but you have to tell them they should. A widely supported method is setting the X-Frame-Options. Setting this option, the browser will not allow other sites to display your page inside an iframe. This protects against Clickjacking attacks and should be used on all sensitive pages like the login page.

// Adds X-Frame-Options to HTTP header, so that page can only be shown in an iframe of the same site.
header('X-Frame-Options: SAMEORIGIN'); // FF 3.6.9+ Chrome 4.1+ IE 8+ Safari 4+ Opera 10.5+

Users working with an up-to-date browser will benefit automatically, when a website sends a Content-Security-Policy (CSP) within the HTTP header. With a CSP you can specify from which locations you accept javascript, which sites are allowed to show your page inside an iframe and many other things. If a browser supports CSP, this can be an effective protection against Cross-Site-Scripting. more…

The implementation in PHP is very straightforward, though some problems may arise from inline JavaScript. The most protection you get, if you avoid all JavaScript inside the HTML files, and always put it to separate *.js files. If this cannot be done (because of existing source code), there is an option to allow inline-script.

// Adds the Content-Security-Policy to the HTTP header.
// JavaScript will be restricted to the same domain as the page itself.
header("Content-Security-Policy: default-src 'self'; script-src 'self';"); // FF 23+ Chrome 25+ Safari 7+ Opera 19+
header("X-Content-Security-Policy: default-src 'self'; script-src 'self';"); // IE 10+

If your site serves over HTTPS only (SSL for all pages), then it is a good idea to send the Strict-Transport-Security header. The first time a user visits your site, the browser will store this header. If the user later visits your site again, maybe using an unsafe WLAN connection, the browser remembers to call it exclusively with HTTPS. This would then protect from SSL-strip.

// Adds the HTTP Strict Transport Security (HSTS) (remember it for 1 year)
$isHttps = !empty($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) != 'off';
if ($isHttps)
{
  header('Strict-Transport-Security: max-age=31536000'); // FF 4 Chrome 4.0.211 Opera 12
}

Generating password hashes with bcrypt

PHP 5.5 offers it's own functions password_hash() and password_verify() to simplify generating BCrypt password hashes. I strongly recommend to use this excellent api, or its compatibility pack for earlier PHP versions. The usage is very straightforward, the hash-value can be stored in a database field of type varchar(255):

// Hash a new password for storing in the database.
// The function automatically generates a cryptographically safe salt.
$hashToStoreInDb = password_hash($_POST['password'], PASSWORD_DEFAULT);

// Check if the hash of the entered login password, matches the stored hash.
// The salt and the cost factor will be extracted from $existingHashFromDb.
$isPasswordCorrect = password_verify($_POST['password'], $existingHashFromDb);

// This way you can define a cost factor (by default 10). Increasing the
// cost factor by 1, doubles the needed time to calculate the hash value.
$hashToStoreInDb = password_hash($_POST['password'], PASSWORD_BCRYPT, array("cost" => 11));

This solves the task pretty well. If you want to know more about safe password storage, have a look at my in-depth tutorial about hashing passwords.


Secure password-reset function

In the article above, we saw how to store passwords safely, but this immediately leads to the next problem, the password-reset function. The best password hash function is worthless, if we do not handle the password-reset with the same care, as storing the password itself.

The usual way is, to send an email with a one time token to the registered user. The token will be stored in the database and when the user clicks the link, we check the token and allow the user to set a new password.

Now imagine an attacker can read the database table with the tokens through SQL-injection. He could then demand a password reset for any e-mail address he likes, and because he can see the new token, he could use it to set his own password. An ideal password-reset function should fulfill all of these requirements:

  • The token must be unpredictable, that's accomplished best with a "really" random code which is not based upon a timestamp or values like the user-id.
  • Like a password, the token should be hashed, before storing it in the database. This makes them useless for an attacker, even if the database is stolen.
  • The reset-link should preferably be short to avoid problems with email clients, and contain only safe characters 0-9 A-Z a-z (base62 encoded).
  • The token should have an expiry date. There is no advantage, when the link can be clicked two years later. On the other hand, being able to read the e-mails doesn't necessarily mean, that an attacker must hack the e-mail account, there is for example the open e-mail client in the office, a lost mobile phone...
  • Of course the token should be marked as used, after the user has successfully set a new password.

The class StoPasswordReset helps generating such reset-links. The generated tokens are very strong (in contrast to weak passwords), so it is safe to store an unsalted hash, calculated with a fast algorithm.

Download: StoPasswordReset.zip.

https://www.example.com/forgot_password.php
// First we check whether a user with this email is registered.
$userId = findUserByEmail($_POST['email']);
if (!is_null($userId))
{
  // Generate a new token with its hash
  StoPasswordReset::generateToken($tokenForLink, $tokenHashForDatabase);

  // Store the hash together with the UserId and the creation date
  $creationDate = new DateTime();
  savePasswordResetToDatabase($tokenHashForDatabase, $userId, $creationDate);
  
  // Send link with the original token
  $emailLink = 'https://www.example.com/reset_password.php?tok=' . $tokenForLink;
  sendPasswordResetEmail($emailLink);
}
https://www.example.com/reset_password.php
// Validate the token
if (!isset($_GET['tok']) || !StoPasswordReset::isTokenValid($_GET['tok']))
  handleErrorAndExit('The token is invalid.');

// Search for the token hash in the database, retrieve UserId and creation date
$tokenHashFromLink = StoPasswordReset::calculateTokenHash($_GET['tok']);
if (!loadPasswordResetFromDatabase($tokenHashFromLink, $userId, $creationDate))
  handleErrorAndExit('The token does not exist or has already been used.');

// Check whether the token has expired
if (StoPasswordReset::isTokenExpired($creationDate))
  handleErrorAndExit('The token has expired.');

// Show password change form and mark token as used
letUserChangePassword($userId);

To come to the point, every website switching between unsecure HTTP and encrypted HTTPS pages, is inevitable prone to SSL-strip. A secure HTTPS connection remains untouched with this attack, though the unaware user will be tricked to work with an HTTP connection, when he thinks to use an HTTPS connection.

Because one cannot expect users to be able to recognize an SSL-strip attack, one should absolutely think about using HTTPS for the whole site. Although this neither can prevent SSL-strip in every case, it helps considerably. Because the following concept can have advantages for HTTPS-only sites too, i decided to keep the article.

The problem with the session-id

For every request of a page, a session-id has to be sent along, that allows the server to recognize the user. The session-id should be stored in a cookie, because passing it along the URL makes session-fixation much to easy. In the session on the server resides the information, whether the user is already logged in or not. The problem now is, that an attacker that finds out this session-id (however he does), can impersonate the user, and therefore has the same priviledges as the user.

To exchange sensitive data, we absolutely need an HTTPS connection with SSL encryption. This makes sure, that nobody between client and server can eavesdrop our communication and prevents a man-in-the-middle attack. Websites which are switching betweed HTTP and HTTPS pages, have now to decide whether they:

  1. send the session-cookie to HTTP and HTTPS pages, and thereby transmit the session-id unprotected as soon as they request a HTTP page (even for requests of pictures).
  2. or configure the session-cookie, so it will be sent exclusively to HTTPS pages, and thereby loose the session, as soon as a HTTP page is shown.

With option 1 we can stop the discussion right now, there won't exist something like security afterwards. Option 2 could be handled, using HTTPS only for the whole site. As already mentioned, this should really be done, todays servers shouldn't have any problems with it. In PHP you could then call the function session_set_cookie_params(...) and set the parameter $secure to true.

The authentication cookie

The idea of the authentication cookie is, to create a second cookie in addition to the session cookie, as soon as the user increases his privileges (login). This second cookie is configured in such a way, that it will be sent back exclusively to HTTPS pages. Of course the login page itself has to use HTTPS.

https://www.example.com/login.php
<?php
  session_start();
  // regenerate session id to make session fixation more difficult
  session_regenerate_id(true);

  // generate random code for the authentication cookie and store it in the session
  $authCode = bin2hex(random_bytes(16));
  $_SESSION['authentication'] = $authCode;

  // create authentication cookie, and restrict it to HTTPS pages
  setcookie('authentication', $authCode, 0, '/', '', true, true);

  print('<h1>login</h1>');
  ...
?>

Now every page (HTTPS and HTTP) can use the unsecure session-cookie, it's purpose is merely to maintain the session. However, all pages with sensitive information can check for the secure authentication cookie.

https://www.example.com/secret.php
<?php
  session_start();

  // check that the authentication cookie exists, and that
  // it contains the same code which is stored in the session.
  $pageIsSecure = (!empty($_COOKIE['authentication']))
    && ($_COOKIE['authentication'] === $_SESSION['authentication']);

  if (!$pageIsSecure)
  {
    // do not display the page, redirect to the login page
  }

  ...
?>

An attacker could manipulate the session cookie, but he never has access to the authentication cookie, which is responsible for the authentication. Only the person who entered the password, can own the authentication cookie, it is sent exclusively over encrypted HTTPS connections.

In separating the two concerns "maintaining the PHP session" and "authentication", we can make the system a bit more robust. There are many ways to attack the session-cookie (server settings, php.ini, .htaccess, php code, browser settings, id in the url, ...), with the separation such attacks are bound to fail.


UTF-8 for PHP and MySQL

Different character encodings can cause headaches, that's something every developer who needs to make localized software knows for sure. Maybe your page shows UTF-8, where as the database delivers iso-8859-1, then you get these odd hieroglyphics, or even worse the user can possibly not even login anymore.

That's why Unicode was developed. I can't go into the details of Unicode here, but the goal is to represent the characters of all known languages, and other symbols as well (see this font character map). One of the most commonly used encodings for Unicode is UTF-8, because it is very compact (only 1 byte for common characters) and is understood by all todays web browsers.

UTF-8 in a PHP page

First the HTML/PHP page itself should be stored in the UTF-8 file format. That means you need an editor which supports Unicode, fortunately most IDE's are able to do this. Normal characters are then stored with 1 byte, special characters need 3-4 bytes, but the editor displays the typed-in character. That means, no HTML-entities like &Auml; anymore(!), what you see is what you typed.

You should care that the editor does not store the BOM header, this header is sometimes stored at the begin of the file with 3 bytes . The editor will hide them, so if you are not sure if your file contains these characters, you can either use a non interpreting editor (hex editor), or this wonderful online W3C checker. The BOM header is treated as output by PHP, and this can cause nasty Cannot modify header information - headers already sent errors.

Then you should add the encoding declaration to the top of the head element of your HTML/PHP page, right after the opening <head> tag.

HTML 4:  <meta http-equiv="Content-type" content="text/html;charset=UTF-8">
XHTML:   <meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
HTML 5:  <meta charset="UTF-8">
see more…

UTF-8 in MySQL

There is a simple way to tell the database it should deliver UTF-8 encoded strings, so they can be used in an UTF-8 web page. Instead of fiddling with the configurations of MySQL, just tell your connection object, which character-set you expect, the database does the rest for you.

Queries will automatically return UTF-8 encoded strings, ajax results can be used without cumbersome conversions, and other applications can request different encodings if necessary.

// tells the mysqli connection to deliver UTF-8 encoded strings.
$db = new mysqli($dbHost, $dbUser, $dbPassword, $dbName);
$db->set_charset('utf8mb4');

// tells the pdo connection to deliver UTF-8 encoded strings.
$dsn = "mysql:host=$dbHost;dbname=$dbName;charset=utf8mb4";
$db = new PDO($dsn, $dbUser, $dbPassword);

// tells the mysql connection to deliver UTF-8 encoded strings.
$db = mysql_connect($dbHost, $dbUser, $dbPassword);
mysql_set_charset('utf8mb4', $db);

To get more information about the charset of your database, you can make a query like that:

SHOW VARIABLES LIKE "character%"

Equal or not equal

What i'm missing most in PHP, is the benefit of a strong typed language. Dynamic typing may have it's advantages, but would you have thought following comparisons will give back true? PHP makes it possible...

  • ('abc' == 0)
  • (0 == null)
  • (1 == '1w?z')

Of course you can use the === operator, to check values and their types. Since PHP doesn't support you well with controlling types explicitly, i found it to be of no much use. That was the point when i started writing a class covering all the things i wished to be built-in in the PHP language.

/**
  * Checks if two values are equal. In contrast to the == operator,
  * the values are considered different, if:
  * - one value is null and the other not, or
  * - one value is an empty string and the other not
  * This helps avoid strange behavier with PHP's type juggling,
  * all these expressions would return true:
  * 'abc' == 0; 0 == null; '' == null; 1 == '1y?z';
  * @param mixed $value1
  * @param mixed $value2
  * @return boolean True if values are equal, otherwise false.
  */
function sto_equals($value1, $value2)
{
  // identical in value and type
  if ($value1 === $value2)
    $result = true;
  // one is null, the other not
  else if (is_null($value1) || is_null($value2))
    $result = false;
  // one is an empty string, the other not
  else if (($value1 === '') || ($value2 === ''))
    $result = false;
  // identical in value and different in type
  else
  {
    $result = ($value1 == $value2);
    // test for wrong implicit string conversion, when comparing a
    // string with a numeric type. only accept valid numeric strings.
    if ($result)
    {
      $isNumericType1 = is_int($value1) || is_float($value1);
      $isNumericType2 = is_int($value2) || is_float($value2);
      $isStringType1 = is_string($value1);
      $isStringType2 = is_string($value2);
      if ($isNumericType1 && $isStringType2)
        $result = is_numeric($value2);
      else if ($isNumericType2 && $isStringType1)
        $result = is_numeric($value1);
    }
  }
  return $result;
}

Avoid functions with mixed-typed return values

Unfortunately it's a common practice in PHP, that functions return different types, depending on whether the function was successful or not.

// This kind of mixed-typed return value (boolean or string),
// can lead to unreliable code!
function precariousCheckEmail($input)
{
  $isValid = filter_var($input, FILTER_VALIDATE_EMAIL);
  if ($isValid)
    return true;
  else
    return 'E-Mail address is invalid.';
}

At first glance, this looks even convenient, but it's easier to get a nasty bug, than to call this function correctly like this:

$result = precariousCheckEmail('nonsense');
if ($result === true)
  print('OK');
else
  print($result); // -> message will be given out

So where's the problem? Everybody using this function needs previous knowledge, that he can only get by looking at the code or at the (good) documentation.

  • You must use the === operator to check the result (== will not work).
  • You must compare with either true (but not with false), or with false (but not with true), and you are never sure which.
  • The return value has to be stored in a variable for later checking/printing. It's difficult to find a describtive name, because it can contain different things. This makes the code more prone to misunderstandings.
// All this checks will wrongly accept the email as valid!
$result = precariousCheckEmail('nonsense');
if ($result == true)
  print('OK'); // -> OK will be given out

if ($result)
  print('OK'); // -> OK will be given out

if ($result === false)
  print($result);
else
  print('OK'); // -> OK will be given out

if ($result == false)
  print($result);
else
  print('OK'); // -> OK will be given out

Instead of just telling what is bad, i would like to give a better alternative as well. The example below passes an additional parameter by reference. The calling code is very readable and it's nearly impossible to use it wrong.

// This function with a return value (boolean) and a
// parameter passed by-reference (string) is robust.
function robustCheckEmail($input, &$errorMessage)
{
  $isValid = filter_var($input, FILTER_VALIDATE_EMAIL);
  $errorMessage = '';
  if (!$isValid)
    $errorMessage = 'E-Mail address is invalid.';
  return $isValid;
}

if (robustCheckEmail('nonsense', $error))
  print('OK');
else
  print($error);

Calculating distance between points on earth

To calculate the spheric distance between two points on the earth (great-circle distance), one can use the Haversine formula. This formula is stable for calculating small distances regarding rounding errors.

/**
  * Calculates the great-circle distance between two points, with
  * the Haversine formula.
  * @param float $latitudeFrom Latitude of start point in [deg decimal]
  * @param float $longitudeFrom Longitude of start point in [deg decimal]
  * @param float $latitudeTo Latitude of target point in [deg decimal]
  * @param float $longitudeTo Longitude of target point in [deg decimal]
  * @param float $earthRadius Mean earth radius in [m]
  * @return float Distance between points in [m] (same as earthRadius)
  */
function haversineGreatCircleDistance(
  $latitudeFrom, $longitudeFrom, $latitudeTo, $longitudeTo, $earthRadius = 6371000)
{
  // convert from degrees to radians
  $latFrom = deg2rad($latitudeFrom);
  $lonFrom = deg2rad($longitudeFrom);
  $latTo = deg2rad($latitudeTo);
  $lonTo = deg2rad($longitudeTo);

  $latDelta = $latTo - $latFrom;
  $lonDelta = $lonTo - $lonFrom;

  $angle = 2 * asin(sqrt(pow(sin($latDelta / 2), 2) +
    cos($latFrom) * cos($latTo) * pow(sin($lonDelta / 2), 2)));
  return $angle * $earthRadius;
}

An alternative to the haversine formula is the vincenty formula, it is slightly more complex, but does not suffer from the weakness with antipodal points (rounding errors).

/**
  * Calculates the great-circle distance between two points, with
  * the Vincenty formula.
  * @param float $latitudeFrom Latitude of start point in [deg decimal]
  * @param float $longitudeFrom Longitude of start point in [deg decimal]
  * @param float $latitudeTo Latitude of target point in [deg decimal]
  * @param float $longitudeTo Longitude of target point in [deg decimal]
  * @param float $earthRadius Mean earth radius in [m]
  * @return float Distance between points in [m] (same as earthRadius)
  */
function vincentyGreatCircleDistance(
  $latitudeFrom, $longitudeFrom, $latitudeTo, $longitudeTo, $earthRadius = 6371000)
{
  // convert from degrees to radians
  $latFrom = deg2rad($latitudeFrom);
  $lonFrom = deg2rad($longitudeFrom);
  $latTo = deg2rad($latitudeTo);
  $lonTo = deg2rad($longitudeTo);

  $lonDelta = $lonTo - $lonFrom;
  $a = pow(cos($latTo) * sin($lonDelta), 2) +
    pow(cos($latFrom) * sin($latTo) - sin($latFrom) * cos($latTo) * cos($lonDelta), 2);
  $b = sin($latFrom) * sin($latTo) + cos($latFrom) * cos($latTo) * cos($lonDelta);

  $angle = atan2(sqrt($a), $b);
  return $angle * $earthRadius;
}