Something I made

Secure request links or session-free session data

by edA-qa mort-ora-y on Sunday July 20 2008 @ 08:57:39 (20/-1320 Points)

Computing ↪Technology ⌦ Reply ✔ Stick It ✗ Ditch It

The technique in this article can be used to create secure session data stored strictly on the client side (browser), or to do account confirmations without storing temporary data on the server. This allows easy failover between systems and avoids the maintenance of temporary data.

A full PHP example is provided at the bottom of this article.

Introduction

For the BrainBrain system I needed a way to have the user confirm changes to their profile. I wanted to do this via an email request, which would also double as a confirmation of the user's account. I also didn't want to store any temporary data on the server side for this request -- that would just introduce more maintenance work. The solution I decided upon was encrypted information in a URL.

I first came upon this idea when thinking how to do authenticated web sessions without storing any server-side data. This was to keep users from being bound a particular machine, allowing to dispatch them freely to any part of the cluster during their session. The technique is virtually the same as for the URL.

Requirements

Arbitrary Data

Clearly the system needs to handle arbitrary data. Once the system is in place it would be nice to use it for all such requests and session needs. The exact form of this data depends on the language you are using, and whether you are targetting a URL, a cookie, or a form field. In any case you will need to serialize the data -- and be able to deserialize it later.

Your language will likely have a way to serialize data for you, Java and PHP both do, and I'm guessing so do Ruby, Python, and many others. The problem with the default serializer is that it may result in excessively long serialized forms. These long forms are good in that they are easier to decode and help prevent corruption (see Secure below), but very bad for a URL form since it they are too long.

For the URL form I created a simpler serialized form, essentially a name-value pair encoding. What is lost is type information -- when the data gets back to the server the user of the data will have to convert the type accordingly. For simple requests this is not a big loss, but for some session data this might be a lot of work. You'll have to decide on the balance between data size and programming convenience.

Secure

Authentication data will be part of the packet being sent to client. In terms of system with a login, this means that the client will actually have the session key granting them access to the system. In my URL confirmations the user id is part of the data, and the server modifies that user without requiring a web login. For this reason the data needs to be very secure, and tamperproof.

One consideration is whether the client is allowed to inspect the data they are receiving. Usually since it is their own data it isn't much of an issue, but since it may contain data only the server should know, we must protect it from client inspection. Symmetric encryption does the trick. Once we've serialized the data we can simply call an encryption routine and give the encrypted form to the client. When the client submits the encrypted data, the server simply decrypts it.

What this doesn't prevent is tampering with the data. That is, if somebody were to alter the encrypted data, the server would be presented with a bad request. Now, in the case of a URL that data was rather compact, this means even small random changes might yield a valid request. Thus the attacker may not know what they are modiying, but they will be modifying something. Certainly not a good situation.

Thankfully the cryptography folks have thought of this and have come up with a hundred and one digests which can be used to protect message integrity. A digest is computed on the message and then added to the end of that message, in our case the message is the serialized data. When the server receives the data it can recompute the digest and check that it matches what was provided. When they don't match the server knows that somebody has tampered with the data. In my case, to keep the URL smal, I used a simple 32bit CRC.

For my system a 32-bit CRC is good enough, and keeps the URLs short. However, should you have an ultra secure installation, or are using cookies and don't have such a length limit, you may wish to use a more robust digesting algorithm.

It is important here as to when you add the digest. It has to cover the entire data which is relevant for the request, so after the serialization, but it has to be before the encryption phase. This should be obvious, but can be easily overlooked. It needs to be encrypted, otherwise a person may tamper with the data and simply update the digest as well. By having the digest inside the encrypted data that attacker now needs to alter the data at two points, once for what he wishes to change in the request, and once to get the checksum correct.

Time Limited

It makes sense that all of these tokens being produced are only valid for a limited period of time: confirmation requests tend to timeout after several days, and login tokens last no more than a couple of hours usually. There are two ways of making this happen.

The easiest is to simply add an expiration date to the data itself. When the server checks the incoming request and finds that it is too old, it can simply reject it. This does not require much of a change in the process. However, for the URL case, adding date information extends the URL length, which is not be desired. To workaround this I simply truncated my dates to increments of days, thereby shortening the serialized form.

Another option involves the key being used for the data. The server needs obviously to have a copy of the key used to encrypt the data. One option to timeouts is to simply rotate the keys which the server is using. That is, every day the server creates a new key, and when data comes to the system it can simply iterate over the keys for many days trying to find one that matches (by decrypting and checking if the CRC matches). If none are found, then it knows the data is invalid.

This technique may be rather slow, especially if you have a long timeout period. A way to avoid is to indicate which key is being used directly in the transmitted data. That is, once the final encryption is done you can simply append a single character to that data. This character is an index into an array of keys on the server. Once a key is expired, it is replaced, and that index can be reused. Therefore you never have to guess which key is being used, and can avoid the inefficiencies of multiple keys.

Rather than altering the keys frequently, you can also have multiple initialization vectors (IVs) to the encryption algorithm. IVs are a way to prevent dictionary attacks, and the likes. They are similar to what salting does for hashed data like passwords. In our case the IV is being stored on the server, thus effectively becomes part of the key. For transmitting larger data however, it is recommended that each virtual sessions gets its own IV which is appended to the transmitted data, in which case key rotation is still important to the system.

PHP Example

My high level API for consists of two functions: dt_encrypt_datalink and dt_decrypt_datalink.

dt_encrypt_datalink

This function takes an array of data and returns an ecrypted form suitable for use in URLs or in a cookies. The data is limited to name-value pairs of strings, or types which can be safely converted to strings. An expiration date is added to the data as an additional name-value pair.

This return value is added as a parameter to the query string in a URL and sent to the user by email. It can just as easily be stored as a cookie.

On the first couple of lines you'll notice a base64_decode and a hash. These are just there to allow easier coding of the global configuration -- the global configuration itself could already have the data in this form if desired.

function dt_encrypt_datalink( $data ) {
  $ec = $GLOBALS['brain_encrypt_conf'];
  $iv = base64_decode( $ec['iv'] );
  $key = hash( $ec['keyhash'], $ec['key'] );
  
  //add an expiration date
  $data['x'] = intval( (time() + $ec['expire'])/$ec['expire_base'] );
  
  //serialize the data
  $datastr = '';
  foreach( $data as $name => $value ) {
    if( strlen( $datastr ) )
      $datastr .= ',';
    $datastr .= $name . ',' . $value;  //should escape comma's in name/value for safety
  }
  
  //prepend hash value
  $hashval = hash( $ec['integrity_hash'], $datastr, TRUE );
  $datastr = $hashval . '|' . $datastr;
    
  $enc = mcrypt_encrypt( $ec['cipher'], $key, $datastr, $ec['mode'], $iv );

  //make it url/email/everything safe
  return php_urlsafe_encode( $enc );
}

dt_decrypt_datalink

This function takes a string which was produced by dt_encrypt_datalink -- in my case this is a request paramter in the URL, so one of the variables in the $_GET global. If using cookies it comes from the $_COOKIE global.

function dt_decrypt_datalink( $pq ) {
  $ec = $GLOBALS['brain_encrypt_conf'];
  $iv = base64_decode( $ec['iv'] );
  $key = hash( $ec['keyhash'], $ec['key'] );
  
  try{
    $enc = php_urlsafe_decode( $pq );
    
    $un = mcrypt_decrypt( $ec['cipher'],  $key, $enc, $ec['mode'], $iv );
    
    //cleanup padded 0's, they don't exist in our data
    $un = rtrim( $un, "\0" );
              
    //extract and check hash
    $hashparts = explode( '|', $un );
    if( count( $hashparts ) != 2 )
      throw new Exception( "hash parts failed" );
    if( $hashparts[0] !== hash( $ec['integrity_hash'], $hashparts[1], TRUE ) )
      throw new Exception( "hash does not match" );
        
    //deserialize
    $parts = explode( ',', $hashparts[1] );
    for( $i = 0; $i < count($parts); $i+=2 ) {
      $data[$parts[$i]] = $parts[$i+1];
    }
      
    //check expiration
    if( !isset( $data['x'] ) 
      || ( time() > $data['x']*$ec['expire_base'] ) )
      throw new Exception( 'Expired link' );
        
    return $data;
  } catch( Exception $ex ) {
    error_log( $ex );
    return null;
  }
}

brain_encrypt_conf

Both of the functions refer to a global variable called brain_encrypt_conf. This is simply an array which looks like the below (the one I actually use, well, a different key of course).

$brain_encrypt_conf = array(
  //this first block of data need not be changed
  'cipher' => MCRYPT_CAST_256,
  'mode' => MCRYPT_MODE_CFB,
  'integrity_hash' => 'crc32',  //something small so links don't grow in size
  'keyhash' => 'ripemd128',
  
  //these items should be changed
  'iv' => "CgVB/wOx1PLElujbe7A55g==",  //base64 encoded
  'key' => "fNKca2389?Cr;2",  //will be hashed to form actual encryption key
  
  //these items can optionally be changed
  'expire' => 15 * 3600*24,  //links expire in 15 days
  'expire_base' => 3600*24,  //to reduce link sizes
);

php_urlsafe_encode

The two urlsafe encode and decode functions are used to create forms of the data which are suitable for use in a URL. For a cookie you might not need these functions, as the standard cookie functions will do an appropriate encoding and decoding for you.

/**
 * The php_urlsafe_encode and decode functions are
 * needed since PHP doesn't handle '+' correclty in URLs.
 * But for additional safety, depending on your apache redirect all
 * possibly special characters are remapped.
 */
function php_urlsafe_encode( $data ) {
  $res = base64_encode( $data );
  return str_replace ( array('+','=','/'), array('_','~','-'), $res );
}

function php_urlsafe_decode( $data ) {
  $res = str_replace( array('_','~','-'), array('+','=','/'), $data );
  return base64_decode( $res );
}

create_iv

To be complete, the IVs come from this little piece of code below. If you are using cookies you would likely create a new IV for every new user session.

Note, you'll want to replace the variables here with whatever you chose in your config.

  $size = mcrypt_get_iv_size(MCRYPT_CAST_256, MCRYPT_MODE_CFB);
  $iv = mcrypt_create_iv($size, MCRYPT_DEV_RANDOM);
  print( base64_encode( $iv ) . "\n" );

© 2008 edA-qa mort-ora-y
Using Persephone and TestPlan