Monthly Archives: May 2013

Prepared Statements

Any introduction to Prepared Statements has to start with SQL Statements.

 

Every SQL Statement arriving at a database is processed before executing and returning a resultset. The processing is a series of computations including: parsing, compiling, and query optimization.

The parsing, compiling, and query optimization can lead to a lot of wasted CPU if your queries are very similar, like this example from the widely available test database Sakila:

Example Queries:

  SELECT * FROM address WHERE address_id = 12;

  SELECT * FROM address WHERE address_id = 34;

  SELECT * FROM address WHERE address_id = 56;

Example Code, assuming PDO & PHP, to produce those queries:

  $ids = array(12,34,56);

  $sql = “SELECT * FROM address WHERE address_id = “;

  $databaseResourceHandle = CustomDB::getDBH();

  $rows = array();

  foreach ($ids as $id) {

   $result = $databaseResourceHandle->query($sql . $id);

   $rows = array_merge($rows, $result->fetchAll());

  }

 

Prepared Statements, on the other hand, are an advanced way to process SQL Statements with as little overhead as possible on the side of the database, swapping CPU cycles for a little RAM.

 

When comparing Prepared Statements and SQL Statements, it is important to consider both the benefits and drawbacks.

  • Benefit:
    Parameterized variables within the query are immune to SQL injection

  • Benefit:
    Processing overhead only occurs at Prepared Statement Handle instantiation
    This prevents parsing, compiling, and query optimization on future executions.

  • Benefit:
    PDO can avoid all possible statement preparation failures related to a full instance-wide Prepared Statement Handle pool

  • Drawback:
    Prepared Statement Handles consume handles from an instance-wide pool that can lead to failure when all handles in that pool are allocated

  • Drawback:
    From the MySQL docs: ‘Prepared Statement allocation is specific to a session until the end of a session or deallocation.’ Do not allocate prepared statements in stored procedures without risking memory leaks. Also, do not use stored procedures, but that topic is outside the scope of this discussion.

  • Drawback:
    Prepared Statement Handles consume memory on the mysql server. There is no visibility into the amount of memory they consume at any level.

 

Prepared Statements are a way to avoid paying the overhead of standard SQL Statements more than once.  The actual code implementation is very different, and for good reason. You’re supposed to pass the query in with parameterization for whatever part is going to change in future executions of the SQL Statement:

Example Queries:

  SELECT * FROM address WHERE address_id = 12;

  SELECT * FROM address WHERE address_id = 34;

  SELECT * FROM address WHERE address_id = 56;

Example Code, assuming PDO & PHP, to produce those queries using Prepared Statements:

  $ids = array(12,34,56);

  $sql = “SELECT * FROM address WHERE address_id = :address_id”;

  $databaseHandle = CustomDB::getDBH();

  $rows = array();

  $preparedStatementHandle = $databaseHandle->prepare($sql);

  foreach ($ids as $id) {

    $preparedStatementHandle->execute(array(‘address_id’ => $id));

    $result = $preparedStatementHandle->fetchAll();

    $rows = array_merge($rows, $result);

  }

 

 

The above code is the simplest way to set up efficient Prepared Statement handling, but it’s still mostly useless. The problem here is that my example can easily be rewritten into an IN clause:

  SELECT id FROM address WHERE address_id IN (12,34,56);

This rewritten SQL Statement would be one network trip and a single processing for all answers.

 

So when is a Prepared Statement useful?

Prepared Statements are useful when they are related to session-long caching alongside a database wrapper that’s just good enough to programmatically generate most typical lookups.

Unfortunately, “good enough” caching looks complicated:

Example Queries:

  SELECT * FROM address WHERE address_id = 12;

  SELECT * FROM address WHERE address_id = 34;

  SELECT * FROM address WHERE address_id = 56;

Example Code, assuming PDO & PHP, to produce those queries using Prepared Statements with caching:

  $ids = array(12,34,56);

  $sql = “SELECT * FROM address WHERE address_id = :address_id”;

  $databaseHandle = CustomDB::getDBH();

  Cache::setStrategy(Cache_LRU::getLabel());

  $rows = array();

  if (!Zend_Registry::isRegistered(self::PREP_STMT_CACHE_KEY)) {

    Zend_Registry::set(self::PREP_STMT_CACHE_KEY, Cache::getInstance());

  }

  $preparedStatementHandle = Zend_Registry::get(self::PREP_STMT_CACHE_KEY)->get($sql);

  if (!$preparedStatementHandle) {

    $preparedStatementHandle = $databaseHandle->prepare($sql);

    // Use the sql itself as the index/hash

    Zend_Registry::get(self::PREP_STMT_CACHE_KEY)->set($sql, $preparedStatementHandle);

}

  foreach ($ids as $id) {

    $preparedStatementHandle->execute(array(‘address_id’ => $id));

    $result = $preparedStatementHandle->fetchAll();

    $rows = array_merge($rows, $result);

  }

 

How can Prepared Statements fail?

  • Prepared Statements only exist for the current session, so holding onto a handle after a session closes will lead to failures
  • Each Prepared Statement consumes a handle from the instance’s Prepared Statement pool, which is the “max_prepared_stmt_count”
  • Out Of Memory on the client side

 

Case 1:

A connection closes while a cache of Prepared Statement Handles exists

Solution:

Update your PDO wrapping class to have a __destruct method defined to clear the relevant cache of Prepared Statement Handles before calling the parent’s destruct method.

 

Case 2:

The max_prepared_stmt_count value is reached on a database

Solution:

Immediately drop all local caches of Prepared Statements and try again. If there is still an issue, activate PDO’s ATTR_EMULATE_PREPARES flag to silently convert calls of ->prepare and ->exec into standard SQL Statements.

Note:

It appears that automatic escaping is retained in this circumstance.

 

Case 3:

Out Of Memory (OOM) on the client-side.

Solution:

Reduce the Prepared Statement Handle Cache size. The cache does not have to be large if it is well managed. Even my company’s complicated webapp’s web requests do not fill a 200-statement-long FIFO cache.

 

 

Remaining Points:

  1. Monitor (ie: Nagios) Prepared_stmt_count vs max_prepared_stmt_count

  2. Monitor (ie: StatsD) the Prepared Statement Handle Cache hit, miss, and purge rate.

  3. An LRU’s extra minimal overhead is only worthwhile over a simple FIFO if your Prepared Statement Handle Cache is too small for all the queries that should be cached.
    Note: Your cache should be small due to the unbounded and invisible memory consumption of Prepared Statement Handles on the database server

  4. A best-case Prepared Statement instantiation against localhost with a simple select costs me, on average, about 300 microseconds. Pulling a handle from a cache is about 6 microseconds.

  5. Coworkers have shown me that long lists of Named Parameters (ie: “:id1, :id2, :id3, […]”) get more expensive with quantity whereas long lists of Ordered Parameters (ie: “?,?,?,[…]”) remain cheap even in large number. Numerically quantifying this slowdown will be a future post.

  6. Ordered Parameters’ values are not decoded in SHOW PROCESSLIST. Named Parameters’ values are displayed, however, which makes them, to me, far preferable. 

Finally, note that the last implementation still is not ideal: Prepared Statement Handles should be contained within a connection object for proper management (e.g.: destruct on disconnect).