Improve performance of MySQL driver for RoR

Last week I was working on the performance tuning of a rails application. I ran a profiler and found something very interesting.

I found that there is a procedure that is called very often and takes a lot of time.

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 14.99    18.13     18.13    43430     0.42     0.85  Mysql#get_length
  8.84    28.82     10.69   172751     0.06     0.09  Kernel.===
  8.40    38.98     10.16   306964     0.03     0.04  Fixnum#==
  7.59    48.16      9.18     5566     1.65    30.60  Integer#times
  6.42    55.92      7.76     7582     1.02     1.63  Mysql::Net#read
...
  0.00   120.92      0.00        1     0.00 120920.00  #toplevel

The procedure was Mysql#get_length:

# File src/rails-1.2.3/activerecord/lib/active_record/vendor/mysql.rb
  def get_length(data, longlong=nil)
    return if data.length == 0
    c = data.slice!(0)
    case c
    when 251
      return nil
    when 252
      a = data.slice!(0,2)
      return a[0]+a[1]*256
    when 253
      a = data.slice!(0,3)
      return a[0]+a[1]*256+a[2]*256**2
    when 254
      a = data.slice!(0,8)
      if longlong then
        return a[0]+a[1]*256+a[2]*256**2+a[3]*256**3+
          a[4]*256**4+a[5]*256**5+a[6]*256**6+a[7]*256**7
      else
        return a[0]+a[1]*256+a[2]*256**2+a[3]*256**3
      end
    else
      c
    end
  end

There is obviously space for improvement! Replace the multiplications by shifts

class Mysql
  def get_length(data, longlong=nil)
    return if data.length == 0
    c = data.slice!(0)

    case c
    when 251
      return nil
    when 252
      a = data.slice!(0,2)
      return a[0]+(a[1]<<8)
    when 253
      a = data.slice!(0,3)
      return a[0]+(a[1]<<8)+(a[2]<<16)
    when 254
      a = data.slice!(0,8)
      if longlong then
        return a[0]+(a[1]<<8)+(a[2]<<16) +(a[3]<<24)+(a[4]<<32)+(a[5]<<40)+(a[6]<<48)+(a[7]<<56)
      else
        return a[0]+(a[1]<<8)+(a[2]<<16)+(a[3]<<24)
      end
    else
      c
    end
  end
end

I ran the profiler again and, well... a wisdom of my university times popped on my mind: “There is an elegant, simple, nice and obvious solution for each problem. Unfortunately, it is wrong!”

Performance remained the same. So, deeper investigation is needed! It was not difficult to find out that most of the times the “else” branch is executed. I tried something like this:

  def get_length(data, longlong=nil)
    return if data.length == 0
    c = data.slice!(0)

    return c if c<251

    case c
    when 251
      return nil
    when 252
      a = data.slice!(0,2)
      return a[0]+(a[1]<<8)
    when 253
      a = data.slice!(0,3)
      return a[0]+(a[1]<<8)+(a[2]<<16)
    when 254
      a = data.slice!(0,8)
      if longlong then
        return a[0]+(a[1]<<8)+(a[2]<<16) +(a[3]<<24)+(a[4]<<32)+(a[5]<<40)+(a[6]<<48)+(a[7]<<56)
      else
        return a[0]+(a[1]<<8)+(a[2]<<16)+(a[3]<<24)
      end
    else
      c
    end
  end

And the performance?

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 11.26    11.25     11.25    43430     0.26     0.38  Mysql#get_length
  8.73    19.97      8.72     5566     1.57    23.25  Integer#times
  7.77    27.73      7.76     7582     1.02     1.63  Mysql::Net#read
  7.27    34.99      7.26     5140     1.41     3.69  Array#each
  5.52    40.50      5.51   139526     0.04     0.05  Fixnum#==
...
  0.00    99.88      0.00        1     0.00 99880.00  #toplevel

Improved! The toplevel cumulative time is down by over 20 seconds.
Now, here's how you can embed this hack into your application:

  • Create a file called e.g. mysql_fix.rb
  • Add there
require 'active_record/vendor/mysql'

class Mysql
  def get_length(data, longlong=nil)
    return if data.length == 0
    c = data.slice!(0)

    return c if c < 251

    case c
    when 251
      return nil
    when 252
      a = data.slice!(0,2)
      return a[0]+(a[1]<<8)
    when 253
      a = data.slice!(0,3)
      return a[0]+(a[1]<<8)+(a[2]<<16)
    when 254
      a = data.slice!(0,8)
      if longlong then
        return a[0]+(a[1]<<8)+(a[2]<<16) +(a[3]<<24)+(a[4]<<32)+(a[5]<<40)+(a[6]<<48)+(a[7]<<56)
      else
        return a[0]+(a[1]<<8)+(a[2]<<16)+(a[3]<<24)
      end
    else
      c
    end
  end
end
  • Put it into e.g. lib/zmok directory
  • Add into your environment.rb following line
require File.join(File.dirname(__FILE__), '../lib/zmok/mysql_fix')

It is funny how my performance tuning session ended. Instead of changing my Rails application, I ended up improving the MySQL driver.