Loading data in migrations

I found migrations to be very useful tool to maintain data structures in a rails project. It is clean, easy to understand and fast when it comes to recreate a database.

During develoment, I faced several times the same problem: “How to import static data into the database?” Of course, one can create fixture and load it. But sometimes one needs more powerful tool.

So, I tried to import the data in the migrations themselves. Since the data to import was huge (tens of MBs), I did try several methods and approaches. Here is the result.

Fixtures in Migrations

This is method that I found on the web in an article of Adam Christensen Loading Fixtures in a Migration. It shows how to load the fixtures file into database.

require 'active_record/fixtures'

class CreateCategories < ActiveRecord::Migration
  def self.up
    create_table :categories do |t|
      t.column :name, :string
    end

    Fixtures.create_fixtures('test/fixtures', File.basename("categories.yml", '.*'))
  end

  def self.down
    drop_table :categories
  end
end

Active record way

If your ActiveRecord object already exists, you can use the following code. It is obviously not the way to import thousands of records, but for ten objects it is good enough.

class CreateAuthors < ActiveRecord::Migration
    def self.up
      create_table :authors do |t|
          t.column :name, :string, :length=>20, :null=>false
          t.column :surname, :string, :length=>50, :null=>false
          t.column :blog, :string, :length=>100, :default=>'http://blog.zmok.net' 
      end
      
      Author.create( :name=>'Roman', :surname=>'Mackovcak')
    end

    def self.down
      drop_table :authors
    end
end

External CSV

If the data came in the CSV format then a fastercsv library could be used.

First of all, the library needs to be installed.

sudo gem install fastercsv

then create the migration

require 'fastercsv'

class LoadAuthors < ActiveRecord::Migration

  def self.up
    FasterCSV.foreach('db/data/authors.csv') do |row|
      Author.create(:name=>row[0], :surname=>row[1], :blog=>row[2])
    end
  end
  
  def self.down
    Author.delete_all
  end
end

And place the CSV file (example below) into db/data folder.

"Miro","Skultety","http://blog.zmok.net"
"Abhishek","Balaria","http://blog.zmok.net"

Embedded CSV

Sometimes it is useful to bundle the data directly with the migration. In this example, the data are provided to the load_articles method from articles_data.

require 'csv'

class CreateArticles < ActiveRecord::Migration
  def self.up
    create_table :articles do |t|
      t.column :title, :string, :length=>20, :null=>false
      t.column :url, :string, :length=>100, :null=>false
      t.column :author_id, :integer, :null=>false
      t.column :abstract, :string, :length=>255
    end

    load_articles
  end

  def self.down
    drop_table :articles
  end

  def self.load_articles
    cnx = ActiveRecord::Base.connection
    ActiveRecord::Base.silence do
      reader = CSV::Reader.create(articles_data) 
      reader.each do |row|
        values = row.collect {|v| cnx.quote(v).gsub('\\n', "\n").gsub('\\r', "\r") }.join(', ')
        sql = "INSERT INTO articles(id, title, url, author_id, abstract) VALUES (#{values})"
        cnx.insert(sql) 
      end
    end
  end

  def self.articles_data
    <<'END_OF_DATA'
1,"Integration is the killer app (even in Web 2.0)","http://blog.zmok.net/articles/2006/08/20/integration-is-the-killer-app-even-in-web-2-0",3,"Integration is the key to success"
2,"Visualize your Rails schema","http://blog.zmok.net/articles/2006/11/13/visualize-your-rails-schema",2,"Exporting rails schema to UML"
END_OF_DATA
  end

end

ActiveRecord - native commands

Last, but not least is the method that uses native functions of MySQL database. This could work only if the rails application sits on the same machine as the MySQL.

require 'tempfile' 

class CreateReaders < ActiveRecord::Migration
    def self.up
      create_table :readers do |t|
          t.column :nickname, :string, :length=>20, :null=>false
          t.column :email, :string, :length=>50, :null=>false
      end
      load_from_file
    end

    def self.down
      drop_table :readers
    end
    
    def self.load_from_file
      file = 'db/data/readers.csv'
      tp = Tempfile.new(File.basename(file)).path
      File.copy(file, tp)
      File.chmod(0666, tp)
      ActiveRecord::Base.connection.execute("load data infile \'#{tp}\' into table readers fields terminated by \',\' enclosed by \'\"\' lines terminated by \'\\n\'")
    end

end

and the corresponding CSV file

1,"bob","bob@really-cool-office.org"
2,"sergey","sergey@hermitage-magazin.ru"

That's it! Enjoy it.

4 comments

  1. Thank you so much for this information.

    I had to add

    require ‘tempfile’

    to your “ActiveRecord – native commands” example for it to work.

    I added it to the end of my config/environment.rb file.

  2. So which one did you choose and why? You don’t mention it.
    But thanks for this overview, I was a bit procrastinating researching this, so now I don’t have to;)

  3. 2 Jan
    Every method has got its own pros and cons. I had a big CSV file, so External CSV and Native commands were my options.

Comments are closed.