Bayes classification in Ruby made easy

Recently I was experimenting with ruby bayes classification. At first sight it looks like a difficult topic, but with the right libraries it is interesting and funny.

Before you start experimenting, you have to install 3 gems.

gem install classifier
gem install madeleine

Confirm the required stemmer gem.

For the beginning, lets experiment with the plain bayes classifier.

require 'classifier'

bayes = 'funny', 'sad', 'neutral'

# Train it slightly...
bayes.train 'funny', 'Finally all of them were smiling'
bayes.train :sad, 'Little ill puppy'
bayes.train :neutral, 'Tax declaration'

The classifier is “trained”, so lets ask it something interesting…

bayes.classify 'Everybody have to pay taxes'
=> "Neutral"

Hmmm… this does not look like the expected answer :o). We have probably trained it incorrectly. So, let’s undo it:

# Remove the incorrect statement
bayes.untrain :neutral, 'Tax declaration'

# Train it right
bayes.train :sad, 'Tax declaration'

# And provide something neutral (if there is no statement for a category, the classifier does not work as expected.
bayes.train :neutral, 'Rainbow is full of colors'

So, how does the classifier sees it now?

bayes.classify 'Everybody have to pay taxes'

Yes, this is how people feel it :o). For those who does not agree (and also for debugging purposes) it is possible to see score for each category.

bayes.classifications 'Everybody have to pay taxes'
=> {"Sad"=>-9.43348392329039, "Neutral"=>-10.2035921449865, "Funny"=>-10.2035921449865}

The classifier that was created and trained is nice, but disappears as soon as you stop your ruby console. To make it more persistent, you have to use Madeleine class.
“Madeleine is a Ruby implementation of Object Prevalence, that is, transparent persistence of business objects using command logging and complete system snapshots.”

require ‘madeleine’

# Store the data into bayes-dir directory
madeleine =“bayes-dir”) { bayes }

Next time load the classifier with command

madeleine ="bayes-dir")

# Perform more training
madeleine.system.train "sad", "Many people were injured by the earthquake"

# And test it once more
madeleine.system.classify 'smiling face'
madeleine.system.classify 'strong earthquake'

The classifier is a nice piece of code. I did enjoy it, and hope you will enjoy it too.


Comments are closed.