Thursday, January 29, 2009

MySQL bench marks, comparing MyISAM to InnoDB and evaluating the performance hit of foreign keys

During an interview earlier this week, the question of how using foreign keys would impact performance on MySQL. I decided that it would be worthwhile to actually check this out in practice. I started out with a basic structure as follows:

CREATE TABLE IF NOT EXISTS `a` (
`id` int(11) NOT NULL,
`data` varchar(255) NOT NULL,
`data2` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `b` (
`id` int(11) NOT NULL,
`a_id` int(11) NOT NULL,
`data3` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Then added a second database where we had a foreign key:

CREATE TABLE IF NOT EXISTS `b` (
`id` int(11) NOT NULL,
`a_id` int(11) NOT NULL,
`data3` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `a_id` (`a_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE `b`
ADD CONSTRAINT `b_ibfk_1` FOREIGN KEY (`a_id`) REFERENCES `a` (`id`) ON DELETE CASCADE;

And just to make things complete, I added a third database identical to the first, but using MyISAM as the engine.

The first thing is to do a data population. I wrote a perl script to generate 10,000 inserts of random data into table A, each with 0-5 random inserts into table B.
Next a selection of queries. I did 500 variations of SELECT * FROM a INNER JOIN b ON (a.id=b.a_id) WHERE a.id=?

For deletes, I deleted 500 rows, using DELETE FROM a WHERE id=?; DELETE FROM b WHERE a_id=? for the databases without the foreign key and the simple DELETE FROM a for the foreign key database.

The results for inserts:
databaseinsert time (s)select time (s)delete time (s)
InnoDB, no FK4.50.253
InnoDB, FK4.00.01
MyISAM1.00.750

I had always assumed that the advantage of MyISAM over InnoDB was the opposite of what it turned out to be with this benchmark. Perhaps even more surprisingly, the foreign key version of the InnoDB database was uniformly faster than the non FK version. I would have expected to see some overhead at least in the insertion stage, but even on multiple trials, I found a consistent advantage for the foreign keys on the insertion.

Saturday, January 17, 2009

Digging back in

Having recently been downsized from oversee.net, I'm using some of the time to revisit some of my old projects. At the top of the stack is lalocalfood.com, where the first phase is an improved version of my map of Los Angeles farmer's markets. Coming back to some of the old code, it's interesting to see how much time is necessary to recover my memories of some of this code I haven't touched in years. That said, once I get over that, I'm finding that my code has surprisingly good design where very little effort is required to handle new cases. I should be finished with the scraper code to collect the farmer's market data from the two main sites with one more day's work.

Thursday, April 24, 2008

Creating a controller in Catalyst

A new controller in Catalyst is registered by running script/APPNAME_create.pl CONTROLLER_NAME. This creates a module in the controller directory and a test in the test directory. These are apparently all that is needed for a controller to be created.

It's worth noting that /foo can activate either App::Controller::Foo::index() or App::Controller::Root::foo(). However, if both exist, App::Controller::Foo::index() takes precedence.

Monday, April 21, 2008

Module Monday: Quantum::Superpositions

A new weekly feature. I'm going to dig out some of the hidden gems of CPAN. One of the best places to look is Damian Conway's list of modules and this is one fun and interesting one.

Here's a selection of what Quantum::Superpositions can do:

$wanted = any("Mr","Ms").any(@names);
if ($name eq $wanted) { print "Reward!"; }

$okay = all(\&check1,\&check2);
die unless $okay->();

my $large =
all( BigNum->new($centillion),
BigNum->new($googol),
BigNum->new($SkewesNum)
);
@huge = grep {$_ > $large} @nums;

Sunday, April 20, 2008

Installing Rose::DB::Object

For the Backlog project, I was thinking this was going to be a good opportunity to also learn Rose::DB::Object, so I tried installing it via CPAN. Oops, it had some errors along the way. It turns out that SQL::ReservedWords has a dependency on Data::OptList which it doesn't declare. Installing Data::OptList solved that problem, but Rose::DB::Object is still failing its test suite:

t/db-object-manager..................7/3900 Can't locate object method "format_select_start_sql" via package "Rose::DB::__RoseDBPrivate__::Rose::DB::MySQL" at /Users/dhosek/.cpan/build/Rose-DB-Object-0.769-AXgKMz/blib/lib/Rose/DB/Object/QueryBuilder.pm line 725.
# Looks like you planned 3900 tests but only ran 827.
# Looks like your test died just after 827.
t/db-object-manager.................. Dubious, test returned 255 (wstat 65280, 0xff00)
Failed 3073/3900 subtests
(less 783 skipped subtests: 44 okay)

My insomnia feels resolved, so I'll come back to this after sleep.

Update (21 April): I managed to solve the problem: Doing an install of Rose::DB::MySQL then re-trying to install Rose::DB::Object managed to do the trick.

Friday, April 18, 2008

Learning Catalyst: The Backlog project. Day 1

The Plan

What I'm looking to build is a web application which is designed for maintaining a product backlog. I see the following as the key user stories:
  • As a programmer, I want the system to use the wiki as its primary display and authentication system (so, users will log into Backlog using their wiki credentials, and the backlog is visible from the wiki).
  • As a programmer, I want the system to import the existing backlog from the wiki.
  • As a User, I want to be able to submit stories to the product backlog.
  • As a product owner, I want to be able to prioritize and arrange the backlog.


Installing Catalyst

This went far more easily than I expected. A simple CPAN install brought it in with no problems. I guess I was just lucky on that front.

Building the core

Step one was to build the core of the application. I ran catalyst.pl from the command line and got the following:

~/dev/Backlog> catalyst.pl Backlog
To use the Catalyst development tools including catalyst.pl and the
generated script/myapp_create.pl you need Catalyst::Helper, which is
part of the Catalyst-Devel distribution. Please install this via a
vendor package or by running one of -

perl -MCPAN -e 'install Catalyst::Devel'
perl -MCPANPLUS -e 'install Catalyst::Devel'


Apparently the Catalyst package is divided into two parts, one just for running Catalyst, one for developing Catalyst applications. Fortunately this install was also uneventful.

Rerunning it, I got

~/dev/Backlog> catalyst.pl Backlog
created "Backlog"
created "Backlog/script"
created "Backlog/lib"
created "Backlog/root"
created "Backlog/root/static"
created "Backlog/root/static/images"
created "Backlog/t"
created "Backlog/lib/Backlog"
created "Backlog/lib/Backlog/Model"
created "Backlog/lib/Backlog/View"
created "Backlog/lib/Backlog/Controller"
created "Backlog/backlog.yml"
created "Backlog/lib/Backlog.pm"
created "Backlog/lib/Backlog/Controller/Root.pm"
created "Backlog/README"
created "Backlog/Changes"
created "Backlog/t/01app.t"
created "Backlog/t/02pod.t"
created "Backlog/t/03podcoverage.t"
created "Backlog/root/static/images/catalyst_logo.png"
created "Backlog/root/static/images/btn_120x50_built.png"
created "Backlog/root/static/images/btn_120x50_built_shadow.png"
created "Backlog/root/static/images/btn_120x50_powered.png"
created "Backlog/root/static/images/btn_120x50_powered_shadow.png"
created "Backlog/root/static/images/btn_88x31_built.png"
created "Backlog/root/static/images/btn_88x31_built_shadow.png"
created "Backlog/root/static/images/btn_88x31_powered.png"
created "Backlog/root/static/images/btn_88x31_powered_shadow.png"
created "Backlog/root/favicon.ico"
created "Backlog/Makefile.PL"
created "Backlog/script/backlog_cgi.pl"
created "Backlog/script/backlog_fastcgi.pl"
created "Backlog/script/backlog_server.pl"
created "Backlog/script/backlog_test.pl"
created "Backlog/script/backlog_create.pl"

running ./Backlog/script/backlog_server.pl started up the web server on port 3000 and I was able to verify that I had a working Catalyst application to begin playing with.

Thursday, April 17, 2008

The @INC array can be more dynamic than you think

We can put subroutine references, array references and objects into the @INC array. See perldoc -f require for the hairy details.