Thursday, January 29, 2009

MySQL bench marks, comparing MyISAM to InnoDB and evaluating the performance hit of foreign keys

During an interview earlier this week, the question of how using foreign keys would impact performance on MySQL. I decided that it would be worthwhile to actually check this out in practice. I started out with a basic structure as follows:

CREATE TABLE IF NOT EXISTS `a` (
`id` int(11) NOT NULL,
`data` varchar(255) NOT NULL,
`data2` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `b` (
`id` int(11) NOT NULL,
`a_id` int(11) NOT NULL,
`data3` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Then added a second database where we had a foreign key:

CREATE TABLE IF NOT EXISTS `b` (
`id` int(11) NOT NULL,
`a_id` int(11) NOT NULL,
`data3` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `a_id` (`a_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE `b`
ADD CONSTRAINT `b_ibfk_1` FOREIGN KEY (`a_id`) REFERENCES `a` (`id`) ON DELETE CASCADE;

And just to make things complete, I added a third database identical to the first, but using MyISAM as the engine.

The first thing is to do a data population. I wrote a perl script to generate 10,000 inserts of random data into table A, each with 0-5 random inserts into table B.
Next a selection of queries. I did 500 variations of SELECT * FROM a INNER JOIN b ON (a.id=b.a_id) WHERE a.id=?

For deletes, I deleted 500 rows, using DELETE FROM a WHERE id=?; DELETE FROM b WHERE a_id=? for the databases without the foreign key and the simple DELETE FROM a for the foreign key database.

The results for inserts:
databaseinsert time (s)select time (s)delete time (s)
InnoDB, no FK4.50.253
InnoDB, FK4.00.01
MyISAM1.00.750

I had always assumed that the advantage of MyISAM over InnoDB was the opposite of what it turned out to be with this benchmark. Perhaps even more surprisingly, the foreign key version of the InnoDB database was uniformly faster than the non FK version. I would have expected to see some overhead at least in the insertion stage, but even on multiple trials, I found a consistent advantage for the foreign keys on the insertion.

1 comment:

GreenLit Design said...

Interesting post. I'd be interested to see how you stored your results. Were you storing the "time taken" for each query in a table?