Thursday, November 6, 2008

Issues with Innodb and Statistics Create Poor Plans

We’ve recently roll out openx which is a web ad serving software. The application consists of 2 primary mysql components. The first is the backend where AD campaigns are uploaded into the system. The second is the front end where ads are served up through PHP. Mysql replication is used to move data from the backend to the front end. The frontend consists of 2 identical slaves for redundancy and load balancing with both containing the same data. The primary tables serving the ads are innodb, and quite small (at least for now).

Since its launch we’ve been seeing particularly high spiky CPU times. Sometimes it’s on one server sometimes on both, and then there are times when both are running smoothly. The spikiness was easy to figure out, in that it dealt with the query cache. Every time new data would be replicated out, it would flush the query cache and force mysql to rebuild rerun it queries, however it didn’t explain why they should be high in the first place. After running a long running queries trace I was able to determine it was being caused by a 6 table join that was scanning like crazy. My initial thought was this was due to the optimizer getting the wrong plan so I forced mysql to do an exhaustive optimization by setting optimizer_prune_level to 0. Unfortunately this didn’t do the job, but it told me the optimizer was getting the correct plan for information it was given. Through looking at the statistics of the indexes it was clear that innodb was off. After running an optimize table command against the bad tables this fixed the problems.

Trying to get row counts out of innodb is like playing roulette, you never know what number is going to come up. This is because unlike myisam that keeps the row count as part of the data structure, innodb calculates these on the fly. These same row counts are used in creating statistics for the optimizer. A long term solution could be to schedule out a job to run the optimize command regularly, which in reality just rebuilds the table and is not the best solution. A permanent solution is to change these tables to myisam.

For some further reading Peter from mysqlperfomanceblog has a great article.
Post

No comments: