Wednesday, November 19, 2008

Mysql Explain and Limit Clause

Be careful when using the explain command in mysql against a statement using the limit clause. When using explain it give the appearance that the whole table is scanned, before the limit number is extracted. For example I’ve created a table with more data off the world.city table which contains 261056 rows. If I want to select with a limit a 10 rather then showing 10 rows in the plan it show’s it as if it will run a total scan of the table, then extract the 10.


mysql> explain extended select * from citybig limit 10 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: citybig
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 261056
filtered: 100.00
Extra:
1 row in set, 1 warning (0.00 sec)

So how do you know mysql is actually only evaluating 10 rows and not scanning the entire table? This can be determined using the slow_log variable. By setting the slow query time to 0 (This can only be done in 5.1 and above) it allows you to capture all queries being executed. By logging to a table (Again This can only be done in 5.1 and above) you can easily see how many rows were examined.

For the query above I got the following results.

*************************** 21. row ***************************
start_time: 2008-11-19 22:06:34
user_host: kenk[kenk] @ localhost [127.0.0.1]
query_time: 00:00:00
lock_time: 00:00:00
rows_sent: 10
rows_examined: 10
db: world
last_insert_id: 0
insert_id: 0
server_id: 0
sql_text: select * from citybig limit 10
21 rows in set (0.00 sec)

Thursday, November 6, 2008

Issues with Innodb and Statistics Create Poor Plans

We’ve recently roll out openx which is a web ad serving software. The application consists of 2 primary mysql components. The first is the backend where AD campaigns are uploaded into the system. The second is the front end where ads are served up through PHP. Mysql replication is used to move data from the backend to the front end. The frontend consists of 2 identical slaves for redundancy and load balancing with both containing the same data. The primary tables serving the ads are innodb, and quite small (at least for now).

Since its launch we’ve been seeing particularly high spiky CPU times. Sometimes it’s on one server sometimes on both, and then there are times when both are running smoothly. The spikiness was easy to figure out, in that it dealt with the query cache. Every time new data would be replicated out, it would flush the query cache and force mysql to rebuild rerun it queries, however it didn’t explain why they should be high in the first place. After running a long running queries trace I was able to determine it was being caused by a 6 table join that was scanning like crazy. My initial thought was this was due to the optimizer getting the wrong plan so I forced mysql to do an exhaustive optimization by setting optimizer_prune_level to 0. Unfortunately this didn’t do the job, but it told me the optimizer was getting the correct plan for information it was given. Through looking at the statistics of the indexes it was clear that innodb was off. After running an optimize table command against the bad tables this fixed the problems.

Trying to get row counts out of innodb is like playing roulette, you never know what number is going to come up. This is because unlike myisam that keeps the row count as part of the data structure, innodb calculates these on the fly. These same row counts are used in creating statistics for the optimizer. A long term solution could be to schedule out a job to run the optimize command regularly, which in reality just rebuilds the table and is not the best solution. A permanent solution is to change these tables to myisam.

For some further reading Peter from mysqlperfomanceblog has a great article.
Post

Using Delimiter in mysql

The delimiter in mysql is the character or set of characters that tells mysql when to end a statement. This can be set at the session level or for the entire server. By default the delimiter uses a semi-colon “;” but this can be changed when starting the server using the delimiter option. The syntax for changing this is straight forward just type delimiter followed by the character. For example try typing
delimiter abc

If you now issue the command:
Show processlist abc

It’s as if you used a semi-colan. Also notice a semi-colon won’t work now, this is because you can only have one delimiter per session. If you open up a different mysql session and type
Show processlist abc

It wont return anything until you hit the semi-colon which in turn will return an error. The primary reason behind having this ability is play jokes on your co-workers. Set the delimiter to xxx then ask a cocky DBA if he can show you how many times the mysql has the query cache. When he issues the command laughs ensue.

Of course this isn’t the reason, rather it lies in the ability to issue several commands within a routine. For example at the command prompt you can’t write the following procedure because it will interpret each command as the only the first command terminated with a semi-colon.

create procedure multistatements()
begin
select * from t1;
select * from t2;
end;


However if you change it to:

delimiter //
create procedure multistatements()
begin
select * from t1;
select * from t2;
end;
//

It will now compile. However there’s on glitch in order to execute this within the same session it either needs to be called using “//” as the delimiter or switch back to a semi-colon.

Call multistatements()//
Or
Delimiter ;
Call multistatements();