Thursday, December 18, 2008

Troubleshooting and Trimming Innodb Data Files

Recently space became an issue with our innodb data file growing unmanageable. Since innodb doesn’t support a shrink function a reset needed to happen. In order to reclaim the disk space we needed to follow these steps.

1. Trim tables – Archive and Delete old Data
2. Run Backup – Get a backup of the complete DB
3. Drop DB
4. Stop Mysql
5. Delete Data File and Logs
6. Restart Mysql
7. Restore Backup

When we got to step 6 we were unable to restart mysql. We kept receiving the below errors in our logs


InnoDB: No valid checkpoint found.
InnoDB: If this error appears when you are creating an InnoDB database,
InnoDB: the problem may be that during an earlier attempt you managed
InnoDB: to create the InnoDB data files, but log file creation failed.
InnoDB: If that is the case, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/error-creating-innodb.html
081218 10:19:22 [ERROR] Plugin 'InnoDB' init function returned error.
081218 10:19:22 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
081218 10:19:22 [ERROR] Unknown/unsupported table type: innodb
081218 10:19:22 [ERROR] Aborting


After 4 hours our mysql expert (Kolby Summers) and I got to the bottom. The issue was in the first error reported “No valid checkpoint found.” Essentially in step 5 we deleted the data file but not the logs. When mysql starts up with innodb enabled it will first search for the existence of these files. If they do not exists it will create them. However it’s not smart enough to know that if one exists and one doesn’t just creating the single file will most likely cause issues, as we saw. You need to manually delete both sets of files then restart.

One note here is that this error will only occur if innodb is set as the default engine. If myisam is the default data engine it will startup but without complain, however innodb won’t be available. This can be confirmed by issueing the show engines command. In addition if your backup is using mysqldump, and step seven succeeds while your default engine

Wednesday, November 19, 2008

Mysql Explain and Limit Clause

Be careful when using the explain command in mysql against a statement using the limit clause. When using explain it give the appearance that the whole table is scanned, before the limit number is extracted. For example I’ve created a table with more data off the world.city table which contains 261056 rows. If I want to select with a limit a 10 rather then showing 10 rows in the plan it show’s it as if it will run a total scan of the table, then extract the 10.


mysql> explain extended select * from citybig limit 10 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: citybig
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 261056
filtered: 100.00
Extra:
1 row in set, 1 warning (0.00 sec)

So how do you know mysql is actually only evaluating 10 rows and not scanning the entire table? This can be determined using the slow_log variable. By setting the slow query time to 0 (This can only be done in 5.1 and above) it allows you to capture all queries being executed. By logging to a table (Again This can only be done in 5.1 and above) you can easily see how many rows were examined.

For the query above I got the following results.

*************************** 21. row ***************************
start_time: 2008-11-19 22:06:34
user_host: kenk[kenk] @ localhost [127.0.0.1]
query_time: 00:00:00
lock_time: 00:00:00
rows_sent: 10
rows_examined: 10
db: world
last_insert_id: 0
insert_id: 0
server_id: 0
sql_text: select * from citybig limit 10
21 rows in set (0.00 sec)

Thursday, November 6, 2008

Issues with Innodb and Statistics Create Poor Plans

We’ve recently roll out openx which is a web ad serving software. The application consists of 2 primary mysql components. The first is the backend where AD campaigns are uploaded into the system. The second is the front end where ads are served up through PHP. Mysql replication is used to move data from the backend to the front end. The frontend consists of 2 identical slaves for redundancy and load balancing with both containing the same data. The primary tables serving the ads are innodb, and quite small (at least for now).

Since its launch we’ve been seeing particularly high spiky CPU times. Sometimes it’s on one server sometimes on both, and then there are times when both are running smoothly. The spikiness was easy to figure out, in that it dealt with the query cache. Every time new data would be replicated out, it would flush the query cache and force mysql to rebuild rerun it queries, however it didn’t explain why they should be high in the first place. After running a long running queries trace I was able to determine it was being caused by a 6 table join that was scanning like crazy. My initial thought was this was due to the optimizer getting the wrong plan so I forced mysql to do an exhaustive optimization by setting optimizer_prune_level to 0. Unfortunately this didn’t do the job, but it told me the optimizer was getting the correct plan for information it was given. Through looking at the statistics of the indexes it was clear that innodb was off. After running an optimize table command against the bad tables this fixed the problems.

Trying to get row counts out of innodb is like playing roulette, you never know what number is going to come up. This is because unlike myisam that keeps the row count as part of the data structure, innodb calculates these on the fly. These same row counts are used in creating statistics for the optimizer. A long term solution could be to schedule out a job to run the optimize command regularly, which in reality just rebuilds the table and is not the best solution. A permanent solution is to change these tables to myisam.

For some further reading Peter from mysqlperfomanceblog has a great article.
Post

Using Delimiter in mysql

The delimiter in mysql is the character or set of characters that tells mysql when to end a statement. This can be set at the session level or for the entire server. By default the delimiter uses a semi-colon “;” but this can be changed when starting the server using the delimiter option. The syntax for changing this is straight forward just type delimiter followed by the character. For example try typing
delimiter abc

If you now issue the command:
Show processlist abc

It’s as if you used a semi-colan. Also notice a semi-colon won’t work now, this is because you can only have one delimiter per session. If you open up a different mysql session and type
Show processlist abc

It wont return anything until you hit the semi-colon which in turn will return an error. The primary reason behind having this ability is play jokes on your co-workers. Set the delimiter to xxx then ask a cocky DBA if he can show you how many times the mysql has the query cache. When he issues the command laughs ensue.

Of course this isn’t the reason, rather it lies in the ability to issue several commands within a routine. For example at the command prompt you can’t write the following procedure because it will interpret each command as the only the first command terminated with a semi-colon.

create procedure multistatements()
begin
select * from t1;
select * from t2;
end;


However if you change it to:

delimiter //
create procedure multistatements()
begin
select * from t1;
select * from t2;
end;
//

It will now compile. However there’s on glitch in order to execute this within the same session it either needs to be called using “//” as the delimiter or switch back to a semi-colon.

Call multistatements()//
Or
Delimiter ;
Call multistatements();

Tuesday, October 21, 2008

Definer in Mysql Stored Procedures and Routines

The definer is a mysql security tactic used for executing mysql stored procedures. If you’re coming from an SQL Server world it’s similar to the schema that owns the SP. In other words when a definer is set it gives the user executing the SP the same perms as the definer in the context of the SP. This is much easier to explain by example.

Suppose you have a sp that runs a select on Table1. The definer must have select permissions to table1 while the user executing doesn’t need it. When the user is executing the sp they inherit the permissions as the account specified as the definer. This allows your application security to be setup so that application users only need to have execute perms on the SP’s and not perms to underlying objects.

Often when developing a SP this option is left blank and by default it is set to the user that created the sp. The problem with this is that when the SP gets pushed to production, the developer account doesn’t exist, and the SP won’t execute.

There are two workarounds here, the first is to create a standard user with the perms required to access the underlying objects in delivery. Also have this user in Test/Dev/Stress (Different Password then in production) and have this user as the definer. This is of course is the optimal solution, but requires quite a bit a work in developing your security infrastructure. The hack which works while you’re developing this infrastructure is to run the following statement in delivery after release.

update proc set definer='Defineruser@localhost';

This changes the definer to an appropriate user.

Thursday, October 16, 2008

Determine the Number of Connections Per Server in Mysql

Coming from the Dynamic View world of sql server 2005 the show commands of mysql take me back to the old days. The primarily issue with the “Show Commands” are they don’t return a result set you can work with. As a DBA one of the most common requests is finding all the processes connected to the server and how many connections each host has. Using show processlist requires capturing this as text then writing some script to parse and aggregate. To avoid this I use the information_schema database. Although not as robust as many rdbs’s there are some good tables to get information from. In this case I like to query the processlist table.


mysql> select host, count(*) as cnt from processlist group by host;
+----------------+-----+
| host | cnt |
+----------------+-----+
| localhost:4819 | 1 |
| localhost:4833 | 1 |
+----------------+-----+
2 rows in set (0.00 sec)

However the challenge here is that concatenated with the host name is the thread id, so you still can’t get a proper count on how many connections each server has. To solve this I used the ultra cool substring_index function of mysql. Here you pass it the string to be searched, the string to be found, and the count of occurrences you want to find.

In this example I was able to group by this function.

mysql> select substring_index(host,':',1) as hostname, count(*) as cnt from processlist group by substring_index(host,':',1);
+-----------+-----+
| hostname | cnt |
+-----------+-----+
| localhost | 2 |
+-----------+-----+
1 row in set (0.02 sec)

Here’s a link to mysql’s documentation

http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substring-index

Friday, October 10, 2008

Pumping Out Warnings When Using Load Data From the Command Line

I’m currently writing a SQL Server SSIS package that leverages the Load data infile command of mysql, to import data. The problem I’ve been running into is that I need to find out any warnings that occur and can be reviewed. Since I’m calling the mysql executable from the command line, I’m unable to capture this data. The command currently looks like this:

C:\>mysql -uuser -ppassword --execute="load data local infile ‘c:\\test.txt'
Replace into table test Fields terminated by ',' " -hServer DB

To work around this I’ve added a second command inside the batch that leverages the “show warnings” command, and now looks like this.

C:\>mysql -uuser -ppassword --execute="load data local infile ‘c:\\test.txt'
Replace into table test Fields terminated by ','; show warnings " -hServer DB

This now pumps out the results. In my case within ssis I pump this out to a text file and email it as part of the results such as


C:\>mysql -uuser -ppassword --execute="load data local infile ‘c:\\test.txt'
Replace into table test Fields terminated by ',' " -hServer DB >c:\testout.txt