jump to navigation

MySQL Series: Performance Engineering – Find What April 14, 2015

Posted by msrviking in MySQL, Performance, Performance tuning.
Tags: , , , , ,
add a comment

In the post MySQL Performance Engineering series – The Goal I had mentioned about the problem statement, and how I planned to go about it. I had series of areas that I need to attend to stabilize the performance and as an old adage I went by finding those low hanging fruits that were spoiling the backyard “the queries”.

Being a first timer on stabilizing the performance of MySQL system I was looking for a way out on, how would I know which queries are doing badly? As in SQL Server we have DMVs that capture the slow performing queries, or traces which profile the database system for a set events and columns (duration), there is something similar in MySQL world known as “Slow Query log”. Luckily I had this handy information from the Dev team who were receiving hourly notifications on which queries were running slow on a particular day.

Now to more details.

Slow Query Log, what is it?

Taking the text from The Slow Query Log page which says “A slow query log has those SQL statements that took longer than these ‘n’ seconds to fetch rows after examining ‘m’ number of rows of the table”. If I remember right, I believe we had set the value for long_query_time > 3s, and default value for min_examined_rows_limit. This would mean that mysqld would write to log file for any queries, stored procedures that need attention because they are exceeding the limit of execution of 3s.

There are few other settings too that could be configured before you enable logging of slow queries into the log, and could be pored through the above link of Slow Query Log.

A sample output of what the slow query log is here

/usr/sbin/mysqld, Version: 5.6.13-log (MySQL Community Server (GPL)). started with:

Tcp port: 3372 Unix socket: /db01/mysql/mysql.sock

Time Id Command Argument

SET timestamp=1390907486;

call sp_name (412,15000);

# Time: 140128 5:12:07

# User@Host: user[pwd] @ [xxx.xxx.xxx.xx] Id: 11963111

# Query_time: 54.035533 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 204534

Please read the above sample output for key information like

Call..-> is the stored procedure name that is called

Time -> is the time of execution of the stored procedure or a query

Query time -> the time taken in seconds for the stored procedure or query to complete

Rows examined -> number of rows looked through during the queries execution in the stored procedure

So what do you do with this raw data? I would think of running a utility that will read through the entire log and give me a summary of what is logged. Well, if I had to do this manually I would have had to use my brains stressed to a level of finding a right app, script, test, review (in cyclical pattern) and abstract the results out. I had been lucky here to and hadn’t spent that much time and effort. The next section will describe of what I did.

 

Raw data of Slow Query Log, what next?

In the above sub-section you would have noticed the output of the slow query log which is Raw, and information rich. Now, to get a decent report of the raw data I had to use the utility “mysqldumpslow” provided by MySQL. Mysqldumpslow is a perl (.pl) program that parses the slow query log and groups similar queries except if data values are different. More here on this nice little program. I didn’t use any additional parameters to parse through the log, and one could tweak the values of the options listed.

 

I had to do one more thing “Install a perl engine” before I could run the mysqldumpslow.pl in its location in my system. And then, I had to copy the raw slow query logs to the folder that standalone .pl file, and run the command mysqldumpslow through the command prompt in the path.

 

The output looks like this

 

Count: 2 Time=57.39s (114s) Lock=0.00s (0s) Rows=313.0 (626), user[pwd]@[xxx.xxx.xx.xx] || SELECT * FROM t WHERE col = ‘S’

Count: 617 Time=57.33s (35371s) Lock=0.12s (76s) Rows=1.0 (617), user[pwd]@4hosts || SELECT * FROM t1 WHERE col1 = ‘S’

Count: 713 Time=56.26s (40116s) Lock=0.72s (516s) Rows=1.0 (713), user[pwd]@4hosts || SELECT * FROM t4 WHERE col2 = ‘S’

Count: 3 Time=55.02s (165s) Lock=0.00s (0s) Rows=1.0 (3), user[pwd]@2hosts || select t1.col1 from t1 join(t2) where t1.col4 !=N and t1.col5=N and t1.col1=t2.col1 and t2.col2=’S’ and t1.col3=N

 

The above sample output has

Count -> number of times the query was called at that instance

Time in () -> total response time in seconds

Time not in () -> average response time in seconds

Lock in () -> total locking time in seconds

Lock not in () -> average response time in seconds

Rows in () -> rows examined

Rows not in () -> rows sent

User, pwd and hostname /host ip address -> speak for themselves

Query definition -> query that was executed with response time, lock time, rows examined, and sent

 

Please note the two | “||” after the query aren’t from the mysqldumpslow.pl. I had to modify the .pl file to include the || for bringing the line of output into single line, else I would have had a /n (enter character) after the hostname /ip. I wanted to pull all the data into one single line, and convert text to columns in excel for better readability and analysis.

 

Now the slow query log is formatted from the utility and it’s to be analyzed, and that wasn’t difficult as much figuring out on how to format for better reporting.

Analyze the Slow Query log

Here is the sample report I had to put in excel for better analysis

Query Profile
Calls Count Average Time (s) Total Time (s) Row Lock (s) # Rows Sent #Rows Examined Query Text
11391 43.47 495193 0.00 (2) 1 11391 StoredProcedure
3757 34.77 130632 0.45 (1695) 418 1570426 Query 1
1788 14.42 25779 0.29 (514) 1 1788 Query 2
1684 48.6 81849 0.04 (73) 1 1683 Query 3
1117 52.14 58244 0.12 (137) 1 1117 Query 4

Are there tools that could help you summarize all these, well I haven’t used any but yes there is one I have kept learning from here.

I decided to look for queries that had least calls count, and highest average or total response time. This was the onset of my performance engineering the MySQL systems for a single goal.

 

Happy reading.

Advertisements

MySQL Series: Performance Engineering – The Goal March 24, 2015

Posted by msrviking in MySQL, Performance.
Tags: , , , ,
add a comment

This post has that one line of description as the goal for performance engineering assignment I had to work on, and about the plan on how I went about. The statement says “Ensure we have acceptable performance in the DB for 5000 concurrent users.” I knew what was waiting from me with this one line statement.

And with no delay, I had put up the plan that would touch the performance soft spots of the MySQL system. I wasn’t sure how much performance improvement will be achieved to scale the system after touching these areas.

Here is what one would want to address, and I believe this will help as reference always.

The table consists of Areas to be looked into, Dependency as in inputs from the team or system for me to assess and the priority of each Area.

# Areas Dependencies Priority
1 Identify poorly performing queries and all application queries
1.1

Assess the query performance

Query list High
Plan Queries parameters High
Concurrency Access to the latest production DB in Dev environment High
1.2

Assess the table performance

Information on business (application) purpose of the query – application perspective
Index Inputs from existing query benchmarks – production or performance testing environment High
Partition Medium
Structure Medium
2 Identify configuration parameters
2.1 Assess the impact of existing configuration parameters Configuration parameters in production Medium
2.2 Recommend new configuration parameters Architecture setup of MySQL instances in production
3 DB scaling (Up & Out)
3.1 Assess the current performance of Hardware and MySQL Current hardware setup of MySQL instances Medium
3.2 Propose additional hardware based on findings Architecture setup of MySQL instances in production Medium
3.3 Propose change in MySQL architecture Medium

 

I had to follow this plan all throughout the assignment for project management reporting, tracking the progress, and at the end quantifying the % of performance improvement in each area.

Not to re-emphasize the intent of having this plan is to work bottom-up. The work had to start from query, index, tables, database, instance, and hardware which is the standard for any performance optimization. The following posts would be on most of the above line items.

Happy reading.