jump to navigation

MySQL Series: Performance Engineering – Find What April 14, 2015

Posted by msrviking in MySQL, Performance, Performance tuning.
Tags: , , , , ,
add a comment

In the post MySQL Performance Engineering series – The Goal I had mentioned about the problem statement, and how I planned to go about it. I had series of areas that I need to attend to stabilize the performance and as an old adage I went by finding those low hanging fruits that were spoiling the backyard “the queries”.

Being a first timer on stabilizing the performance of MySQL system I was looking for a way out on, how would I know which queries are doing badly? As in SQL Server we have DMVs that capture the slow performing queries, or traces which profile the database system for a set events and columns (duration), there is something similar in MySQL world known as “Slow Query log”. Luckily I had this handy information from the Dev team who were receiving hourly notifications on which queries were running slow on a particular day.

Now to more details.

Slow Query Log, what is it?

Taking the text from The Slow Query Log page which says “A slow query log has those SQL statements that took longer than these ‘n’ seconds to fetch rows after examining ‘m’ number of rows of the table”. If I remember right, I believe we had set the value for long_query_time > 3s, and default value for min_examined_rows_limit. This would mean that mysqld would write to log file for any queries, stored procedures that need attention because they are exceeding the limit of execution of 3s.

There are few other settings too that could be configured before you enable logging of slow queries into the log, and could be pored through the above link of Slow Query Log.

A sample output of what the slow query log is here

/usr/sbin/mysqld, Version: 5.6.13-log (MySQL Community Server (GPL)). started with:

Tcp port: 3372 Unix socket: /db01/mysql/mysql.sock

Time Id Command Argument

SET timestamp=1390907486;

call sp_name (412,15000);

# Time: 140128 5:12:07

# User@Host: user[pwd] @ [xxx.xxx.xxx.xx] Id: 11963111

# Query_time: 54.035533 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 204534

Please read the above sample output for key information like

Call..-> is the stored procedure name that is called

Time -> is the time of execution of the stored procedure or a query

Query time -> the time taken in seconds for the stored procedure or query to complete

Rows examined -> number of rows looked through during the queries execution in the stored procedure

So what do you do with this raw data? I would think of running a utility that will read through the entire log and give me a summary of what is logged. Well, if I had to do this manually I would have had to use my brains stressed to a level of finding a right app, script, test, review (in cyclical pattern) and abstract the results out. I had been lucky here to and hadn’t spent that much time and effort. The next section will describe of what I did.

 

Raw data of Slow Query Log, what next?

In the above sub-section you would have noticed the output of the slow query log which is Raw, and information rich. Now, to get a decent report of the raw data I had to use the utility “mysqldumpslow” provided by MySQL. Mysqldumpslow is a perl (.pl) program that parses the slow query log and groups similar queries except if data values are different. More here on this nice little program. I didn’t use any additional parameters to parse through the log, and one could tweak the values of the options listed.

 

I had to do one more thing “Install a perl engine” before I could run the mysqldumpslow.pl in its location in my system. And then, I had to copy the raw slow query logs to the folder that standalone .pl file, and run the command mysqldumpslow through the command prompt in the path.

 

The output looks like this

 

Count: 2 Time=57.39s (114s) Lock=0.00s (0s) Rows=313.0 (626), user[pwd]@[xxx.xxx.xx.xx] || SELECT * FROM t WHERE col = ‘S’

Count: 617 Time=57.33s (35371s) Lock=0.12s (76s) Rows=1.0 (617), user[pwd]@4hosts || SELECT * FROM t1 WHERE col1 = ‘S’

Count: 713 Time=56.26s (40116s) Lock=0.72s (516s) Rows=1.0 (713), user[pwd]@4hosts || SELECT * FROM t4 WHERE col2 = ‘S’

Count: 3 Time=55.02s (165s) Lock=0.00s (0s) Rows=1.0 (3), user[pwd]@2hosts || select t1.col1 from t1 join(t2) where t1.col4 !=N and t1.col5=N and t1.col1=t2.col1 and t2.col2=’S’ and t1.col3=N

 

The above sample output has

Count -> number of times the query was called at that instance

Time in () -> total response time in seconds

Time not in () -> average response time in seconds

Lock in () -> total locking time in seconds

Lock not in () -> average response time in seconds

Rows in () -> rows examined

Rows not in () -> rows sent

User, pwd and hostname /host ip address -> speak for themselves

Query definition -> query that was executed with response time, lock time, rows examined, and sent

 

Please note the two | “||” after the query aren’t from the mysqldumpslow.pl. I had to modify the .pl file to include the || for bringing the line of output into single line, else I would have had a /n (enter character) after the hostname /ip. I wanted to pull all the data into one single line, and convert text to columns in excel for better readability and analysis.

 

Now the slow query log is formatted from the utility and it’s to be analyzed, and that wasn’t difficult as much figuring out on how to format for better reporting.

Analyze the Slow Query log

Here is the sample report I had to put in excel for better analysis

Query Profile
Calls Count Average Time (s) Total Time (s) Row Lock (s) # Rows Sent #Rows Examined Query Text
11391 43.47 495193 0.00 (2) 1 11391 StoredProcedure
3757 34.77 130632 0.45 (1695) 418 1570426 Query 1
1788 14.42 25779 0.29 (514) 1 1788 Query 2
1684 48.6 81849 0.04 (73) 1 1683 Query 3
1117 52.14 58244 0.12 (137) 1 1117 Query 4

Are there tools that could help you summarize all these, well I haven’t used any but yes there is one I have kept learning from here.

I decided to look for queries that had least calls count, and highest average or total response time. This was the onset of my performance engineering the MySQL systems for a single goal.

 

Happy reading.

Advertisements

MySQL Series: Performance Engineering – The Goal March 24, 2015

Posted by msrviking in MySQL, Performance.
Tags: , , , ,
add a comment

This post has that one line of description as the goal for performance engineering assignment I had to work on, and about the plan on how I went about. The statement says “Ensure we have acceptable performance in the DB for 5000 concurrent users.” I knew what was waiting from me with this one line statement.

And with no delay, I had put up the plan that would touch the performance soft spots of the MySQL system. I wasn’t sure how much performance improvement will be achieved to scale the system after touching these areas.

Here is what one would want to address, and I believe this will help as reference always.

The table consists of Areas to be looked into, Dependency as in inputs from the team or system for me to assess and the priority of each Area.

# Areas Dependencies Priority
1 Identify poorly performing queries and all application queries
1.1

Assess the query performance

Query list High
Plan Queries parameters High
Concurrency Access to the latest production DB in Dev environment High
1.2

Assess the table performance

Information on business (application) purpose of the query – application perspective
Index Inputs from existing query benchmarks – production or performance testing environment High
Partition Medium
Structure Medium
2 Identify configuration parameters
2.1 Assess the impact of existing configuration parameters Configuration parameters in production Medium
2.2 Recommend new configuration parameters Architecture setup of MySQL instances in production
3 DB scaling (Up & Out)
3.1 Assess the current performance of Hardware and MySQL Current hardware setup of MySQL instances Medium
3.2 Propose additional hardware based on findings Architecture setup of MySQL instances in production Medium
3.3 Propose change in MySQL architecture Medium

 

I had to follow this plan all throughout the assignment for project management reporting, tracking the progress, and at the end quantifying the % of performance improvement in each area.

Not to re-emphasize the intent of having this plan is to work bottom-up. The work had to start from query, index, tables, database, instance, and hardware which is the standard for any performance optimization. The following posts would be on most of the above line items.

Happy reading.

A year and more gone by..no posts March 23, 2015

Posted by msrviking in Configuration, Data Integration, Design, Integration, MySQL, Oracle, Performance tuning.
Tags: , , , , , ,
add a comment

I alone know on how I have missed blogging on what all I learnt on MySQL and Oracle RDS over last 14 months, and how much I wished that what all I did was on SQL Server. Well, not always it is the way you want it to be.

There was such a learning over this period on the way things are done in MySQL, Oracle at design, development and performance engineering that I might have to spend several days blogging about. In next few weeks to months I will be spending time on variety of learnings starting from design, coding, through performance tuning to achieve concurrency and scalability. Although these topics will primarily be around MySQL, Oracle I might map on some of the done work to SQL Server features and implementations.

A brief preview on what would be the topics on, and possibly will be having its own sub-series in detail with mapping

MySQL:

  1. What are the physical design checklist
  2. What are best instance level configurations
  3. What are the optimal working configuration values of host hardware and OS
  4. What are the ways to optimize execution plans
  5. What are the coding best practices
  6. What are the ways to troubleshoot deadlocking problems

Oracle RDS:

  1. What are the best practices for coding in PL/SQL
  2. What are the best practices for designing and building integration DB
  3. What are the ways to optimize execution plans
  4. What are the ways to monitor for any performance bottlenecks
  5. What are the ways to achieve concurrency, scalability and performance
  6. What is that not do-able when compared to on-premise instances
  7. How to administer the instance

Happy reading.

What am I doing now-a-days over last few months..? January 30, 2014

Posted by msrviking in General, MySQL, Performance tuning.
Tags: , , , ,
add a comment

I had been busy doing stuff on SSIS for few months, then on MySQL from the last post I have put up in the blog. What is that I am doing on MySQL and writing a post on a SQL Server blogging site? Well, I am kind a trying to get hold on how MySQL works, while I am trying to stabilize the performance of system. Surely I am not a MySQL geek to look at OS, Hardware, MySQL configurations deeply, but with little knowledge I had and gaining as time is going by I am trying to troubleshoot performance of the queries, indexing, and partitioning tables. These are the few things that I have been trying to put in place even before I get on other levels of performance engineering.

A thread is already on if I could shard the database, introduce read-write splitting to provide a scale-out solution by using out of the box features like MySQL cluster or customizing by partitioning tables, sharding them into different nodes, read-write splitting using MemCache.

These are lot of thinking in terms of MySQL but then I don’t have such flexibility in SQL Server although there are some implementations that use read-write splitting, using load balancers at application level and not database. I am highlighting some of those that are not there in SQL Server, and available to be used at fullest in MySQL. But then there are many of those missing in MySQL which is so good to use and work out things in SQL Server.

Some of the ones that I am missing MySQL badly are

  • Profiler
  • Graphical execution plan
  • Indexes with include
  • Free and easy to use monitoring tools for OS, Hardware
  • Multiple query plan algorithms
  • Proper documentation (implementation or bugs)

This post is to share what I am seeing those top few things in MySQL and few things that I am missing when I think of SQL Server. It’s kind of missing or home-sick post I would say.

I will keep writing as and when I learn new thing and definitely put a comparison with SQL Server features.

Happy reading.