MySQL Stored Procedures Ain t All That
I give quite a lot of presentations. A whole lot less than I used to, but still quite a few per year. Most of the time, the presentations are on performance tuning MySQL.
Almost every time I give a presentation on MySQL performance tuning and this happens 100% of the time if I am presenting to a Windows SQL Server crowd I get the following question:
Why don t you cover using stored procedures in order to increase performance? Wouldn t that be the easiest way to get better performance since the stored procedures will only be parsed once and then the compiled bytecode would be efficiently executed from then on?
Every person that asks this question assumes something about MySQL s stored procedure implementation; they incorrectly believe that stored procedures are compiled and stored in a global stored procedure cache. similar to the stored procedure cache in Microsoft SQL Server  or Oracle  .
This is wrong. Flat-out incorrect.
Here is the truth: Every single connection to the MySQL server maintains it s own stored procedure cache.
This means two very important things that users of stored procedures should understand:
- If you operate in a shared-nothing environment for example, the majority of PHP and Python applications that do not use connection pooling or persistent connections if your application uses stored procedures, the connection is compiling the stored procedure, storing it in a cache, and destroying that cache every single time you connect to the database server and issue a CALL statement
- If you use stored procedures, the memory usage of every single connection that uses those stored procedures is going to increase, and will increase substantially if you use many stored procedures
Ooops, I Invalidated Everything Again
So, what happens when you CREATE. ALTER. or DROPany stored procedures. Since MySQL stores all stored procedure execution code on the connection threads, each of those connection threads must invalidate the procedure in its caches that has changed, right?
No, it s worse. Every time ANY stored procedure is added, dropped, or updated, ALL stored procedures on ALL connection threads will be invalidated and must be re-compiled. Here is how the caches are invalidated:
from /sql/sp_cache.cc. lines 193-197, in MySQL 5.5
It s a bit misleading, since it actually doesn t invalidate anything at all. What the above code does is increment the global Cversion variable. When a connection thread attempts to execute, drop or insert a new procedure, it will notice that it s local cache s version number is less than this Cversion number and will destroy the entire cache and rebuild it gradually as procedures are affected or executed.
So, Should You Use Stored Procedures in MySQL?
The above warning doesn t necessarily mean that you should never use stored procedures? No. What it means (besides being a bit of a rant on the implementation of MySQL s stored procedures) is that you should be aware of these issues and use stored procedures where they make the most sense:
- When you know that you will be executing the stored procedure over and over again on the same connection for instance, in a bulk loading script or similar
- When you know that you will not be disconnecting from the MySQL server at the end of script execution for instance, if you use JDBC connection pooling
- When you know that you have a limited number of stored procedures and the memory usage of connections won t be an issue
Finally, if you see benchmarks that purport to show a huge performance increase from using stored procedures in MySQL, be careful to understand what the benchmark is doing and whether that benchmark represents your real-world environment. For instance, if you see a huge performance increase in sysbench when using stored procedures, but you have a PHP shared-nothing environment, understand that those benchmark results mean very little to you, since sysbench connections don t get destroyed until the end of the run
 From my copy of Inside SQL Server 2000, Delaney (2001), pages 852-865. For a short, but decent, online explanation of SQL Server s stored procedure cache, see here
 Oracle s stored procedures are stored in the shared pool of the Oracle system global area (SGA)
Previous Article Now Recording Drizzle Contributor Tutorial
Next Article Developing Nova on Linux Getting Started
Your last points on when to use stored procedures are spot on again, assuming that the goal is faster due to compiling once .
I am onsite at a client today (a rarity for Pythian) and they have an ETL server that s CPU bound that processes 40,000 queries per second, with about 6,000 commands sent to the mysql server (basically on average a stored procedure does about 7 queries) ..
They aren t using persistent connections or calling the same queries over and over in one connection, but they found that with the *volume* of queries they were able to do, actually having less network overhead made a huge difference for them (ie, instead of making 7 calls to the server, making 1 call to a stored procedure to do those 7 tasks).
Everyone s mileage may vary, and you re 100% correct with your points .it s ironic that today in particular I ran across someone using stored procedures correctly and improving performance but not because of compilation.
I would be curious as to your thoughts on this from a Drizzle perspective. I know there are plans on implementing stored procs in Drizzle will it be with the same centralized cache architecture as SQL Server and Oracle?
http://joinfu.com/ Jay Pipes