Connection pooling is a technique of creating and managing a pool of connections that are ready for use by any thread that needs them. Connection pooling can greatly increase the performance of your Java application, while reducing overall resource usage.
Most applications only need a thread to have access to a JDBC connection when they are actively processing a transaction, which often takes only milliseconds to complete. When not processing a transaction, the connection sits idle. Connection pooling enables the idle connection to be used by some other thread to do useful work.
In practice, when a thread needs to do work against a MySQL or other database with JDBC, it requests a connection from the pool. When the thread is finished using the connection, it returns it to the pool, so that it can be used by any other threads.
When the connection is loaned out from the pool, it is used
exclusively by the thread that requested it. From a programming
point of view, it is the same as if your thread called
DriverManager.getConnection()
every time it
needed a JDBC connection. With connection pooling, your thread may
end up using either a new connection or an already-existing
connection.
The main benefits to connection pooling are:
Reduced connection creation time.
Although this is not usually an issue with the quick connection setup that MySQL offers compared to other databases, creating new JDBC connections still incurs networking and JDBC driver overhead that will be avoided if connections are recycled.
Simplified programming model.
When using connection pooling, each individual thread can act as though it has created its own JDBC connection, allowing you to use straightforward JDBC programming techniques.
Controlled resource usage.
If you create a new connection every time a thread needs one rather than using connection pooling, your application's resource usage can be wasteful, and it could lead to unpredictable behaviors for your application when it is under a heavy load.
The concept of connection pooling in JDBC has been standardized through the JDBC 2.0 Optional interfaces, and all major application servers have implementations of these APIs that work with MySQL Connector/J.
Generally, you configure a connection pool in your application server configuration files, and access it through the Java Naming and Directory Interface (JNDI). The following code shows how you might use a connection pool from an application deployed in a J2EE application server:
Example 7.1 Connector/J: Using a connection pool with a J2EE application server
import java.sql.Connection; import java.sql.SQLException; import java.sql.Statement; import javax.naming.InitialContext; import javax.sql.DataSource; public class MyServletJspOrEjb { public void doSomething() throws Exception { /* * Create a JNDI Initial context to be able to * lookup the DataSource * * In production-level code, this should be cached as * an instance or static variable, as it can * be quite expensive to create a JNDI context. * * Note: This code only works when you are using servlets * or EJBs in a J2EE application server. If you are * using connection pooling in standalone Java code, you * will have to create/configure datasources using whatever * mechanisms your particular connection pooling library * provides. */ InitialContext ctx = new InitialContext(); /* * Lookup the DataSource, which will be backed by a pool * that the application server provides. DataSource instances * are also a good candidate for caching as an instance * variable, as JNDI lookups can be expensive as well. */ DataSource ds = (DataSource)ctx.lookup("java:comp/env/jdbc/MySQLDB"); /* * The following code is what would actually be in your * Servlet, JSP or EJB 'service' method...where you need * to work with a JDBC connection. */ Connection conn = null; Statement stmt = null; try { conn = ds.getConnection(); /* * Now, use normal JDBC programming to work with * MySQL, making sure to close each resource when you're * finished with it, which permits the connection pool * resources to be recovered as quickly as possible */ stmt = conn.createStatement(); stmt.execute("SOME SQL QUERY"); stmt.close(); stmt = null; conn.close(); conn = null; } finally { /* * close any jdbc instances here that weren't * explicitly closed during normal code path, so * that we don't 'leak' resources... */ if (stmt != null) { try { stmt.close(); } catch (sqlexception sqlex) { // ignore, as we can't do anything about it here } stmt = null; } if (conn != null) { try { conn.close(); } catch (sqlexception sqlex) { // ignore, as we can't do anything about it here } conn = null; } } } }
As shown in the example above, after obtaining the JNDI
InitialContext
, and looking up the
DataSource
, the rest of the code follows
familiar JDBC conventions.
When using connection pooling, always make sure that connections, and anything created by them (such as statements or result sets) are closed. This rule applies no matter what happens in your code (exceptions, flow-of-control, and so forth). When these objects are closed, they can be re-used; otherwise, they will be stranded, which means that the MySQL server resources they represent (such as buffers, locks, or sockets) are tied up for some time, or in the worst case can be tied up forever.
Each connection to MySQL has overhead (memory, CPU, context switches, and so forth) on both the client and server side. Every connection limits how many resources there are available to your application as well as the MySQL server. Many of these resources will be used whether or not the connection is actually doing any useful work! Connection pools can be tuned to maximize performance, while keeping resource utilization below the point where your application will start to fail rather than just run slower.
The optimal size for the connection pool depends on anticipated load and average database transaction time. In practice, the optimal connection pool size can be smaller than you might expect. If you take Oracle's Java Petstore blueprint application for example, a connection pool of 15-20 connections can serve a relatively moderate load (600 concurrent users) using MySQL and Tomcat with acceptable response times.
To correctly size a connection pool for your application, create load test scripts with tools such as Apache JMeter or The Grinder, and load test your application.
An easy way to determine a starting point is to configure your connection pool's maximum number of connections to be unbounded, run a load test, and measure the largest amount of concurrently used connections. You can then work backward from there to determine what values of minimum and maximum pooled connections give the best performance for your particular application.
MySQL Connector/J can validate the connection by executing a lightweight ping against a server. In the case of load-balanced connections, this is performed against all active pooled internal connections that are retained. This is beneficial to Java applications using connection pools, as the pool can use this feature to validate connections. Depending on your connection pool and configuration, this validation can be carried out at different times:
Before the pool returns a connection to the application.
When the application returns a connection to the pool.
During periodic checks of idle connections.
To use this feature, specify a validation query in your connection
pool that starts with /* ping */
. Note that the
syntax must be exactly as specified. This will cause the driver
send a ping to the server and return a dummy lightweight result
set. When using a ReplicationConnection
or
LoadBalancedConnection
, the ping will be sent
across all active connections.
It is critical that the syntax be specified correctly. The syntax needs to be exact for reasons of efficiency, as this test is done for every statement that is executed:
protected static final String PING_MARKER = "/* ping */"; ... if (sql.charAt(0) == '/') { if (sql.startsWith(PING_MARKER)) { doPingInstead(); ...
None of the following snippets will work, because the ping syntax is sensitive to whitespace, capitalization, and placement:
sql = "/* PING */ SELECT 1"; sql = "SELECT 1 /* ping*/"; sql = "/*ping*/ SELECT 1"; sql = " /* ping */ SELECT 1"; sql = "/*to ping or not to ping*/ SELECT 1";
All of the previous statements will issue a normal
SELECT
statement and will
not be transformed into the
lightweight ping. Further, for load-balanced connections, the
statement will be executed against one connection in the internal
pool, rather than validating each underlying physical connection.
This results in the non-active physical connections assuming a
stale state, and they may die. If Connector/J then re-balances, it
might select a dead connection, resulting in an exception being
passed to the application. To help prevent this, you can use
loadBalanceValidateConnectionOnSwapServer
to
validate the connection before use.
If your Connector/J deployment uses a connection pool that allows
you to specify a validation query, take advantage of it, but
ensure that the query starts exactly with
/* ping */
. This is particularly important if
you are using the load-balancing or replication-aware features of
Connector/J, as it will help keep alive connections which
otherwise will go stale and die, causing problems later.