[Top] [Prev] [Next] [Bottom]

DataSource

NOTE: The material in this chapter is based on JDBCtm API Tutorial and Reference, Second Edition: Universal Data Access for the Javatm 2 Platform, published by Addison Wesley as part of the Java series, ISBN 0-201-43328-1.

4.1 DataSource Overview

A DataSource object is the representation of a data source in the Java programming language. In basic terms, a data source is a facility for storing data. It can be as sophisticated as a complex database for a large corporation or as simple as a file with rows and columns. A data source can reside on a remote server, or it can be on a local desktop machine. Applications access a data source using a connection, and a DataSource object can be thought of as a factory for connections to the particular data source that the DataSource instance represents. The DataSource interface provides two methods for establishing a connection with a data source.

Using a DataSource object is the preferred alternative to using the DriverManager for establishing a connection to a data source. They are similar to the extent that the DriverManager class and DataSource interface both have methods for creating a connection, methods for getting and setting a timeout limit for making a connection, and methods for getting and setting a stream for logging.

Their differences are more significant than their similarities, however. Unlike the DriverManager, a DataSource object has properties that identify and describe the data source it represents. Also, a DataSource object works with a Javatm Naming and Directory Interfacetm (JNDI) naming service and is created, deployed, and managed separately from the applications that use it. A driver vendor will provide a class that is a basic implementation of the DataSource interface as part of its JDBC 2.0 or 3.0 driver product. What a system administrator does to register a DataSource object with a JNDI naming service and what an application does to get a connection to a data source using a DataSource object registered with a JNDI naming service are described later in this chapter.

Being registered with a JNDI naming service gives a DataSource object two major advantages over the DriverManager. First, an application does not need to hardcode driver information, as it does with the DriverManager. A programmer can choose a logical name for the data source and register the logical name with a JNDI naming service. The application uses the logical name, and the JNDI naming service will supply the DataSource object associated with the logical name. The DataSource object can then be used to create a connection to the data source it represents.

The second major advantage is that the DataSource facility allows developers to implement a DataSource class to take advantage of features like connection pooling and distributed transactions. Connection pooling can increase performance dramatically by reusing connections rather than creating a new physical connection each time a connection is requested. The ability to use distributed transactions enables an application to do the heavy duty database work of large enterprises.

Although an application may use either the DriverManager or a DataSource object to get a connection, using a DataSource object offers significant advantages and is the recommended way to establish a connection.

4.1.1 Properties

A DataSource object has a set of properties that identify and describe the real world data source that it represents. These properties include information like the location of the database server, the name of the database, the network protocol to use to communicate with the server, and so on. DataSource properties follow the JavaBeans design pattern and are usually set when a DataSource object is deployed.

To encourage uniformity among DataSource implementations from different vendors, the JDBC 2.0 API specifies a standard set of properties and a standard name for each property. The following table gives the standard name, the data type, and a description for each of the standard properties. Note that a DataSource implementation does not have to support all of these properties; the table just shows the standard name that an implementation should use when it supports a property.

Table 4.1 Standard Data Source Properties

Property Name	Type	Description
databaseName	`String`	the name of a particular database on a server
dataSourceName	String	the logical name for the underlying `XADataSource` or `ConnectionPoolDataSource` object; used only when pooling of connections or distributed transactions are implemented
description	`String`	a description of this data source
networkProtocol	`String`	the network protocol used to communicate with the server
password	`String`	the user's database password
portNumber	`int`	the port number where a server is listening for requests
roleName	`String`	the initial SQL rolename
serverName	`String`	the database server name
user	`String`	the user's account name

A DataSource object will, of course, have to support all of the properties that the data source it represents needs for making a connection, but the only property that all DataSource implementations are required to support is the description property. This standardizing of properties makes it possible, for instance, for a utility to be written that lists available data sources, giving a description of each along with the other property information that is available.

A DataSource object is not restricted to using only those properties specified in Table 4.1. A vendor may add its own properties, in which case it should give each new property a vendor-specific name.

If a DataSource object supports a property, it must supply getter and setter methods for it. The following code fragment illustrates the methods that a DataSource object ds would need to include if it supports, for example, the property serverName.

ds.setServerName("my_database_server");
String serverName = ds.getServerName();

Properties will most likely be set by a developer or system administrator using a GUI tool as part of the installation of the data source. Users connecting to the data source do not get or set properties. This is enforced by the fact that the DataSource interface does not include the getter and setter methods for properties; they are supplied only in a particular implementation. The effect of including getter and setter methods in the implementation but not the public interface creates some separation between the management API for DataSource objects and the API used by applications. Management tools can get at properties by using introspection.

4.1.2 Using JNDI

JNDI provides a uniform way for an application to find and access remote services over the network. The remote service may be any enterprise service, including a messaging service or an application-specific service, but, of course, a JDBC application is interested mainly in a database service. Once a DataSource object is created and registered with a JNDI naming service, an application can use the JNDI API to access that DataSource object, which can then be used to connect to the data source it represents.

4.1.3 Creating and Registering a DataSource Object

A DataSource object is usually created, deployed, and managed separately from the Java applications that use it. For example, the following code fragment creates a DataSource object, sets its properties, and registers it with a JNDI naming service. Note that a DataSource object for a particular data source is created and deployed by a developer or system administrator, not the user. The class VendorDataSource would most likely be supplied by a driver vendor. (The code example in the next section will show the code that a user would write to get a connection.) Note also that a GUI tool will probably be used to deploy a DataSource object, so the following code, shown here mainly for illustration, is what such a tool would execute.

VendorDataSource vds = new VendorDataSource();

vds.setServerName("my_database_server");
vds.setDatabaseName("my_database");
vds.setDescription("the data source for inventory and personnel");

Context ctx = new InitialContext();

ctx.bind("jdbc/AcmeDB", vds);

The first four lines represent API from a vendor's class VendorDataSource, an implementation of the javax.sql.DataSource interface. They create a DataSource object, vds, and set its serverName, databaseName, and description properties. The fifth and sixth lines use JNDI API to register vds with a JNDI naming service. The fifth line calls the default InitialContext constructor to create a Java object that references the initial JNDI naming context. System properties, which are not shown here, tell JNDI which naming service provider to use. The last line associates vds with a logical name for the data source that vds represents.

The JNDI namespace consists of an initial naming context and any number of subcontexts under it. It is hierarchical, similar to the directory/file structure in many file systems, with the initial context being analogous to the root of a file system and subcontexts being analogous to subdirectories. The root of the JNDI hierarchy is the initial context, here represented by the variable ctx. Under the initial context there may be many subcontexts, one of which is jdbc, the JNDI subcontext reserved for JDBC data sources. (The logical data source name may be in the subcontext jdbc or in a subcontext under jdbc.) The last element in the hierarchy is the object being registered, analogous to a file, which in this case is a logical name for a data source. The result of the preceding six lines of code is that the VendorDataSource object vds is associated with jdbc/AcmeDB. The following section shows how an application uses this to connect to a data source.

4.1.4 Connecting to a Data Source

In the previous section, a DataSource object, vds, was given properties and bound to the logical name AcmeDB. The following code fragment shows application code that uses this logical name to connect to the database that vds represented. The code then uses the connection to print lists with the name and title of each member of the sales and customer service departments.

Context ctx = new InitialContext();

DataSource ds = (DataSource)ctx.lookup("jdbc/AcmeDB");
Connection con = ds.getConnection("genius", "abracadabra");
con.setAutoCommit(false);
PreparedStatement pstmt = con.prepareStatement(
                            "SELECT NAME, TITLE FROM PERSONNEL WHERE DEPT = ?");
pstmt.setString(1, "SALES");
ResultSet rs = pstmt.executeQuery();

System.out.println("Sales Department:");
while (rs.next()) {
        String name = rs.getString("NAME");
        String title = rs.getString("TITLE");
        System.out.println(name + "     " + title);
}
pstmt.setString(1, "CUST_SERVICE");
ResultSet rs = pstmt.executeQuery();

System.out.println("Customer Service Department:");
while (rs.next()) {
        String name = rs.getString("NAME");
        String title = rs.getString("TITLE");
        System.out.println(name + "     " + title);
}
rs.close();
pstmt.close();

con.close();

The first two lines use JNDI API; the third line uses DataSource API. After the first line creates an instance of javax.naming.Context for the initial naming context, the second line calls the method lookup on it to get the DataSource object associated with jdbc/AcmeDB. Recall that in the previous code fragment, the last line of code associated jdbc/AcmeDB with vds, so the object returned by the lookup method refers to the same DataSource object that vds represented. However, the return value for the method lookup is a reference to a Java Object, the most generic of objects, so it must be cast to the more narrow DataSource before it can be assigned to the DataSource variable ds.

At this point ds refers to the same data source that vds referred to previously, the database my_database on the server my_database_server. Therefore, in the third line of code, calling the method DataSource.getConnection on ds and supplying it with a user name and password is enough to create a connection to my_database.

The rest of the code fragment uses a single transaction to execute two queries and print the results of each query. The DataSource implementation in this case is a basic implementation included with the JDBC driver. If the DataSource class had been implemented to work with an XADataSource class, and the preceding code example was executed in the context of a distributed transaction, the code could not have called the method Connection.commit. It also would not have set the auto-commit mode to false because that would have been unnecessary. The default for newly-created connections that can participate in distributed transactions is to have auto-commit mode turned off. The next section will discuss the three broad categories of DataSource implementations.

In addition to the version of getConnection that takes a user name and password, the DataSource interface provides a version of the method DataSource.getConnection that takes no parameters. It is available for situations where a data source does not require a user name and password because it uses a different security mechanism or where a data source does not restrict access.

4.1.5 DataSource Implementations

The DataSource interface may be implemented to provide three different kinds of connections. As a result of DataSource objects working with a JNDI service provider, all connections produced by a DataSource object offer the advantages of portability and easy maintenance, which are explained later in this chapter. Implementations of DataSource that work with implementations of the more specialized ConnectionPoolDataSource and XADataSource interfaces produce connections that are pooled or that can be used in distributed transactions. The following list summarizes the three general categories of classes that implement the DataSource interface:

Basic DataSource class
- provided by: driver vendor
- advantages: portability, easy maintenance
DataSource class implemented to provide connection pooling
- provided by: application server vendor or driver vendor
- works with: a ConnectionPoolDataSource class, which is always provided by a driver vendor
- advantages: portability, easy maintenance; increased performance
DataSource class implemented to provide distributed transactions
- provided by: application server vendor such as an EJB server vendor
- works with: an XADataSource class, which is always provided by a driver vendor
- advantages: portability, easy maintenance; ability to participate in distributed transactions
  Note that a DataSource implementation that supports distributed transactions is almost always implemented to support connection pooling as well.
  
  An instance of a class that implements the DataSource interface represents one particular data source. Every connection produced by that instance will reference the same data source. In a basic DataSource implementation, a call to the method DataSource.getConnection returns a Connection object that, like the Connection object returned by the DriverManager facility, is a physical connection to the data source. Appendix A of the specification for the JDBC 2.0 Standard Extension API (available at http://java.sun.com/products/jdbc) gives a sample implementation of a basic DataSource class.
  
  DataSource objects that implement connection pooling likewise produce a connection to the particular data source that the DataSource class represents. The Connection object that the method DataSource.getConnection returns, however, is a handle to a PooledConnection object rather than being a physical connection. An application uses the Connection object just as it usually does and is generally unaware that it is in any way different. Connection pooling has no effect whatever on application code except that a pooled connection, as is true with all connections, should always be explicitly closed. When an application closes a connection that is pooled, the connection joins a pool of reusable connections. The next time DataSource.getConnection is called, a handle to one of these pooled connections will be returned if one is available. Because connection pooling avoids creating a new physical connection every time one is requested, it can help to make applications run significantly faster. Connection pooling is generally used, for example, by a web server that supports servlets and JavaServertm Pages.
  
  A DataSource class can likewise be implemented to work with a distributed transaction environment. An EJB server, for example, supports distributed transactions and requires a DataSource class that is implemented to interact with it. In this case, the DataSource.getConnection method returns a Connection object that can be used in a distributed transaction. As a rule, EJB servers provide a DataSource class that supports connection pooling as well as distributed transactions. Like connection pooling, transaction management is handled internally, so using distributed transactions is easy. The only requirement is that when a transaction is distributed (involves two or more data sources), the application cannot call the methods commit or rollback. It also cannot put the connection in auto-commit mode. The reason for these restrictions is that a transaction manager begins and ends a distributed transaction under the covers, so an application cannot do anything that would affect when a transaction begins or ends.

4.1.6 Logging and Tracing

The DataSource interface provides methods that allow a user to get and set the character stream to which tracing and error logging will be written. A user can trace a specific data source on a given stream, or multiple data sources can write log messages to the same stream provided that the stream is set for each data source. Log messages that are written to a log stream specific to a DataSource object are not written to the log stream maintained by the DriverManager.

4.1.7 Advantages of Using JNDI

There are major advantages to connecting to a data source using a DataSource object registered with a JNDI naming service rather than using the DriverManager facility. The first is that it makes code more portable. With the DriverManager, the name of a JDBC driver class, which usually identifies a particular driver vendor, is included in application code. This makes the application specific to that vendor's driver product and thus non-portable.

Another advantage is that it makes code much easier to maintain. If any of the necessary information about the data source changes, only the relevant DataSource properties need to be modified, not every application that connects to that data source. For example, if a database is moved to a different server and uses a different port number, only the DataSource object's serverName and portNumber properties need to be updated. A system administrator could keep all existing code usable with the following code fragment. In practice, a system administrator would probably use a GUI tool to set the properties, so the following code fragment illustrates the code a tool might execute internally.

Context ctx = new InitialContext()

DataSource ds = (DataSource)ctx.lookup("jdbc/AcmeDB");
ds.setServerName("my_new_database_server");

ds.setPortNumber("940");

The application programmer would not need to do anything at all to keep all of the applications using the data source running smoothly.

Yet another advantage is that applications using a DataSource object to get a connection will automatically benefit from connection pooling if the DataSource class has been implemented to support connection pooling. Likewise, an application will automatically be able to use distributed transactions if the DataSource class has been implemented to support them.

[Top] [Prev] [Next] [Bottom]

Java Technology