To increase server and network availability and band-width, two new
compression formats are available to Java deployment of applications and
applets: gzip and Pack200.
With both techniques the compressed JAR files are transmitted over the
network and the receiving application decompresses and restores them.
Theory
HTTP 1.1 (RFC 2616) protocol discusses HTTP compression. HTTP Compression
allows applications JAR files to be deployed as compressed JAR files. The
supported compression techniques are
gzip,compress,deflate.
As of SDK/JRE version 5.0, HTTP compression is implemented in Java Web Start
and Java Plug-in in compliance with RFC 2616. The supported techniques are
gzip and pack200-gzip.
The requesting application sends an
HTTP request to the server. An HTTP request has multiple fields. The
Accept-Encoding (AE) field is set to pack200-gzip or
gzip, indicating to the server that the application can handle
pack200-gzip or gzip format.
The server implementation will search for the requested JAR file with
.pack.gz or .gz file extension and respond back with
the located file. The server will set the response header Content-Encoding (CE)
field to pack200-gzip , gzip, or NULL depending on the
type of file that is being sent, and optionally may set the Content-Type (CT) to
application/Java-archive. Therefore, by inspecting the CE field, the
requesting application can apply the corresponding transformation to restore the
original JAR file.
The above can be achieved using a simple servlet or server module with any
HTTP 1.1 compliant web-servers. Compressing files on the fly will degrade server performance, especially
with Pack200, and therefore not recommended.
Sample Tomcat Servlet:
/**
* A simple HTTP Compression Servlet
*/
import java.util.*;
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.util.zip.*;
import java.net.*;
/**
* The servlet class.
*/
public class ContentType extends HttpServlet {
private static final String JNLP_MIME_TYPE = "application/x-java-jnlp-file";
private static final String JAR_MIME_TYPE = "application/x-java-archive";
private static final String PACK200_MIME_TYPE = "application/x-java-pack200";
// HTTP Compression RFC 2616 : Standard headers
public static final String ACCEPT_ENCODING = "accept-encoding";
public static final String CONTENT_TYPE = "content-type";
public static final String CONTENT_ENCODING = "content-encoding";
// HTTP Compression RFC 2616 : Standard header for HTTP/Pack200 Compression
public static final String GZIP_ENCODING = "gzip";
public static final String PACK200_GZIP_ENCODING = "pack200-gzip";
private void sendHtml(HttpServletResponse response, String s)
throws IOException {
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<head>");
out.println("<title>ContentType</title>");
out.println("</head>");
out.println("<body>");
out.println(s);
out.println("</body>");
out.println("</html>");
}
/*
* Copy the inputStream to output ,
*/
private void sendOut(InputStream in, OutputStream ostream)
throws IOException {
byte buf[] = new byte[8192];
int n = in.read(buf);
while (n > 0 ) {
ostream.write(buf,0,n);
n = in.read(buf);
}
ostream.close();
in.close();
}
boolean doFile(String name, HttpServletResponse response) {
File f = new File(name);
if (f.exists()) {
getServletContext().log("Found file " + name);
response.setContentLength(Integer.parseInt(
Long.toString(f.length())));
response.setDateHeader("Last-Modified",f.lastModified());
return true;
}
getServletContext().log("File not found " + name);
return false;
}
/** Called when someone accesses the servlet. */
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
String encoding = request.getHeader(ACCEPT_ENCODING);
String pathInfo = request.getPathInfo();
String pathInfoEx = request.getPathTranslated();
String contentType = request.getContentType();
StringBuffer requestURL = request.getRequestURL();
String requestName = pathInfo;
ServletContext sc = getServletContext();
sc.log("----------------------------");
sc.log("pathInfo="+pathInfo);
sc.log("pathInfoEx="+pathInfoEx);
sc.log("Accept-Encoding="+encoding);
sc.log("Content-Type="+contentType);
sc.log("requestURL="+requestURL);
if (pathInfoEx == null) {
response.sendError(response.SC_NOT_FOUND);
return;
}
String outFile = pathInfo;
boolean found = false;
String contentEncoding = null;
// Pack200 Compression
if (encoding != null && contentType != null &&
contentType.compareTo(JAR_MIME_TYPE) == 0 &&
encoding.toLowerCase().indexOf(PACK200_GZIP_ENCODING) > -1){
contentEncoding = PACK200_GZIP_ENCODING;
if (doFile(pathInfoEx.concat(".pack.gz"),response)) {
outFile = pathInfo.concat(".pack.gz") ;
found = true;
} else {
// Pack/Compress and transmit, not very efficient.
found = false;
}
}
// HTTP Compression
if (found == false && encoding != null &&
contentType != null &&
contentType.compareTo(JAR_MIME_TYPE) == 0 &&
encoding.toLowerCase().indexOf("gzip") > -1) {
contentEncoding = GZIP_ENCODING;
if (doFile(pathInfoEx.concat(".gz"),response)) {
outFile = pathInfo.concat(".gz");
found = true;
}
}
// No Compression
if (found == false) { // just send the file
contentEncoding = null;
sc.log(CONTENT_ENCODING + "=" + "null");
doFile(pathInfoEx,response);
outFile = pathInfo;
}
response.setHeader(CONTENT_ENCODING, contentEncoding);
sc.log(CONTENT_ENCODING + "=" + contentEncoding +
" : outFile="+outFile);
if (sc.getMimeType(pathInfo) != null) {
response.setContentType(sc.getMimeType(pathInfo));
}
InputStream in = sc.getResourceAsStream(outFile);
OutputStream out = response.getOutputStream();
if (in != null) {
try {
sendOut(in,out);
} catch (IOException ioe) {
if (ioe.getMessage().compareTo("Broken pipe") == 0) {
sc.log("Broken Pipe while writing");
return;
} else throw ioe;
}
} else response.sendError(response.SC_NOT_FOUND);
}
}
GZIP Compression
GZIP is a freely available compressor available
within the JRE and the SDK as Java.util.zip.GZIPInputStream and
Java.util.zip.GZIPOutputStream.
The Command line versions are available with
most Unix Operating Systems, Windows Unix Toolkits (Cygwin and MKS), or they are
downloadable for a plethora of operating systems at
http://www.gzip.org/.
One can get the highest degree of compression using
gzip to compress an uncompressed jar file vs. compressing a compressed jar file,
the downside is that the file may be stored uncompressed on the target
systems.
Here is an example: Compressing using gzip on a jar file
containing individual deflated entries. Notepad.jar
46.25 kb Notepad.jar.gz 43.00 kb
Compressing using gzip
on a jar file containing "stored" entries Notepad.jar
987.47 kb Notepad.jar.gz 32.47 kb
As you can
see the download size can be reduced by 14% using uncompressed jar, versus 3%
using compressed jar file.
Pack200 Compression
Pack200 compresses large files very efficiently,
depending on the density and size of the class files in the JAR file. One can
expect compression to 1/9 the size of the JAR file, if it contains only class
files and is in the order of several MB.
Using the same jar in the previous
example: Notepad.jar 46.25
kb Notepad.jar.pack.gz 22.58 kb
In this case the same jar can be reduced by
50%.
Please note: when signing large jars,
step 5 may fail with a Security Error a likely cause is bug
5078608.
Please use one of the workarounds detailed in the release notes.
Pack200 works most efficiently on Java class files.
It uses several techniques to efficiently reduce the size of JAR files:
It merges and sorts the constant-pool data in the class files and
co-locates them in the archive.
It removes redundant class attributes.
It stores internal data structures.
It use delta and variable length encoding.
It chooses optimum coding types for secondary compression.
Pack200 can be used by using the Command Line
Interfaces pack200(1), unpack200(1) in the bin directory of your SDK or the JRE
directory. Pack200 interfaces can also invoked programmatically from Java,
please refer to the API and JavaDoc references to
Java.util.jar.Pack200.
Steps to Pack a file
1. Consider the size of the JAR file, the contents of the JAR file, and the
bandwidth of your target audience.
All these factors play into choosing a
compression technique. The unpack200 is designed to be as efficient as
possible and it takes little time to restore the original file. If you have
large JAR files (2 MB or more) comprised mostly of class files, Pack200 is the
preferred compression technique. If you have large JAR files which are
comprised of resource files (JPEG, GIF, data, etc.), then gzip is the
preferred compression technique.
2. Pack200 segmenting.
Pack200 loads the entire packed file into memory.
However, when target systems are memory and resource constrained, setting the
Pack200.Packer.SEGMENT_LIMIT to a lower value, will reduce the
memory requirements during packing and unpacking. ThePack200.Packer.SEGMENT_LIMIT=-1 will force one
segment to be generated, which will be effect in size reduction, but will
require a much larger Java heap on the packing and and unpacking system. Note
that several of these packed segments may be concatenated to produce a single
packed file.
3. Signing the JAR files.
Pack200 rearranges the contents of the resultant
JAR file. The jarsigner hashes the contents of the class file and stores the
hash in an encrypted digest in the manifest. When the unpacker runs on a
packed packed, the contents of the classes will be rearranged and thus
invalidate the signature. Therefore, the JAR file must be normalized
first using pack200 and unpack200, and thereafter signed.
(Here's why this works: Any reordering the packer
does of any classfile structures is idempotent, so the second packing does not
change the orderings produced by the first packing. Also, the unpacker is
guaranteed by the JSR 200 specification to produce a specific bytewise image
for any given transmission ordering of archive elements.)
An Example
Suppose you wish to use HelloWorld.jar.
Step 1: Repack the file to
normalize the jar, repacking calls the packer and unpacks the file in
one step.
% pack200 --repack HelloWorld.jar
Step 2: Sign the jar after we normalize using
repack.
Verify the just signed jar to ensure the signing worked.
% jarsigner -verify HelloWorld.jar
jar verified.
Ensure the jar still works.
% Java -jar HelloWorld.jar HelloWorld
Step 3: Now we pack the file
% pack200
HelloWorld.jar.pack.gz HelloWorld.jar
Step 4: Unpack
the file
% unpack200 HelloWorld.jar.pack.gz HelloT1.jar
Step 5: Verify the jar
% jarsigner -verify HelloT1.jar jar verified.
// Test the jar ...
% Java -jar HelloT1.jar
HelloWorld
After verification, the compressed pack
file HelloWorld.jar.pack.gz can be
deployed.
4. Reduction techniques:
Pack200 by default behaves in a High
Fidelity (Hi-Fi) mode, meaning all the original attributes present in the
classes as well as the attributes of each individual entry in a JAR file is
retained. These typically tend to add to the packed file size, here are some
of the techniques one can use to further reduce the size of the
download:
Modification times: If modification time of the individual
entries in a JAR file is not a concern, you can specify the option
Pack200.Packer.MODIFICATION_TIME="LATEST". This will
allow one modification time to be transmitted in the pack file for
each segment. The latest time will be the latest time of any entry within
that segment.
Deflation hint: Similar to the above, if the compression state of the
individual entries in the archive is not required, set
Pack200.Packer.DEFLATION_HINT="false". This will fractionally reduce the
download size, as individual compression hints will not be transmitted.
However, the jar when recomposed will contain "stored" entries and hence may
consume more disk space on the target system.
Note: the above optimizations will yield better results with a JAR file
containing thousands of entries.
Attributes: Several class attributes are not required when deploying
JAR files. These attributes can be stripped out of class files,
significantly reducing download size. However, care must be taken to ensure
that required runtime attributes are maintained.
Debugging attributes: If debugging information, such as Line Numbers
and Source File, is not required (typically in applications stack traces),
then these attributes can be discarded by specifying
Pack200.Packer.STRIP_DEBUG=true.This typically reduces the
packed file by about 10%.
Other attributes:
Advanced users may use some of the other strip-related properties to strip
out additional attributes. However, extreme caution should be used when
doing so, the resultant JAR file must be tested on all possible Java
runtime systems to ensure that the runtime does not depend on the stripped
attributes.
5. Handling unknown attributes:
Pack200 deals with standard attributes defined by the Java Virtual
Machine Specification, however compilers are free to introduce custom
attributes. When such attributes are present, by default, Pack200 passes
through the class, emitting a warning message. These "passed-through"
class files, may contribute to bloating of packed files. If the unknown
attributes are prevalent in the classes of a JAR file, this may lead to a
very large bloat of the packed output. In such a cases, consider the
following strategies:
Strip the attribute if the attribute is deemed to be redundant
at runtime, this can be achieved by setting the property
Pack200.Packer.UNKNOWN_ATTRIBUTE=STRIP or
If the attributes are required at runtime, and they do contribute to an
inflation, then identify the attribute from the warning message and apply a
suitable layout for these, as described in the Pack200 JSR 200
specification., and the Java API reference section for
Pack200.Packer.
Its possible that a compiler could define an attribute not implemented in
the layout specification of Pack200, and may cause the Packer to
malfunction, in such cases an entire class file(s) can be "passed through",
as if it were a resource by virtue of its name and can be specified as
follows:
or an entire directory and its contents,
pack200 --pass-file="com/acme/foo/bar/" foo.pack.gz foo.jar
6. Installers:
You may wish to take advantage of the Pack200
technology in your installation program, whereby a product's jars may need to
compressed using Pack200 and decompressed during the installation. If
the JRE or SDK is bundled in the installation, you are free to
use the unpack200 (Unix) or unpack200.exe(Windows) in the distribution 'bin'
directory, this implementation is a pure C++ application requiring no
Java runtime to be present for it to run.
Windows: Installers may use a better
algorithm than the one in GZIP to compress entries in such cases, one will get
better compression using the Installer's intrinsic compression, by using
the pack200 as follows:
pack200 --no-gzip foo.jar.pack foo.jar
This will prevent the output file from being gzip compressed.
unpack200 is a Windows Console application, ie. it will display a MS-DOS
window during the install, to suppress this, you can use a launcher with a
WinMain which will suppress this window, as shown below.
Sample Code:
#include "windows.h"
#include <stdio.h>
int APIENTRY WinMain(HINSTANCE hInstance,
HINSTANCE hPrevInstance,
LPSTR lpCmdLine,
int nCmdShow) {
STARTUPINFO si;
memset(&si, 0, sizeof(si));
si.cb = sizeof(si);
PROCESS_INFORMATION pi;
memset(&pi, 0, sizeof(pi));
//Test
//lpCmdLine = "c:/build/windows-i586/bin/unpack200 -l c:/Temp/log c:/Temp/rt.pack c:/Temp/rt.jar";
int ret = CreateProcess(NULL, /* Exec. name */
lpCmdLine, /* cmd line */
NULL, /* proc. sec. attr. */
NULL, /* thread sec. attr */
TRUE, /* inherit file handle */
CREATE_NO_WINDOW | DETACHED_PROCESS, /* detach the process/suppress console */
NULL, /* env block */
NULL, /* inherit cwd */
&si, /* startup info */
&pi); /* process info */
if ( ret == 0) ExitProcess(255);
// Wait until child process exits.
WaitForSingleObject( pi.hProcess, INFINITE );
DWORD exit_val;
// Be conservative and return
if (GetExitCodeProcess(pi.hProcess, &exit_val) == 0) ExitProcess(255);
ExitProcess(exit_val); // Return the error code of the child process
return -1;
}
Testing
It is required that all JAR files, packed and
unpacked, be tested for correctness with your applications test qualifiers.
When using the command line interface pack200, the output file
will be compressed using gzip with default values. A user
may create a simple pack file and compress using gzip
with user-specified options or using some other compressor.
In Java SE 6, the Java class file format has been updated. For more information see JSR 202: Java Class File Specification Update. Due to JSR 202 the Pack200 engine needs to be updated accordingly for the following reasons:
Align with the new class file format for Java SE 6
Ensure that Java SE 6 class files are compressed effectively.
To keep the changes minimal and seamless for users, the packer will generate appropriately versioned pack files based on the version of the input class files.
Also to maintain backward compatibility, if the input JAR-files are solely comprised of JDK 1.5 or older class files, a 1.5 compatible pack file is produced. Otherwise a Java SE 6 compatible pack200 file is produced. For more information, refer the Pack200 man page.