To increase server and network availability and band-width, two
new compression formats are available to Java deployment of
applications and applets: gzip and Pack200.
With both techniques the compressed JAR files are transmitted
over the network and the receiving application decompresses and
restores them.
Theory
HTTP 1.1 (RFC 2616) protocol discusses HTTP compression. HTTP
Compression allows applications JAR files to be deployed as
compressed JAR files. The supported compression techniques are
gzip,compress,deflate.
As of SDK/JRE version 5.0, HTTP compression is implemented in
Java Web Start and Java Plug-in in compliance with RFC 2616. The
supported techniques are gzip and pack200-gzip.
The requesting application sends an HTTP request to the server. An
HTTP request has multiple fields. The Accept-Encoding (AE) field is
set to pack200-gzip or gzip, indicating
to the server that the application can handle
pack200-gzip or gzip format.
The server implementation will search for the requested JAR file
with .pack.gz or .gz file extension and
respond back with the located file. The server will set the
response header Content-Encoding (CE) field to
pack200-gzip , gzip, or NULL depending on
the type of file that is being sent, and optionally may set the
Content-Type (CT) to application/Java-archive. Therefore, by
inspecting the CE field, the requesting application can apply the
corresponding transformation to restore the original JAR file.
The above can be achieved using a simple servlet or server
module with any HTTP 1.1 compliant web-servers. Compressing files
on the fly will degrade server performance, especially with
Pack200, and therefore not recommended.
Sample Tomcat Servlet:
/**
* A simple HTTP Compression Servlet
*/
import java.util.*;
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.util.zip.*;
import java.net.*;
/**
* The servlet class.
*/
public class ContentType extends HttpServlet {
private static final String JNLP_MIME_TYPE = "application/x-java-jnlp-file";
private static final String JAR_MIME_TYPE = "application/x-java-archive";
private static final String PACK200_MIME_TYPE = "application/x-java-pack200";
// HTTP Compression RFC 2616 : Standard headers
public static final String ACCEPT_ENCODING = "accept-encoding";
public static final String CONTENT_TYPE = "content-type";
public static final String CONTENT_ENCODING = "content-encoding";
// HTTP Compression RFC 2616 : Standard header for HTTP/Pack200 Compression
public static final String GZIP_ENCODING = "gzip";
public static final String PACK200_GZIP_ENCODING = "pack200-gzip";
private void sendHtml(HttpServletResponse response, String s)
throws IOException {
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<head>");
out.println("<title>ContentType</title>");
out.println("</head>");
out.println("<body>");
out.println(s);
out.println("</body>");
out.println("</html>");
}
/*
* Copy the inputStream to output ,
*/
private void sendOut(InputStream in, OutputStream ostream)
throws IOException {
byte buf[] = new byte[8192];
int n = in.read(buf);
while (n > 0 ) {
ostream.write(buf,0,n);
n = in.read(buf);
}
ostream.close();
in.close();
}
boolean doFile(String name, HttpServletResponse response) {
File f = new File(name);
if (f.exists()) {
getServletContext().log("Found file " + name);
response.setContentLength(Integer.parseInt(
Long.toString(f.length())));
response.setDateHeader("Last-Modified",f.lastModified());
return true;
}
getServletContext().log("File not found " + name);
return false;
}
/** Called when someone accesses the servlet. */
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
String encoding = request.getHeader(ACCEPT_ENCODING);
String pathInfo = request.getPathInfo();
String pathInfoEx = request.getPathTranslated();
String contentType = request.getContentType();
StringBuffer requestURL = request.getRequestURL();
String requestName = pathInfo;
ServletContext sc = getServletContext();
sc.log("----------------------------");
sc.log("pathInfo="+pathInfo);
sc.log("pathInfoEx="+pathInfoEx);
sc.log("Accept-Encoding="+encoding);
sc.log("Content-Type="+contentType);
sc.log("requestURL="+requestURL);
if (pathInfoEx == null) {
response.sendError(response.SC_NOT_FOUND);
return;
}
String outFile = pathInfo;
boolean found = false;
String contentEncoding = null;
// Pack200 Compression
if (encoding != null && contentType != null &&
contentType.compareTo(JAR_MIME_TYPE) == 0 &&
encoding.toLowerCase().indexOf(PACK200_GZIP_ENCODING) > -1){
contentEncoding = PACK200_GZIP_ENCODING;
if (doFile(pathInfoEx.concat(".pack.gz"),response)) {
outFile = pathInfo.concat(".pack.gz") ;
found = true;
} else {
// Pack/Compress and transmit, not very efficient.
found = false;
}
}
// HTTP Compression
if (found == false && encoding != null &&
contentType != null &&
contentType.compareTo(JAR_MIME_TYPE) == 0 &&
encoding.toLowerCase().indexOf("gzip") > -1) {
contentEncoding = GZIP_ENCODING;
if (doFile(pathInfoEx.concat(".gz"),response)) {
outFile = pathInfo.concat(".gz");
found = true;
}
}
// No Compression
if (found == false) { // just send the file
contentEncoding = null;
sc.log(CONTENT_ENCODING + "=" + "null");
doFile(pathInfoEx,response);
outFile = pathInfo;
}
response.setHeader(CONTENT_ENCODING, contentEncoding);
sc.log(CONTENT_ENCODING + "=" + contentEncoding +
" : outFile="+outFile);
if (sc.getMimeType(pathInfo) != null) {
response.setContentType(sc.getMimeType(pathInfo));
}
InputStream in = sc.getResourceAsStream(outFile);
OutputStream out = response.getOutputStream();
if (in != null) {
try {
sendOut(in,out);
} catch (IOException ioe) {
if (ioe.getMessage().compareTo("Broken pipe") == 0) {
sc.log("Broken Pipe while writing");
return;
} else throw ioe;
}
} else response.sendError(response.SC_NOT_FOUND);
}
}
GZIP
Compression
GZIP is a freely available
compressor available within the JRE and the SDK as
Java.util.zip.GZIPInputStream and
Java.util.zip.GZIPOutputStream.
The Command line versions are available with most Unix Operating
Systems, Windows Unix Toolkits (Cygwin and MKS), or they are
downloadable for a plethora of operating systems at
http://www.gzip.org/.
One can get the highest degree of compression using gzip to
compress an uncompressed jar file vs. compressing a compressed jar
file, the downside is that the file may be stored uncompressed on
the target systems.
Here is an example:
Compressing using gzip on a jar file containing individual deflated
entries.
Notepad.jar 46.25 kb
Notepad.jar.gz 43.00 kb
Compressing using gzip on a jar file containing "stored"
entries
Notepad.jar 987.47 kb
Notepad.jar.gz 32.47 kb
As you can see the download size can be reduced by 14% using
uncompressed jar, versus 3% using compressed jar file.
Pack200 Compression
Pack200 compresses large files very
efficiently, depending on the density and size of the class files
in the JAR file. One can expect compression to 1/9 the size of the
JAR file, if it contains only class files and is in the order of
several MB.
Using the same jar in the previous
example:
Notepad.jar 46.25 kb
Notepad.jar.pack.gz 22.58 kb
In this case the same jar can be
reduced by 50%.
Please note: when signing large jars,
step 5 may fail with a Security Error — a likely cause is bug
5078608.
Please use one of the workarounds detailed in the release
notes.
Pack200 works most efficiently on Java
class files. It uses several techniques to efficiently reduce the
size of JAR files:
It merges and sorts the constant-pool data in the class files
and co-locates them in the archive.
It removes redundant class attributes.
It stores internal data structures.
It use delta and variable length encoding.
It chooses optimum coding types for secondary compression.
Pack200 can be used by using the
Command Line Interfaces pack200(1), unpack200(1) in the bin
directory of your SDK or the JRE directory.
Pack200 interfaces can also invoked programmatically from Java,
please refer to the API and JavaDoc references to
Java.util.jar.Pack200.
Steps to Pack a file
1. Consider the size of the JAR file, the contents of the JAR
file, and the bandwidth of your target audience.
All these factors play into choosing a
compression technique. The unpack200 is designed to be as efficient
as possible and it takes little time to restore the original file.
If you have large JAR files (2 MB or more) comprised mostly of
class files, Pack200 is the preferred compression technique. If you
have large JAR files which are comprised of resource files
(JPEG, GIF, data, etc.), then gzip is the preferred compression
technique.
2. Pack200 segmenting.
Pack200 loads the entire packed file
into memory. However, when target systems are memory and resource
constrained, setting the Pack200.Packer.SEGMENT_LIMIT
to a lower value, will reduce the memory requirements during
packing and unpacking. The
Pack200.Packer.SEGMENT_LIMIT=-1 will force one
segment to be generated, which will be effect in size reduction,
but will require a much larger Java heap on the packing and
unpacking system. Note that several of these packed segments may be
concatenated to produce a single packed file.
3. Signing the JAR files.
Pack200 rearranges the contents of the
resultant JAR file. The jarsigner hashes the contents of the class
file and stores the hash in an encrypted digest in the manifest.
When the unpacker runs on a packed packed, the contents of the
classes will be rearranged and thus invalidate the signature.
Therefore, the JAR file must be normalized first using
pack200 and unpack200, and thereafter signed.
(Here's why this works: Any reordering
the packer does of any classfile structures is idempotent, so the
second packing does not change the orderings produced by the first
packing. Also, the unpacker is guaranteed by the JSR 200
specification to produce a specific bytewise image for any given
transmission ordering of archive elements.)
An Example
Suppose you wish to use HelloWorld.jar.
Step 1: Repack the file to normalize the jar, repacking calls
the packer and unpacks the file in one step.
% pack200 --repack HelloWorld.jar
Step 2: Sign the jar after we normalize using repack.
Verify the just signed jar to ensure the signing worked.
% jarsigner -verify HelloWorld.jar
jar verified.
Ensure the jar still works.
% Java -jar HelloWorld.jar
HelloWorld
Step 3: Now we pack the file
% pack200 HelloWorld.jar.pack.gz HelloWorld.jar
Step 4: Unpack the file
% unpack200 HelloWorld.jar.pack.gz HelloT1.jar
Step 5: Verify the jar
% jarsigner -verify HelloT1.jar
jar verified.
// Test the jar ... % Java -jar HelloT1.jar
HelloWorld
After verification, the compressed pack file HelloWorld.jar.pack.gz
can be deployed.
4. Reduction techniques:
Pack200 by default behaves in
a High Fidelity (Hi-Fi) mode, meaning all the original attributes
present in the classes as well as the attributes of each individual
entry in a JAR file is retained. These typically tend to add to the
packed file size, here are some of the
techniques one can use to further reduce the size of the
download:
Modification times: If modification time of the
individual entries in a JAR file is not a concern, you can specify
the option
Pack200.Packer.MODIFICATION_TIME="LATEST". This will allow
one modification time to be transmitted in the pack file for each
segment. The latest time will be the latest time of any entry
within that segment.
Deflation hint: Similar to the above, if the compression state
of the individual entries in the archive is not required, set
Pack200.Packer.DEFLATION_HINT="false". This will fractionally
reduce the download size, as individual compression hints will not
be transmitted. However, the jar when recomposed will contain
"stored" entries and hence may consume more disk space on the
target system.
Note: the above optimizations will yield better results with a
JAR file containing thousands of entries.
Attributes: Several class attributes are not required when
deploying JAR files. These attributes can be stripped out of class
files, significantly reducing download size. However, care must be
taken to ensure that required runtime attributes are maintained.
Debugging attributes: If debugging information, such as Line
Numbers and Source File, is not required (typically in applications
stack traces), then these attributes can be discarded by specifying
Pack200.Packer.STRIP_DEBUG=true.This typically reduces
the packed file by about 10%.
Other attributes: Advanced users may use some of the other
strip-related properties to strip out additional attributes.
However, extreme caution should be used when doing so, the
resultant JAR file must be tested on all possible Java runtime
systems to ensure that the runtime does not depend on the stripped
attributes.
5. Handling unknown attributes:
Pack200 deals with standard attributes defined by the Java
Virtual Machine Specification, however compilers are free to
introduce custom attributes. When such attributes are present, by
default, Pack200 passes through the class, emitting a warning
message. These "passed-through" class files, may contribute to
bloating of packed files. If the unknown attributes are prevalent
in the classes of a JAR file, this may lead to a very large bloat
of the packed output. In such a cases, consider the following
strategies:
Strip the attribute if the attribute is deemed to be
redundant at runtime, this can be achieved by setting the
property Pack200.Packer.UNKNOWN_ATTRIBUTE=STRIP or
If the attributes are required at runtime, and they do
contribute to an inflation, then identify the attribute from the
warning message and apply a suitable layout for these, as described
in the Pack200 JSR 200 specification., and the Java API reference
section for Pack200.Packer.
Its possible that a compiler could define an attribute not
implemented in the layout specification of Pack200, and may cause
the Packer to malfunction, in such cases an entire class file(s)
can be "passed through", as if it were a resource by virtue of its
name and can be specified as follows:
or an entire directory and its contents, pack200
--pass-file="com/acme/foo/bar/" foo.pack.gz
foo.jar
6. Installers:
You may wish to take advantage of
the Pack200 technology in your installation program, whereby a
product's jars may need to compressed using Pack200 and
decompressed during the installation. If the JRE or SDK
is bundled in the installation, you are free to use the
unpack200 (Unix) or unpack200.exe(Windows) in the distribution
'bin' directory, this implementation is a pure C++
application requiring no Java runtime to be present for it to
run.
Windows: Installers may use a
better algorithm than the one in GZIP to compress entries in such
cases, one will get better compression using the Installer's
intrinsic compression, by using the pack200 as follows:
pack200 --no-gzip foo.jar.pack foo.jar
This will prevent the output file from being gzip compressed.
unpack200 is a Windows Console application, ie. it will display a
MS-DOS window during the install, to suppress this, you can use a
launcher with a WinMain which will suppress this window, as shown
below.
Sample Code:
#include "windows.h"
#include <stdio.h>
int APIENTRY WinMain(HINSTANCE hInstance,
HINSTANCE hPrevInstance,
LPSTR lpCmdLine,
int nCmdShow) {
STARTUPINFO si;
memset(&si, 0, sizeof(si));
si.cb = sizeof(si);
PROCESS_INFORMATION pi;
memset(&pi, 0, sizeof(pi));
//Test
//lpCmdLine = "c:/build/windows-i586/bin/unpack200 -l c:/Temp/log c:/Temp/rt.pack c:/Temp/rt.jar";
int ret = CreateProcess(NULL, /* Exec. name */
lpCmdLine, /* cmd line */
NULL, /* proc. sec. attr. */
NULL, /* thread sec. attr */
TRUE, /* inherit file handle */
CREATE_NO_WINDOW | DETACHED_PROCESS, /* detach the process/suppress console */
NULL, /* env block */
NULL, /* inherit cwd */
&si, /* startup info */
&pi); /* process info */
if ( ret == 0) ExitProcess(255);
// Wait until child process exits.
WaitForSingleObject( pi.hProcess, INFINITE );
DWORD exit_val;
// Be conservative and return
if (GetExitCodeProcess(pi.hProcess, &exit_val) == 0) ExitProcess(255);
ExitProcess(exit_val); // Return the error code of the child process
return -1;
}
Testing
It is required that all JAR files,
packed and unpacked, be tested for correctness with your
applications test qualifiers. When using the command line interface
pack200, the output file will be compressed using
gzip with default values. A user may create a
simple pack file and compress using gzip
with user-specified options or using some other compressor.
In Java SE 6, the Java class file format has been updated. For
more information see JSR 202: Java Class File
Specification Update. Due to JSR 202 the Pack200 engine needs
to be updated accordingly for the following reasons:
Align with the new class file format for Java SE 6
Ensure that Java SE 6 class files are compressed
effectively.
To keep the changes minimal and seamless for users, the packer
will generate appropriately versioned pack files based on the
version of the input class files.
Also to maintain backward compatibility, if the input JAR-files
are solely comprised of JDK 1.5 or older class files, a 1.5
compatible pack file is produced. Otherwise a Java SE 6 compatible
pack200 file is produced. For more information, refer the Pack200
man page.