Sorry about this – It was thrown together quickly more as notes to myself. Maybe you can use it ?
Needed this info. for one of my websites. Wanted to put up a feature to allow downloading of PDF’s from my google app engine service. So had to dig into the google box to find this material. First up are some references i found that helped me to understand the complexity of the task. This is necessary to create http headers in the payload going to a client like a web-browser. The http session content type is like a ‘suggestion’ to the receiving client as to what the payload has in it. That way the client can properly render or deal with the payload. If no content type is indicated, then the browser will guess as to the type of content. Another important header is the ‘Content Disposition’ which has something to do with the MIME type of the document, and even more importantly, the character encoding. See wiki reference for more MIME.
Servlet Tutorial on Session Tracking
Servlet-Tutorial-Session-Tracking
Servlet Response Headers Tutorial
http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/Servlet-Tutorial-Response-Headers.html
How to Serve A PDF From a Java Servlet
This java sample works but runs as slow as water uphill. Typically you would have a server somewhere on the internet, say amazon S3 or perhaps google, at least a service that supports the running of java jvm’s. how-do-i-serve-up-a-pdf-from-a-servlet
Tutorial on Servlet Content Types
Setting_content-type_utf-8
A must read to understand servlet content-type declarations. it’s not as easy as you think !
Unicode and Character Sets
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Joel on Software http://www.joelonsoftware.com/articles/Unicode
RFC2045 (MIME) + Base64 Content-Transfer-Encoding
rfc 2045, 1996 – the document that started it all:
http://www.ietf.org/rfc/rfc2045.txt
How to Create a Custom Jquery Plugin
http://www.ibm.com/developerworks/library/wa-jqplugin/wa-jqplugin-pdf.pdf
– courtesy IBM developerworks
Wiki MIME Content Disposition
http://en.wikipedia.org/wiki/MIME#Content-Disposition
Oracles’ Servlet Specification Javadocs
http://docs.oracle.com/javaee/1.4/api/javax/servlet/ServletResponse.html#setContentType(java.lang.String)
http://docs.oracle.com/javaee/1.3/api/javax/servlet/ServletResponse.html
Streaming Large Files In A Java Servlet
http://stackoverflow.com/questions/55709/streaming-large-files-in-a-java-servlet
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.OutputStream;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
public class ReaderServlet extends javax.servlet.http.HttpServlet implements javax.servlet.Servlet {
private static final long serialVersionUID = 1L;
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
doIt(request, response);
}
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException,
IOException {
doIt(request, response);
}
private void doIt(HttpServletRequest request, HttpServletResponse response) throws ServletException,
IOException {
// fill in tis bit with your PDF file to be read
String pdfFileName = "somefilename.pdf";
String contextPath = getServletContext().getRealPath(File.separator);
File pdfFile = new File(contextPath + pdfFileName);
response.setContentType("application/pdf");
response.addHeader("Content-Disposition", "attachment; filename=" + pdfFileName);
response.setContentLength((int) pdfFile.length());
FileInputStream fileInputStream = new FileInputStream(pdfFile);
OutputStream responseOutputStream = response.getOutputStream();
int bytes;
while ((bytes = fileInputStream.read()) != -1) {
responseOutputStream.write(bytes);
}
}
}
It’s possible to have a servlet serve up PDF content by specifying the content type of the servlet response to be the ‘application/pdf‘ MIME type via response.setContentType(“application/pdf“). This tut demonstrates this as follows.
The TestServlet class is mapped to /test in your web.xml file. When the TestServlet is hit by a browser request, it locates the test.pdf file in the root of the web directory. It sets the response content type to be ‘application/pdf‘, specifying that the response is an attachment, and sets the response content length. Following that, it writes the contents of the PDF file to the response output stream.
If we hit the TestServlet, the browser may ask us if we’d like to open or save the test.pdf file. Some browsers will, others do not ask.
This technique can be useful in a variety of ways. For example, PDF content can be generated dynamically and returned to a user via the response output stream without ever needing to create an actual file in the file system. In addition, having a servlet serve up PDF content can restrict access to a PDF file in the file system since a servlet can determine who should have access to a particular PDF file.
The original article is courtesy of Deron Eriksson: http://www.avajava.com/tutorials/lessons/how-do-i-serve-up-a-pdf-from-a-servlet.html
Servlet Javadocs
http://docs.oracle.com/javaee/1.4/api/javax/servlet/ServletResponse.html
Wiki for MIME
(Multipurpose Internet Mail Extensions) discusses these issues more fully: http://en.wikipedia.org/wiki/MIME. Here we can read up on the different MIME headers, versions, content id’s, content dispositions and transfer encodings. This is a more complex discussion of multi-part messages.
IANA manage the list of known MIME Media types, see here: http://www.iana.org/assignments/media-types/index.html
internet media types wiki: http://en.wikipedia.org/wiki/Internet_media_type
java servlet response setContentType allows several possible configurations based on the MIME type:
response.setContentType("application/json");
response.setContentType("text/html;charset=UTF-8");
response.setContentType("text/plain");
or for typical images, something like this:
Content-Type: image/jpeg
Content-Disposition: attachment; filename=santa.jpeg;
Here is a short list
Type of Application
For Multipurpose Files
application/atom+xml: Atom feeds
application/ecmascript: ECMAScript/JavaScript; Defined in RFC 4329 (equivalent to application/javascript but with stricter processing rules)
application/EDI-X12: EDI X12 data; Defined in RFC 1767
application/EDIFACT: EDI EDIFACT data; Defined in RFC 1767
application/json: JavaScript Object Notation JSON; Defined in RFC 4627
application/javascript: ECMAScript/JavaScript; Defined in RFC 4329 (equivalent to application/ecmascript but with looser processing rules) It is not accepted in IE 8 or earlier – text/javascript is accepted but it is defined as obsolete in RFC 4329. The “type” attribute of the <script> tag in HTML5 is optional and in practice omitting the media type of JavaScript programs is the most interoperable solution since all browsers have always assumed the correct default even before HTML5.
application/octet-stream: Arbitrary binary data. Generally speaking this type identifies files that are not associated with a specific application. Contrary to past assumptions by software packages such as Apache this is not a type that should be applied to unknown files. In such a case, a server or application should not indicate a content type, as it may be incorrect, but rather, should omit the type in order to allow the recipient to guess the type.
application/ogg: Ogg, a multimedia bitstream container format; Defined in RFC 5334
application/pdf: Portable Document Format, PDF has been in use for document exchange on the Internet since 1993; Defined in RFC 3778
application/postscript: PostScript; Defined in RFC 2046
application/rdf+xml: Resource Description Framework; Defined by RFC 3870
application/rss+xml: RSS feeds
application/soap+xml: SOAP; Defined by RFC 3902
application/font-woff: Web Open Font Format; (candidate recommendation; use application/x-font-woff until standard is official)
application/xhtml+xml: XHTML; Defined by RFC 3236
application/xml-dtd: DTD files; Defined by RFC 3023
application/xop+xml:XOP
application/zip: ZIP archive files;
application/x-gzip: Gzip
Type Audio
For Audio
audio/basic: mulaw audio at 8 kHz, 1 channel; Defined in RFC 2046
audio/L24: 24bit Linear PCM audio at 8-48kHz, 1-N channels; Defined in RFC 3190
audio/mp4: MP4 audio
audio/mpeg: MP3 or other MPEG audio; Defined in RFC 3003
audio/ogg: Ogg Vorbis, Speex, Flac and other audio; Defined in RFC 5334
audio/vorbis: Vorbis encoded audio; Defined in RFC 5215
audio/x-ms-wma: Windows Media Audio; Documented in MS kb288102
audio/x-ms-wax: Windows Media Audio Redirector; Documented in MS kb288102
audio/vnd.rn-realaudio: RealAudio; Documented in RealPlayer Help
audio/vnd.wave: WAV audio; Defined in RFC 2361
audio/webm: WebM open media format
Type Image
image/gif: GIF image; Defined in RFC 2045 and RFC 2046
image/jpeg: JPEG JFIF image; Defined in RFC 2045 and RFC 2046
image/pjpeg: JPEG JFIF image; Internet Explorer; Listed in ms775147(v=vs.85) – Progressive JPEG, initiated before global browser support for progressive JPEGs (Microsoft and Firefox).
image/png: Portable Network Graphics; Defined in RFC 2083
image/svg+xml: SVG vector image; Defined in SVG Tiny 1.2 Specification Appendix M
image/tiff: Tag Image File Format (only for Baseline TIFF); Defined in RFC 3302
image/vnd.microsoft.icon: ICO image;
Type Multipart
For archives and other objects with more than one part.
multipart/mixed: MIME Email; Defined in RFC 2045 and RFC 2046
multipart/alternative: MIME Email; Defined in RFC 2045 and RFC 2046
multipart/related: MIME Email; Defined in RFC 2387 and used by MHTML (HTML mail)
multipart/form-data: MIME Webform; Defined in RFC 2388
multipart/signed: Defined in RFC 1847
multipart/encrypted: Defined in RFC 1847
Type text
For Human-Readable Text and Source Code
text/cmd: commands; subtype resident in Gecko browsers like Firefox 3.5
text/css: Cascading Style Sheets; Defined in RFC 2318
text/csv: Comma-separated values; Defined in RFC 4180
text/html: HTML; Defined in RFC 2854
text/javascript (Obsolete): JavaScript; Defined in and obsoleted by RFC 4329 in order to discourage its usage in favor of application/javascript. However, text/javascript is allowed in HTML 4 and 5 and, unlike application/javascript, has cross-browser support. The “type” attribute of the <script> tag in HTML5 is optional and there is no need to use it at all since all browsers have always assumed the correct default (even in HTML 4 where it was required by the specification).
text/plain: Textual data; Defined in RFC 2046 and RFC 3676
text/vcard: vCard (contact information); Defined in RFC 6350
text/xml: Extensible Markup Language; Defined in RFC 3023
Type Video
For video
video/mpeg: MPEG-1 video with multiplexed audio; Defined in RFC 2045 and RFC 2046
video/mp4: MP4 video; Defined in RFC 4337
video/ogg: Ogg Theora or other video (with audio); Defined in RFC 5334
video/quicktime: QuickTime video;
video/webm: WebM Matroska-based open media format
video/x-matroska: Matroska open media format
video/x-ms-wmv: Windows Media Video; Documented in Microsoft KB 288102
video/x-flv: Flash video (FLV files)
XML Use On The Internet
XML is described in more details in this Wiki: http://en.wikipedia.org/wiki/XML#Use_on_the_Internet
The design goals of XML emphasize simplicity, generality, portability and usability over the Internet.
It is a textual data format with Unicode support for several world languages. It is widely used for the representation of data structures and as a data transport layer between applications written in many programming languages.
XML Dialects include RSS, Atom, SOAP, and XHTML plus several office products such as Microsoft office, OpenOffice and LibreOffice, plus one variant as a comms protocol called XMPP for chat sessions.
Content-Disposition
The original MIME specifications only described the structure of mail messages. They did not address the issue of presentation styles. The content-disposition header field was added in RFC 2183 to specify the presentation style. A MIME part can have:
- an inline content-disposition, which means that it should be automatically displayed when the message is displayed, or
- an attachment content-disposition, in which case it is not displayed automatically and requires some form of action from the user to open it.
Content-Transfer-Encoding
In June 1992, MIME (RFC 1341, since made obsolete by RFC 2045) defined a set of methods for representing binary data in ASCII text format. The content-transfer-encoding: MIME header has a two-sided significance:
It indicates whether or not a binary-to-text encoding scheme has been used on top of the original encoding as specified within the Content-Type header:
- If such a binary-to-text encoding method has been used, it states which one.
- If not, it provides a descriptive label for the format of content, with respect to the presence of 8 bit or binary content.
The RFC and the IANA’s list of transfer encodings define the values. See: http://www.iana.org/assignments/transfer-encodings/transfer-encodings.xml