http - compression

Web/HTTP 2013. 6. 27. 13:02


HTTP compression is a capability that can be built into web servers and web clients to make better use of available bandwidth, and provide greater transmission speeds between both.[1]

HTTP data is compressed before it is sent from the server: compliant browsers will announce what methods are supported to the server before downloading the correct format; browsers that do not support compliant compression method will download uncompressed data. The most common compression schemas include gzip and deflate, however a full list of available schemas is maintained by IANA.[2] Additionally, third parties develop new methods and include them in their products (e.g. the Google SDCH schema implemented in Google Chrome browser and used on certain Google servers).

Contents

  [hide

Client/Server compression scheme negotiation[edit]

In most cases, excluding the SDCH, the negotiation is done in two steps, described in RFC 2616:

1. The web client includes an Accept-Encoding field in the HTTP request, with supported compression schema names (called content-coding tokens), separated by commas.

GET /encrypted-area HTTP/1.1
Host: www.example.com
Accept-Encoding: gzip, deflate

2. If the server supports one or more compression schemas, the outgoing data may be compressed by one or more methods supported by both parties. If this is the case, the server will add aContent-Encoding field in the HTTP response with the used schemas, separated by commas.

HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/1.3.3.7 (Unix)  (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
Etag: "3f80f-1b6-3e1cb03b"
Accept-Ranges: bytes
Content-Length: 438
Connection: close
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip

The web server is by no means obliged to use any compression method - this depends on the internal settings of the web server and also may depend on the internal architecture of the website in question.

In case of SDCH a dictionary negotiation is also required, which may involve additional steps, like downloading a proper dictionary from the external server.

Problems preventing the use of HTTP compression[edit]

A 2009 article by Google engineers Arvind Jain and Jason Glasgow states that more than 99 person-years are wasted[3] daily due to page load time increases when users do not receive compressed content. This occurs where anti-virus software interferes with connections to force them to be uncompressed, where proxies are used (with overcautious web browsers), where servers are misconfigured, and where browser bugs stop compression being used. Internet Explorer 6, which drops to HTTP 1.0 (without features like compression or pipelining) when behind a proxy- a common configuration in corporate environments- was the mainstream browser most prone to failing back to uncompressed HTTP.[4]

Content-coding tokens[edit]

  • compress - UNIX "compress" program method
  • deflate - despite its name the zlib compression (RFC 1950) should be used (in combination with the deflate compression (RFC 1951)) as described in the RFC 2616. The implementation in the real world however seems to vary between the zlib compression and the (raw) deflate compression.[5][6] Due to this confusion, gzip has positioned itself as the more reliable default method (March 2011).
  • exi - W3C Efficient XML Interchange
  • gzip - GNU zip format (described in RFC 1952). This method is the most broadly supported as of March 2011.[7]
  • identity - No transformation is used. This is the default value for content coding.
  • pack200-gzip - Network Transfer Format for Java Archives [8]
  • sdch[citation needed] - Google Shared Dictionary Compression for HTTP
  • bzip2[citation needed] - free and open source lossless data compression algorithm
  • peerdist[citation needed] - Microsoft Peer Content Caching and Retrieval (described in MS-PCCRPT)
  • lzma[citation needed] - elinks supports LZMA via a compile-time option.[9] Firefox and Gecko will be supporting LZMA compression, this is particularly interesting for smartphones and tablet where bandwidth is limited: LZMA has a very high compression ratio compared to gzip (patch discussed in [1])

Servers that support HTTP compression[edit]

The compression in HTTP can also be achieved by using the functionality of server-side scripting languages like PHP, or programming languages like Java.

References[edit]

  1. ^ "Using HTTP Compression (IIS 6.0)". Microsoft Corporation. Retrieved 9 February 2010.
  2. ^ RFC 2616, Section 3.5: "The Internet Assigned Numbers Authority (IANA) acts as a registry for content-coding value tokens."
  3. ^ "Use compression to make the web faster". Google Developers. Retrieved 22 May 2013.
  4. ^ http://code.google.com/speed/articles/use-compression.html
  5. a b "Compression Tests". Verve Studios, Co. Retrieved 19 July 2012.
  6. ^ "Frequently Asked Questions about zlib - What's the difference between the "gzip" and "deflate" HTTP 1.1 encodings?". Greg Roelofs, Jean-loup Gailly and Mark Adler. Retrieved 23 March 2011.
  7. ^ "Compression Tests: Results". Verve Studios, Co. Retrieved 19 July 2012.
  8. ^ JSR 200: Network Transfer Format for Java Archives.
  9. ^ elinks LZMA decompression
  10. ^ "HOWTO: Use Apache mod_deflate To Compress Web Content (Accept-Encoding: gzip) - Mark S. Kolich". Mark S. Kolich. Retrieved 23 March 2011.
  11. ^ https://issues.apache.org/bugzilla/show_bug.cgi?id=53121
  12. ^ Extra part of Hiawatha webserver's manual

External links[edit]



출처 - http://en.wikipedia.org/wiki/HTTP_compression


'Web > HTTP' 카테고리의 다른 글

apache - 파일 크기 제한을 초과함 $HTTPD  (0) 2013.09.09
http - cache(web cache)  (0) 2013.06.24
http - request, response header  (0) 2013.06.20
http - accept header field  (0) 2013.06.19
http - List of HTTP status codes  (0) 2011.12.16
Posted by linuxism
,