FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:   || 2 |

«Contents Introduction System Evolution Lagoon - CERN - Netscape Alternative Servers Hardware Demands Hardware Resource Balancing Networking and ...»

-- [ Page 1 ] --

The UK National Web Cache

A State of the Art Report

Neil G. Smith,


The University of Kent at Canterbury.


Two years after its introduction at the First International World-Wide Web Conference at CERN,

Geneva, the use of caching technology to improve the efficiency of network utilisation has become

a hot topic. With relatively poor international connectivity, if was through necessity that UK

academia was one of the first communities to make widespread use of this technology on a large scale. The implementation of a national strategy proposed by HENSA Unix in June 1995 has led an experimental project to become what is probably the most mature caching facility in the world today. In this paper we present a brief history of the project, a discussion of the evolution of the hardware, software and networking systems involved, and take a look to the future of the project within the framework of the UKs networking strategy. It is hoped that some of our experiences may be of use to other large bodies of users who are tired of waiting for their Web pages to arrive.

Contents Introduction System Evolution Lagoon - CERN - Netscape Alternative Servers Hardware Demands Hardware Resource Balancing Networking and Machine Load Balancing The Users Future Developments Proxy Auto-configuration Cache co-operation Networks for Caches New HTTP Protcols Conclusion References Introduction The World-Wide Web has long suffered as a result of its own popularity. The combination of the ease with which large video and audio data types can be incorporated into documents, and the model of a single publisher serving countless clients places great demands on bandwidth. While this may not present a problem on a local area network, or even within a national context, as soon as information passes across international networks the lack of bandwidth and the resulting congestion is immediately apparent. (This problem is very obvious in the UK. Nationally we have good connectivity with a 150Mbps backbone, but our international links are relatively slow: 4Mbps to the United States, 4Mbps to Europe and 2Mbps to Scandinavia.) The problem was already recognised by the time of the First International World-Wide Web Conference [1] in May 1994 and has received a steady stream of attention since then. The consensus of opinion suggests that distributing the publication responsibility through the deployment of Web proxy caches gives us the quickest route to a medium term solution. In future more sophisticated schemes may allow for more flexible publication mechanisms that avoid some of the problems that proxies introduce. However, for the moment they are all that we have and their role is now central in many users’ access to the Web.

This means that all new protocol developments musttake account of these intermediate servers and work with them. This makes protocol development more complicated, but the impact that proxy caches can have is so great that we cannot afford to ignore them.

System Evolution The evolution of the systems in use at HENSA Unix has been forced by the great demand on the service. Until recently this demand always out-stripped the resources available. At points in the services history the demand has been so great that queues on the servers resulted in using the cache actually being slower than going direct. The graph below shows how the service has grown (December’s dip is, of course, the seasonal norm).

Lagoon - CERN - Netscape In November 1993, after initial experiments confirmed that the wholesale mirroring of Web pages was not an effective way to reduce the latency seen on the networks, HENSA Unix adopted Lagoon[2] as an experimental proxy cache. At the time, the necessary protocol extensions to support proxying were not in place and early versions of Lagoon had to make use of a CGI script that rewrote HTML on the fly in order to direct clients back to the cache for each subsequent page that they retrieved. Despite some innovative features (cache co-operation was already being discussed) this HTML rewriting and other performance related problems meant that the client base being supported by HENSA Unix was becoming too large for Lagoon.

At about the time of the First International Conference, a proxy mechanism was introduced into the the CERN HTTP server[3]. New versions of Mosaic made the use of this facility transparent to the user and proxying started to become a viable proposition. HENSA Unix continued to use the CERN server for almost a year but with the increasing popularity of the cache, the forking process model used by the CERN server started to place a higher and higher load on our hardware. At this point the caching service was still experimental and not receiving its own funding.

It was because the CERN server forked for each connection it received that the service eventually started to fail. The incoming connections could not be accepted fast enough and users were being turned away. Hacks to increase the priority of the parent process while decreasing that of the child processes only helped for a very short time.

At the beginning of 1995 Netscape started beta testing their own proxy server and HENSA Unix was asked to act as a test site. The Netscape server[4] relies on a non-forking process model and thereby places a significantly lower strain on the hardware. The Netscape proxy server is still used to provide the main caching service; a service currently responding to over 1,100,000 requests every day.

Alternative Servers While the evolution from Lagoon to the CERN server and finally to the Netscape proxy server represents a considerable improvement in stability, configurability and performance, the fundamental principles involved have not changed a great deal. Each of these servers still merely acts a simple proxy with a cache of pages to improve performance. Other projects have developed proxy servers that attempt to go further. The most notable of these being the Harvest Object Cache [5].

Harvest allows a single cache to interact with neighbour and parent caches in a co-operating hierarchy. These neighbours will normally be on networks that the cache has good access to. This model improves performance in the case of a cache miss by allowing other close-by caches to say whether they have the requested page. If another local cache has the page then it will be retrieved from that cache rather than the remote site. This means that any cache that is part of a large co-operative hierarchy benefits from the pages stored in all the other caches in that hierarchy.

While Harvest’s approach goes one better than other simpler proxies it, unfortunately, relies on a single process model. This process uses non-blocking I/O and this results in relatively good performance. However, the question remains as to whether this model can ever be fast enough to serve the size of community currently using HENSA Unix. This community currently averages 28 connections a second at peak times in the afternoons, with peaks in activity of more than 100 connections per second.

Hardware Demands For the first eighteen months of service, the HENSA Unix cache was placed on the same single processor Sparc 10 serving the HENSA Unix FTP archive. As the popularity of both services increased an upgrade was required and a Silicon Graphics Challenge S was deployed. It was anticipated that the cache would remain on the Sparc 10 while the FTP archive moved to the Silicon Graphics machine. Unfortunately the demand on the service increased to fill all the spare capacity on the Sparc and with the Challenge S being the fastest machine available both the FTP and the Web caching service were moved to this machine.

The very high connection rate being experienced on this server and on other conventional HTTP servers at other sites, stressed operating systems in ways in which they had never been stressed before. Now it was not the hardware that was insufficient, but bugs in TCP code implementations that made the service unstable. With the obvious demand for fixes from all quarters of the community, the vendors were quick to patch up the problems and demand could continue to grow.

At this point another problem struck the HENSA Unix service. The Silicon Graphics machine had always been intended to support the FTP archive and as a result it was ordered with a small number of very large disks (three 9GB disks). Even with the cached files spread across three disks, the I/O bottleneck was great enough to mean that in some cases going via the cache was actually slower than going to a site directly. The impact this had on the FTP archive (the service for which we received our funding) was that FTP users were unable to connect to the machine at all. At this point the cache was serving about 300,000 requests each day.

The solution to the problem came when an emergency equipment purchase expanded the service with a dual processor Silicon Graphics Challenge DM. This machine was ordered with six 2GB disks to ensure that bandwidth within the machine would not be a constraint on the service. The service was migrated to this machine in June 1995. The start of the UK academic year in October 1995 saw this machine responding to over 900,000 requests each day. Once again, the demand had expanded to fill the available capacity.

A proposal by HENSA Unix to take the experience gained through operating this experimental service and deploy a scalable and reliable national service was accepted in July 1995 and for the first time ever, enabled us to invest in equipment that would be capable of keeping up with the demand. In order to provide resilience in the case of hardware failure a number of machines would be used. Based on previous experience, each of these machines would have the optimum balance of processor power, disk bandwidth and system memory.

Hardware Resource Balancing A busy Web cache tests all the sub-systems in a machine. Surprisingly, network bandwidth is not always the most important concern or the first bottleneck hit. This reflects the disparity between transfer rates on local and on international networks. Instead, a lack of disk bandwidth, processor speed or real memory can bring a cache server to a grinding halt.

In the case of the Netscape Proxy server it is the combination of the speed of the processor and the amount of real memory that determines how many concurrent users you may support. Each of our 175MHz R4400 based servers, with 128MB of memory can support approximately 650 concurrent connections.

Disk bandwidth is a more serious concern than disk space once a minimum level has been passed.

Simulations based on real cache activity show the hit-rates being achieved by larger and larger caches stabalizing at approximately 55%. The growth in hit-rate is quite rapid, and with very large disk drives now available at a fraction of their cost even two years ago, there is no reason why all caches could not achieve this hit-rate. While the size of the disk determines the hit-rate, bandwidth to the disks is most likely to be the first bottleneck after the international networks. Making use of a large number of disks, and distributing the cache data across these disks is a facility now offered by both Harvest and the Netscape Proxy server.

It remains to be seen whether the continual growth of both server and client populations on the Web makes a significant difference to the hit-rates attained by, and the disk space demanded by, caching proxies. It is true to say that with more servers there will be more potentially cachable data, but on the other hand, with more clients there are a greater number of hits on the popular pages. This may lead to caches that are as effective without any increase in disk capacity.

Networking and Machine Load Balancing Throughout the first two years of service, the network structure surrounding the HENSA Unix cache did not change. It was only with the acceptance of the proposal for a national strategy that changes were made to the operation of the cache on the network.

Having multiple machines provides resilience in the case of hardware failure. If these machines are distributed across several sites then resilience against network failure is also gained. Currently the HENSA Unix cache is implemented with machines at two sites, the University of Kent and the University of Leeds. This distribution also ensures that the bandwidth into or, more importantly, as the caches are bandwidth magnifiers, out of any particular site does not become the bottleneck in the whole scheme.

In order to evenly distribute the load across the machines supporting the cache we anticipated having to modify a DNS name server to return the name of the most lightly loaded machine. In fact, this proved unnecessary as more recent versions of BIND provide a round-robin facility that rotates the list of addresses corresponding to a single name. With a five minute Time-To-Live on the name this is sufficient to ensure that, over a 24 hour period, the load across all six machines is even. It also gives us the ability to quickly reconfigure the group of machines supporting the service in the event of a hardware failure.

Further distribution of the caching facility is envisaged in the UK’s overall strategy. This distribution consists of local caches operated by an institution or even a department within an institution. We are encouraging these local caches to then make use of the national cache to minimise redundant transfers across the international network links. Simulations based on the log files collected at HENSA Unix show us that an institution, even with only a relatively small cache, 500MB of disk, can reduce the load placed on the national facility by as much as 40%. Institutions without the specialist knowledge to operate a WWW proxy cache are being encouraged to approach their closest Metropolitan Area Network to make use of a cache at this point.

Through the study of server log files from sites outside the UK a number of institutions were found who were not making use of the national caching facility. When questioned, the most common response was that they intended to install a local cache and did not want to have to go through the user education procedure twice, first they would be telling their users to make use of the national cache at HENSA, and shortly afterwards redirecting them to the local cache.

Pages:   || 2 |

Similar works:

«All Saints Church, Benington, Lincolnshire Assessment of Significance December 2010 All Saints Church, Benington, Lincolnshire Assessment of Significance December 2010 CONTENTS Summary 1.0 Introduction 2.0 Conservation Principles 3.0 Description of the church 4.0 Heritage Values 5.0 Heritage Significance 6.0 Heritage Sensitivity 7.0 References Appendix 1 Listed Building Description Appendix 2 Maps Appendix 3 Photographs Appendix 4 Historic photographs and images Appendix 5 Community Value...»

«2 MOSCOW STATE UNIVERSITY INSTITUTE OF ASIAN AND AFRICAN STUDIES Proceedings of the Conference in memory of Alexander Gouber (1902 1971) Issue I SOUTHEAST ASIA: historical memory, ethnocultural identity and political reality Moscow Kluch S МОСКОВСКИЙ ГОСУДАРСТВЕННЫЙ УНИВЕРСИТЕТ имени М.В. Ломоносова Институт стран Азии и Африки Губеровские чтения Выпуск 1 ЮГО ВОСТОЧНАЯ АЗИЯ:...»

«Franz Rothenbacher Schelklingen: Ein Führer durch Stadt, Kloster Urspring, Burgen und Teilgemeinden © Franz Rothenbacher, Mannheim, 2006. Alle Rechte vorbehalten. Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verfassers unzulässig und strafbar. Das gilt insbesondere für Vervielfältigungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in...»

«\j, a ■SI m w NUI MAYNOOTH OUscoil na hÉireann M a Nuad THE SOCIAL AND IDEOLOGICAL ROLE OF CRANNOGS IN EARLY MEDIEVAL IRELAND by AIDAN O’SULLIVAN THESIS FOR THE DEGREE OF PH.D. DEPARTMENT OF MODERN HISTORY NATIONAL UNIVERSITY OF IRELAND MAYNOOTH HEAD OF DEPARTMENT: Professor R.V. Comerford Supervisor of Research: Mr. John Bradley Volume Two of Two APPENDICES AND BIBLIOGRAPHY March 2004 Contents Appendix 1: A select bibliography of early medieval references to crannogs, islands and lakes In...»

«Cultural Tourism Cultural Tourism Glossary of Terms u u al s rT ABA American Bus Association; comprised of bus companies, operators and owners Around Texas Complete calendar of events, fairs, festivals, sports, shows, concerts, exhibits and other events across Texas. Endorsed by the Texas Festivals & Events Association (TFEA). Attendance Building Marketing and promotional programs designed to increase attendance at conventions, trade shows, meetings, and events. Attractions General...»

«Organizational Platform of the General Union of Anarchists (Draft) The new translation, the debate, the history & the platform today Texts by Nestor Makhno, Ida Mett, Piotr Archinov, Valevsky, Linsky, Workers Cause (Dielo Truda) Group of Russian Anarchists Abroad, Maria Isidine, Errico Malatesta, Pieter Archinov, Jeff Shantz & P.J. Lilley, Alan MacSimoin, Nick Heath, Nestor McNab and the Anarkismo editorial group. Clarification and replies Contents On revolutionary discipline (Nestor Makhno,...»

«Chapter 2 A Brief History of Anger Michael Potegal and Raymond W. Novaco Abstract Stories, myths, and religious beliefs reveal the powerful role that anger has played in human affairs since the beginning of recorded history. The projections of anger into the supernatural by ancient and pre-literate societies trying to account for the terrifying vagaries of nature testify to their experience with, and appreciation of, the baleful influence of anger in the human sphere. It has served as an...»

«The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology A High-quality Historical Humidity Database for Australia Chris Lucas CAWCR Technical Report No. 024 July 2010 A High-quality Historical Humidity Database for Australia Chris Lucas1 The Centre for Australian Weather and Climate Research a partnership between CSIRO and the Bureau of Meteorology CAWCR Technical Report No. 024 July 2010 ISSN: 1835-9884 National Library of Australia...»

«20th Annual Student Scholars Day • • Angeline Meitzler, collision14.1.5 Table of Contents 2015 SSD Committee 2 Welcome from Co-Directors 3 Schedule of Events 4 Statement from Cover Artist 4 Keynote Lecture 5 Henry Hall Map 6 Kirkhof Center Maps 7-8 History of SSD 9 History of Undergraduate Research and Scholarship at GVSU 10 McNair Scholars and Student Summer Scholars 11 Highlights of GVSU Student Work 12 fishladder 15 Poster Presentations Schedule and Abstracts 16 Oral Presentations...»

«ORAL-VISUAL CONTRADICTION: SEEING AND HEARING IN SHAKESPEARE’S HISTORY PLAYS Thesis submitted for the degree of Doctor of Philosophy at the University of Leicester by Sonia Suman School of English 1 October 2013 ABSTRACT Oral-Visual Contradiction Seeing and Hearing in Shakespeare’s History Plays Sonia Suman Scholarship in the latter half of the twentieth century did much to rehabilitate Shakespeare’s early histories into the canon. Discarded on the grounds of collaborative authorship or...»

«Aleksander Kwiatek ON THE ANTINOMY OF EAST-WEST IN THE SECOND REPUBLIC OF POLAND AND ITS SILESIAN CONTEXT. PART 2 The antinomy of East-West, which constituted one of the fundamental determinants of the Polish thought, or the ‘Polish idea’, was gaining particular significance and sense in the conditions of the Second Republic of Poland. Defined more and more precisely on different levels of social reflection (e.g. in literature, historiography, political writing), it becomes – within the...»

«Picasso – another anachronistic approach Lotte Betting  January 2011  Leiden University  TABLE OF CONTENTS Introduction p. 3 Individual interest and commonality p. 3  Contextualization p. 4  Spatial anachronisms p. 5  Body Chapters p. 6 I Beyond Hegel p. 6 Affinities p. 6  Historical coincidence vs. direct derivation p. 7  One-way option p. 8  Representation p. 9  II Toward Bois p. 10 Self-reflexivity p. 10  Morphological vs. structural influence p. 11 ...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.