Automatic detection of shared fragments in large collections of web pages and its applications

Authors: Gu, Zhimin; Ma, Junchang

Source: Journal of Algorithms & Computational Technology, Volume 1, Number 2, June 2007 , pp. 215-250(36)

Publisher: Multi-Science Publishing Co Ltd

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content

Abstract:

To reduce network-related delays in serving dynamic web pages, various approaches have been proposed. However, one of the common fundamental problems encountered in some representatives of them is how to automatically find shared fragments in large numbers of web pages. Besides, this problem is also encountered in studies of web content characteristics at fragment granularity. This paper gives a formal definition of the problem, presents an efficient and scalable algorithm for it, and introduces the applications of the algorithm. In the problem definition, we introduce the notion of compound fragment, and our definition of maximal shared fragment captures the real characteristics of fragments that are appropriate for delivery and caching individually. Our algorithm has two unique features: (1) it is able to find real maximal shared fragments (2) it is able to effectively handle large collections of web pages by utilizing database techniques. The algorithm has been implemented and applied to 16 large sets of web pages. The experiments show that the algorithm can effectively handle large numbers of web pages, and can provide significant bandwidth saving and latency reduction when used in fragment-based web caching.

Document Type: Research article

DOI: 10.1260/174830107781389003

The full text article is not available for purchase.

The publisher only permits individual articles to be downloaded by subscribers.

Back to top

Key:
Free Content - Free Content
New Content - New Content
Subscribed Content - Subscribed Content
Free Trial Content - Free Trial Content
Share this item with others: These icons link to social bookmarking sites where readers can share and discover new web pages.
Page Help Click here for Page Help
Shopping cart
Tools
Sign in






Need to register?
Sign up here
Text size: A | A | A | A