|

During late 1990s,
a team of researchers led by Albert-Laszlo Barabasi started
working on an interesting question � what is the average
distance between two randomly selected web pages in the World
Wide Web? Since the Word Wide Web is a huge collection of web
pages with links to each other, they wanted to measure the
average number of hops (or clicks) needed to navigate from a
randomly selected source page to a target page. Given that
mapping the complete Web is an impossible task, they started
off by writing a simple program (a web crawler) to map pages
in their university domain and proceeded to map small sections
of the Web. Through some mathematical manipulation,
researchers were able to extrapolate the results for the
estimated size of the World Wide Web (8 x 108 pages in late
1990s) and concluded that two randomly chosen pages in the Web
are 19 links (or hops) apart on average. Although 19 links may
appear bit large, this is remarkably a small number of hops
between two randomly chosen pages given the sheer size of the
World Wide Web.
In order to obtain the above result,
researchers modelled the Web as a network of web pages (which
formed the nodes) connected by links between them. They
expected the resulting network to have a random distribution
of links. After all, there are no rules in the Web dictating
which page should be linked to whom and a random pattern is
naturally expected when a huge number of pages are considered.
In such a random network, vast majority of the nodes will have
an average number of links while there will be a few nodes
with either a very large or very small number of links (In
mathematical terms, we say the number of links follow a
poisson distribution). However, the researchers were amazed to
discover that vast majority of the web pages have very few
links pointed at them and there are a small number of �hubs�
which attracted a large number of inward links. Barabasi and
the team coined the term �scale-free� for this type of
networks (In mathematical terms, distribution of the number of
links per node obeys a power law in such networks). Numerous
studies which followed this discovery have concluded that the
Internet � the infrastructure of networks on which the World
Wide Web operates
� also has a scale free structure.
But how can such organised, non-random
structures arise in systems such as the World Wide Web and the
Internet which do not undergo any centralised management or
control? Barabasi and his team have
answered
this question as well. They have observed two properties which
results in the formation of hubs and scale-free network
structures � growth and preferential attachment. Regarding
growth, both the Internet and the Web started off with few
nodes and grew rapidly as mode nodes were added. In such
systems, older nodes will tend to have more links than the
newer nodes simply because they had more time (and chances) to
attract inward links. According to preferential attachment, a
new node will favour linking to a hub rather than a node with
few existing links � it is a system of �rich gets richer�.
This makes sense in the Web since majority of the pages would
have links to hubs, which are popular web pages, while few
would post a link to your personal web page, unless you are a
celebrity. But what is the significance of studying the
structure and dynamics of the World Wide Web and the Internet?
Well, let�s leave that question for my next post.
(You can
download Barabasi�s papers on the structure and dynamics of
the Internet and WWW from the following link:
http://www.barabasilab.com/pubs-www.php)
Hasala
Peiris (MSc.IT ,CISSP) is a Doctoral Research Student and a
Sessional Academic at Curtin University, Perth, Australia.. |