|   
 | 
 
 
 
  | 
 
The Effect of Outbound Links | 
  | 
 
 | 
  | 
Since PageRank is based on the linking 
structure of the whole web, it is inescapable that if the 
inbound links of a page influence its PageRank, its outbound 
links do also have some impact. To illustrate the effects of 
outbound links, we take a look at a simple example. 
 
 
  We regard a web consisting of to websites, each 
having two web pages. One site consists of pages A and B, the 
other constists of pages C and D. Initially, both pages of 
each site solely link to each other. It is obvious that each 
page then has a PageRank of one. Now we add a link which 
points from page A to page C. At a damping factor of 0.75, we 
therefore get the following equations for the single pages' 
PageRank values:  
 
PR(A) = 0.25 + 0.75 PR(B) 
PR(B) = 
0.25 + 0.375 PR(A) 
PR(C) = 0.25 + 0.75 PR(D) + 0.375 
PR(A) 
PR(D) = 0.25 + 0.75 PR(C)  
 
Solving the 
equations gives us the following PageRank values for the first 
site:  
 
PR(A) = 14/23 
PR(B) = 11/23  
 
We 
therefore get an accumulated PageRank of 25/23 for the first 
site. The PageRank values of the second site are given by 
 
 
PR(C) = 35/23 
PR(D) = 32/23  
 
So, the 
accumulated PageRank of the second site is 67/23. The total 
PageRank for both sites is 92/23 = 4. Hence, adding a link has 
no effect on the total PageRank of the web. Additionally, the 
PageRank benefit for one site equals the PageRank loss of the 
other.  | 
 | 
 
 | 
 
 
The Actual Effect of Outbound Links
 | 
 | 
 
 | 
 | 
As it has already been shown, the PageRank benefit for 
a closed system of web pages by an additional inbound link is 
given by  
 
(d / (1-d)) × (PR(X) / C(X)),  
 
where X 
is the linking page, PR(X) is its PageRank and C(X) is the 
number of its outbound links. Hence, this value also 
represents the PageRank loss of a formerly closed system of 
web pages, when a page X within this system of pages now 
points by a link to an external page.  
 
The validity of 
the above formula requires that the page which receives the 
link from the formerly closed system of pages does not link 
back to that system, since it otherwise gains back some of the 
lost PageRank. Of course, this effect may also occur when not 
the page that receives the link from the formerly closed 
system of pages links back directly, but another page which 
has an inbound link from that page. Indeed, this effect may be 
disregarded because of the damping factor, if there are enough 
other web pages in-between the link-recursion. The validity of 
the formula also requires that the linking site has no other 
external outbound links. If it has other external outbound 
links, the loss of PageRank of the regarded site diminishes 
and the pages already receiving a link from that page lose 
PageRank accordingly.  
 
Even if the actual PageRank 
values for the pages of an existing web site were known, it 
would not be possible to calculate to which extend an added 
outbound link diminishes the PageRank loss of the site, since 
the above presented formula regards the status after adding 
the link.  | 
 | 
 
 | 
 
 
Intuitive Justification of the Effect of Outbound 
Links
 | 
 | 
 
 | 
 | 
The intuitive justification for the loss of PageRank by 
an additional external outbound link according to the Random 
Surfer Modell is that by adding an external outbound link to 
one page the surfer will less likely follow an internal link 
on that page. So, the probability for the surfer reaching 
other pages within a site diminishes. If those other pages of 
the site have links back to the page to which the external 
outbound link has been added, also this page's PageRank will 
deplete.  
 
We can conclude that external outbound links 
diminish the totalized PageRank of a site and probably also 
the PageRank of each single page of a site. But, since links 
between web sites are the fundament of PageRank and 
indespensable for its functioning, there is the possibility 
that outbound links have positive effects within other parts 
of Google's ranking criteria. Lastly, relevant outbound links 
do constitute the quality of a web page and a webmaster who 
points to other pages integrates their content in some way 
into his own site.  | 
 | 
 
 | 
 
 
Dangling Links
 | 
 | 
 
 | 
 | 
An important aspect of outbound links is the lack of 
them on web pages. When a web page has no outbound links, its 
PageRank cannot be distributed to other pages. Lawrence Page 
and Sergey Brin characterise links to those pages as dangling 
links.  
 
  The effect of dangling links shall be illustrated by 
a small example website. We take a look at a site consisting 
of three pages A, B and C. In our example, the pages A and B 
link to each other. Additionally, page A links to page C. Page 
C itself has no outbound links to other pages. At a damping 
factor of 0.75, we get the following equations for the single 
pages' PageRank values:  
 
PR(A) = 0.25 + 0.75 
PR(B) 
PR(B) = 0.25 + 0.375 PR(A) 
PR(C) = 0.25 + 0.375 
PR(A)  
 
Solving the equations gives us the following 
PageRank values:  
 
PR(A) = 14/23 
PR(B) = 
11/23 
PR(C) = 11/23  
 
So, the accumulated PageRank of 
all three pages is 36/23 which is just over half the value 
that we could have expected if page A had links to one of the 
other pages. According to Page and Brin, the number of 
dangling links in Google's index is fairly high. A reason 
therefore is that many linked pages are not indexed by Google, 
for example because indexing is disallowed by a robots.txt 
file. Additionally, Google meanwhile indexes several file 
types and not HTML only. PDF or Word files do not really have 
outbound links and, hence, dangling links could have major 
impacts on PageRank.  
 
  In order to prevent PageRank from the negative 
effects of dangling links, pages wihout outbound links have to 
be removed from the database until the PageRank values are 
computed. According to Page and Brin, the number of outbound 
links on pages with dangling links is thereby normalised. As 
shown in our illustration, removing one page can cause new 
dangling links and, hence, removing pages has to be an 
iterative process. After the PageRank calculation is finished, 
PageRank can be assigned to the formerly removed pages based 
on the PageRank algorithm. Therefore, as many iterations are 
needed as for removing the pages. Regarding our illustration, 
page C could be processed before page B. At that point, page B 
has no PageRank yet and, so, page C will not receive any 
either. Then, page B receives PageRank from page A and during 
the second iteration, also page C gets its PageRank. 
 
 
Regarding our example website for dangling links, 
removing page C from the database results in page A and B each 
having a PageRank of 1. After the calculations, page C is 
assigned a PageRank of 0.25 + 0.375 PR(A) = 0.625. So, the 
accumulated PageRank does not equal the number of pages, but 
at least all pages which have outbound links are not harmed 
from the danging links problem.  
 
By removing dangling 
links from the database, they do not have any negative effects 
on the PageRank of the rest of the web. Since PDF files are 
dangling links, links to PDF files do not diminish the 
PageRank of the linking page or site. So, PDF files can be a 
good means of search engine optimisation for Google. 
 | 
 | 
 
 | 
  | 
 | 
 
 | 
 | 
 | 
 
 | 
  | 
 | 
 
 | 
  | 
 | 
 
 | 
PageRank and Google are trademarks of Google Inc., 
Mountain View CA, USA. 
PageRank is protected by US Patent 
6,285,999. 
 
The content of this document may be 
reproduced on the web provided that a copyright notice is 
included and that there is a straight HTML hyperlink to the 
corresponding page at pr.efactory.de in direct 
context.  | 
 | 
 
 
  
 | 
 
  | 
 
|   | 
 
 
 |