The Wikipedia Review: A forum for discussion and criticism of Wikipedia
Wikipedia Review Op-Ed Pages

Welcome, Guest! ( Log In | Register )

 
Reply to this topicStart new topic
> What's In The Brin That Links May Character?, And How Rank A Page Cdr PageRank Rank?
Rating  2
Jon Awbrey
post Wed 28th January 2009, 3:24pm
Post #1


τὰ δέ μοι παθήματα μαθήματα γέγονε
*********

Group: Moderators
Posts: 6,745
Joined: Sun 6th Apr 2008, 4:52am
From: Meat Puppet Nation
Member No.: 5,619

WP user page - talk
check - contribs



I've been meaning to discuss this in detail some day …

QUOTE

Sergey Brin and Lawrence Page : The Anatomy of a Large-Scale Hypertextual Web Search Engine

2.1.2. Intuitive Justification

PageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. And, the d damping factor is the probability at each page the "random surfer" will get bored and request another random page. One important variation is to only add the damping factor d to a single page, or a group of pages. This allows for personalization and can make it nearly impossible to deliberately mislead the system in order to get a higher ranking. We have several other extensions to PageRank, again see (Page, 1998).

Another intuitive justification is that a page can have a high PageRank if there are many pages that point to it, or if there are some pages that point to it and have a high PageRank. Intuitively, pages that are well cited from many places around the web are worth looking at. Also, pages that have perhaps only one citation from something like the Yahoo! homepage are also generally worth looking at. If a page was not high quality, or was a broken link, it is quite likely that Yahoo's homepage would not link to it. PageRank handles both these cases and everything in between by recursively propagating weights through the link structure of the web.


Jes not today …

Ja³
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
GlassBeadGame
post Wed 28th January 2009, 3:51pm
Post #2


Dharma Bum
*********

Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981



QUOTE(Jon Awbrey @ Wed 28th January 2009, 10:24am) *

I've been meaning to discuss this in detail some day …

QUOTE

Sergey Brin and Lawrence Page : The Anatomy of a Large-Scale Hypertextual Web Search Engine

2.1.2. Intuitive Justification

PageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. And, the d damping factor is the probability at each page the "random surfer" will get bored and request another random page. One important variation is to only add the damping factor d to a single page, or a group of pages. This allows for personalization and can make it nearly impossible to deliberately mislead the system in order to get a higher ranking. We have several other extensions to PageRank, again see (Page, 1998).

Another intuitive justification is that a page can have a high PageRank if there are many pages that point to it, or if there are some pages that point to it and have a high PageRank. Intuitively, pages that are well cited from many places around the web are worth looking at. Also, pages that have perhaps only one citation from something like the Yahoo! homepage are also generally worth looking at. If a page was not high quality, or was a broken link, it is quite likely that Yahoo's homepage would not link to it. PageRank handles both these cases and everything in between by recursively propagating weights through the link structure of the web.


Jes not today …

Ja³


I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Moulton
post Wed 28th January 2009, 4:01pm
Post #3


Anthropologist from Mars
*********

Group: Contributors
Posts: 10,220
Joined: Mon 29th Oct 2007, 9:56pm
From: Greater Boston
Member No.: 3,670

WP user page - talk
check - contribs



QUOTE(GlassBeadGame @ Wed 28th January 2009, 10:51am) *
I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists.

Say you log in fresh for the day, totally bored out of your mind, having grown tired of whatever subject you were drilling down before you fell asleep at the keyboard the night before.

You surf randomly for a while, and eventually land on something that intrigues you. You click, read, and drill down until you have had your fill.

What's the proportion of links you click on while drilling down, compared to the proportion of pages you visit "fresh" (i.e. via an initial Google search driven by your own internal agenda).

The "damping factor" corresponds to that drill-down proportion.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
GlassBeadGame
post Wed 28th January 2009, 4:29pm
Post #4


Dharma Bum
*********

Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981



QUOTE(Moulton @ Wed 28th January 2009, 11:01am) *

QUOTE(GlassBeadGame @ Wed 28th January 2009, 10:51am) *
I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists.

Say you log in fresh for the day, totally bored out of your mind, having grown tired of whatever subject you were drilling down before you fell asleep at the keyboard the night before.

You surf randomly for a while, and eventually land on something that intrigues you. You click, read, and drill down until you have had your fill.

What's the proportion of links you click on while drilling down, compared to the proportion of pages you visit "fresh" (i.e. via an initial Google search driven by your own internal agenda).

The "damping factor" corresponds to that drill-down proportion.


Thanks for that Moulton. It would seem to me length of visit and possibly "link away and return" would be important in calculating "drill down." They would signal a degree of interest. Of course I might go to a three paragraph page and walk away for twenty minutes.

Today web use seems to me much more purpose driven. What they describe seems more like using some Gopher to navigate Usenet. They really have changed things profoundly.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Jon Awbrey
post Wed 28th January 2009, 8:18pm
Post #5


τὰ δέ μοι παθήματα μαθήματα γέγονε
*********

Group: Moderators
Posts: 6,745
Joined: Sun 6th Apr 2008, 4:52am
From: Meat Puppet Nation
Member No.: 5,619

WP user page - talk
check - contribs



If you really want a taste of just how rank Giggle's PageRank can be, try a search on Ampheck

Ja³
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
EricBarbour
post Thu 29th January 2009, 10:06am
Post #6


blah
*********

Group: Regulars
Posts: 5,919
Joined: Mon 25th Feb 2008, 2:31am
Member No.: 5,066

WP user page - talk
check - contribs



I've got something even ranker:

Sergey has claimed since 1998 that he read all these books while in high school and Stanford.

Possible, though I have my doubts.

(Also, can someone explain why Stanford keeps his personal website up?
Bragging?)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Jon Awbrey
post Thu 29th January 2009, 11:54am
Post #7


τὰ δέ μοι παθήματα μαθήματα γέγονε
*********

Group: Moderators
Posts: 6,745
Joined: Sun 6th Apr 2008, 4:52am
From: Meat Puppet Nation
Member No.: 5,619

WP user page - talk
check - contribs



QUOTE(EricBarbour @ Thu 29th January 2009, 5:06am) *

I've got something even ranker:

Sergey has claimed since 1998 that he read all these books while in high school and Stanford.

Possible, though I have my doubts.

(Also, can someone explain why Stanford keeps his personal website up? Bragging?)


A bit OCR (Obsessive Compulsive Reader), but not otherwise unusual.

I'm guessing that Stanford & Sons keeps a lot of old junk around as a kind of technophile* chronofile.

Folks who want a taste of what a link matrix looks like on a smaller scale might have a look at this page-ranking application that someone created for the entries in PlanetMath.

Myyn.Org's PlanetMath Browser

Ja³

* No, it's not what some of you are thinking.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
GlassBeadGame
post Thu 29th January 2009, 1:39pm
Post #8


Dharma Bum
*********

Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981



QUOTE(Jon Awbrey @ Thu 29th January 2009, 6:54am) *

QUOTE(EricBarbour @ Thu 29th January 2009, 5:06am) *

I've got something even ranker:

Sergey has claimed since 1998 that he read all these books while in high school and Stanford.

Possible, though I have my doubts.

(Also, can someone explain why Stanford keeps his personal website up? Bragging?)


A bit OCR (Obsessive Compulsive Reader), but not otherwise unusual.

I'm guessing that Stanford & Sons keeps a lot of old junk around as a kind of technophile* chronofile.

Folks who want a taste of what a link matrix looks like on a smaller scale might have a look at this page-ranking application that someone created for the entries in PlanetMath.

Myyn.Org's PlanetMath Browser

Ja³

* No, it's not what some of you are thinking.


Note that Brin had read both Shakespeare's Othello as well as Othello, The Moor of Venice. Also he read both Vonnegut's Slaughterhouse Five as well as Slaughterhouse-Five. Of course these are just his favorite books there where probably hundreds of books with slight variations of the title of the same work that he read but didn't list. It could be he showed a precocious ability for generating lists with no concern for underlying meaning from an early age.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

-   Lo-Fi Version Time is now: 20th 6 13, 5:36am