| |
|
  |
What's In The Brin That Links May Character?, And How Rank A Page Cdr PageRank Rank? |
|
|
| Jon Awbrey |
Wed 28th January 2009, 3:24pm
|

τὰ δέ μοι παθήματα μαθήματα γέγονε
        
Group: Moderators
Posts: 6,745
Joined: Sun 6th Apr 2008, 4:52am
From: Meat Puppet Nation
Member No.: 5,619
WP user page -
talk
check -
contribs

|
I've been meaning to discuss this in detail some day …QUOTE Sergey Brin and Lawrence Page : The Anatomy of a Large-Scale Hypertextual Web Search Engine2.1.2. Intuitive JustificationPageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. And, the d damping factor is the probability at each page the "random surfer" will get bored and request another random page. One important variation is to only add the damping factor d to a single page, or a group of pages. This allows for personalization and can make it nearly impossible to deliberately mislead the system in order to get a higher ranking. We have several other extensions to PageRank, again see ( Page, 1998). Another intuitive justification is that a page can have a high PageRank if there are many pages that point to it, or if there are some pages that point to it and have a high PageRank. Intuitively, pages that are well cited from many places around the web are worth looking at. Also, pages that have perhaps only one citation from something like the Yahoo! homepage are also generally worth looking at. If a page was not high quality, or was a broken link, it is quite likely that Yahoo's homepage would not link to it. PageRank handles both these cases and everything in between by recursively propagating weights through the link structure of the web. Jes not today … Ja³
|
|
|
|
|
|
| GlassBeadGame |
Wed 28th January 2009, 3:51pm
|

Dharma Bum
        
Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981

|
QUOTE(Jon Awbrey @ Wed 28th January 2009, 10:24am)  I've been meaning to discuss this in detail some day …QUOTE Sergey Brin and Lawrence Page : The Anatomy of a Large-Scale Hypertextual Web Search Engine2.1.2. Intuitive JustificationPageRank can be thought of as a model of user behavior. We assume there is a "random surfer" who is given a web page at random and keeps clicking on links, never hitting "back" but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. And, the d damping factor is the probability at each page the "random surfer" will get bored and request another random page. One important variation is to only add the damping factor d to a single page, or a group of pages. This allows for personalization and can make it nearly impossible to deliberately mislead the system in order to get a higher ranking. We have several other extensions to PageRank, again see ( Page, 1998). Another intuitive justification is that a page can have a high PageRank if there are many pages that point to it, or if there are some pages that point to it and have a high PageRank. Intuitively, pages that are well cited from many places around the web are worth looking at. Also, pages that have perhaps only one citation from something like the Yahoo! homepage are also generally worth looking at. If a page was not high quality, or was a broken link, it is quite likely that Yahoo's homepage would not link to it. PageRank handles both these cases and everything in between by recursively propagating weights through the link structure of the web. Jes not today … Ja³ I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists.
|
|
|
|
|
|
| Moulton |
Wed 28th January 2009, 4:01pm
|

Anthropologist from Mars
        
Group: Contributors
Posts: 10,220
Joined: Mon 29th Oct 2007, 9:56pm
From: Greater Boston
Member No.: 3,670
WP user page -
talk
check -
contribs

|
QUOTE(GlassBeadGame @ Wed 28th January 2009, 10:51am)  I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists. Say you log in fresh for the day, totally bored out of your mind, having grown tired of whatever subject you were drilling down before you fell asleep at the keyboard the night before. You surf randomly for a while, and eventually land on something that intrigues you. You click, read, and drill down until you have had your fill. What's the proportion of links you click on while drilling down, compared to the proportion of pages you visit "fresh" (i.e. via an initial Google search driven by your own internal agenda). The "damping factor" corresponds to that drill-down proportion.
|
|
|
|
|
|
| GlassBeadGame |
Wed 28th January 2009, 4:29pm
|

Dharma Bum
        
Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981

|
QUOTE(Moulton @ Wed 28th January 2009, 11:01am)  QUOTE(GlassBeadGame @ Wed 28th January 2009, 10:51am)  I'm trying to understand the "damping factor" this seems to kind of threshold before clicks are counted, say to ones own page to increase rank, but I'm not sure. Does this subtract multiple hits from one IP? Or something else? Also the "random suffer" probably described someone prior to the popularity of search engines who no longer exists. Say you log in fresh for the day, totally bored out of your mind, having grown tired of whatever subject you were drilling down before you fell asleep at the keyboard the night before. You surf randomly for a while, and eventually land on something that intrigues you. You click, read, and drill down until you have had your fill. What's the proportion of links you click on while drilling down, compared to the proportion of pages you visit "fresh" (i.e. via an initial Google search driven by your own internal agenda). The "damping factor" corresponds to that drill-down proportion. Thanks for that Moulton. It would seem to me length of visit and possibly "link away and return" would be important in calculating "drill down." They would signal a degree of interest. Of course I might go to a three paragraph page and walk away for twenty minutes. Today web use seems to me much more purpose driven. What they describe seems more like using some Gopher to navigate Usenet. They really have changed things profoundly.
|
|
|
|
|
|
| Jon Awbrey |
Thu 29th January 2009, 11:54am
|

τὰ δέ μοι παθήματα μαθήματα γέγονε
        
Group: Moderators
Posts: 6,745
Joined: Sun 6th Apr 2008, 4:52am
From: Meat Puppet Nation
Member No.: 5,619
WP user page -
talk
check -
contribs

|
QUOTE(EricBarbour @ Thu 29th January 2009, 5:06am)  I've got something even ranker: Sergey has claimed since 1998 that he read all these books while in high school and Stanford. Possible, though I have my doubts. (Also, can someone explain why Stanford keeps his personal website up? Bragging?) A bit OCR (Obsessive Compulsive Reader), but not otherwise unusual. I'm guessing that Stanford & Sons keeps a lot of old junk around as a kind of technophile* chronofile. Folks who want a taste of what a link matrix looks like on a smaller scale might have a look at this page-ranking application that someone created for the entries in PlanetMath. Myyn.Org's PlanetMath BrowserJa³ * No, it's not what some of you are thinking.
|
|
|
|
|
|
| GlassBeadGame |
Thu 29th January 2009, 1:39pm
|

Dharma Bum
        
Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981

|
QUOTE(Jon Awbrey @ Thu 29th January 2009, 6:54am)  QUOTE(EricBarbour @ Thu 29th January 2009, 5:06am)  I've got something even ranker: Sergey has claimed since 1998 that he read all these books while in high school and Stanford. Possible, though I have my doubts. (Also, can someone explain why Stanford keeps his personal website up? Bragging?) A bit OCR (Obsessive Compulsive Reader), but not otherwise unusual. I'm guessing that Stanford & Sons keeps a lot of old junk around as a kind of technophile* chronofile. Folks who want a taste of what a link matrix looks like on a smaller scale might have a look at this page-ranking application that someone created for the entries in PlanetMath. Myyn.Org's PlanetMath BrowserJa³ * No, it's not what some of you are thinking. Note that Brin had read both Shakespeare's Othello as well as Othello, The Moor of Venice. Also he read both Vonnegut's Slaughterhouse Five as well as Slaughterhouse-Five. Of course these are just his favorite books there where probably hundreds of books with slight variations of the title of the same work that he read but didn't list. It could be he showed a precocious ability for generating lists with no concern for underlying meaning from an early age.
|
|
|
|
|
|
|
  |
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
| |