|
Content contributors, statistical analysis |
|
|
Peter Damian |
|
I have as much free time as a Wikipedia admin!
Group: Regulars
Posts: 4,400
Joined:
Member No.: 4,212
|
My blog post for today http://ocham.blogspot.com/2011/10/repetiti...-wikipedia.html on whether there are statistically measurable properties that distinguish 'content contributors' from wiki-gnomes. Conclusion: the statistical difference is strongly indicative of a real difference, discussed in detail on the blog. Remaining questions: why do content contributors remain on the project, given that they have a lower status than those who perform repetitive and tedious work? Easily-learned repetitive labour is nearly always paid less in real life than labour which requires either specialised learning, or some innate but scarce skill. The simple reason for this is supply and demand. Rare or difficult-to-acquire skills are by definition in short supply, and will attract a higher price than common, easily acquired skills (at least, to my simple mind - I don't know any economics). So why is the situation apparently reversed on Wikipedia? The statistics suggest that the majority of administrators use these low-value skills like vandal reversion, template adding, linking to the Estonian Wikipedia etc. Yet their status on Wikipedia is high, whereas that of 'content contributors' is low. This post has been edited by Peter Damian:
|
|
|
|
|
|
Replies
Kelly Martin |
|
Bring back the guttersnipes!
Group: Regulars
Posts: 3,270
Joined:
From: EN61bw
Member No.: 6,696
|
The problem I have with the proposed statistical "rules" that have been presented in this discussion is that they're all ad hoc, rather than empirical. That is, instead of taking a sample of edits or editors, categorizing them by inspection, and then doing an analysis of variance (or some other regression analysis) to identify metrics that are correlated with the already-determined categorizations, you instead identify metrics that you have a priori decided ought to correspond with categorizations. That's methodologically bankrupt; a decisional rule that uses a metric as proxy to categorize members of a population has to be empirically justified, and not just backdoored in by handwaving.
And it's not enough to generate a statistic and then look to see if the extremes fit your hypothesis (e.g. Peter's post giving the "top scorers" on some metric which I think is edits per page); such an analysis is vulnerable to confirmation bias. You need to look at a broad sample from the entire population, not just the three-sigma tail, if you want an actual predictive rule.
From where I sit the "statistical" evidence I've seen posted in this thread ranges from inadequate to farcical. Let's take radek's four-way categorization. Not hard to test it: Take a sample of about 50 editors, categorize them by inspection (not of their statistics, but of their apparent behavior based on examining their edits) into the categories provided. Then generate the statistics radek proposes, and run the numbers to see if the metrics really do predict the categorization, and with what degree of certainty. Until you actually do this, you're just pissing into the wind.
(This is on my mind at the moment because I've been reading some of the materials related to the dual-polarization radar that the NWS just put up here in Chicago. They've done a lot of research to try to come up with rules to translate the various additional metrics the new radar offers into actionable information such as "hail", "snow", "freezing rain", and "graupel". While they have some concepts of what they think will happen, the actual methodology used in the field is based on collecting the metrics and cross-correlating it with "ground truth" reports of what is actually going on in the field. They're doing it right, which is why they can predict the weather, and you can't.)
|
|
|
|
Kelly Martin |
|
Bring back the guttersnipes!
Group: Regulars
Posts: 3,270
Joined:
From: EN61bw
Member No.: 6,696
|
QUOTE(Peter Damian @ Mon 31st October 2011, 3:57pm) QUOTE(Kelly Martin @ Mon 31st October 2011, 8:49pm) The problem I have with the proposed statistical "rules" that have been presented in this discussion is that they're all ad hoc, rather than empirical.
I read no further than that sentence, as it is clear you have no idea what you are talking about. Radek clearly does. Until I see a t-test or F-test score, I'm going to have to assume that you have no idea what you're talking about. Radek is at least making an effort. I'd like to see the actual analysis run; I rather doubt that the two axes in his proposal are truly orthogonal, for example. I suppose I should actually break down and look at the data set and see what, if anything, can be sucked out of it. In any case, I've been around long enough to distrust sloppy stats. It's amazing how often people's intuitions are wrong about statistical measures, especially in populations that exhibit markedly unbalanced distributions.
|
|
|
|
radek |
|
Ãœber Member
Group: Regulars
Posts: 699
Joined:
Member No.: 15,651
|
QUOTE(Kelly Martin @ Mon 31st October 2011, 4:08pm) QUOTE(Peter Damian @ Mon 31st October 2011, 3:57pm) QUOTE(Kelly Martin @ Mon 31st October 2011, 8:49pm) The problem I have with the proposed statistical "rules" that have been presented in this discussion is that they're all ad hoc, rather than empirical.
I read no further than that sentence, as it is clear you have no idea what you are talking about. Radek clearly does. Until I see a t-test or F-test score, I'm going to have to assume that you have no idea what you're talking about. Radek is at least making an effort. I'd like to see the actual analysis run; I rather doubt that the two axes in his proposal are truly orthogonal, for example. I suppose I should actually break down and look at the data set and see what, if anything, can be sucked out of it. In any case, I've been around long enough to distrust sloppy stats. It's amazing how often people's intuitions are wrong about statistical measures, especially in populations that exhibit markedly unbalanced distributions. I think that what Kelly is talking about above is something like External Validity (there seems to be some other criticisms mixed in as well). One way to do it is to somehow generate a list of randomly selected editors, then pass this list out to people familiar with Wikipedia and ask them to categorize these people according to the criteria above. Then see to what extent the subjective categorizations match up with categorizations based on epp and % articles. This wouldn't be totally ideal as people can have quite skewed and biased notions of themselves and others, in additions to having widely different definitions (a clear example is that guy calling Dr. Blofeld a "content creator" in that thread) Another way would be to first define what "gnomish" edit is, what a "content creating" edit is, what a "drama queen" post is etc. Then with these pre-set definition in hand go out and get that list of randomly selected editors and again, see if it matches up. This would be way too much work. (and in fact I'm somewhat ok with just DEFINING high % low epp editors as "Wiki gnomes" and high % high epp editors as "Content creators". Most of the trouble is with the low % folks) Hmmm, so you want a t stat or an F test. One thing I could do is to see if epp or % articles predict admin status (the logit or probit regression I mentioned before). Two problems would be the lack of randomness I mentioned above, and also that ideally we'd want to have the epp and % articles BEFORE a person became an admin, so that we get the causality right. But I don't think there's data on that though I might email soxred and axe him 'bout it. QUOTE(Kelly Martin @ Mon 31st October 2011, 4:19pm) QUOTE(radek @ Mon 31st October 2011, 3:56pm) I'm actually sort of doing this. There are two difficulties however. First, is how to sample these 50 editors. I can just pull people off the top of my head or what have you but I'm wary of some kind of bias - basically, I'm not sure how to randomly select these 50 people (this isn't a problem - at least to first approx - with articles, since we have the Random Article feature). The second part, as already mentioned is that for the low epp editors there's no way to distinguish "Posting a Lot at AN/I" from "Running Featured Article Reviews" because soxred counts both as edits to WP. The only way I can think of separating it out is by manually looking at the last 1000 or so edits of a particular editor and counting up the proportion of times they posted to ANI (or similar). This is doable but time consuming.
Overall though, I'm not sure if even then I'd call this "scientific" - it's more like those "Political compass" tests if anything. One thing which WOULD BE interesting is if somehow I could get this data on ALL editors (say, with more than 1000) edits and see which "cell" (or corner of the scatter plot) is "saturated" which one is "over saturated" and which one comes up empty.
(and if you look at that scatterplot above, the 4-way categorization does correctly predict for the 5 people I labeled on there. Malleus is regarded as "content creator". Dr.Blofeld is a "wiki gnome" (under this definition of gnome). Etc. But that's still a small and non-random sample so while encouraging it's not serious evidence at this point)
It looks like the soxred data generates around two dozen metrics per user, some of which are obviously interdependent (as the percentages necessarily add to 100%, so there's at least one degree of freedom eaten there). We can get random users by sampling the "All Users" list, but the problem with that is that most of them will be extremely low (that is, zero) edit count users; not terribly useful. However, the filtering process could be automated using the exposed API (http://en.wikipedia.org/w/api.php), and that API could also be used to automate gathering the "how much does this editor post to ANI" statistics you showed an interest in (although the API throttle might make that a slow process). So it's not unattainable, not in the least. I have no idea on how to do any of that.
|
|
|
|
Malleus |
|
Fat Cat
Group: Contributors
Posts: 1,682
Joined:
From: United Kingdom
Member No.: 8,716
|
QUOTE(Peter Damian @ Mon 31st October 2011, 9:34pm) QUOTE(radek @ Mon 31st October 2011, 9:23pm)
(and in fact I'm somewhat ok with just DEFINING high % low epp editors as "Wiki gnomes" and high % high epp editors as "Content creators". Most of the trouble is with the low % folks)
That would actually be perfect. We start with the behavioural assumption first. Anyone who is editing 3 times a minute or more on different articles is unlikely to be contributing what we call 'content'. That's behind our whole idea of what 'content' is. Namely, stuff you have to study the whole article carefully in order to add. I think the harder one is the high epp. E.g. FT2 famously spends a huge amount of time editing and re-editing the same sentence, sometimes 100 edits just for one paragraph. But it still reads like the long-winding verbose nonsense that it was in the first place. But even there, does it matter? Let's just define 'content' as what is added by high epp'ers. Then we have the logical deduction that a very high proportion of admins are not content-producers. What is actually much more difficult is choosing another population to compare with. Non-admins are too large. Anything else risks selection bias. What about those users like me who failed at RfA?
|
|
|
|
Ceoil |
|
Junior Member
Group: Contributors
Posts: 56
Joined:
Member No.: 8,131
|
QUOTE(Malleus @ Mon 31st October 2011, 10:38pm) QUOTE(Peter Damian @ Mon 31st October 2011, 9:34pm) QUOTE(radek @ Mon 31st October 2011, 9:23pm)
(and in fact I'm somewhat ok with just DEFINING high % low epp editors as "Wiki gnomes" and high % high epp editors as "Content creators". Most of the trouble is with the low % folks)
That would actually be perfect. We start with the behavioural assumption first. Anyone who is editing 3 times a minute or more on different articles is unlikely to be contributing what we call 'content'. That's behind our whole idea of what 'content' is. Namely, stuff you have to study the whole article carefully in order to add. I think the harder one is the high epp. E.g. FT2 famously spends a huge amount of time editing and re-editing the same sentence, sometimes 100 edits just for one paragraph. But it still reads like the long-winding verbose nonsense that it was in the first place. But even there, does it matter? Let's just define 'content' as what is added by high epp'ers. Then we have the logical deduction that a very high proportion of admins are not content-producers. What is actually much more difficult is choosing another population to compare with. Non-admins are too large. Anything else risks selection bias. What about those users like me who failed at RfA? Your Rfa was not so much a failure as an assassination. I'm sure your savy enough to realise the orchestration. This post has been edited by Ceoil:
|
|
|
|
Posts in this topic
Peter Damian Content contributors SB_Johnny
So why is the situation apparently reversed on Wi... Ottava
So why is the situation apparently reversed on W... Peter Damian
So why is the situation apparently reversed on W... communicat Peter/Edward, don't know if you've come ac... Peter Damian
Peter/Edward, don't know if you've come a... Ottava
Peter/Edward, don't know if you've come a... radek
My blog post for today [url=http://ocham.blogspot... Peter Damian
Wikipedia is not a market.
That's interest... radek
Wikipedia is not a market.
That's interes... thekohser
Wikipedia is not a market.
For most editors, no,... radek
My blog post for today http://ocham.blogspot.com/... Peter Damian
Oh yeah Peter, one thing. Your methodology will o... radek
Well, there's no perfect way of doing it but... Ottava
Well, there's no perfect way of doing it but ... Peter Damian
I have a feeling that you might want to break dow... EricBarbour
For example, Fetchcommons has 28.26% of his posts... radek
Bear in mind that many of those "wiki gnome... timbo
For example, Fetchcommons has 28.26% of his post... communicat PeterEdward, in my experience there's another ... Peter Damian
PeterEdward, in my experience there's another... communicat
[quote name='communicat' post='287348' date='Sun ... Peter Damian
I see no convincing comparison or correlation bet... Silver seren How would you account for the people that work on ... Peter Damian
How would you account for the people that work on... radek
How would you account for the people that work o... Peter Damian
As I mention above, after seeing Jechoman's ... radek
As I mention above, after seeing Jechoman's... Peter Damian
I think you have uncovered a certain asymmetric p... radek
[quote name='radek' post='287361' date='Sun 30th ... Peter Damian
[quote name='radek' post='287361' date='Sun 30th... radek
[quote name='radek' post='287366' date='Sun 30th ... EricBarbour
However, under 'content creators' there i... Ottava
Here, I made a matrix (and uploaded it to commons... radek
[quote name='radek' post='287366' date='Sun 30th ... Malleus
BTW, Malleus is a very clear outlier. Very high %... radek
Are you sure you don't mean "outlaw... A Horse With No Name
Are you sure you don't mean "outlaw... the fieryangel
Are you sure you don't mean "outlaw... communicat
No need for "hard demographic data". Ev... Ceoil Oh for fuck sake. If you just wanted to cram in a ... communicat
Oh for fuck sake. If you just wanted to cram in a... EricBarbour
People this website used to be fun, what happened... communicat
Gomi might disagree with you. See his recent mess... thekohser
Whoops, sorry, didn't mean that as a personal... Maunus How do I calculate where I fit in the contributor ... Peter Damian
How do I calculate where I fit in the contributor... radek
How do I calculate where I fit in the contributor... Peter Damian For the record, here are the top 20 scorers. Most... timbo Radek's Chart really nails it.
Silver seren m... radek
That chart really nails it.
Silver seren makes a... timbo
Second, the "autoreviewer" thing is a j... Peter Damian A message to me from a Wikipedian.
OK I need to... dogbiscuit
A message to me from a Wikipedian.
OK I need t... communicat
A message to me from a Wikipedian.
OK I need t... communicat
A message to me from a Wikipedian.
OK I need t... Malleus
A message to me from a Wikipedian.
OK I need ... communicat
[quote name='communicat' post='287419' date='Mon ... Peter Damian
I agree with Ceoil that you (and others in the di... thekohser
I suspect you are an idiot.
None of my experimen... radek
[quote name='communicat' post='287419' date='Mon ... communicat Peter/Edward?Whatever: You're becoming as bad ... thekohser Try this, Peter. Say five nice things about Wikip... Ottava One of the things I noticed is that even if you na... EricBarbour url=http://en.wikipedia.org/w/index.php?title=Kubl... Ceoil Sorry Eric, you make really great, LOUD, tubes (I... Peter Damian
Sorry Eric, you make really great, LOUD, tubes (I... Peter Damian
Sorry Eric, you make really great, LOUD, tubes (I... Ceoil Peter I'm not accusing you of anything, lets b... Ceoil Hi Peter. I'd like to engage Eric, he is often... Peter Damian
Hi Peter. I'd like to engage Eric, he is ofte... Ottava
Hi Peter. I'd like to engage Eric, he is ofte... Ceoil What Kelly said.
Peter I was not having a go at ... Peter Damian
What Kelly said.
Peter I was not having a go at... Ceoil I'm not a hallowed logician like you are, sitt... Peter Damian
I'm not a hallowed logician like you are, sit... radek
The problem I have with the proposed statistical ... Kelly Martin I'm actually sort of doing this. There are two... Malleus
Your Rfa was not so much a failure as an assassi... mbz1
What about those users like me who failed at RfA?... Malleus
[quote name='Malleus' post='287477' date='Mon 31s... EricBarbour
I had two: this is the first, and here's the ... mbz1
[quote name='mbz1' post='287489' date='Tue 1st No... Malleus
[quote name='mbz1' post='287489' date='Tue 1st N... Kelly Martin Another way would be to first define what "gn... Peter Damian
Another way would be to first define what "g... Kelly Martin What is your qualification in statistics, Kelly?Wh... Peter Damian
What is your qualification in statistics, Kelly?W... radek
Another way would be to first define what "g... Kelly Martin Well, I'm not going to send off my four-color ... SB_Johnny
[quote name='Peter Damian' post='287469' date='Mo... Peter Damian
I think that what Kelly was trying to point out h... communicat
I think that what Kelly was trying to point out ... timbo Thinking out loud here...
Each edit changes artic... radek
Thinking out loud here...
Each edit changes arti... Peter Damian Also, for the record, here are the first 27 of edi... radek
Also, for the record, here are the first 27 of ed... Kelly Martin What I want is a test. That is, I want a decision... radek
What I want is a test. That is, I want a decisio... Ceoil Peter I notice two things; one is you are defensiv... The Joy
Peter I notice two things; one is you are defensi... A Horse With No Name
Peter I notice two things; one is you are defens... Malleus
Hey, whatever happened to Ryan's hot girlfrie... Peter Damian
Peter I notice two things; one is you are defensi... Ottava
Peter I notice two things; one is you are defens... Vigilant
[quote name='Peter Damian' post='287509' date='Tu... GlassBeadGame
I'm very sorry about this. I really hadn... thekohser
As I have already indicated I don't believe a... Peter Damian
I'm very sorry about this. I really hadn... Ottava
[quote name='GlassBeadGame' post='287567' date='T... timbo
The main point actually is to engage with the stu... SB_Johnny
By demonstrating that, you have shown that, to t... EricBarbour
Is this a case of the blind leading the clueless,... SB_Johnny
Is this a case of the blind leading the clueless... carbuncle
There are other aspects of WP that haven't be... iii
What I want is a [b]test. That is, I want a deci... papaya Well, looking at my pie chart, about half my edits... Peter Damian
There are other aspects of WP that haven't be... Abd The kind of research being suggested here would be... Detective
Wikiversity may be the only WMF wiki that allows ... Anne Sexton I apologize for jumping into this after 7 pages, w... thekohser
This: http://arxiv.org/abs/1002.0561 (maybe you... gomi First, welcome to the Review, and thank you for a ... Anne Sexton
First, welcome to the Review, and thank you for a... EricBarbour Welcome to WR, Anne.
Just as an aside: one of the... Anne Sexton
Welcome to WR, Anne.
Just as an aside: one of th... Peter Damian I have updated the editing patterns http://www.log...
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
| |