Milton Roe
Tue 27th October 2009, 8:49pm
QUOTE(Somey @ Tue 27th October 2009, 12:41pm)

Meanwhile, the idea that people who want "maintainability" are really "deletionists" in disguise is probably valid in many cases, but let's face it, that's only because AfD's are often the only feasible avenue for getting rid of anything. Wikipedia doesn't do mass-deletions, hardly ever does mass page-protections, removal of significant amounts of content within an article is usually reverted (no matter how obscene, politically damaging, culturally-insensitive, etc. it is), and the "notability" standards are ridiculously low compared to a real published encyclopedia. So what else are you gonna do?
Yoiu mean if you were scientific and honest, and had integrity?
It's not like some of these questions could not be addressed epidemiologically. For one thing, there's a category "living people" which has 400,000 entries in a list,
http://en.wikipedia.org/wiki/Category:Living_peopleAs we've noted, these are still-breathing people who at any time were major and minor sports figures (anybody who ever played cricket at the national level anywhere on earth), major and minor entertainers (including actors who have had only bit parts, DJs, and the like), major and minor politicians (Uganda, etc.). Major and minor business figures. Major and minor royalty. Basically if you've ever stood in front of an audience in your life and done anything that was reported in the press anywhere, that's good enough to make you notable. Nevermind that this is most people of normal intelligence who have reached the age of 40.
What we'd like to know is: who watches these bios, including the stubs? The problem with the page-watcher tool is it has a cut-off of "below 30 watchers":
http://toolserver.org/~mzmcbride/cgi-bin/watcher.pyOtherwise we could use it to find pages not watched by anybody (or one or two people probably ar inactive). Gee, WP is constructed to hide its own vulnerabilities.

Okay, so we need to fix this to see how bad the problem is. Some bot needs to run this tool and get a real number for that entire 400,000 group of articles with this cat tag.
You can also use the "pageviews" tool on the people in cat:living people to see how many pageviews each one gets.
http://stats.grok.se/When you do, you find that there are plenty of BLPs which haven't been seen (pulled up) for months. But this tells you nothing about how many active people are watching them on their watchlist.
Okay, so now what?
Well, a scientist would then do an active intervention-- a stress test. We'd use a compuuuuter to add a tag line to some fraction of all BLPs, which says, in effect, "THIS HERE IS A TAG TO SEE HOW CAREFULLY THIS PAGE IS WATCHED. WHEN YOU SEE IT, PLEASE REVERT IT." The same program collects statistics to see how fast all these changes are fixed. The line has to be something that Cluebot and the like do not "see", and you have to make sure of all this, so the reversion bots don't foul up your stats (since it's quite possible for vandals to spoof them).
This would give you immediate feedback on how well BLPs are protected against vandalism. It could be repeated a number of times (especially if you've only done 4000 BLPs at a time, to give you the standard statistical error) to find out how "true" and reproducable your figure is.
Now, here's the hard part. No matter what your results are, some fraction of Wiki-inclusion warriors will say (after the fact) that they're "good enough." So, in order to prevent this, we need a PRIOR debate on how bad such results need to be, to be "unacceptable." For that to happen, the BLP inclusions have to commit themselves beforehand, to some number (like 50% within 24 hours, and no more than 10% left after 3 days, or something). This weeds out the people who obviously will not pick a line of unacceptability, since they intend to accept anything. This smokes those people out.
Even that is instructive.
Once the people who really don't care have revealed themselves, and everybody ELSE has committed themselves to some quality-level, the line may be drawn. THEN we run the stress test. No
ex post facto positions. Science is about making
a priori stands and predictions. Place your bets on the table and nobody touches them after the roulette wheel of nature starts.
Now, all this is NOT a difficult thing to do, leaving aside the social conflict. Any developer with access to the MediaWiki software could probaby code it up in a couple of days. But the important part is not to run the test first, and the argument later. You must have the argument first and then run the test. That's how good science works, and it's always shocking.
My guess is that the social resistance would far outweigh the technical difficulty of the project. Else it would have been done already.