I just wanted to suggest that we do more to weed out fake usernames that are either meant to be used as read only accounts (to get access to the pit and other places that are now or may in the future be only available to members) and accounts used for write access but only to post things that are all but trolling. I know the blatant troll accounts are immediately block as I have seen happen to admins who have tried to infiltrate this just to spam us but over IRC (#wikipedia on irc.freenode.net for those of you who are interested) I heard a discussion in regards to creating fake usernames specifically to spam this site and if nothing else to crowd the userlist with useless names with no posts.
Its already being delt with. And who are you anyways?
Fair enough but you do sound like you have a been a long time lurker here in the first place.
We do tend to keep an eye out for suspicious accounts. On average, I tend to suspend anywhere between two and five accounts a week. Obvious trolls are usually picked out with ease, as are accounts created with temporary email addresses (a la mailinator.com). I may eventually go through the member list and suspend accounts older than two weeks with no posts, but I don't see the need at this point.
Legal action will be taken against those who deliberately disrupt this website through spam denial-of-service attacks or otherwise.
I'm sure Daniel Brandt would love to help us match up IPs to names (although I can probably do most or all of it myself)
Deliberate vandalism to a website (forum or no) is punishable under international law or/as well as United States law including Title 18 of the U.S. Code to include the Computer Fraud and Abuse Act of 1986 and the National Information Infrastructure Protection Act.
Changed the topic title to something a bit more descriptive.
"http://en.wikiquote.org/wiki/Leon_Trotsky#Attributed", eh?
It would be very much appreciated if you or anyone else could supply me or any other administrator with a log of that conversation, and /whois nickname (which shows IP) of the IRC usernames that said the incriminating things. Thank you anyone for any help.
http://meta.wikimedia.org/wiki/IRC_Group_Contacts
I am very doubtful that Wikipedia members will have anything close to the tools needed for fast paced flooding.
Wget will do recursive crawling too. Looking at the site, I don't think the recursive crawl would work well. Even if it works, you'd end up with too much garbage.
I'd write a program that shells to Lynx -dump URL > outputfile
The first step is to make a text file of each of the 28 days, plus how many pages in each day.
Use this file to drive the program. This info is all you need to construct the URL for each fetch. By using Lynx, you already avoid about 35 percent of the characters in each file, because Lynx strips out the HTML.
After you get all your files, you can write a routine to parse out more noise at the top and bottom of each file that Lynx didn't delete. Make each line flush left by deleting any tabs or white space. Add the date to the time if it isn't there already. (The latest files include the date, but the early ones I looked at have the time only on each line.) Concatenate each of the 28 days into a single file for that day.
It would take a day of work, but it's probably worth doing. I just saw some of the stuff they were saying about me today on #wikipedia, and it's not very kind. Keeping a log like this could be evidence of their intent.
I know that Wikipedia considers these logs private, but as far as I know, there is no legal standing behind this policy. Is anyone aware of any legal problems with logging this stuff and making it searchable?
Finally, does anyone have a Linux program that can run stand-alone (no browser) that will log the channel? I don't know much about IRC -- have only played with it for a few weeks, and only with ChatZilla.
Better PM me on this; I'm sure they're listening.