Wikipedia-sponsored denial of service attack to this site, organized from #wikipedia IRC channel -

The Wikipedia Review: A forum for discussion and criticism of Wikipedia

Welcome, Guest! ( Log In | Register )

> Forum Information > Forum Information (Readme!) > Forum Information Archive

Wikipedia-sponsored denial of service attack to this site, organized from #wikipedia IRC channel, Info on illegal attack planned by Wikipedia users against this site

Options

Locke85	Post #1
Neophyte Group: Contributors Posts: 12 Joined: Member No.: 210	I just wanted to suggest that we do more to weed out fake usernames that are either meant to be used as read only accounts (to get access to the pit and other places that are now or may in the future be only available to members) and accounts used for write access but only to post things that are all but trolling. I know the blatant troll accounts are immediately block as I have seen happen to admins who have tried to infiltrate this just to spam us but over IRC (#wikipedia on irc.freenode.net for those of you who are interested) I heard a discussion in regards to creating fake usernames specifically to spam this site and if nothing else to crowd the userlist with useless names with no posts. This post has been edited by Locke85:

Replies

Daniel Brandt	Post #2
Postmaster Group: Regulars Posts: 2,473 Joined: Member No.: 77	Wget will do recursive crawling too. Looking at the site, I don't think the recursive crawl would work well. Even if it works, you'd end up with too much garbage. I'd write a program that shells to Lynx -dump URL > outputfile The first step is to make a text file of each of the 28 days, plus how many pages in each day. Use this file to drive the program. This info is all you need to construct the URL for each fetch. By using Lynx, you already avoid about 35 percent of the characters in each file, because Lynx strips out the HTML. After you get all your files, you can write a routine to parse out more noise at the top and bottom of each file that Lynx didn't delete. Make each line flush left by deleting any tabs or white space. Add the date to the time if it isn't there already. (The latest files include the date, but the early ones I looked at have the time only on each line.) Concatenate each of the 28 days into a single file for that day. It would take a day of work, but it's probably worth doing. I just saw some of the stuff they were saying about me today on #wikipedia, and it's not very kind. Keeping a log like this could be evidence of their intent. I know that Wikipedia considers these logs private, but as far as I know, there is no legal standing behind this policy. Is anyone aware of any legal problems with logging this stuff and making it searchable? Finally, does anyone have a Linux program that can run stand-alone (no browser) that will log the channel? I don't know much about IRC -- have only played with it for a few weeks, and only with ChatZilla. Better PM me on this; I'm sure they're listening.

« Next Oldest · Forum Information Archive · Next Newest »

1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)

0 Members:

Display Mode: Switch to: Standard · Switch to: Linear+ · Outline

Track this topic · Email this topic · Print this topic · Subscribe to this forum

Lo-Fi Version

Time is now:

FORUM WARNING [2] Cannot modify header information - headers already sent by (output started at /home2/wikipede/public_html/int042kj398.php:242) (Line: 0 of Unknown)