FORUM WARNING [2] Division by zero (Line: 2933 of /srcsgcaop/boardclass.php)
My upcoming plagiarism report -
     
 
The Wikipedia Review: A forum for discussion and criticism of Wikipedia
Wikipedia Review Op-Ed Pages

Welcome, Guest! ( Log In | Register )

> General Discussion? What's that all about?

This subforum is for general discussion of Wikipedia and other Wikimedia projects. For a glossary of terms frequently used in such discussions, please refer to Wikipedia:Glossary. For a glossary of musical terms, see here. Other useful links:

Akahele.orgWikipedia-WatchWikitruthWP:ANWikiEN-L/Foundation-L (mailing lists) • Citizendium forums

> My upcoming plagiarism report, How should I present it?
Daniel Brandt
post
Post #1


Postmaster
*******

Group: Regulars
Posts: 2,473
Joined:
Member No.: 77



I need suggestions on how to present my plagiarism report at wikipedia-watch.org. I still have several weeks of work to do, despite the fact that I've been working a few hours a day on it for the last three weeks.

I'm far enough along in terms of separating the signal from the noise, that I can now predict that the report will end up with between 100 and 300 examples. Here's a throwaway example, that will probably get corrected as soon as someone from Wikipedia sees this post:

Wikipedia version as of mid-September, 2006

Source that was plagiarized

Most of my examples are similar to this -- except they're not from Britannica, but rather from everywhere imaginable. Almost all of the original sources have clear copyright notices on them, and the source is not acknowledged on the Wikipedia article, and anywhere from several sentences to several paragraphs are plagiarized.

My question is, "How can I format the report so that anyone looking at it will get the picture, within a few clicks, that Wikipedia has a plagiarism problem?"

So far my best idea is to have a doorway page explaining that my examples were culled from a sampling of slightly less than one percent of the 1.4 million English-language Wikipedia articles. If I have 200 examples, then we can presume that there are about 20,000 plagiarized articles in Wikipedia that no one has yet discovered. No one has made any attempt to discover them, and no one ever will. It's just too hard. Even for programmers with a pipeline into automated Google inquiries, it's still too hard. There's an amazing amount of manual checking that's required to reduce the noise without throwing out the signal.

This doorway page will link to 200 subpages (Example 001, Example 002, ... Example 200). Each of the subpages will be titled "Plagiarism on Wikipedia - Example 001" and have a link to the source, plus a link to the version on Wikipedia as of mid-September when I grabbed the page. Then below this, the text-portion only from that page (this is easy to strip out of the XML versions of the article that I already have) will be reproduced, and the sections that are plagiarized from the source will be in highlighted in background yellow.

The effect will be that the visitor to the doorway page is given some information on how the examples were found, and is invited to click randomly on any of the 200 examples to see for themselves. I'm linking to the mid-September version, since it's possible that many editors will start cleaning up these 200 examples. One way they will try to clean it up is to acknowledge the source, but that still doesn't solve the problem that entire paragraphs were copied verbatim. They'll have to change sentences around too.

Therefore, I predict that Jimmy will claim that Wikipedia is amazingly free from plagiarism, because Wikipedia has always had a zero-tolerance policy. (This will be a lie -- there have been no efforts to identify plagiarism at Wikipedia.) Then he will zap totally all 200 articles (no history, no nothing) so that the links to the September version on my subpages won't work. That's why I have to reproduce the text from the article and highlight the plagiarized material. If I don't, my report will not be convincing after Jimmy zaps the 200 articles.

Any other ideas?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
 
Reply to this topicStart new topic
Replies
Somey
post
Post #2


Can't actually moderate (or even post)
*********

Group: Moderators
Posts: 11,816
Joined:
From: Dreamland
Member No.: 275



One more thing...

Another way they'll probably attack this will be to delete any mention of it in Wikipedia itself, under the pretense that Brandt "self-publishes" and is therefore not a "reliable source." As it turns out, plans are already underway to replace WP:NOR and WP:V with a new, "more cohesive" policy that will help them squelch any dissent over the plagiarism issue, ensuring that they can sweep the whole thing under the rug as soon as possible:

http://en.wikipedia.org/wiki/Wikipedia:Att...blished_sources

And I'll give you one guess as to who's the author of this new proposed policy:

QUOTE(Slimmy @ Right about now)
A self-published source is a published source, online or on paper, that has not been subject to any form of independent fact-checking, or where no one stands between the writer and the act of publication. Anyone can create a website or pay to have a book published and then claim to be an expert in a certain field. For that reason, self-published books, personal websites, and blogs are usually not acceptable as reliable sources.

They're awfully predictable sometimes, aren't they? It's kind of sad, in a way.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Placeholder
post
Post #3


Member
***

Group: On Vacation
Posts: 204
Joined:
Member No.: 287



/

This post has been edited by Joey:
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Daniel Brandt
post
Post #4


Postmaster
*******

Group: Regulars
Posts: 2,473
Joined:
Member No.: 77



QUOTE(Joey @ Wed 11th October 2006, 11:33am) *
The approach here might be to submit the matter for publication. You could either write it yourself for some reputable publication, or offer your facts to a reputable publication. Since it is a well-organized and documented study, some professional journal might take it.

It was Seigenthaler's idea to do a plagiarism study of Wikipedia in the first place. He asked a couple of journalism professors to assign a couple of students to it, and mentioned this to me in an email. I thought about this, and soon realized that any student wouldn't have a snowball's chance in hell of separating the signal from the noise. You need experience with Wikipedia, you need programming skills, you need a pipeline into automated Google queries, and you need a huge amount of time -- much more time than a student has for a single course.

I decided that Seigenthaler's idea had merit, and realized that I was one of the few people who could do this. He knows I'm doing it, and he's delighted that I'm doing it. You can be sure that before the report is made live on my website, he and I will work together to find an interested journalist from mainstream media (maybe the AP or the NYT) who would like a scoop. When the scoop is published, at that instant the study goes live. Until then, there's nothing Jimmy and Brad and Danny can do, because they don't know which 200 articles will end up as my examples.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Posts in this topic
Daniel Brandt   My upcoming plagiarism report  
EuroSceptic   1. Provide link, but also save all WP versions loc...  
Jonny Cache   What EuroSceptic suggests sounds like the first th...  
Joey   /  
Skyrocket   Plagiarism? It's trivial. What about copyrig...  
poopooball   whats scary is taht the plagerizer here says hes a...  
Daniel Brandt   Most of the plagiarism in my examples will also se...  
Somey   Well, I'm certainly impressed! Nice work...  
Joey   /  
Ashibaka   It was Seigenthaler's idea to do a plagiarism...  
Daniel Brandt   I gotta say, that's pretty cool! Make sure...  
Daniel Brandt   Actually, Wikipedia lacks tools to convert individ...  
Somey   If you try to save an actual page from Wikipedia a...  
Uly   You'll probably want to prepare an argument fo...  
Daniel Brandt   Somey: If you need Explorer to read them, that mea...  
guy   Let's hope they do say that. Daniel can point...  
Daniel Brandt   Let's hope they do say that. Daniel can point...  
poopooball   looks like plagarist librerian fixed it. http://...  
taiwopanfob   I guess the obvious should be said if it hasn...  
Joey   /  
Daniel Brandt   Look here -- I'm picking up the MSN cache copy...  
Surfer   For presentation: I like Euro´s suggestion, too...  
guy   That's unlikely to work for old but still in c...  
Joey   /  
guy   I'm not certain what relevance the fact that ...  
Joey   /  
guy   Absent a definite article that would expose the i...  
Joey   /  
JohnA   The only problem I can see is that Wikipedia may g...  
guy   I expect they'll say that there are a handful ...  
Daniel Brandt   Here's how I'm planning on doing each exam...  
Somey   I say we all block ourselves for 45 minutes, go ma...  
Joey   ?  
JohnA   So I wouldn't sweat it, personally. If the m...  
Daniel Brandt   More tips for Wikipedia critics with their own ser...  
Daniel Brandt   Citizendium is one example. Another example is the...  
Somey   It's hard enough to sell a print version that ...  
JohnA   The problem is that Wikipedia is too big, and by...  


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

-   Lo-Fi Version Time is now:
 
     
FORUM WARNING [2] Cannot modify header information - headers already sent by (output started at /home2/wikipede/public_html/int042kj398.php:242) (Line: 0 of Unknown)