The Wikipedia Review: A forum for discussion and criticism of Wikipedia
Wikipedia Review Op-Ed Pages

Welcome, Guest! ( Log In | Register )

 
Reply to this topicStart new topic
> Your two rights on Wikipedia
thekohser
post Mon 18th February 2008, 7:05pm
Post #1


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



I've heard it said that users have two rights on Wikipedia:

1. The right to "fork" the database
2. The right to leave the project

So, I'm thinking about forking the English Wikipedia. How exactly does one go about doing that? I thought that the Wikimedia Foundation had given up about 18 months ago with trying to produce regularly-available data dumps of the entire project, presumably because their servers were choking on the process.

Is it now incumbent on a forker of the mother database (the "mother forker") to execute the entire process from "outside" Wikipedia?

And another question -- how might one fork the Simple English Wikipedia, which has a much more manageable 25,704 articles?

Greg
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Nathan
post Mon 18th February 2008, 7:43pm
Post #2


Retired
******

Group: Inactive
Posts: 1,609
Joined: Mon 27th Feb 2006, 6:35pm
From: Ottawa, Ontario, Canada
Member No.: 17

WP user page - talk
check - contribs



Find the data dumps then import them into your database?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Mon 18th February 2008, 7:53pm
Post #3


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



QUOTE(Nathan @ Mon 18th February 2008, 2:43pm) *

Find the data dumps then import them into your database?


Sure, but where are these elusive data dumps? I thought the last stable, successful one was back at the end of 2006!?

Greg
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
GlassBeadGame
post Mon 18th February 2008, 9:00pm
Post #4


Dharma Bum
*********

Group: Contributors
Posts: 7,919
Joined: Sat 17th Feb 2007, 12:55am
From: My name it means nothing. My age it means less. The country I come from is called the Mid-West.
Member No.: 981



QUOTE(thekohser @ Mon 18th February 2008, 2:53pm) *

QUOTE(Nathan @ Mon 18th February 2008, 2:43pm) *

Find the data dumps then import them into your database?


Sure, but where are these elusive data dumps? I thought the last stable, successful one was back at the end of 2006!?

Greg


I don't think there is anyway to execute a dump from upside as you would need sufficient permissions on the database, so you would have to rely on an existing publicly available dumps. It would still be an interesting project even if the dump was rather old. After all it's not like the project is improving anymore. You could insist on IRL identities of editors, respect experts, treat businesses with respect, exercise editorial restraint and implement BLP reform. I think the approach would be like marble sculpture. Cut away everything that doesn't look like an encyclopedia. You would have a much better product within a year, even with only modest number of committed editors.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
gomi
post Mon 18th February 2008, 9:01pm
Post #5


Member
********

Group: Members
Posts: 3,022
Joined: Fri 17th Nov 2006, 6:38pm
Member No.: 565



This is one of the big lies of Wikipedia -- that you can fork it. There have been successful backups during 2007 -- as recently as December, but they get removed as soon as they are complete. There is a very small window in which to pick one up. Wordbomb has some, but I think they are old.

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
EternalIdealist
post Wed 20th February 2008, 5:33am
Post #6


New Member
*

Group: Contributors
Posts: 22
Joined: Wed 2nd Jan 2008, 2:25pm
From: In-patient Wikipedia recovery clinic
Member No.: 4,330



The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. rolleyes.gif

Wikimedia db dumps

Wikipedia's page about acquiring the db
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Somey
post Wed 20th February 2008, 5:41am
Post #7


Can't actually moderate (or even post)
*********

Group: Moderators
Posts: 11,815
Joined: Sat 17th Jun 2006, 7:47pm
From: Dreamland
Member No.: 275



Yeah! I even took a photo of one, just the other day:

FORUM Image


I'm not sure how you'd fork something like that, though. Maybe a pitchfork...
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Wed 20th February 2008, 5:52am
Post #8


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



QUOTE(EternalIdealist @ Wed 20th February 2008, 12:33am) *

The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. rolleyes.gif

Wikimedia db dumps

Wikipedia's page about acquiring the db


LOL. Try clicking the English Wikipedia HTML dump. (Doesn't work.)

Try grabbing the XML dump of just the most recent edited pages of the English Wikipedia. (Doesn't work.)

So, you were rolling your eyes, because...?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Pumpkin Muffins
post Wed 20th February 2008, 6:19am
Post #9


Über Member
*****

Group: Regulars
Posts: 656
Joined: Wed 28th Nov 2007, 4:48pm
Member No.: 3,972



QUOTE(thekohser @ Mon 18th February 2008, 7:05pm) *

I've heard it said that users have two rights on Wikipedia:

1. The right to "fork" the database
2. The right to leave the project

So, I'm thinking about forking the English Wikipedia. How exactly does one go about doing that? I thought that the Wikimedia Foundation had given up about 18 months ago with trying to produce regularly-available data dumps of the entire project, presumably because their servers were choking on the process.

Is it now incumbent on a forker of the mother database (the "mother forker") to execute the entire process from "outside" Wikipedia?

And another question -- how might one fork the Simple English Wikipedia, which has a much more manageable 25,704 articles?

Greg


to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.



User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Wed 20th February 2008, 6:24am
Post #10


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



QUOTE(Pumpkin Muffins @ Wed 20th February 2008, 1:19am) *

to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.


Pumpkin, I realize that. Show me where I can get a working copy of the 6 GB file of "All pages, current versions only". Please!
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Pumpkin Muffins
post Wed 20th February 2008, 6:57am
Post #11


Über Member
*****

Group: Regulars
Posts: 656
Joined: Wed 28th Nov 2007, 4:48pm
Member No.: 3,972



QUOTE(thekohser @ Wed 20th February 2008, 6:24am) *

QUOTE(Pumpkin Muffins @ Wed 20th February 2008, 1:19am) *

to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.


Pumpkin, I realize that. Show me where I can get a working copy of the 6 GB file of "All pages, current versions only". Please!


here? or here? ... don't know if these files are functional though. The xml dumps need to be converted.

This post has been edited by Pumpkin Muffins: Wed 20th February 2008, 7:22am
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Nathan
post Wed 20th February 2008, 7:15am
Post #12


Retired
******

Group: Inactive
Posts: 1,609
Joined: Mon 27th Feb 2006, 6:35pm
From: Ottawa, Ontario, Canada
Member No.: 17

WP user page - talk
check - contribs



QUOTE(thekohser @ Wed 20th February 2008, 12:52am) *

QUOTE(EternalIdealist @ Wed 20th February 2008, 12:33am) *

The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. :rolleyes:

Wikimedia db dumps

Wikipedia's page about acquiring the db


LOL. Try clicking the English Wikipedia HTML dump. (Doesn't work.)

Try grabbing the XML dump of just the most recent edited pages of the English Wikipedia. (Doesn't work.)

So, you were rolling your eyes, because...?


There's a dump right there, though. oops, not what you want.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
dtobias
post Wed 20th February 2008, 1:27pm
Post #13


Obsessive trolling idiot [per JzG]
*******

Group: Regulars
Posts: 2,213
Joined: Sun 11th Feb 2007, 2:45pm
From: Boca Raton, FL, USA
Member No.: 962

WP user page - talk
check - contribs



When you gotta take a dump, you gotta take a dump!

To the tune of the William Tell Overture / Lone Ranger theme:

Take a dump, take a dump, take a dump dump dump
Take a dump, take a dump, take a dump dump dump
Take a dump, take a dump, take a dump dump dump
Every day, take a dump dump dump!

User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Error59
post Wed 20th February 2008, 1:33pm
Post #14


Junior Member
**

Group: Contributors
Posts: 54
Joined: Thu 4th Oct 2007, 12:10am
Member No.: 3,363



Dtobias - you may enjoy The Diarrhea Song happy.gif
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Wed 20th February 2008, 2:03pm
Post #15


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



Reminds me of a song a co-worker of mine would sing from the Men's room when I worked in a carpet warehouse as a teenager --

Stranded! Stranded! Stranded on the bathroom bowl...

What do you do, when you just had a poo...

And you gotta have a roll?!
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
JohnA
post Wed 20th February 2008, 2:38pm
Post #16


Looking over Winston Smith's shoulder
******

Group: Regulars
Posts: 1,171
Joined: Sun 30th Jul 2006, 9:56pm
Member No.: 313



I assume that Wikipedia has told you to go fork yourself?

Greg, this is probably what you want: http://download.wikimedia.org/enwiki/20080...rticles.xml.bz2
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Wed 20th February 2008, 2:47pm
Post #17


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



QUOTE(JohnA @ Wed 20th February 2008, 9:38am) *

I assume that Wikipedia has told you to go fork yourself?

Greg, this is probably what you want: http://download.wikimedia.org/enwiki/20080...rticles.xml.bz2


Perhaps. We'll see -- I'm 66% downloaded now.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
JohnA
post Fri 22nd February 2008, 8:11am
Post #18


Looking over Winston Smith's shoulder
******

Group: Regulars
Posts: 1,171
Joined: Sun 30th Jul 2006, 9:56pm
Member No.: 313



Now that you've got it, what are you going to do with it?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
thekohser
post Fri 22nd February 2008, 1:54pm
Post #19


Member
*********

Group: Regulars
Posts: 10,274
Joined: Thu 1st Feb 2007, 10:21pm
Member No.: 911



QUOTE(JohnA @ Fri 22nd February 2008, 3:11am) *

Now that you've got it, what are you going to do with it?


Stay tuned. I'm assembling a strategy team and will likely be incorporating, either with or without venture capital. We've already discussed what I might do with it, elsewhere on here.

To discuss any more would just be sabotaging my own first-mover advantage.

Greg
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

-   Lo-Fi Version Time is now: 26th 11 14, 12:12am