Printable Version of Topic

Click here to view this topic in its original format

_ MediaWiki Software _ How to fork a wiki

Posted by: emesee

Step 1

Sign up for hosting somewhere. You need to have PHP and MySQL installed, at least.

Dreamhost will work nicely, and quite often, you can signup, with a unique domain name for just 10 dollars, for the first year.

There are many other hosts to signup with. It will often run about 5-15+ dollars a month. If you are not getting mega ammounts of traffic, a host somewhere within this range should suffice.

Step 2

Install MediaWiki. This can have a bit of a learning curve, but once you get the hang of it, it is not super super difficult. See:
http://www.mediawiki.org/wiki/Manual:Installing_MediaWiki

Step 3

Get the XML dump of the wiki that you wish to fork.

Many can be found at:
http://download.wikimedia.org/backup-index.html

If you want to fork Commons, then see:
http://yousefourabi.com/blog/semantic-web/download-all-wikipedia-images-with-wikix

If the wiki you wish to fork does not have an XML dump available, then you still could fork the wiki, but it could be a pain. You would have to get the page list at [[Special:Allpages]] and then either export all pages manually at [[Special:Export]] (with a large wiki, this could take a long, long time) or else setup some sort script to do it (find it using Google, if it exists, or write it yourself).

Step 4

Import the dump. See:
http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps

Appendix

Configure the Wiki

Then of course, configure MediaWiki at sometime in here. You may want to Import [[MediaWiki:common.css]] from the big wiki.

You can enable the use of images from Commons. See:
http://www.mediawiki.org/wiki/Manual:$wgForeignFileRepos

A pretty decent overview of other good to know configurations are at:
http://wiki.dreamhost.com/MediaWiki

You may need to tweak the configuration over time. See:
http://www.mediawiki.org/wiki/MediaWiki

The MediaWiki wiki has a lot of information on it, and is organized perhaps pretty reasonably.

smile.gif

Posted by: CharlotteWebb

You left out the all-important detail of how to get other people to contribute to your site rather than to Wikipedia, and how to keep it from degenerating into a grudge site being colonized primarily by unpersons banned from TOW, etc.

If two trees diverge in a yellow wood, but not even Robert Frost visits your wiki, is it still a fork? tongue.gif

Posted by: Eva Destruction

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 7:39pm) *

You left out the all-important detail of how to get other people to contribute to your site rather than to Wikipedia, and how to keep it from degenerating into a grudge site being colonized primarily by unpersons banned from TOW, etc.

If two trees diverge in a yellow wood, but not even Robert Frost visits your wiki, is it still a fork? tongue.gif

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.

Posted by: emesee

Thank you for the feedback. laugh.gif

Posted by: CharlotteWebb

QUOTE(Eva Destruction @ Sun 3rd May 2009, 6:45pm) *

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.

Hmm, do the http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/All claim (or need to claim) any kind of immunity? That's probably news to them.

How about the "scraper" sites which just load the current version of a Wikipedia page and pass it along to you (substituting their own site's name and logo, etc.)?

Posted by: Tex

Forks can work if there's a substantial amount of people who edit in a non-malicious manner and keep pages up to date. This happens if the wiki is associated with a topic which can interest a number of people from a highly populated forum (WoW wiki, Star Wars wiki and Star Trek wiki are certainly successful, and more accurate than the wikiprojects on Wikipedia related to those topics).

Posted by: CharlotteWebb

QUOTE(Tex @ Sun 3rd May 2009, 7:45pm) *

Forks can work if there's a substantial amount of people who edit in a non-malicious manner and keep pages up to date. This happens if the wiki is associated with a topic which can interest a number of people from a highly populated forum (WoW wiki, Star Wars wiki and Star Trek wiki are certainly successful, and more accurate than the wikiprojects on Wikipedia related to those topics).

So would it be possible to start a wiki based on a narrow topic (to lure the Right Kind Of People onto the ground floor) then gradually broaden its scope to include, eh... potentially everything? Has anyone ever taken this approach?

Posted by: Eva Destruction

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 8:04pm) *

QUOTE(Eva Destruction @ Sun 3rd May 2009, 6:45pm) *

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.

Hmm, do the http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/All claim (or need to claim) any kind of immunity? That's probably news to them.

How about the "scraper" sites which just load the current version of a Wikipedia page and pass it along to you (substituting their own site's name and logo, etc.)?

Those listed at Wikipedia:Mirrors and forks/All are mostly specialist projects that have just forked a few pages and have checked them before hosting their own versions. The mirror sites should, at any rate, not be editable and purely mirror the Wikipedia pages, and are essentially just an alternative means of accessing the main site's content. Major scrapers like http://www.bbc.co.uk/music/artists/ that mirror Wikipedia pages (as opposed to using Wikipedia articles as a base on which to write their own articles) are http://www.bbc.co.uk/music/faqs#what_happens_if_wikipedias_vandalised and tend to have a policy of suspending particular articles on request if issues are raised regarding accuracy. As I understand it, what Emesee is talking about (correct me if I'm wrong) is the mechanics of duplicating the whole of Wikipedia and using it as a basis – it would be virtually impossible to even attempt to check the 2million+ articles to ensure an accurate starting point, without Wikipedia's high user base and equally high (but often ignored) casual-browser-who-quietly-fixes-an-error count. As I understand it, this was one of the main reasons Larry ditched the original plan to import-and-clean-up the whole of Wikipedia as a starting point for Citizendium.

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 8:55pm) *

QUOTE(Tex @ Sun 3rd May 2009, 7:45pm) *

Forks can work if there's a substantial amount of people who edit in a non-malicious manner and keep pages up to date. This happens if the wiki is associated with a topic which can interest a number of people from a highly populated forum (WoW wiki, Star Wars wiki and Star Trek wiki are certainly successful, and more accurate than the wikiprojects on Wikipedia related to those topics).

So would it be possible to start a wiki based on a narrow topic (to lure the Right Kind Of People onto the ground floor) then gradually broaden its scope to include, eh... potentially everything? Has anyone ever taken this approach?

Conservapedia – which started out as a host for teaching materials for Eagle Forum University – is one that springs to mind.

Posted by: UseOnceAndDestroy

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 8:04pm) *

Hmm, do the http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/All claim (or need to claim) any kind of immunity? That's probably news to them.

Most of the made-for-advertising sites will not give a toss - they're either outside of practical legal reach, or will fold at the first sniff of legal threat and move on to the next spamvertised domain.

QUOTE
How about the "scraper" sites which just load the current version of a Wikipedia page and pass it along to you (substituting their own site's name and logo, etc.)?

How many scrapers "load the current version"? My observation is the majority use an off-the-shelf script to make a static copy of wikipedia pages, slap their own css and adsense code on it, then never update it again - propagating wikipedia's errors indefinitely.

Posted by: CharlotteWebb

QUOTE(UseOnceAndDestroy @ Mon 4th May 2009, 11:11am) *

How many scrapers "load the current version"? My observation is the majority use an off-the-shelf script to make a static copy of wikipedia pages, slap their own css and adsense code on it, then never update it again - propagating wikipedia's errors indefinitely.

http://meta.wikimedia.org/wiki/Live_mirrors, however their access tends to be blocked as they are identified.

See also http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks#Remote_loading

Posted by: thekohser

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 2:39pm) *

You left out the all-important detail of how to get other people to contribute to your site rather than to Wikipedia, and how to keep it from degenerating into a grudge site being colonized primarily by unpersons banned from TOW, etc.

If two trees diverge in a yellow wood, but not even Robert Frost visits your wiki, is it still a fork? tongue.gif


A successful fork would also have to show how, at least sometimes, it can http://search.msn.com/results.aspx?q=defective%20gypsum%20board&FORM=MSNH11 for the #1 spot in a search result page.

Posted by: anthony

QUOTE(Eva Destruction @ Sun 3rd May 2009, 6:45pm) *

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.


Based on what? The court cases I've read have suggested that you'd be protected by Section 230.

Posted by: Eva Destruction

QUOTE(anthony @ Tue 5th May 2009, 4:50am) *

QUOTE(Eva Destruction @ Sun 3rd May 2009, 6:45pm) *

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.


Based on what? The court cases I've read have suggested that you'd be protected by Section 230.

It would depend on interpretation. The "Protector of Wikipedia" line of s 230 is "No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider" – liability would come down to the interpretation of "another information content provider". As has been often remarked, Wikipedia itself is on fairly shaky ground should anyone bring a full legal challenge – and all this is dependent on the site being hosted in the US with its lax libel laws (a UK or French-hosted Wikipedia would last about 10 minutes in the libel courts, for example).

(adding to clarify what I mean) – Wikipedia's defense against libel actions (as with Myspace, Facebook, Livejournal etc – and WR itself for that matter) is that it's a "passive" host on which other people post comments and that it while it can moderate comments after they're posted it has no control over what people initially post. You, on the other hand, would be actively choosing to import the potentially defamatory material.

Posted by: CharlotteWebb

QUOTE(thekohser @ Tue 5th May 2009, 3:40am) *

A successful fork would also have to show how, at least sometimes, it can http://search.msn.com/results.aspx?q=defective%20gypsum%20board&FORM=MSNH11 for the #1 spot in a search result page.

I was talking about how to recruit a critical mass of contributors (ones interested in the goal of a particular project). You can google-bomb and keyword-stuff to your heart's content without having that, or a wiki for that matter.

One might argue that aggressive SEO is the great recruiter because it makes people more likely to find your site. It will noticeably attract more readers for sure, but that doesn't necessarily translate to more editors. I know I regularly read over a dozen wiki sites in which I may never develop any interest in editing.

The notion that my contributions (if i contributed) could be on the front page of Google, or even ahead of Wikipedia (just like John Lennon claiming to be "higher than Jesus right now"—so what?), doesn't mean much if anything to me. There's no way that alone would convince somebody to contribute unless their ultimate goal is self-promotion. The rationale for joining must come from somewhere within the site. Personally I'd look for ways in which a project's modus operandi favorably differs from WP, and frankly the front page of your site makes me more than a little bit queasy.

Posted by: anthony

QUOTE(Eva Destruction @ Tue 5th May 2009, 1:56pm) *

QUOTE(anthony @ Tue 5th May 2009, 4:50am) *

QUOTE(Eva Destruction @ Sun 3rd May 2009, 6:45pm) *

You also forgot step 5: realise that the moment you start hosting it (and consequently become the "publisher") you become liable for any libels, errors etc and for taking steps to fix them (Jimbo & pals can at least point to the fact that they try to fix errors), and also become subject to the laws of your country as opposed to Florida's comparatively lax libel laws. And while IANAL, I suspect that because you'd be actively importing any offending content – as opposed to just being the host on which other people post libels – you'd be on very shaky ground if you tried to claim s230, should anyone take exception to any of the offending material.


Based on what? The court cases I've read have suggested that you'd be protected by Section 230.

It would depend on interpretation.


Well, yeah, but what have you seen to suggest that the courts wouldn't interpret Section 230 to be applicable to a mirror or fork? I'd point to Batzel v. Smith to argue that the courts would find Section 230 applicable, but I haven't been following all that closely, maybe there's a more recent precedent which went the other way?

QUOTE(Eva Destruction @ Tue 5th May 2009, 1:56pm) *

The "Protector of Wikipedia" line of s 230 is "No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider" – liability would come down to the interpretation of "another information content provider".


And how could that be interpreted to protect the WMF but not a fork? It seems to me that if anything a fork would be *better* protected than the WMF - even if the original contributor isn't "another information content provider", the WMF is. But courts have consistently interpreted "another information content provider" to be the initiator of the statement. (Roommates.com was found to have initiated certain statements by providing the questionnaire.)

QUOTE(Eva Destruction @ Tue 5th May 2009, 1:56pm) *

As has been often remarked, Wikipedia itself is on fairly shaky ground should anyone bring a full legal challenge – and all this is dependent on the site being hosted in the US with its lax libel laws (a UK or French-hosted Wikipedia would last about 10 minutes in the libel courts, for example).


Being often remarked doesn't make it true. Maybe "Wikipedia" is on fairly shaky ground, but I haven't seen any evidence that this is the case. Do you have any evidence of this other than it being "often remarked"? A legal argument? A precedent which can't be easily distinguished?

If not, that's fine, you're entitled to your opinion. But I'm just pointing out that it's only an opinion, and that it goes against much of the case law.

QUOTE(UseOnceAndDestroy @ Mon 4th May 2009, 11:11am) *

QUOTE(CharlotteWebb @ Sun 3rd May 2009, 8:04pm) *

Hmm, do the http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/All claim (or need to claim) any kind of immunity? That's probably news to them.

Most of the made-for-advertising sites will not give a toss - they're either outside of practical legal reach, or will fold at the first sniff of legal threat and move on to the next spamvertised domain.


That gave me an idea. Why not see what Answers Corporation has to say in its latest 10-K?

QUOTE

We may be subject to liability for online services, which may not be limited by the safe harbors in The Digital Millennium Copyright Act, or DMCA, The Communications Decency Act, or CDA, or the U.S. Children’s Online Privacy Protection Act, or COPPA. If we do not meet the safe harbor requirements, or if it is otherwise determined that our Web properties contain actionable content, we could be subject to claims, which could be costly and time-consuming to defend.

We host certain services that enable individuals to generate content and engage in various online activities. The law relating to the liability of providers of these online services for activities of their users is currently unsettled both within the United States and internationally. Claims have been threatened and may in the future be brought against us for defamation, invasion of privacy, negligence, copyright or trademark infringement, unlawful activity, tort, including personal injury, fraud, or other theories based on the nature and content of information to which we provide links, or that may be posted online or generated by the users of our Web properties. Our defense of any of these actions could be costly and involve significant time and attention of our management and other resources.

The DMCA is intended, among other things, to reduce the liability of online service providers for listing or linking to third party Web properties that include materials that infringe copyrights or rights of others. Additionally, portions of the CDA are intended to provide statutory protections to online service providers who distribute third party content. A safe harbor for copyright infringement is also available under the DMCA to certain online service providers that provide specific services, if the providers take certain affirmative steps as set forth in the DMCA. Important questions regarding the safe harbor under the DMCA and the CDA have yet to be litigated, and we can not guarantee that we will meet the safe harbor requirements of the DMCA or of the CDA. If we are not covered by a safe harbor, for any reason, we could be exposed to claims, which could be costly and time-consuming to defend.

In addition, COPPA was enacted in October 1998. COPPA imposes civil and criminal penalties on persons distributing material harmful to minors over the Internet to persons under the age of 17 or collecting personal information from children under the age of 13. We do not knowingly collect and disclose personal information from minors. The manner in which COPPA may be interpreted and enforced cannot yet be determined. Moreover, the applicability to the Internet of existing laws governing issues such as property ownership, copyright, defamation, obscenity and personal privacy is uncertain. We may be subject to claims that our content violates such laws, which could damage our business and cause our stock price to decline.

We also periodically enter into arrangements to offer third party products, services or content under the Answers brand or through our Web properties. We may be subject to claims concerning these products, services or content by virtue of our involvement in marketing, branding, broadcasting or providing access to them, even if we do not ourselves host, operate, provide, or provide access to them.

It is also possible that, if any information provided directly by us contains errors or is otherwise negligently provided to users, third parties could make claims against us. While it is our belief that the Terms of Use governing the use of our Web properties covers us against these types of claims, there are no assurances as to the final determination of these types of claims by any court of law. Furthermore, investigating and defending any of these types of claims is expensive, even to the extent that the claims are without merit or do not ultimately result in liability.

Posted by: thekohser

QUOTE(CharlotteWebb @ Tue 5th May 2009, 10:45am) *

QUOTE(thekohser @ Tue 5th May 2009, 3:40am) *

A successful fork would also have to show how, at least sometimes, it can http://search.msn.com/results.aspx?q=defective%20gypsum%20board&FORM=MSNH11 for the #1 spot in a search result page.

Personally I'd look for ways in which a project's modus operandi favorably differs from WP, and frankly the front page of your site makes me more than a little bit queasy.


Will you help me "sofixit"? Only 0.00105% of global Internet users visited Wikipedia Review.com last month, so obviously I could use some help.

Posted by: CharlotteWebb

QUOTE(emesee @ Wed 6th May 2009, 5:45am) *

I suppose if you are interested and feeling magnanimously oriented, send me a PM and i have a link you can go off of...

Well naturally I wouldn't want to start paying for something like that until I knew for sure I'd be able to get somebody, anybody, to join.

Hopefully it's easy enough to migrate content to dreamhost from some free wiki-farm.

Posted by: emesee

QUOTE(CharlotteWebb @ Wed 6th May 2009, 8:21am) *

QUOTE(emesee @ Wed 6th May 2009, 5:45am) *

I suppose if you are interested and feeling magnanimously oriented, send me a PM and i have a link you can go off of...

Well naturally I wouldn't want to start paying for something like that until I knew for sure I'd be able to get somebody, anybody, to join.

Hopefully it's easy enough to migrate content to dreamhost from some free wiki-farm.


I don't know how easy it would be. It probably depends on which farm it is and if they allow XML dumps.

Alternately if you had a page list you could just add every page to the [[Special:Export]] and then do a dump that way.

Also, you could put every page in the same category (perhaps a hidden category), and then just add them all to [[Special:Export]] using the add by category feature.

Then again, when I read about people starting websites I have heard successful individuals say they wish they had started with a unique domain name if they were to do it over again.

So for 10 dollars (often what you can signup for on Dreamhost for the first year, with a free domain) what do you have to lose if you have few to no contributors... ten dollars? You can spend more on that for just 1 meal at a pretty standard restaurant.


smile.gif

Posted by: gomi

I hadn't seen this very old thread. When it popped up, the first thing that came to mind was: "Well, first you take it out for a really nice dinner, maybe buy flowers .....".

Posted by: Cock-up-over-conspiracy

I am not interested in forking the Wikipedia ...


but is anyone here capturing dumps on a regular basis in order that they can analyze all the oversights and page removals that are going on?

Posted by: Alison

QUOTE(Cock-up-over-conspiracy @ Fri 8th January 2010, 6:28am) *

... is anyone here capturing dumps on a regular basis in order that they can analyze all the oversights and page removals that are going on?

Why?? 99.9% of them are boring, boring bored.gif ... in decreasing order of magnitude.

Posted by: Cock-up-over-conspiracy

QUOTE(Alison @ Fri 8th January 2010, 5:04pm) *
Why??

Why ... a bit like most cults, the mystique only works if extensive historical revision goes on.

The leaders are able to constantly hide their sins, and incompetence. Newcomers keep coming along and getting sucked in because they know nothing of all the shit in the past and cannot believe it.

People need to see it to believe it.

I want an older revision which has more of the pedophile stuff in it to make a presentation in order to send it to some the funding trusts.

Pedophilia ... pornography ... freaky sex ... nationalism ... cultists ... especially where there is admin involvement. That kind of stuff, e.g. perhaps some of the "deleted from public view" hard core porno images that kid admins still have access to.


I have realised stupidly (it has taken this long but I was not to committed before) that what we need are snapshots going back 6, 12, 18 ... months etc because they are becoming to smart at "disappearing" stuff in order to keep up appearances. And a lot of you guys here are doing too good a job clearing it off. Doing them the favor.

I am sorry. They allow or empowers kids to defend cult followers whose leaders have covered up and done nothing about child sex abuse within their midst ... the conversation has to opened up and discussed more widely.

It is pointless trying to address or converse with the rabble, so I would rather go around the back and discuss with the funders.