FORUM WARNING [2] Division by zero (Line: 2933 of /srcsgcaop/boardclass.php)
Wikipedia and Information Theory -
     
 
The Wikipedia Review: A forum for discussion and criticism of Wikipedia
Wikipedia Review Op-Ed Pages

Welcome, Guest! ( Log In | Register )

> Wikipedia and Information Theory
anthony
post
Post #1


Postmaster
*******

Group: Regulars
Posts: 2,034
Joined:
Member No.: 2,132



The English Wikipedia database, uncompressed: 5.34 terabytes
The English Wikipedia database, compressed: 32 gigabytes
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
 
Reply to this topicStart new topic
Replies
Milton Roe
post
Post #2


Known alias of J. Random Troll
*********

Group: Regulars
Posts: 10,209
Joined:
Member No.: 5,156



QUOTE(anthony @ Fri 2nd April 2010, 8:09pm) *

The English Wikipedia database, uncompressed: 5.34 terabytes
The English Wikipedia database, compressed: 32 gigabytes

Weird. 167 to 1.

The weirdness is that the best English text compression is roughly 9 to 1 (without any external dictionary for common words like "the").

So what the heck compresses down 167 times? I would assume the database is mostly English text, no? It's too small to include any images. So what's the rest of the crud on WP that takes up so much space but is just.... space? (IMG:smilys0b23ax56/default/blink.gif)

And no, this is not a snarky comment on content. Even crappy content and vandalism should take up the same space as the best and finest prose. Provided it's not run-on letter vandalism ala zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.

User is offlineProfile CardPM
Go to the top of the page
+Quote Post



Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

-   Lo-Fi Version Time is now:
 
     
FORUM WARNING [2] Cannot modify header information - headers already sent by (output started at /home2/wikipede/public_html/int042kj398.php:242) (Line: 0 of Unknown)