Printable Version of Topic

Click here to view this topic in its original format

_ General Discussion _ Wikipedia Editing Statistics

Posted by: MBisanz

Throwing some statistics out at http://en.wikipedia.org/wiki/User:MBisanz/edit_velocity. Shows that generally the time between edit point intervals has stayed constant for the last couple of years.

Image
Image

Posted by: Somey

Interesting, but not unexpected really. The maintenance phase doesn't necessarily imply that fewer (or more) edits will be made, or that the rate will increase or decrease. I'm afraid what's needed is a metric for "topic development," new topic development in particular - and I doubt that's going to be easy to come up with, particularly in terms of hard numbers.

This isn't necessarily related, but one thing that struck me recently is that the "threat" of lockdown - spread mainly by the open-source/internet-must-be-free folks, in reaction to things like the flagged revisions feature - could be rather cynically used by the WP/WMF bigwigs to artifically create an activity spike, as in, "you'd better get those anonymous attack-edits in now before we make it much more difficult for you in a few weeks/months." Sort of like a reverse-FUD campaign, if there could actually be such a thing. That still probably wouldn't make much of a dent in the edit rates though, given the kind of volume they're dealing with.

Also, the word you want is likely to be something more like "pace," rather than "velocity," the latter referring specifically to the physical movement of objects through space or some other medium.

Posted by: Jon Awbrey

QUOTE(Somey @ Wed 18th March 2009, 3:57am) *

Also, the word you want is likely to be something more like "pace," rather than "velocity," the latter referring specifically to the physical movement of objects through space or some other medium.


I think "rate" is a suitably generic term.

Jon

Posted by: GlassBeadGame

QUOTE(MBisanz @ Wed 18th March 2009, 12:56am) *

Throwing some statistics out at http://en.wikipedia.org/wiki/User:MBisanz/edit_velocity. Shows that generally the time between edit point intervals has stayed constant for the last couple of years.

Image
Image


I believe the usual custom is to put the axis representing "time" across the bottom running horizontally, and then the variable you are counting on the vertical axis with this bottom line representing 0. This would result in this case with a nice graphic representation of the activity initially increasing ("going up") and then flattening out. The current arrangement is a bit counter-intuitive.

QUOTE(Jon Awbrey @ Wed 18th March 2009, 3:51am) *

QUOTE(Somey @ Wed 18th March 2009, 3:57am) *

Also, the word you want is likely to be something more like "pace," rather than "velocity," the latter referring specifically to the physical movement of objects through space or some other medium.


I think "rate" is a suitably generic term.

Jon


Yes.

Posted by: Milton Roe

QUOTE(Somey @ Wed 18th March 2009, 12:57am) *

Interesting, but not unexpected really. The maintenance phase doesn't necessarily imply that fewer (or more) edits will be made, or that the rate will increase or decrease. I'm afraid what's needed is a metric for "topic development," new topic development in particular - and I doubt that's going to be easy to come up with, particularly in terms of hard numbers.

This isn't necessarily related, but one thing that struck me recently is that the "threat" of lockdown - spread mainly by the open-source/internet-must-be-free folks, in reaction to things like the flagged revisions feature - could be rather cynically used by the WP/WMF bigwigs to artifically create an activity spike, as in, "you'd better get those anonymous attack-edits in now before we make it much more difficult for you in a few weeks/months." Sort of like a reverse-FUD campaign, if there could actually be such a thing. That still probably wouldn't make much of a dent in the edit rates though, given the kind of volume they're dealing with.

Also, the word you want is likely to be something more like "pace," rather than "velocity," the latter referring specifically to the physical movement of objects through space or some other medium.

"Velocity" is also occasionally used for the time-rate of change (dY/dt) of a any quantity Y-- not just spacial position. Here the quantity is number of edits.

The bottom graph is one of edits/time, vs. time, or a time graph of what is already a sort of velocity. Thus, the slopes at any given point in that graph, as you see it, correspond with a sort of "acceleration" in editing at that time.

Helpfully,

Milton




Posted by: Random832

QUOTE(Milton Roe @ Wed 18th March 2009, 4:49pm) *
The bottom graph is one of edits/time, vs. time, or a time graph of what is already a sort of velocity. Thus, the slopes at any given point in that graph, as you see it, correspond with a sort of "acceleration" in editing at that time.


That's the problem - it's _not_ edits/time - it's time/edits. A downward slope is an acceleration, an upward slope is a deceleration (yeah, yeah, negative acceleration, nevermind that editing is not a vector nor is it relative), and you can't flip it upside down because it's not linear the other way.

Posted by: Milton Roe

QUOTE(Random832 @ Wed 18th March 2009, 9:51am) *

QUOTE(Milton Roe @ Wed 18th March 2009, 4:49pm) *
The bottom graph is one of edits/time, vs. time, or a time graph of what is already a sort of velocity. Thus, the slopes at any given point in that graph, as you see it, correspond with a sort of "acceleration" in editing at that time.


That's the problem - it's _not_ edits/time - it's time/edits. A downward slope is an acceleration, an upward slope is a deceleration (yeah, yeah, negative acceleration, nevermind that editing is not a vector nor is it relative), and you can't flip it upside down because it's not linear the other way.

Thank you-- I totally missed that. Yes, it has to be edits/time against time to mean anything, since only the second time derivative is the acceleration. It is indeed screwy and nonlinear if you graph time/edit against time.

(Oh, and yes, in these cases we should have used the word "speed", but the people who use "velocity" as a stand-in for rates-of-change-with-respect-to-time, of scalar stuff, don't really understand the difference between vectors and scalars, and usually use "velocity." I suppose because it sounds cooler.)

Posted by: One

I like the analogy to velocity for a compared analogy to acceleration.

If you take this graph and divide by the number of articles graph, it would indicate less edits per article on average as the base expands.

Posted by: gomi

QUOTE(One @ Wed 18th March 2009, 10:59am) *
I like the analogy to velocity for a compared analogy to acceleration.

If you take this graph and divide by the number of articles graph, it would indicate less edits per article on average as the base expands.

I think this is an interesting point. In fact, what might be interesting is a min/max/mean/average of edits/time/article (mean or average edits per unit time for the mean or average articles).

I suspect that Wikipedia expends many of its edits accomplishing a whole lot of nothin' on a small number of controversial articles, more of its edits accomplishing nothing on sundry fancruft articles, and comparatively little time (low mean edits/time) for the vast majority of articles. One's suggested analysis, or something similar, would point this out.

One way to articulate this is that in a Real Encyclopedia™, unlike Wikipedia, all articles will be maintained on some schedule. The article on (e.g.) Abbeyhill railway station (T-H-L-K-D) needing perhaps less attention than Palestine (T-H-L-K-D), it will nonetheless get some attention regularly. In Wikipedia, an article that one might expect to be relatively stable such as (e.g.) Anti-Defamation League (T-H-L-K-D) gets the snot pounded out of it, while after 9 months (and 500 revisions of mostly vandalism and reversion), http://en.wikipedia.org/w/index.php?title=Central_processing_unit&diff=278073646&oldid=222994392 is essentially unchanged.

Posted by: One

QUOTE(gomi @ Wed 18th March 2009, 7:09pm) *

I suspect that Wikipedia expends many of its edits accomplishing a whole lot of nothin' on a small number of controversial articles, more of its edits accomplishing nothing on sundry fancruft articles, and comparatively little time (low mean edits/time) for the vast majority of articles. One's suggested analysis, or something similar, would point this out.

Actually, I think the charts that most elegantly illustrate that are below.
http://en.wikipedia.org/wiki/User:Dragons_flight/Log_analysis

This is from 2007, but the logarithmic scale shows how many edits are consumed by articles on the long tail. Meanwhile, most of the stuff hasn't progressed far beyond stub. Hitting "random page" repeatedly confirms that there are a whole lot of sports-event-by-year articles and bot-generated locations. I suspect most of these are unwatched.

Posted by: gomi

[Moderator's note: I changed the topic title to be more descriptive. -- gomi]

Posted by: east.718

QUOTE(One @ Wed 18th March 2009, 4:59pm) *

QUOTE(gomi @ Wed 18th March 2009, 7:09pm) *

I suspect that Wikipedia expends many of its edits accomplishing a whole lot of nothin' on a small number of controversial articles, more of its edits accomplishing nothing on sundry fancruft articles, and comparatively little time (low mean edits/time) for the vast majority of articles. One's suggested analysis, or something similar, would point this out.

Actually, I think the charts that most elegantly illustrate that are below.
http://en.wikipedia.org/wiki/User:Dragons_flight/Log_analysis

This is from 2007, but the logarithmic scale shows how many edits are consumed by articles on the long tail. Meanwhile, most of the stuff hasn't progressed far beyond stub. Hitting "random page" repeatedly confirms that there are a whole lot of sports-event-by-year articles and bot-generated locations. I suspect most of these are unwatched.

I've been playing around with attempting to generate lists of "lonely" BLP articles in the past couple of days, with arbitrary criteria of less than ten revisions, less than five incoming links, and no edits in the past three months (it's reasonable to guess that these types of articles are the ones where libel festers). You don't even want to know how many results I got.