Good news, everyone!
– Professor Farnsworth, Futurama (upprepad replik)
Tidigare har det hävdats att den tidigaste Wikipedia-artikeln som finns bevarad är artikeln UuU. Jag har själv skrivit det i min bok, Så fungerar Wikipedia. Och fram tills för några dagar sedan stämde det. Det gör det inte längre!
För lite drygt 20 timmar sedan annonserade nämligen Tim Starling, en av Wikimedia Foundations löjligt få utvecklare – och en av de bara tre personer som har en egen Wikipedia-dag – följande:
I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001!
This is exciting, because there is lots of article history in here which was assumed to be lost forever.
I’ve long been interested in Wikipedia’s history, and I’ve tried in the past to locate such backups. I asked various people who might have had one. I had given up hope.
The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod’s policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn’t been purged yet.
I didn’t want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from
January 15 to August 17, 2001.
I’ve put the two log files up on the web, at:
http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
The 7-zip archive is only 8.4MB — much more manageable than today’s backups.
rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files.
— Tim Starling (källa Foundation-mailinglistan)
Det här kommer riktigt lägligt när det bara är en månad kvar till Wikipedia fyller 10 år! Och nog kommer vissa forskare få mer material.
Hur ser då Wikipedias början ut? Tidigare har man kunnat gå till nostalgiversionen av Wikipedia, men nu finns det ”nytt” gammalt material, som inte hunnit läggas in där. Så här ser istället den första bevarade meningen ut:
”This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren’t so sure. So, let’s get to work!”
Och debatten om Larry Sangers roll som Wikipedias grundare kommer säkert att aktualiseras med sidor som den här.
Eller hans tilltro till wiki-konceptet:
None of this is to say that the Nupedia wiki will ”replace” the main encyclopedia; of course it won’t. But it will be an interesting ancillary endeavor!
För den som är intresserad av Wikipedias utveckling börjar det bli en ganska stor förteckning här.
Tim Starling har i alla fall gjort sig förtjänt av sin dag den här veckan.