X-Git-Url: https://sipb.mit.edu/gitweb.cgi/ikiwiki.git/blobdiff_plain/86484c109d30c0062c5a5056f4adf41c0581b5e9..6b3cf85ea088d62dfec8b19965b1c7538a7785d1:/doc/tips/convert_mediawiki_to_ikiwiki.mdwn diff --git a/doc/tips/convert_mediawiki_to_ikiwiki.mdwn b/doc/tips/convert_mediawiki_to_ikiwiki.mdwn index db1a1745c..a90c6144a 100644 --- a/doc/tips/convert_mediawiki_to_ikiwiki.mdwn +++ b/doc/tips/convert_mediawiki_to_ikiwiki.mdwn @@ -48,10 +48,7 @@ in HTML, you may need to add further processing to the last line. Note that by default, `Special:Allpages` will only list pages in the main namespace. You need to add a `&namespace=XX` argument to get pages in a -different namespace. The following numbers correspond to common namespaces: - - * 10 - templates (`Template:foo`) - * 14 - categories (`Category:bar`) +different namespace. (See below for the default list of namespaces) Note that the page names obtained this way will not include any namespace specific prefix: e.g. `Category:` will be stripped off. @@ -62,11 +59,26 @@ If you have access to the relational database in which your mediawiki data is stored, it is possible to derive a list of page names from this. With mediawiki's MySQL backend, the page table is, appropriately enough, called `table`: - SELECT page_namespace, page_title FROM page; + SELECT page_namespace, page_title FROM page; As with the previous method, you will need to do some filtering based on the namespace. +### namespaces + +The list of default namespaces in mediawiki is available from . Here are reproduced the ones you are most likely to encounter if you are running a small mediawiki install for your own purposes: + +[[!table data=""" +Index | Name | Example +0 | Main | Foo +1 | Talk | Talk:Foo +2 | User | User:Jon +3 | User talk | User_talk:Jon +6 | File | File:Barack_Obama_signature.svg +10 | Template | Template:Prettytable +14 | Category | Category:Pages_needing_review +"""]] + ## Step 2: fetching the page data Once you have a list of page names, you can fetch the data for each page. @@ -137,11 +149,33 @@ into an ikiwiki tag name using a script such as The [[plugins/contrib/mediawiki]] plugin can be used by ikiwiki to interpret most of the Mediawiki syntax. -## External links +The following things are not working: + +* templates +* tables +* spaces and other funky characters ("?") in page names + +## Scripts [[sabr]] used to explain how to [import MediaWiki content into git](http://u32.net/Mediawiki_Conversion/index.html?updated), including full edit history, but as of 2009/10/16 that site is not available. A copy of the -information found on this website is stored at +information found on this website is stored at . + +[[Albert]] wrote a ruby script to convert from mediawiki's database to ikiwiki at + +[[Anarcat]] wrote a python script to convert from a mediawiki website to ikiwiki at . The script doesn't need any special access or privileges and communicates with the documented API (so it's a bit slower, but allows you to mirror sites you are not managing, like parts of Wikipedia). The script can also incrementally import new changes from a running site, through RecentChanges inspection. It also supports mithro's new Mediawiki2markdown converter. + +> Some assembly is required to get Mediawiki2markdown and its mwlib +> gitmodule available in the right place for it to use.. perhaps you could +> automate that? --[[Joey]] + +> > You mean a debian package? :) media2iki is actually a submodule, so you need to go through extra steps to install it. mwlib being the most annoying part... I have fixed my script so it looks for media2iki directly in the submodule and improved the install instructions in the README file, but I'm not sure I can do much more short of starting to package the whole thing... --[[anarcat]] + +> Also, when I try to run it with -t on www.amateur-radio-wiki.net, it +> fails on some html in the page named "4_metres". On archiveteam.org, +> it fails trying to write to a page filename starting with "/", --[[Joey]] +> > can you show me exactly which commandline arguments you're using? also, I have made improvements over the converter too, also available here: -- [[anarcat]] +[[scy]] wrote a python script to convert from mediawiki XML dumps to git repositories at .