Mediawiki subversion (SVN) to git migration

From Biowikifarm Metawiki
Revision as of 13:50, 11 January 2013 by Gregor Hagedorn (Talk | contribs)

Jump to: navigation, search

This is a planning document

How-to is here: https://www.mediawiki.org/wiki/Download_from_Git . Checking extensions: Which extensions previously obtained from WMF-Subversion (i.e. excluding those we have in local SVN) are Not in the following list of git-migrated mediawiki extensions? https://gerrit.wikimedia.org/r/#/admin/projects/

Useful resources for understanding git checkout, head, detached head, --track:

HINT: to see available branches/tags:

 git branch -r | sort --version-sort
 git tag -l | sort --version-sort


Preparations: Comparing 1.20 git-extensions with 1.18 and above, Nov. 2012

Local SVN are now in same folder as git checkout = OK

  • check what mediawiki svn extensions (e.g. still SMWAskAPI) have to be merged into our biowikifarm-local-svn management, to be updated later on along with all former svn extensions (e.g. MobileKeyV1, XMLTranslation).

Manually copied i.e. now available in the 1.20git extension folder:

  • SMWAskAPI - only installed on offene-naturfuehrer = OK
    (subversion trunc (revision 19) was merged into our local-biowikifarm-svn)

Missing but no longer used = OK

  • AddMediaWizard (no longer found by find/grep)
  • ArrayExtension replaced by extension: Arrays (all seem to be updated)
  • SyntaxHighlight (no longer used, all occurrences refer to SyntaxHighlight_GeSHi)

Missing, now disabled in common/local settings = OK

  • Disabling MassEditRegex - Installed on all wikis, but not used because the currently installed version (r4) is not compatible with MediaWiki 1.18. We have tested this extension on 1.19 where it works. For some wikis this extension would be very useful.
  • Disabling MathStatFunctions - probably never used.
  • Disabling DynamicPageList - only installed 4 wiki, was deprecated a long time; all pages checked and function removed.
  • Disabling FileAttach: Used only as testing in testwikis, now removed there.


THE FOLLOWING probably may have to be disabled when migrating:

Installing mediawiki from git (1.20 and later)

Main source: mediawiki.org: Download from Git. See also overview of wikimedia projects in Git.

IMPORTANT CHANGE in comparison to earlier subversion setup: we directly clone all extensions into the /extension folder, instead of cloning elsewhere and then symlinking.

Memo: use "git tag -l | sort -V" um tags sortiert anzuzeigen

First installation only: Create git-clone on biowikifarm

A biowikifarm git-clone (normal, not bare) of the WMF core and extension repo is created.

# 1. FIRST create a biowikifarm git-clone (non-bare) for mediawiki-core and mw-extensions (submodules):
  cd /usr/share; sudo mkdir mw-wmf-clone; cd /usr/share/mw-wmf-clone
  sudo git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git core
  sudo git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions.git extensions
  cd /usr/share/mw-wmf-clone/extensions; sudo git submodule update --init --recursive
  # Note: with after init, one can use "git foreach". Example:
  #   git submodule foreach 'echo $path `git rev-parse HEAD`' 
  #   will show the path and currently checked out commit for each submodule.

Update git and export version to folder

Core and extensions are exported by a tar archive export out of a git clone. This works for creating the first version as well as updating. (HINT: before starting the script do a sudo command, e.g. sudo ls so the bash prompt knows your password and you don't have to give your password over and over again)

 # 1. Update biowikifarm clone:
 cd /usr/share/mw-wmf-clone/core; sudo git fetch; cd ../extensions; sudo git pull; sudo git submodule update --init --recursive;

 # 2. Now export (git archive) a specific MediaWiki versions into other folders:
 wikipath="/usr/share/mediawiki20/"

 if ! [ -d "${wikipath}" ]; then sudo mkdir "${wikipath}"; fi
 cd /usr/share/mw-wmf-clone/core; sudo git archive 1.20.2 | sudo tar --extract --overwrite --directory="${wikipath}"
  # Note: tar provides a "--recursive-unlink" option for extract, which removes the content of the target folder
  #  However, our mediawiki installations require many additional files (svn, widget-right changes, smarty, commons-bot)
  #  which would have to be manually restored. Therefore, do not use --recursive-unlink
 # 3. Same for extensions.
 # In list of extensions desired version is specified like the following examples:
 #  "SemanticMediaWiki"           → version HEAD + extension SemanticMediaWiki
 #  "tags/1.8, SemanticMediaWiki" → version  1.8 + extension SemanticMediaWiki
 EXTENSIONS="AccessControl
AdminLinks
APC
Arrays
ArticleFeedback
ArticleFeedbackv5
AssertEdit
CategoryTree
CentralAuth
CentralNotice
CharInsert
CheckUser
Cite
cldr
ClickTracking
CodeEditor
Collection
ConditionalShowSection
Configure
ConfirmAccount
ConfirmEdit
ContactPage
Contest
ContributionReporting
DataTransfer
DiscussionThreading
DismissableSiteNotice
DoubleWiki
DumpHTML
DynamicPageList
EmailCapture
ExpandTemplates
ExternalData
FlaggedRevs
FormPreloadPostCache
Gadgets
GlobalBlocking
GlobalUsage
GlobalUserGroups
GraphViz
HeaderTabs
ImageMap
InputBox
Interlanguage
intersection
Interwiki
LabeledSectionTransclusion
LiquidThreads
Lockdown
Loops
Maintenance
tags/2.0.1, Maps
Math
MediaFunctions
MobileFrontend
MoodBar
MwEmbedSupport
MWSearch
Nuke
OAI
OggHandler
OpenID
OpenSearchXml
PagedTiffHandler
ParserFunctions
PdfHandler
Persona
Poem
PrefSwitch
ProofreadPage
Quiz
Renameuser
ReplaceText
RSS
Scribunto
tags/0.3.4, SemanticCompoundQueries
tags/1.2.4, SemanticDrilldown
tags/2.5.1, SemanticForms
SemanticFormsInputs
SemanticImageInput
tags/0.7, SemanticInternalObjects
tags/2.0, SemanticMaps
tags/1.8, SemanticMediaWiki
tags/1.8, SemanticResultFormats
SemanticSignup
SemanticWatchlist
SimpleSurvey
SocialProfile
SpamBlacklist
Sudo
Suhosin
SwiftCloudFiles
SyntaxHighlight_GeSHi
TimedMediaHandler
TitleKey
UploadLocal
UploadWizard
tags/0.5.1, Validator
Variables
Vector
VipsScaler
VisualEditor
WebFonts
Widgets
Wikibase
WikibaseSolr
WikiEditor
WikiLove
WikimediaMessages"

 extPathTo=${wikipath}"extensions/"
 gitExt="/usr/share/mw-wmf-clone/extensions/"
 extUpdateStatus=""
 IFS=$'\n' # set bash's Internal Field Separator to newline (" " is default)
 for ext in $EXTENSIONS; do
  # split line on ", ". If only one value, export HEAD, else first value is branch/tag. Example:
  #   "tags/1.8, SemanticMediaWiki"
  #      ↓             ↓
  #   $cloneVersion  $extension
  #   will export tag 1.8.
  # --------------------------------------
  # NOTE: check the line (similar to ternary operator: if condition ? dothis : dootherthings)
  # $( … ) substitutes/returns the command's output
  # → if ext-check-with-comma == ext then (&&) "HEAD" else (||) version-string
  # ${ext%,*}    Deletes shortest match of substring ",*"   from back  of $ext
  # ${ext##*,* } Deletes longest  match of substring "*,* " from front of $ext
  # --------------------------------------
  cloneVersion=$([ "${ext%,*}" == "$ext" ] && echo "HEAD" || echo "${ext%,*}")
  extension=$([ "${ext%,*}" == "$ext" ] && echo "$ext" || echo "${ext##*,* }")
  if [ ! -d "${gitExt}${extension}" ]; then
    echo "${gitExt}${extension} DOES NOT EXIST, check the extension's name or path!! (extension was skipped)"
  else
    if [ ! -d "${extPathTo}${extension}" ]; then
      sudo mkdir --parents "${extPathTo}${extension}"; extUpdateStatus="+"
    else
      extUpdateStatus="u"
    fi
    printf "git archive export: %1s %-30s %s\n" $extUpdateStatus $extension $cloneVersion
    # EXAMPLE: sudo git archive --prefix=AdminLinks/ HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
    cd "${gitExt}${extension}" && sudo git archive --prefix="${extension}/" $cloneVersion | sudo tar --extract --overwrite --directory="${extPathTo}"
  fi
 done

To update single extensions use (following some examples from recent uses):

cd /usr/share/mw-wmf-clone/extensions; sudo git pull; sudo git submodule update --init --recursive;
cd /usr/share/mw-wmf-clone/extensions/MwEmbedSupport; sudo git archive --prefix=MwEmbedSupport/ HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
cd /usr/share/mw-wmf-clone/extensions/Maps; sudo git archive --prefix=Maps/   tags/2.0.1 | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
cd /usr/share/mw-wmf-clone/extensions/ConfirmAccount; sudo git archive --prefix=ConfirmAccount/   HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
cd /usr/share/mw-wmf-clone/extensions/PrefSwitch; sudo git archive --prefix=PrefSwitch/   HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
cd /usr/share/mw-wmf-clone/extensions/Vector; sudo git archive --prefix=Vector/   HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions
cd /usr/share/mw-wmf-clone/extensions/SemanticDrilldown; sudo git archive  --prefix=SemanticDrilldown/  HEAD | sudo tar --extract --overwrite --directory=/usr/share/mediawiki20/extensions

#######################################
# in most cases it is possible to delete extension first, HOWEVER SOME EXTENSIONS need special additions or configurations beyond git and will NOT WORK if deleted and checked out. Therefore, '''use the dangerous "delete-all-option" "--recursive-unlink" only if 100% that this is not the case for a given extension!'''
#######################################
extensionDestPath="/usr/share/mediawiki20/extensions"
extensionSrcPath="/usr/share/mw-wmf-clone/extensions"
extension="Vector"
if ! [ -d "${extensionDestPath}/${extension}" ]; then
  sudo mkdir -p "${extensionDestPath}/${extension}"
fi
cd "${extensionSrcPath}/${extension}"; sudo git archive HEAD | sudo tar --extract --overwrite --recursive-unlink --directory="${extensionDestPath}/${extension}"


# 4.a ONLY FIRST TIME CREATION of SVN-Extensions:
# Insert the local-svn-managed extensions into the same folder:
# Note: www-data cannot create the folders, but must be able to write the updates (svn-post-commit hook) -> temp folder
# Remaining Problem: when a new svn folder is added after this initial setup to the local subversion repo,
#   that folder must manually chown to www-data; else svn co post-commit errors will occur.
cd ${wikipath}
sudo mkdir TEMP-LOCAL-svn; ## temp folder
sudo svn checkout file:///var/lib/svn/LocalSVNextensions ./TEMP-LOCAL-svn
sudo chown -R  www-data:www-data ./TEMP-LOCAL-svn
sudo mv ./TEMP-LOCAL-svn/* ./extensions; sudo mv ./TEMP-LOCAL-svn/.svn ./extensions
sudo rm ./TEMP-LOCAL-svn -r
# 4.b LATER SVN-Extension UPDATING (required ALTERNATIVE to 4.a)
# 3. Update from LOCAL-svn with:
cd ${wikipath}extensions; sudo /usr/bin/svn update;
# 5. Finishing touches. Widgets needs a writable folder:
cd ${wikipath}
sudo chown -R  www-data:www-data ./extensions/Widgets/compiled_templates;
sudo chmod g+w -R  ./extensions/Widgets/compiled_templates;
# eg. sudo chown -R  www-data:www-data /usr/share/mediawiki20/extensions/Widgets/compiled_templates; sudo chmod g+w -R  /usr/share/mediawiki20/extensions/Widgets/compiled_templates
# Widgets does not contain all necessary code (-> Warning: require_once(/usr/share/mediawiki20/phase3/extensions/Widgets/smarty/Smarty.class.php): failed to open stream: No such file or directory...)
# In the absence of installation instructions, using previously saved code:
sudo cp -r /usr/share/mw-missing-code/Widgets/smarty ./extensions/Widgets
sudo cp -r /usr/share/mw-missing-code/Widgets/googlecode ./extensions/Widgets

NEW: Testing new wikis

 cd /var/www/testwiki_v20;        sudo -u www-data php ./maintenance/update.php  --quick --conf ./LocalSettings.php 
 cd /var/www/testwiki_v20;        sudo -u www-data php ./maintenance/runJobs.php -v --procs 4 --conf ./LocalSettings.php
 ## cd /var/www/v-practicalplants/w; sudo -u www-data php ./maintenance/update.php  --quick --conf ./LocalSettings.php
 sudo /usr/sbin/apache2ctl -k graceful && sudo service nginx restart && sudo service php5-fpm restart

Test location is http://biowikifarm.net/test_v20/Main_Page (temp also: also http://biowikifarm.net/test_v18/Main_Page = test with normal staging version, just for identical web server setup)

Possible errors that could be resolved (notes for migration of production wikis):

  • "Database returned error "1170: BLOB/TEXT column 'acr_email' used in key specification without a key length (localhost)" occurred in earlier test. This may or may not re-occur when migrating the production wikis!
    • creating a new db with all shared tables locally and a setting in LocalSettings: $wgSharedTables = array("interwiki"); did not help.
    • Changed field type, see [1]* ./maintenance/update.php running through.
  • Note: as of 2012-12-12: SemanticInternalObjects is now working.
  • SemanticForms was still breaking on main page and Special:Version with "Fatal error: Class 'SMWSQLStore3' not found in /usr/share/mediawiki20/extensions/SemanticMediaWiki/includes/SMW_GlobalFunctions.php on line 286. This was caused by enabledSemantic not being properly called for this specific test wiki (the other wikis were ok, so it was a config error).
  • A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:

SELECT smw_id,smw_sortkey FROM `smw_object_ids` WHERE smw_title = 'Ns:widget' AND smw_namespace = '10' AND smw_iw = AND smw_subobject = LIMIT 1 from within function "SMWSql3SmwIds::getDatabaseIdAndSort". Database returned error "1146: Table 'testwiki20.smw_object_ids' doesn't exist (localhost)". FIXED WITH:

cd /var/www/testwiki_v20
sudo -u www-data php ./maintenance/runJobs.php -v --procs 4 --conf ./LocalSettings.php
sudo -u www-data php ./extensions/SemanticMediaWiki/maintenance/SMW_refreshData.php -ftpv --conf ./LocalSettings.php
# the next step can be time-consuming:
sudo -u www-data php ./extensions/SemanticMediaWiki/maintenance/SMW_refreshData.php -v --conf ./LocalSettings.php
sudo -u www-data php ./maintenance/runJobs.php -v --procs 4 --conf ./LocalSettings.php

Testwiki 1.20 is working now.