Difference between revisions of "Copying or merging wiki user accounts"

From Biowikifarm Metawiki
Jump to: navigation, search
(Script)
Line 60: Line 60:
  
 
=Script=
 
=Script=
A script for doing this automatically was written, and will be available on SVN shortly.
+
A script for doing this automatically was written, and will be available on SVN when the SVN layout is improved.

Revision as of 14:38, 2 January 2015

There are several use cases where user accounts need to be copied from one wiki to another, e.g.

Merging two wikifarms
Wikifarms store user accounts in the metawiki database, which is shared by all wikis in the wikifarm. If the wikis of a wikifarm (or a selection thereof) need to be moved to another wikifarm, then a merge of the user accounts becomes necessary.
Moving a wikifarm to anothers server
Strictly speaking, moving a wikifarm to another server can be done by copying the whole databse. In practice however, the new wikifarm will often be a separate branch of the original wiki, so a merge becomes necessary.

Overview

The process consists of a sequence of activities, described in more detail below:

  1. Backup the user database
  2. Resolve conflicts
  3. Merge the user accounts
  4. Anonymize user accounts which should be discarded
  5. Clean-up
  • The source database server shall be refered to as "FROM"
  • The sink database server shall be refered to as "TO"

1. Backup

Make a backup of the tables in TO that will be modified:

  • user table in metawiki database
  • user tables user_groups, user_newtalk, user_openid, user_properties, recentchanges, revision, watchlist in all individual wiki databases.

2. Resolve conflicts

User accounts can conflict if FROM and TO have accounts with the same ids but with different data. This can happen if:

  • accounts in FROM have been updated after TO was copied.
  • accounts have been created in TO, but not in FROM.

Resolving conflicts involves the following actions:

  1. Search for incomplete accounts in FROM (e.g. without an e-mail address). If incomplete accounts are found, stop.
  2. List all conflicting accounts in TO, i.e. which have no equivalent in FROM because they were created after TO was copied.
  3. Add a buffer of e.g. 2000 accounts in TO, to accomodate future account merges.
  4. Move conflicting accounts after the buffer.
  5. Update user ids in the individual wiki databases, in the tables user_groups, user_newtalk, user_openid, user_properties, recentchanges, revision, watchlist.
  6. Check that conflicts are resolved, else stop.

3. Merge

Merge the user accounts in TO with accounts in FROM.

  1. Copy the table user_groups from all individual wiki databases that will be merged.
  2. Update user data in the tables user_newtalk, user_openid, user_properties, recentchanges, revision, watchlist from all individual wiki databases that will be merged.
  3. Copy accounts which exist in FROM but not in TO, or which were updated in FROM after TO was copied.

4. Anonymize

In some cases, not all accounts shall be merged. However, to keep the page revisions, discussions etc., user accounts cannot simply be deleted, so they have to be anonymized:

  1. Make a list of users which will be kept, e.g.
    1. Users which belong to a specific user group
    2. Users which are registered with a specific wiki in TO
    3. Users with certain E-mail addresses
    4. Administrators, bureaucrats
    5. Bots
  2. Anonymize all users which are not in the list obtained in the previous step:
    1. Erase E-mail address and related columns
    2. Replace login with some string, e.g. id
    3. Replace password and related columns by empty string or null
    4. Replace real name by id

Script

A script for doing this automatically was written, and will be available on SVN when the SVN layout is improved.