User:Magill-bot

From Lotro-Wiki.com
Jump to: navigation, search


User Magill's Bot

It uses the Python script: Pywikipediabot, running under OSX 10.8.4. - 3 July 2013
Python Wikipedia Robot Framework [1] -> Pywikipediabot [2] -> Pywikipedia [3] -> Pywikibot [4] -> pwb



See also: RTC's Bot Sandbox on the TestWiki.

MediaWiki Botting

This page is an attempt to document bot usage on lotro-wiki.com.
Pywikipediabot can be run on any system with a Python installation.
(Just for the record, the Python entry above is an "InterWiki" link, which does not work from lotro-wiki.com. Lotro-wiki.com requires a full HTML link:
Python )

Note that Pywikipediabot was written for/from a Unix(Linux) point of view. It will run on any system which has a working Python system. See the specific download and installation instructions in the MediaWiki manual below.

I am running it on an iMac under OSX 10.8.4, which is a Unix-like system.

References:

Installation information

  • Download the current distribution via SVN or GIT (the source is changing in the summer of 2013), see the manual above for current information.

Before you begin

There are a number of files in the SVN download to which one needs to pay attention. While they can be read from the Finder, you cannot perform the necessary actions on them from there.

  1. README
  2. CONTENTS
  3. docs/README
Note that when viewed from the Finder, the directory (folder) is displayed in CASE INSENSITIVE alphabetical order (upper and lower case are intermixed), while when viewed from a "ls" in Terminal, they appear in CASE SENSITIVE order -- i.e. the Capitalized names will appear at the top.
  • The top-level README file is a standard inclusion in such software distributions, however in this case, it's contents are simply cosmetic. They provide no useful information.
  • The CONTENTS file contains important information describing the individual files in the distribution.

The first step in installing pywikipedia is to run the python code: generate_user_files.py. This cannot be done from the Finder.

Note that if you have Xcode installed, double clicking on this script from the Finder window will launch Xcode.
  • In a Terminal window:
$ cd pywikipedia
$ python generate_user_files.py

Which will generate the following:

Note 1: For your first attempt, generate both files
Note 2: The list of WIkis are those who are advertised to Mediawiki.org; for a private Wiki, select "27 - test"
Note 3: The default language is english
Note 4: for "Username" enter the username associated with your bot; typically <yourWikiUserid-bot>
Note 5: Choose the "Small" (S) version of the config file for your first attempt. The extend version contains many options beyond the scope of this article.
1: Create user_config.py file
2: Create user_fixes.py file
3: The two files
What do you do? 3 <-- Note 1: generate both files 
1: anarchopedia
2: battlestarwiki
3: botwiki
4: celtic
5: commons
6: fon
7: gentoo
8: i18n
9: incubator
10: krefeldwiki
11: lockwiki
12: loveto
13: lyricwiki
14: mediawiki
15: memoryalpha
16: meta
17: mozilla
18: oldwikivoyage
19: omegawiki
20: openttd
21: osm
22: piratenwiki
23: southernapproach
24: species
25: strategy
26: supertux
27: test
28: twcareer
29: ubuntutw
30: uncyclopedia
31: vikidia
32: wekey
33: wesolve
34: wikia
35: wikibond
36: wikibooks
37: wikidata
38: wikimediachapter
39: wikinews
40: wikipedia
41: wikiquote
42: wikisource
43: wikitech
44: wikitravel
45: wikitravel_shared
46: wikiversity
47: wikivoyage
48: wiktionary
49: wowwiki
Select family of sites we are working on (default: wikipedia): 27 <-- Note 2:  use Test to start
The language code of the site we're working on (default: 'en'):  <-- Note 3: (hit return to accept the default)
Username (en test): <wiki-bot username> <-- Note 4: your bot's username -- typically <yourWikiUserid-bot>
Which variant of user_config.py:
[S]mall or [E]xtended (with further information)? S <-- Note 5: Choose the Small version
'user-config.py' written.
'user-fixes.py' written.

Now you can begin configuring your bot.

Configuration files

  • user-config.py

You will have a very simple user-config.py file. You need to edit it to point at lotro-wiki, and your bot's name.

This example includes entries for both the main lotro-wiki and the testwiki.

# -*- coding: utf-8  -*-
mylang = 'en'

family = 'lotro-wiki'
usernames['lotro-wiki']['en'] = u'magill-bot'

family = 'testwiki'
usernames['testwiki']['en'] = u'magill-bot'

use_api_login = True
console_encoding = 'utf-8'
log = ['*']


Generating a Family FIle

Family is the project name.[1]

  • We selected "27 test" as the Family of wikis when we ran the setup script generate_user_files.py.

However, if the bot’s main wiki is "mywiki", edit this line in user-config.py:

family = 'mywiki'
  • To run a bot on your local wiki (mywiki) you will need to create a "family file" --"families/xxx_family.py" where xxx is the name of your local wiki "families/mywiki_family.py", the same name as you would enter into this variable, replacing "test". See: families/README-family.txt
Usage: generate_family_file.py <url> <short name>
Example: generate_family_file.py http://www.mywiki.bogus/wiki/Main_Page mywiki
This will create the file families/mywiki_family.py

Example: lotro-wiki.com

> python generate_family_file.py http://lotro-wiki.com/ lotro-wiki
Generating family file from http://lotro-wiki.com/

==================================
api url: http://lotro-wiki.com/api.php
MediaWiki version: 1.20.6
==================================

Determining other languages... fr

There are 2 languages available.
Do you want to generate interwiki links? This might take a long time. ([y]es/[N]o/[e]dit)N
Loading wikis... 
  * en...  in cache
Retrieving namespaces...  en 
Writing families/lotro-wiki_family.py... 

This creates the basic file: pywikipedia/families/lotro-wiki_family.py

  • If you did so, you can now rename user-config-hold.py to user-config.py and make certain that the "family=" parameter is correct.
  • Consulting: README-family.txt again, you can edit the family file to include any necessary additions or corrections.
  1. Meta uses 'meta' for both language code and wiki family, Commons uses 'commons' for both, and Testwiki uses 'test' for both, the multilingual wikisource uses '-' for the language. You can override this on the command line by using -family:wikibooks.
  • famlies/lotro-wiki_family.py
# -*- coding: utf-8 -*-
"""
This family file was auto-generated by $Id: generate_family_file.py 11542 2013-05-17 19:58:40Z drtrigon $
Configuration parameters:
  url = http://lotro-wiki.com/
  name = lotro-wiki

Please do not commit this to the SVN repository!
"""

import family

class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)
        self.name = 'lotro-wiki'
        self.langs = {
            'en': u'lotro-wiki.com',
        }

        self.namespaces[4] = self.namespaces.get(4, {})
        self.namespaces[4][u'en'] = [u'Lotro-Wiki.com']
        self.namespaces[5] = self.namespaces.get(5, {})
        self.namespaces[5][u'en'] = [u'Lotro-Wiki.com talk']
        self.namespaces[6] = self.namespaces.get(6, {})
        self.namespaces[6][u'en'] = [u'Image']
        self.namespaces[7] = self.namespaces.get(7, {})
        self.namespaces[7][u'en'] = [u'Image talk']
        self.namespaces[152] = self.namespaces.get(152, {})
        self.namespaces[152][u'en'] = [u'Property']
        self.namespaces[153] = self.namespaces.get(153, {})
        self.namespaces[153][u'en'] = [u'Property talk']
        self.namespaces[158] = self.namespaces.get(158, {})
        self.namespaces[158][u'en'] = [u'Concept']
        self.namespaces[159] = self.namespaces.get(159, {})
        self.namespaces[159][u'en'] = [u'Concept talk']
        self.namespaces[90] = self.namespaces.get(90, {})
        self.namespaces[90][u'en'] = [u'Thread']
        self.namespaces[91] = self.namespaces.get(91, {})
        self.namespaces[91][u'en'] = [u'Thread talk']
        self.namespaces[92] = self.namespaces.get(92, {})
        self.namespaces[92][u'en'] = [u'Summary']
        self.namespaces[93] = self.namespaces.get(93, {})
        self.namespaces[93][u'en'] = [u'Summary talk']
        self.namespaces[100] = self.namespaces.get(100, {})
        self.namespaces[100][u'en'] = [u'Item']
        self.namespaces[101] = self.namespaces.get(101, {})
        self.namespaces[101][u'en'] = [u'Item Talk']
        self.namespaces[102] = self.namespaces.get(102, {})
        self.namespaces[102][u'en'] = [u'Quest']
        self.namespaces[103] = self.namespaces.get(103, {})
        self.namespaces[103][u'en'] = [u'Quest Talk']
        self.namespaces[104] = self.namespaces.get(104, {})
        self.namespaces[104][u'en'] = [u'Portal']
        self.namespaces[105] = self.namespaces.get(105, {})
        self.namespaces[105][u'en'] = [u'Portal Talk']
        self.namespaces[106] = self.namespaces.get(106, {})
        self.namespaces[106][u'en'] = [u'Region']
        self.namespaces[107] = self.namespaces.get(107, {})
        self.namespaces[107][u'en'] = [u'Region Talk']
        self.namespaces[108] = self.namespaces.get(108, {})
        self.namespaces[108][u'en'] = [u'Social']
        self.namespaces[109] = self.namespaces.get(109, {})
        self.namespaces[109][u'en'] = [u'Social Talk']

"""
Insert "index.php" for correct URL to use at lotro-wiki.com
"""
    def nicepath(self, code):
        return '/index.php/'

    def scriptpath(self, code):
        return {
            'en': u'',
        }[code]

    def version(self, code):
        return {
            'en': u'1.20.6',
        }[code]

Usage Explanation

Many thanks to RTC for his work to make the botting easy ... His explanation:

  • The problem with the stock replace.py is that it tries to use a preload feature which fetches batches of pages. The mediawiki feature that supports this is disabled in our wiki (by lotroadmin) because we has some lazy person downloading the entirety of wiki and posting on their own site.
See: a copy of User:magill-bot/scripts/replace_lotro.py. Search for the string "nopreload" for all the changes I made to create this version. Basically, I add a new option which suppresses use of the unsupported feature.
I add the -nopreload option to the command line to suppress using the "Special:Export" feature of mediawiki.

user-fixes.py

See a copy of User:magill-bot/scripts/user-fixes.py
The last entry, around line 620, has not been run against the wiki.
Typically, I include an example of a test command line and a production command line for each fix. I used the the test command line while I was developing the replacement regular expressions, and the production command line is typical of the one I ran to apply the fix to the wiki.
When you run a fix, it gives you a confirmation prompt with several options. The "b" option opens the current unchanged target page in a browser. If that option doesn't work "out of the box" for you, it is really worth while to figure out how to make it work.

Work Flow

My work flow is to run the replace_lotro.py command line for the test page. Review the listed changes in the command window. Then press "b" to open the page in the browser. That gives the pre-view of the page. Press the "y" to make the change to the wiki page, and refresh the page in the browser to make sure the results are what I expect. If they are not, I can very easily undo or rollback the bot change.
When I start doing production changes, I manually confirm quite a few changes to gain confidence that the change is working properly for many pages before I auto-confirm.
Note that these pages are encoded using UTF-8 without BOM. Make sure any editor you use with them preserves the proper characters codes. See around lines 61 and 74 where I have fixes for Gath Forthnír and Esteldín to check that these names are handled correctly. The python code has the "# -*- coding: utf-8 -*-" comment at the beginning to tell python to treat the source as UTF-8 encoded text.