Jump to content

Uiser:BZPN/Bot

Frae Wikipedia, the free beuk o knawledge

BZPN bot is a maintenance bot designed to support content quality and language integrity on the Scots Wikipedia (sco.wiki). The bot focuses on tagging, reporting, and light normalization and does not delete pages, revert edits, or take administrative actions.

Its primary goal is to reduce the maintenance burden on human editors and sysops by automating routine, low-risk tasks.

BZPN bot operates only within clearly defined limits and avoids any disruptive or irreversible actions.

Task category Description Type
Language tagging Tags new articles as English-only or English-heavy based on lexical analysis Static
Maintenance tagging Adds {{Stub}} and {{Unsoorced}} templates where appropriate Static
New user welcoming Welcomes new users after their first content edit Static
Language reports Generates language reports Static
Reporting Generates periodic maintenance reports (e.g. Wikidata, unsourced) Dynamic
Lexical normalization Applies approved header replacements from an on-wiki whitelist Dynamic
Lexical normalization Applies approved word replacements from an on-wiki whitelist Dynamic
User notifications Sends user warnings for non-Scots content creators Dynamic

What the bot does NOT do

[eedit | edit soorce]

Warning BZPN bot does not create language content of any kind. It does not generate Scots, translate text, rewrite sentences, correct grammar, or “Scotsify” articles. The bot will only replace words or headers if, and only if, a specific replacement has been explicitly authorised on-wiki in User:BZPN/Whitelist.json. If a replacement is not listed and approved, the bot will never modify the text.

  • Does not revert edits
  • Does not sanction users
  • Does not modify article meaning or factual content
  • Does not apply language changes outside approved scopes

What the bot DOES do

[eedit | edit soorce]
  • Adds maintenance templates to articles that clearly meet established criteria
  • Adds tags to revisions that clearly meet established criteria
  • Tags language issues based on transparent, dictionary-based lexical analysis
  • Welcomes new contributors with an informational message
  • Generates maintenance reports to assist human editors
  • Applies approved lexical and header normalization strictly according to the on-wiki whitelist
  • Operates only in low-risk, reversible maintenance areas
  • Logs all actions for review

Language detection

[eedit | edit soorce]

The bot classifies article language using dictionary-based lexical comparison. The system does not attempt to judge dialects or stylistic variation and is intended only as a maintenance aid.

Classification Threshold Action
English-only ≥ 90% English {{No Scots}}
English-heavy 75-90% English {{Fix Scots}}
English-mixed 50% English language reports mode only
Scots ≥ 25% Scots Eligible for normalization

Thresholds are configurable but conservative by default.

Whitelist configuration

[eedit | edit soorce]

All lexical replacements and header normalizations are controlled via an on-wiki JSON file:

This ensures full transparency and community oversight.

Whitelist structure

[eedit | edit soorce]
{
  "lexical_replacements": [
    { "find": "old_word", "replace": "new_word" }
  ],
  "header_replacements": {
    "See also": "See forbye"
  },
  "allowed_templates": {
    "Infobox": ["name"]
  },
  "features": {
    "replace_in_captions": false
  }
}

Automated reports

[eedit | edit soorce]

The BZPN bot generates a set of automated maintenance reports at regular intervals. These reports are intended to assist editors in identifying structural and content-related issues that are not handled automatically by the bot.

Reports are updated cyclically and published on subpages of the bot user space.

Report schedule

[eedit | edit soorce]
  • Reports are generated every 12 hours.
  • Each reporting cycle appends new entries up to a predefined limit.
  • Older entries are archived automatically once archive thresholds are reached.

Types of reports

[eedit | edit soorce]

Wikidata report

[eedit | edit soorce]
  • Page: User:BZPN/Wikidata report
  • Purpose: Lists articles that do not have an associated Wikidata item.
  • Inclusion criteria:
    • Main namespace articles only
    • No existing Wikidata sitelink

Unsourced articles report

[eedit | edit soorce]
  • Page: User:BZPN/Unsourced report
  • Purpose: Identifies articles that lack reliable sourcing.
  • Inclusion criteria:
    • Article size greater than or equal to UNSOURCED_MIN_SIZE
    • No <ref> tags detected
    • No recognized citation templates present
  • Exclusions:

Uncategorized articles report

[eedit | edit soorce]
  • Page: User:BZPN/Uncategorized report
  • Purpose: Tracks articles that are missing category assignments.
  • Inclusion criteria:
    • Main namespace articles
    • No [[Category:…]] links present

Language report

[eedit | edit soorce]
  • Page: User:BZPN/Language report
  • Purpose: Tracks articles that are written in heavy English or in English only.
  • Inclusion criteria:

General behaviour

[eedit | edit soorce]
  • Reports are additive: new entries are appended and not removed unless archived.
  • Duplicate entries are avoided within the same report page.
  • The bot does not perform content edits while generating reports.

Archiving

[eedit | edit soorce]

Archived report pages follow the naming pattern: User:BZPN/<Report_name>/Archive_<number>

Templates

[eedit | edit soorce]

The bot uses the following templates:

Source code

[eedit | edit soorce]

The bot's source code is available on GitHub. The bot runs on Wikimedia Cloud Services.