Uiser:BZPN/Bot
BZPN bot is a maintenance bot designed to support content quality and language integrity on the Scots Wikipedia (sco.wiki). The bot focuses on tagging, reporting, and light normalization and does not delete pages, revert edits, or take administrative actions.
Its primary goal is to reduce the maintenance burden on human editors and sysops by automating routine, low-risk tasks.
Tasks
[eedit | edit soorce]BZPN bot operates only within clearly defined limits and avoids any disruptive or irreversible actions.
| Task category | Description | Type |
|---|---|---|
| Language tagging | Tags new articles as English-only or English-heavy based on lexical analysis | |
| Maintenance tagging | Adds {{Stub}} and {{Unsoorced}} templates where appropriate |
|
| New user welcoming | Welcomes new users after their first content edit | |
| Language reports | Generates language reports | |
| Reporting | Generates periodic maintenance reports (e.g. Wikidata, unsourced) | |
| Lexical normalization | Applies approved header replacements from an on-wiki whitelist | |
| Lexical normalization | Applies approved word replacements from an on-wiki whitelist | |
| User notifications | Sends user warnings for non-Scots content creators |
What the bot does NOT do
[eedit | edit soorce]
BZPN bot does not create language content of any kind.
It does not generate Scots, translate text, rewrite sentences, correct grammar, or “Scotsify” articles. The bot will only replace words or headers if, and only if, a specific replacement has been explicitly authorised on-wiki in User:BZPN/Whitelist.json.
If a replacement is not listed and approved, the bot will never modify the text.
What the bot DOES do
[eedit | edit soorce]
Adds maintenance templates to articles that clearly meet established criteria
Adds tags to revisions that clearly meet established criteria
Tags language issues based on transparent, dictionary-based lexical analysis
Welcomes new contributors with an informational message
Generates maintenance reports to assist human editors
Applies approved lexical and header normalization strictly according to the on-wiki whitelist
Operates only in low-risk, reversible maintenance areas
Logs all actions for review
Language detection
[eedit | edit soorce]The bot classifies article language using dictionary-based lexical comparison. The system does not attempt to judge dialects or stylistic variation and is intended only as a maintenance aid.
| Classification | Threshold | Action |
|---|---|---|
| English-only | ≥ 90% English | {{No Scots}} |
| English-heavy | 75-90% English | {{Fix Scots}} |
| English-mixed | 50% English | language reports mode only |
| Scots | ≥ 25% Scots | Eligible for normalization |
Thresholds are configurable but conservative by default.
Whitelist configuration
[eedit | edit soorce]All lexical replacements and header normalizations are controlled via an on-wiki JSON file:
This ensures full transparency and community oversight.
Whitelist structure
[eedit | edit soorce]{
"lexical_replacements": [
{ "find": "old_word", "replace": "new_word" }
],
"header_replacements": {
"See also": "See forbye"
},
"allowed_templates": {
"Infobox": ["name"]
},
"features": {
"replace_in_captions": false
}
}
Automated reports
[eedit | edit soorce]The BZPN bot generates a set of automated maintenance reports at regular intervals. These reports are intended to assist editors in identifying structural and content-related issues that are not handled automatically by the bot.
Reports are updated cyclically and published on subpages of the bot user space.
Report schedule
[eedit | edit soorce]- Reports are generated every 12 hours.
- Each reporting cycle appends new entries up to a predefined limit.
- Older entries are archived automatically once archive thresholds are reached.
Types of reports
[eedit | edit soorce]Wikidata report
[eedit | edit soorce]- Page:
User:BZPN/Wikidata report - Purpose: Lists articles that do not have an associated Wikidata item.
- Inclusion criteria:
- Main namespace articles only
- No existing Wikidata sitelink
Unsourced articles report
[eedit | edit soorce]- Page:
User:BZPN/Unsourced report - Purpose: Identifies articles that lack reliable sourcing.
- Inclusion criteria:
- Article size greater than or equal to
UNSOURCED_MIN_SIZE - No
<ref>tags detected - No recognized citation templates present
- Article size greater than or equal to
- Exclusions:
- Articles already tagged with
{{Unsoorced}}
- Articles already tagged with
Uncategorized articles report
[eedit | edit soorce]- Page:
User:BZPN/Uncategorized report - Purpose: Tracks articles that are missing category assignments.
- Inclusion criteria:
- Main namespace articles
- No
[[Category:…]]links present
Language report
[eedit | edit soorce]- Page:
User:BZPN/Language report - Purpose: Tracks articles that are written in heavy English or in English only.
- Inclusion criteria:
- Main namespace articles
- Language criteria
General behaviour
[eedit | edit soorce]- Reports are additive: new entries are appended and not removed unless archived.
- Duplicate entries are avoided within the same report page.
- The bot does not perform content edits while generating reports.
Archiving
[eedit | edit soorce]Archived report pages follow the naming pattern:
User:BZPN/<Report_name>/Archive_<number>
Templates
[eedit | edit soorce]The bot uses the following templates:
Source code
[eedit | edit soorce]The bot's source code is available on GitHub. The bot runs on Wikimedia Cloud Services.