“Wazuh Static Analysis” series:
- Part 1: Decoders - decoder XML validation
- Part 2: Rules (you are here) - rule validation and cross-type checking
In Part 1 we built a linter for Wazuh decoder XML files - a tool that validates structure, regex/order consistency, and parent-child decoder chains. But decoders are only half of the event processing pipeline. Decoders extract fields from raw logs, while rules decide what to do with those fields: generate an alert, escalate a threat level, or trigger an automated response. An error in a rule - a missed alert or a false positive - can be more dangerous than a decoder misconfiguration.
The tool has grown. It is now wazuh-linter - a static analysis platform that validates decoders, rules, and the relationships between them. This article covers the architectural evolution of the tool, its 24 validation rules for Wazuh rule XML files, and the cross-type checking mechanism.
Common Rule Configuration Errors
Wazuh rule XML files are located in /var/ossec/etc/rules/ (custom) and /var/ossec/ruleset/rules/ (default). Each file contains <rule> elements inside a root <group> element. Analysis of real-world configurations reveals consistent error patterns.
timeframe without frequency. A rule with timeframe="120" but no frequency attribute defines a time window with no firing threshold. Wazuh silently ignores the timeframe in this case. The exception is rules with <if_matched_sid> or <if_matched_group>, which inherit frequency context from the referenced rule. Note that frequency without timeframe is valid - Wazuh applies a default timeframe.
<!-- Error: timeframe without frequency -->
<rule id="100001" level="10" timeframe="120">
<description>Missing frequency</description>
</rule>
if_sid referencing a non-existent ID. A rule declares <if_sid>100500</if_sid>, but rule 100500 does not exist in any loaded file. The rule will never fire because its activation condition cannot be satisfied.
Duplicate rule IDs. Two rules with the same id without overwrite="yes" produce undefined behavior. Wazuh loads one of them, but which one depends on file processing order.
Invalid MITRE ATT&CK format. MITRE identifiers must follow the format Tnnnn or Tnnnn.nnn (for example, T1078 or T1078.001). Arbitrary strings like brute_force in the <id> element break MITRE framework integration.
osmatch in regex elements. The type="osmatch" attribute is valid for <match> and <prematch>, but not for <regex> - osmatch does not support capturing groups, and <regex> exists specifically for field capture.
Invalid time/weekday formats. The <time> element accepts ranges in 24-hour format (6 pm - 8:30 am), while <weekday> accepts day names or special values weekdays/weekends. Typos like Mnday or 25:00 - 26:00 cause the time condition to be silently ignored.
Architectural Evolution: From Linter to Platform
The first version of the tool - wazuh-decoder-linter - was a monolith: a single WazuhDecoderLinter class handling XML parsing, sanitization, block extraction, and all validation checks. When it came time to add rule validation, it became clear that copying XML parsing logic was a dead end. Decoders and rules share the same mechanisms: file reading, malformed XML recovery, Wazuh-specific character escaping, and individual block extraction on parse failure.
The solution was extracting shared logic into a base class BaseXmlLinter:
BaseXmlLinter
- File reading (UTF-8 / latin-1 fallback)
- XML sanitization (unescaped &, \<, bare <)
- Two-pass parsing strategy
- Individual block extraction on failure
- Line context formatting
|
+-- WazuhDecoderLinter
| 14 decoder checks
| Decoder name registry
|
+-- WazuhRuleLinter
24 rule checks
Rule ID and group registry
Each specialized linter inherits from BaseXmlLinter and implements only domain logic. WazuhDecoderLinter checks regex/order, parent chains, plugin_decoder. WazuhRuleLinter checks frequency/timeframe, if_sid chains, MITRE IDs, time formats.
The second architectural decision was LintSession. This is a shared state object that ties decoder and rule linters into a single validation pass. When the decoder linter processes files, it registers all discovered decoder names in the session. When the rule linter encounters <decoded_as>sshd</decoded_as>, it checks the session to verify that the sshd decoder exists. Without LintSession, this cross-type validation would be impossible.
The third change was automatic file type detection. The CLI analyzes XML content to determine whether a file contains decoders (<decoder> elements) or rules (<group> with child <rule> elements). This allows running the unified wazuh-lint command on a directory with mixed file types.
24 Validation Rules for Wazuh Rule XML Files
The rule linter implements 24 checks, grouped by type.
Structural Checks
| Rule | Severity | Description |
|---|---|---|
| Required attributes | ERROR | <rule> must have id and level |
| ID range | ERROR | id must be an integer from 1 to 999999 |
| Level range | ERROR | level must be an integer from 0 to 16 |
| ID uniqueness | ERROR | Duplicate IDs within a file are forbidden |
| Unknown elements | WARNING | Child elements outside the set of 82 valid elements |
| Description | WARNING | Rule should contain <description> |
| Rule attributes | WARNING | Unknown attributes on <rule> |
Logical Checks
| Rule | Severity | Description |
|---|---|---|
| frequency/timeframe | ERROR | Co-dependency (relaxed for if_matched_*) |
| if_sid format | ERROR | Valid comma-separated integers |
| if_level format | ERROR | Integer within 0-16 range |
| Overwrite value | ERROR | Only yes or no |
| Correlation context | WARNING | same_*/different_* require frequency or if_matched_* |
Format Checks
| Rule | Severity | Description |
|---|---|---|
| Regex types | ERROR | type must be osmatch, osregex, or pcre2 |
| Negate attribute | ERROR | Only yes/no; valid only on matching elements |
| OS_Regex syntax | WARNING | Unsupported constructs in osregex patterns |
| Options values | ERROR | Only valid option values accepted |
| Time format | ERROR | Valid time range (24h, 12h am/pm, ! for negation) |
| Weekday format | ERROR | Valid day names or weekdays/weekends |
| MITRE ID format | WARNING | <id> must match Tnnnn or Tnnnn.nnn |
| List attributes | ERROR | <list> requires field=; valid lookup= values |
Cross-File Checks
| Rule | Severity | Description |
|---|---|---|
| if_sid chain | WARNING | <if_sid> must reference existing IDs |
| if_matched_sid chain | WARNING | <if_matched_sid> must reference existing IDs |
| Cross-file duplicate IDs | ERROR | No duplicates without overwrite="yes" |
| decoded_as | INFO | <decoded_as> must reference an existing decoder |
Let us examine several checks in more detail.
Correlation context. Elements <same_source_ip>, <different_source_ip>, and other correlation elements (40 total) only make sense in an aggregation context - when <frequency> or <if_matched_sid> is present. Without them, the correlation element is silently ignored:
<!-- Error: same_source_ip without frequency -->
<rule id="100002" level="8">
<if_sid>5710</if_sid>
<same_source_ip />
<description>Should correlate but cannot</description>
</rule>
MITRE ATT&CK validation. The linter verifies that identifiers inside <mitre><id> conform to the Tnnnn or Tnnnn.nnn format. This is not full validation against the MITRE registry, but it catches obvious errors like text descriptions instead of identifiers.
List attributes. The <list> element for CDB lookups requires a mandatory field attribute and accepts lookup with values match_key, not_match_key, match_key_value, address_match_key, not_address_match_key, address_match_key_value. A missing field is an error, and so is an invalid lookup value.
Cross-Type Validation: decoded_as
The most interesting capability of the new architecture is checking relationships between rules and decoders. The <decoded_as> element in a rule filters events by decoder name. If the specified decoder does not exist, the rule will never fire.
Consider this example. In a rules file:
<rule id="100100" level="5">
<decoded_as>custom-nginx</decoded_as>
<description>Custom nginx event detected</description>
</rule>
But no decoder named custom-nginx exists in the decoder files. The rule is formally valid XML, but functionally useless.
LintSession solves this problem. When running through the unified wazuh-lint entry point, a session object is created. The decoder linter populates the name registry (session.decoder_names). The rule linter then receives this registry and checks every <decoded_as> against it.
from wazuh_linter import WazuhDecoderLinter, WazuhRuleLinter, LintSession
session = LintSession()
decoder_linter = WazuhDecoderLinter()
decoder_report = decoder_linter.lint_paths(
["decoders/"], session=session
)
rule_linter = WazuhRuleLinter()
rule_report = rule_linter.lint_paths(
["rules/"], session=session
)
# session.decoder_names populated by decoder linter
# rule_linter used it to validate decoded_as
for result in rule_report.results:
print(f"[{result.severity}] {result.file}:{result.line} - {result.message}")
Output when an issue is detected:
[INFO] local_rules.xml:12 - Rule '100100': <decoded_as> references
decoder 'custom-nginx' which was not found in scanned decoder files
The severity is INFO rather than ERROR because the decoder may exist in files not included in the current scan (for example, Wazuh default decoders).
CLI Usage and CI/CD Integration
The updated tool provides three entry points:
# Auto-detect file type (recommended)
wazuh-lint /var/ossec/etc/
# Force specific type
wazuh-lint --type rule /var/ossec/etc/rules/
wazuh-lint --type decoder /var/ossec/etc/decoders/
# Legacy aliases (identical to wazuh-lint, kept for backwards compatibility)
wazuh-rule-lint /var/ossec/etc/rules/
wazuh-decoder-lint /var/ossec/etc/decoders/
The wazuh-lint command automatically detects the type of each XML file and creates a LintSession for cross-type validation. Use --type to force a specific mode. The wazuh-rule-lint and wazuh-decoder-lint commands are aliases for wazuh-lint - they do not force a type. Options --strict, --format json, and --show-info work across all modes.
Updated GitHub Actions example for validating the entire configuration:
name: Lint Wazuh Configuration
on:
push:
paths:
- 'decoders/**'
- 'rules/**'
pull_request:
paths:
- 'decoders/**'
- 'rules/**'
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install wazuh-linter
run: pip install git+https://github.com/pyToshka/wazuh-linter.git
- name: Lint decoders and rules
run: wazuh-lint --strict --format json decoders/ rules/ > lint-results.json
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: lint-results
path: lint-results.json
Pre-commit hook for both file types:
repos:
- repo: local
hooks:
- id: wazuh-lint
name: Wazuh Lint
entry: wazuh-lint --strict
language: python
files: '\.xml$'
types: [file]
For a detailed walkthrough of the decoder linter and all 14 decoder checks, refer to Part 1 of this series.
Conclusion and Next Steps
wazuh-linter now covers both sides of the Wazuh event processing pipeline: decoders (14 checks) and rules (24 checks), connected through the LintSession cross-type validation mechanism. The BaseXmlLinter architecture makes adding new analysis types straightforward.
The tool is open source under the BSD 3-Clause license at github.com/pyToshka/wazuh-linter. The next part of the series will explore further extensions to the tool’s capabilities.
Related Reading
- Part 1: Static Analysis Tool for Wazuh Decoder XML Files - decoder XML validation
- Boosting Container Image Security Using Wazuh and Trivy - automated security validation
- RAG for Wazuh Documentation: Part 1 - building retrieval systems over Wazuh knowledge base
- Wazuh LLM: Fine-Tuned Llama 3.1 for Security Analysis - AI model for security event analysis