RPM Plugin

The RPM plugin provides support for DNF/YUM-based repositories (RHEL, CentOS, Fedora, Rocky Linux, AlmaLinux).

Overview

Status: ✅ Available

The RPM plugin consists of:

  • RpmSyncPlugin - Syncs packages from upstream RPM repositories

  • RpmPublisher - Publishes RPM repositories with metadata

Features

Repository Modes:

  • Mirror Mode - Full metadata mirroring (all repomd.xml types)

  • Filtered Mode - Smart metadata regeneration for filtered repos

  • Hosted Mode - For self-hosted packages (future)

Package Management:

  • ✅ Repomd.xml/primary.xml.gz parsing

  • ✅ RPM package downloading

  • ✅ SHA256 checksum verification

  • ✅ Architecture filtering

  • ✅ Pattern-based package filtering

  • ✅ Source RPM exclusion

  • ✅ Version filtering (only latest)

Metadata Support:

  • ✅ Full metadata mirroring (updateinfo, filelists, other, comps, modules, etc.)

  • ✅ Updateinfo/errata parsing and filtering

  • ✅ Metadata regeneration for filtered repositories

  • Compression: Gzip, Zstandard (.zst), XZ, BZ2 (read & write)

  • ✅ Configurable compression for generated metadata

  • ✅ Magic byte detection for auto-format detection

  • ✅ RHEL CDN support (client certificates)

Quality Assurance:

  • ✅ 39 comprehensive tests for compression support

  • ✅ Roundtrip tests (compress + decompress)

  • ✅ Compatibility tests with stdlib and zstandard library

  • ✅ Compression level tests

  • ✅ Large data handling tests

Planned:

  • 🚧 Delta RPMs

  • 🚧 GPG signature verification

Configuration

Basic RPM Repository

repositories:
  - id: epel9-latest
    name: EPEL 9 - Latest
    type: rpm
    feed: https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/
    enabled: true

With Filters

repositories:
  - id: epel9-webservers
    name: EPEL 9 - Web Servers
    type: rpm
    feed: https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/
    enabled: true
    filters:
      patterns:
        include: ["^nginx-.*", "^httpd-.*"]
        exclude: [".*-debug.*"]
      metadata:
        architectures:
          include: ["x86_64", "noarch"]
      rpm:
        exclude_source_rpms: true
      post_processing:
        only_latest_version: true

With Custom Compression

Configure metadata compression format (useful for openSUSE Tumbleweed which uses .zst):

repositories:
  - id: opensuse-tumbleweed
    name: openSUSE Tumbleweed
    type: rpm
    feed: https://download.opensuse.org/tumbleweed/repo/oss/
    enabled: true
    metadata:
      compression: auto  # auto | gzip | zstandard | bzip2 | none

Compression Options:

  • auto (default): Use same compression as upstream repository

  • gzip: Always use gzip for regenerated metadata (.gz files)

  • zstandard: Always use Zstandard (.zst files) - Required for openSUSE Tumbleweed

  • bzip2: Always use bzip2 (.bz2 files)

  • none: No compression (not recommended)

Note: This setting only affects metadata generated by Chantal (primary.xml in all modes, updateinfo/filelists/other in filtered mode). In mirror mode, original upstream metadata files are hardlinked unchanged.

RHEL with Client Certificates

repositories:
  - id: rhel9-baseos
    name: RHEL 9 BaseOS
    type: rpm
    feed: https://cdn.redhat.com/content/dist/rhel9/9/x86_64/baseos/os
    enabled: true
    ssl:
      ca_bundle: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
      client_cert: /etc/pki/entitlement/1234567890.pem
      client_key: /etc/pki/entitlement/1234567890-key.pem
      verify: true

Repository Modes

Chantal supports three repository operation modes for RPM repositories:

Mirror Mode (Default)

Full metadata mirroring - Downloads and publishes ALL metadata types from upstream repository unchanged.

repositories:
  - id: rhel9-baseos-mirror
    name: RHEL 9 BaseOS (Full Mirror)
    type: rpm
    feed: https://cdn.redhat.com/content/dist/rhel9/9/x86_64/baseos/os
    enabled: true
    mode: mirror  # Default

Behavior:

  • ✅ All metadata files downloaded: updateinfo, filelists, other, comps, modules, etc.

  • ✅ Metadata published unchanged (no filtering)

  • ✅ Perfect 1:1 mirror of upstream repository

  • ✅ Ideal for: Complete repository mirrors, compliance requirements

Metadata types mirrored:

  • primary - Package metadata (name, version, arch, dependencies)

  • filelists - File listings for each package

  • other - Changelog data

  • updateinfo - Errata/security advisories

  • comps - Package groups and categories

  • modules - Modular metadata (RHEL 8+)

  • And any other metadata types present in repomd.xml

Filtered Mode

Smart filtering with metadata regeneration - Filters packages and regenerates metadata to match.

repositories:
  - id: epel9-webservers
    name: EPEL 9 - Web Servers Only
    type: rpm
    feed: https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/
    enabled: true
    mode: filtered
    filters:
      patterns:
        include: ["^nginx-.*", "^httpd-.*", "^php-.*"]
      post_processing:
        only_latest_version: true

Behavior:

  • ✅ Packages filtered based on patterns/filters

  • ✅ Metadata regenerated to match available packages

  • ✅ Updateinfo filtered to include only relevant errata

  • ✅ Filelists, other metadata filtered accordingly

  • ✅ Ideal for: Custom repositories, filtered mirrors, disk space optimization

Metadata regeneration:

  • primary.xml - Regenerated with filtered package list

  • filelists.xml - Regenerated with filtered packages

  • other.xml - Regenerated with filtered packages

  • updateinfo.xml - Filtered to include only errata for available packages

  • comps.xml - Copied unchanged (groups still valid)

  • modules.yaml - Copied unchanged (if present)

Updateinfo Filtering Example:

Upstream has 1000 security advisories, but you only mirror nginx packages:

  • Mirror mode: All 1000 advisories published (irrelevant for your packages)

  • Filtered mode: Only nginx-related advisories published (smart filtering)

# Filtered updateinfo only includes errata matching your packages
# RHSA-2024:1234 for nginx-1.20.1-1.el9 → INCLUDED
# RHSA-2024:5678 for kernel-5.14.0-362.el9 → EXCLUDED (kernel not mirrored)

Hosted Mode

Self-hosted packages - For future use (uploading custom RPMs).

repositories:
  - id: custom-rpms
    name: Custom Internal RPMs
    type: rpm
    mode: hosted
    enabled: true

Status: Planned feature for uploading custom-built RPMs.

How It Works

Sync Process

1. Fetch repomd.xml

GET https://example.com/repo/repodata/repomd.xml

Parse repomd.xml to discover all metadata types:

  • primary - Package metadata (required)

  • filelists - File listings

  • other - Changelog data

  • updateinfo - Errata/advisories

  • comps - Package groups

  • modules - Modular metadata

  • … and any other types

2. Download Metadata

Mirror Mode:

  • Downloads ALL metadata types from repomd.xml

  • Stores in pool: /var/lib/chantal/pool/files/

  • Metadata tracked in RepositoryFile model

Filtered Mode:

  • Downloads primary.xml (required for package discovery)

  • Downloads updateinfo.xml (for errata filtering)

  • Other metadata downloaded as needed

3. Parse Packages

Fetch and parse primary.xml.gz:

GET https://example.com/repo/repodata/abc123-primary.xml.gz

Extract package list with metadata:

  • Name, version, release, epoch, architecture

  • Dependencies, provides, requires

  • SHA256 checksum

  • File location

4. Apply Filters (Filtered Mode Only)

  • Pattern matching (include/exclude regex)

  • Architecture filtering

  • Size/build time filtering

  • RPM-specific filters (exclude source RPMs, etc.)

  • Post-processing (only latest version)

Mirror Mode: No filtering applied.

5. Download Packages

For each package:
  - Calculate expected SHA256
  - Check if exists in pool
  - If not, download to pool
  - Verify checksum

Pool structure:

/var/lib/chantal/pool/content/{sha256[0:2]}/{sha256[2:4]}/{sha256}.rpm

6. Update Database

  • Add packages to database (ContentItem model)

  • Add metadata files to database (RepositoryFile model)

  • Associate with repository

  • Record sync history

Publish Process

1. Query Packages

packages = repository.content_items
metadata_files = repository.repository_files  # Mirror mode only

2. Create Directory Structure

/var/www/repos/repo-id/latest/
├── Packages/
└── repodata/

4. Publish Metadata

Mirror Mode:

  • Copy ALL metadata files from pool to repodata/

  • Hardlinks: pool/files/{sha256}.xml.gzrepodata/{type}.xml.gz

  • Copy repomd.xml unchanged

  • Perfect 1:1 mirror

Filtered Mode:

  • Generate primary.xml with filtered package list

  • Generate filelists.xml with filtered packages

  • Generate other.xml with filtered packages

  • Filter updateinfo.xml to include only relevant errata

  • Copy comps.xml unchanged (if present)

  • Generate new repomd.xml with checksums

5. Updateinfo Filtering (Filtered Mode)

Parse upstream updateinfo.xml:

<updates>
  <update type="security" id="RHSA-2024:1234">
    <title>nginx security update</title>
    <pkglist>
      <package name="nginx" version="1.20.1" release="1.el9" arch="x86_64"/>
    </pkglist>
  </update>
  <update type="security" id="RHSA-2024:5678">
    <title>kernel security update</title>
    <pkglist>
      <package name="kernel" version="5.14.0" release="362.el9" arch="x86_64"/>
    </pkglist>
  </update>
</updates>

Filter logic:

  • Extract package NVRAs from each advisory

  • Check if ANY package in advisory is in your filtered repository

  • If yes: Include advisory in filtered updateinfo.xml

  • If no: Exclude advisory

Result:

<updates>
  <update type="security" id="RHSA-2024:1234">
    <!-- nginx advisory INCLUDED (nginx is in filtered repo) -->
  </update>
  <!-- kernel advisory EXCLUDED (kernel not in filtered repo) -->
</updates>

6. Result

Mirror Mode:

/var/www/repos/rhel9-baseos-mirror/latest/
├── Packages/
│   ├── nginx-1.20.2-1.el9.x86_64.rpm
│   ├── kernel-5.14.0-362.el9.x86_64.rpm
│   └── ... (all packages)
└── repodata/
    ├── repomd.xml
    ├── abc123-primary.xml.gz
    ├── def456-filelists.xml.gz
    ├── ghi789-other.xml.gz
    ├── jkl012-updateinfo.xml.gz
    ├── mno345-comps.xml.gz
    └── ... (all metadata types)

Filtered Mode:

/var/www/repos/epel9-webservers/latest/
├── Packages/
│   ├── nginx-1.20.2-1.el9.x86_64.rpm
│   └── httpd-2.4.51-1.el9.x86_64.rpm
└── repodata/
    ├── repomd.xml (regenerated)
    ├── abc123-primary.xml.gz (regenerated)
    ├── def456-filelists.xml.gz (regenerated)
    ├── ghi789-other.xml.gz (regenerated)
    └── jkl012-updateinfo.xml.gz (filtered)

Metadata Files

repomd.xml

Root metadata file:

<?xml version="1.0"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo">
  <revision>1641816000</revision>
  <data type="primary">
    <checksum type="sha256">abc123...</checksum>
    <location href="repodata/abc123-primary.xml.gz"/>
    <timestamp>1641816000</timestamp>
    <size>12345</size>
    <open-checksum type="sha256">def456...</open-checksum>
    <open-size>67890</open-size>
  </data>
</repomd>

primary.xml.gz

Package list (gzip-compressed):

<?xml version="1.0"?>
<metadata packages="2">
  <package type="rpm">
    <name>nginx</name>
    <arch>x86_64</arch>
    <version epoch="0" ver="1.20.2" rel="1.el9"/>
    <checksum type="sha256" pkgid="YES">f256abc...</checksum>
    <summary>High performance web server</summary>
    <description>...</description>
    <packager>...</packager>
    <url>...</url>
    <time file="1641816000" build="1641815000"/>
    <size package="1234567" installed="4567890" archive="1234000"/>
    <location href="Packages/nginx-1.20.2-1.el9.x86_64.rpm"/>
    <format>
      <rpm:license>BSD</rpm:license>
      <rpm:vendor>EPEL</rpm:vendor>
      <rpm:group>System Environment/Daemons</rpm:group>
      <rpm:buildhost>buildvm.example.com</rpm:buildhost>
      <rpm:sourcerpm>nginx-1.20.2-1.el9.src.rpm</rpm:sourcerpm>
      <rpm:provides>...</rpm:provides>
      <rpm:requires>...</rpm:requires>
    </format>
  </package>
</metadata>

Compression Support

Chantal supports all compression formats used by RPM repositories with full read and write capabilities.

Supported Formats

Format

Extension

Read

Write

Magic Bytes

Use Case

Gzip

.gz

1f 8b

Most common (RHEL, CentOS, Fedora)

Zstandard

.zst

28 b5 2f fd

Modern (openSUSE Tumbleweed)

XZ

.xz

fd 37 7a 58 5a 00

High compression (some repos)

Bzip2

.bz2

42 5a 68

Legacy (older repos)

None

-

-

Testing/debugging

Auto-Detection

Chantal automatically detects compression format using two methods:

  1. Extension-based (primary): Checks file extension (.gz, .zst, .xz, .bz2)

  2. Magic byte detection (fallback): Reads first bytes of file to identify format

This ensures compatibility even with misnamed files or when extension is unknown.

Configuration

Control compression format for generated metadata:

repositories:
  - id: my-repo
    type: rpm
    feed: https://example.com/repo/
    metadata:
      compression: auto  # Default - detects from upstream

Options:

  • auto: Detect from upstream repository (default, recommended)

  • gzip: Force gzip compression (universal compatibility)

  • zstandard: Force Zstandard (best compression ratio, modern)

  • bzip2: Force bzip2 (legacy compatibility)

  • none: No compression (not recommended, large files)

Compression Behavior

Mirror Mode:

  • Original metadata files hardlinked from pool (unchanged)

  • No recompression occurs

  • Preserves upstream compression format

Filtered Mode:

  • Metadata regenerated based on filtered packages

  • Uses configured compression format

  • auto mode mirrors upstream compression

  • Primary.xml always regenerated

  • Updateinfo, filelists, other regenerated if filtered

Performance Characteristics

Compression speeds (relative to gzip=100%):

Format

Compression Speed

Decompression Speed

Ratio

Best For

None

-

-

1.0×

Testing only

Gzip (level 6)

100%

100%

~3-5×

Universal compatibility

Zstandard (level 3)

200%

150%

~3-6×

Modern repos (recommended)

Bzip2 (level 9)

30%

80%

~4-6×

Legacy repos only

XZ (level 6)

15%

90%

~5-8×

Slow compression, use sparingly

Note: Zstandard offers the best balance of speed and compression ratio for modern repositories.

Testing

Compression support is thoroughly tested with 39 unit tests:

  • Roundtrip tests: Compress → decompress → verify

  • Compatibility tests: Interop with stdlib and native libraries

  • Compression levels: Test levels 1-22 for zstandard

  • Large data: Test with 1000+ package metadata

  • Magic byte detection: Verify all formats detected correctly

  • Error handling: Invalid formats raise appropriate errors

Run tests:

PYTHONPATH=src python3.12 -m pytest tests/test_rpm_compression.py tests/test_rpm_parsers_zstd.py -v

Examples

openSUSE Tumbleweed (Zstandard):

repositories:
  - id: opensuse-tumbleweed
    name: openSUSE Tumbleweed OSS
    type: rpm
    feed: https://download.opensuse.org/tumbleweed/repo/oss/
    metadata:
      compression: auto  # Detects .zst from upstream

Force Gzip for consistency:

repositories:
  - id: mixed-sources
    name: Multiple Upstream Sources
    type: rpm
    feed: https://example.com/repo/
    metadata:
      compression: gzip  # Force gzip regardless of upstream

RPM-Specific Filters

Exclude Source RPMs

filters:
  rpm:
    exclude_source_rpms: true

Excludes packages ending with .src.rpm.

Group Filtering

filters:
  rpm:
    groups:
      include:
        - "System Environment/Base"
        - "Applications/Internet"

Filter by RPM group metadata.

License Filtering

filters:
  rpm:
    licenses:
      include: ["GPL", "MIT", "Apache"]

Filter by package license.

Supported Distributions

Red Hat Enterprise Linux (RHEL)

  • RHEL 8, 9

  • Requires subscription and client certificates

Example:

repositories:
  - id: rhel9-baseos
    type: rpm
    feed: https://cdn.redhat.com/content/dist/rhel9/9/x86_64/baseos/os
    ssl:
      client_cert: /etc/pki/entitlement/xxx.pem
      client_key: /etc/pki/entitlement/xxx-key.pem

CentOS Stream

  • CentOS Stream 8, 9

Example:

repositories:
  - id: centos-stream-9-baseos
    type: rpm
    feed: https://mirror.stream.centos.org/9-stream/BaseOS/x86_64/os/

Fedora

  • Fedora 38, 39, 40+

Example:

repositories:
  - id: fedora-40-everything
    type: rpm
    feed: https://download.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/

EPEL (Extra Packages for Enterprise Linux)

  • EPEL 8, 9

Example:

repositories:
  - id: epel9-everything
    type: rpm
    feed: https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/

Rocky Linux

  • Rocky Linux 8, 9

Example:

repositories:
  - id: rocky9-baseos
    type: rpm
    feed: https://dl.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/

AlmaLinux

  • AlmaLinux 8, 9

Example:

repositories:
  - id: alma9-baseos
    type: rpm
    feed: https://repo.almalinux.org/almalinux/9/BaseOS/x86_64/os/

Troubleshooting

Metadata Not Found

Error: Failed to fetch repomd.xml

Solutions:

  • Check feed URL is correct

  • Ensure URL ends with / (e.g., .../os/ not .../os)

  • Verify network connectivity

  • Check SSL certificates if using HTTPS

Checksum Mismatch

Error: SHA256 checksum mismatch for package

Solutions:

  • Corrupted download - retry sync

  • Upstream changed package without updating metadata

  • Network issue - check connection

No Packages After Filtering

Warning: Filtered out all packages, 0 remaining

Solutions:

  • Check filter patterns are correct

  • Verify architecture filter includes needed architectures

  • Review only_latest_version setting

Future Enhancements

  • Modular repositories - Support for modules.yaml

  • Delta RPMs - Download only package deltas

  • GPG verification - Verify package signatures

  • Comps.xml - Package group metadata

  • Updateinfo.xml - Security/bug fix advisories

  • Filelists.xml - File listings for packages