Sanitizes reference links for Beancount

Ensures that user-provided reference strings for expense,
receivable, and revenue entries are sanitized before being
included as Beancount links. This prevents issues caused by
invalid characters in the links, improving compatibility with
Beancount's link format. A new utility function is introduced
to handle the sanitization process.
This commit is contained in:
padreug 2025-11-10 15:04:27 +01:00
parent a6b67b7416
commit 51ae2e8e47
2 changed files with 34 additions and 9 deletions

View file

@ -14,6 +14,31 @@ Key concepts:
from datetime import date, datetime
from decimal import Decimal
from typing import Any, Dict, List, Optional
import re
def sanitize_link(text: str) -> str:
"""
Sanitize a string to make it valid for Beancount links.
Beancount links can only contain: A-Z, a-z, 0-9, -, _, /, .
All other characters are replaced with hyphens.
Examples:
>>> sanitize_link("Test (pending)")
'Test-pending'
>>> sanitize_link("Invoice #123")
'Invoice-123'
>>> sanitize_link("castle-abc123")
'castle-abc123'
"""
# Replace any character that's not alphanumeric, dash, underscore, slash, or period with a hyphen
sanitized = re.sub(r'[^A-Za-z0-9\-_/.]', '-', text)
# Remove consecutive hyphens
sanitized = re.sub(r'-+', '-', sanitized)
# Remove leading/trailing hyphens
sanitized = sanitized.strip('-')
return sanitized
def format_transaction(