feat(editor): try to have ckeditor not crash when handling stranger tags #6867

perfectra1n · 2025-09-02T03:39:29Z

this definitely isn't being merged, but keeping it open for Elian. Here's what the annoying AI said the fix was for below lol:

CKEditor Generic Type Syntax Fix

The Problem

We discovered a critical bug where CKEditor would crash when switching from read-only to edit mode on notes containing programming language generic type syntax. Here's what was happening:

The Crash

When a note contained code like this:

HashMap<String, PromptTemplate> templates = new HashMap<>();
List<String, Value> items = getItems();

Or Rust code like:

fn process(error: Box<dyn Error>) -> Result<()> {
    // ...
}

CKEditor would crash with this error:

InvalidCharacterError: Failed to execute 'createElement' on 'Document': 
The tag name provided ('string,') is not a valid name.

Why It Crashed

CKEditor was trying to parse the generic type syntax as HTML tags! When it saw <String, PromptTemplate>, it thought:

"Oh, this must be an HTML tag called String,"
"Let me create that element..."
💥 Crash! Because String, isn't a valid HTML tag name (it has a comma!)

The same happened with:

Box<dyn → tried to create a box<dyn element
<OpenAIClient> → tried to create an openaiclient element

Why Existing Solutions Didn't Work

Server-side HTML Sanitizer

Trilium already has an HTML sanitizer that uses an allowedHtmlTags setting. You might think, "Why not just use that?" Here's why that wouldn't work:

The sanitizer removes non-allowed tags entirely:

// What the sanitizer does:
sanitizeHtml("Here is <String, Type> code", { allowedTags: ['div', 'p'] })
// Result: "Here is  code"  
// ❌ The generic type syntax is DELETED!

We don't want to delete the code - we want to preserve it as text!

The Key Insight

Sanitizing: Removes unwanted HTML for security (data loss)
Escaping: Converts angle brackets to HTML entities (data preserved)

We need escaping, not sanitizing!

Our Solution

We implemented a smart escaping mechanism that:

Identifies what's actually HTML using the user's allowedHtmlTags setting
Escapes everything else so it displays as text instead of being parsed as HTML

Here's how it works:

private escapeGenericTypeSyntax(content: string): string {
    // Get allowed HTML tags from user settings
    const allowedTags = JSON.parse(options.get("allowedHtmlTags"));
    const htmlTags = new Set(allowedTags.map(tag => tag.toLowerCase()));
    
    // Step 1: Protect real HTML tags with placeholders
    const htmlProtectionMap = new Map<string, string>();
    let protectedContent = content.replace(/<\/?([a-zA-Z][a-zA-Z0-9-]*)\b[^>]*>/gi, (match, tagName) => {
        if (htmlTags.has(tagName.toLowerCase())) {
            const placeholder = `__HTML_PROTECTED_${counter++}__`;
            htmlProtectionMap.set(placeholder, match);
            return placeholder;
        }
        return match;
    });
    
    // Step 2: Escape ALL remaining angle brackets (these are code, not HTML!)
    protectedContent = protectedContent
        .replace(/</g, '&lt;')
        .replace(/>/g, '&gt;');
    
    // Step 3: Restore the real HTML tags
    htmlProtectionMap.forEach((original, placeholder) => {
        protectedContent = protectedContent.replace(placeholder, original);
    });
    
    return protectedContent;
}

Example Transformation

Input:

<p>Here's a Java class:</p>
<pre>
HashMap<String, PromptTemplate> templates;
Box<dyn Error> error;
</pre>
<div>More content</div>

After Step 1 (Protect HTML):

__HTML_0__Here's a Java class:__HTML_1__
__HTML_2__
HashMap<String, PromptTemplate> templates;
Box<dyn Error> error;
__HTML_3__
__HTML_4__More content__HTML_5__

After Step 2 (Escape remaining brackets):

__HTML_0__Here's a Java class:__HTML_1__
__HTML_2__
HashMap&lt;String, PromptTemplate&gt; templates;
Box&lt;dyn Error&gt; error;
__HTML_3__
__HTML_4__More content__HTML_5__

Final Output:

<p>Here's a Java class:</p>
<pre>
HashMap&lt;String, PromptTemplate&gt; templates;
Box&lt;dyn Error&gt; error;
</pre>
<div>More content</div>

CKEditor receives this and:

✅ Parses <p>, <pre>, <div> as HTML tags
✅ Displays <String, PromptTemplate> as visible text: <String, PromptTemplate>
✅ No crash!

Why Our Fix is Correct

Preserves Data: Unlike sanitization which deletes unknown tags, escaping preserves the code content
Respects User Settings: Uses the allowedHtmlTags option so users can customize what's treated as HTML
No False Positives: Only escapes things that aren't in the allowed tags list
Bidirectional: We unescape when saving, so the database content remains unchanged
Fail-Safe: Has fallback error handling if escaping fails

Configuration

Users can customize which tags are treated as HTML through:
Settings → Other → Allowed HTML Tags

The default list includes common HTML tags (div, span, p, etc.) and Trilium-specific elements (includenote).

Testing

The fix handles all these patterns correctly:

✅ Java/C#: HashMap<String, List<Item>>
✅ TypeScript: Record<string, unknown>
✅ Rust: Box<dyn Error + Send>
✅ C++: std::vector<int>
✅ Incomplete: <String, (missing closing bracket)
✅ Mixed: HTML and code in the same note

Summary

The bug occurred because CKEditor tried to parse programming language generic syntax as HTML tags. Our fix intelligently escapes non-HTML angle brackets while preserving real HTML tags, preventing crashes while maintaining all content. The solution respects user preferences through the configurable allowedHtmlTags setting and ensures no data loss occurs.

feat(editor): try to have ckeditor not crash when handling stranger tags

ba1c6ba

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 2, 2025

perfectra1n mentioned this pull request Sep 18, 2025

Note content replaced by Excalidraw JSON when switching notes #6788

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(editor): try to have ckeditor not crash when handling stranger tags #6867

feat(editor): try to have ckeditor not crash when handling stranger tags #6867

Uh oh!

perfectra1n commented Sep 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

feat(editor): try to have ckeditor not crash when handling stranger tags #6867

Are you sure you want to change the base?

feat(editor): try to have ckeditor not crash when handling stranger tags #6867

Uh oh!

Conversation

perfectra1n commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Problem

The Crash

Why Existing Solutions Didn't Work

Server-side HTML Sanitizer

The Key Insight

Our Solution

Example Transformation

Why Our Fix is Correct

Configuration

Testing

Summary

Uh oh!

Uh oh!

perfectra1n commented Sep 2, 2025 •

edited

Loading