Back to Home

HTML Tag Validation Regex

CronOS Team
regexhtmlvalidationtutorialweb

Need to generate a regex pattern?

Use CronOS to generate any regex pattern you wish with natural language. Simply describe what you need, and we'll create the perfect regex pattern for you. It's completely free!

Generate Regex Pattern

HTML Tag Validation Regex

Validate HTML tags including opening/closing tags and self-closing tags using regex pattern with backreferences.

Pattern Breakdown

regex
^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$

Components

ComponentDescriptionMatches
^Start anchorEnsures match from string start
<Opening bracketLiteral < character
([a-z]+)Tag nameOne or more lowercase letters (captured)
([^<]+)*AttributesZero or more attribute characters
(?:>(.*)<\/\1>|\s+\/>)Tag content or self-closeAlternation group
>Opening tag endLiteral > for opening tag
(.*)Tag contentAny content between tags
<\/\1>Closing tagClosing tag using backreference \1
|\s+\/>OR self-closingSelf-closing tag format
$End anchorEnsures match to string end

Detailed Breakdown

  • ([a-z]+) - Captures tag name (e.g., div, span, p)
  • ([^<]+)* - Optional attributes (anything except <)
  • (?:>(.*)<\/\1>|\s+\/>) - Alternation:
    • >(.*)<\/\1> - Opening tag with content and matching closing tag
    • \s+\/> - Self-closing tag (e.g., <br />, <img />)
  • \1 - Backreference to first capture group (tag name)

Examples

Valid:

  • <div>content</div>
  • <span class="test">text</span>
  • <p id="para">paragraph</p>
  • <br />
  • <img src="image.jpg" />
  • <input type="text" />

Invalid:

  • <div>content</span> (mismatched tags)
  • <DIV>content</DIV> (uppercase not supported)
  • <div> (unclosed tag)
  • </div> (closing tag without opening)
  • <div><span></div></span> (nested tags not properly handled)
  • <div content</div> (missing >)

Implementation

JavaScript

javascript
const htmlTagRegex = /^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$/;
htmlTagRegex.test('<div>content</div>'); // true
htmlTagRegex.test('<span class="test">text</span>'); // true
htmlTagRegex.test('<br />'); // true
htmlTagRegex.test('<div>content</span>'); // false (mismatched)
htmlTagRegex.test('<DIV>content</DIV>'); // false (uppercase)

Python

python
import re
html_tag_regex = r'^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$'
bool(re.match(html_tag_regex, '<div>content</div>'))  # True
bool(re.match(html_tag_regex, '<span class="test">text</span>'))  # True
bool(re.match(html_tag_regex, '<br />'))  # True
bool(re.match(html_tag_regex, '<div>content</span>'))  # False (mismatched)

Go

go
htmlTagRegex := regexp.MustCompile(`^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$`)
htmlTagRegex.MatchString("<div>content</div>") // true
htmlTagRegex.MatchString("<span class=\"test\">text</span>") // true
htmlTagRegex.MatchString("<br />") // true
htmlTagRegex.MatchString("<div>content</span>") // false (mismatched)

Limitations

  1. No nested tags: Doesn't properly handle nested HTML tags
  2. Lowercase only: Only accepts lowercase tag names
  3. No attribute validation: Doesn't validate attribute syntax
  4. Simple content matching: .* matches any content, including other tags
  5. Not for full HTML parsing: Use HTML parsers for complex documents
  6. No DOCTYPE or comments: Doesn't handle HTML comments or DOCTYPE

When to Use

  • Simple HTML tag format validation
  • Validating individual tag structures
  • When you need to check tag syntax
  • Basic HTML tag pattern matching
  • Quick format checking

For production, consider:

  • Using proper HTML parsers (e.g., DOMParser, BeautifulSoup)
  • Supporting uppercase tag names
  • Validating attribute syntax properly
  • Handling nested tags correctly
  • Using specialized HTML validation libraries

Need to generate a regex pattern?

Use CronOS to generate any regex pattern you wish with natural language. Simply describe what you need, and we'll create the perfect regex pattern for you. It's completely free!

Generate Regex Pattern