name_parser
Name parser for parsing human names into components.
This module provides functionality to parse western-style names (including Latin-style names common in the USA) into their component parts: title, first name, middle name(s), last name, suffix, and nickname.
Architecture:
The parser is organized into logical sections:
- Configuration Constants: Sets of known titles, suffixes, prefixes, etc.
- Utility Functions: Helper functions for classification (is_title, is_suffix, etc.)
- Name Component Extraction: Functions to extract titles, suffixes, nicknames
- Format-Specific Parsing:
- Comma format: "Last [Suffix], Title First Middle [Suffix]"
- Standard format: "Title First Middle Last Suffix"
- Public API: Main parse_name() function
Parsing Flow:
- Extract nickname from quotes/parentheses
- Normalize whitespace and clean input
- Detect format (comma-separated vs standard)
- Parse according to format:
- Extract titles from start
- Extract suffixes from end
- Identify first, middle, and last names
- Handle prefixes (e.g., "de la", "van der")
- Apply post-processing rules (e.g., handle_firstnames logic)
Module
Functions
parse_name
def parse_name(full_name: str) ‑> ParsedName:Parse a full name string into its components.
Supports formats like:
- "John Smith"
- "Smith, John"
- "Dr. John M. Smith"
- "Smith, John M., JR"
- "John 'Johnny' Smith"
- "De la Vega, Juan"
- "McDonald, James"
The parsing process:
- Extract nickname from quotes/parentheses
- Normalize whitespace and clean input
- Detect format (comma-separated vs standard)
- Parse according to format
- Apply post-processing rules
Arguments
full_name: The full name string to parse
Returns ParsedName object with parsed components
Classes
ParsedName
class ParsedName( title: str = '', first: str = '', middle: str = '', last: str = '', suffix: str = '', nickname: str = '',):Represents a parsed human name with its components.
This class holds the result of parsing a full name string into its constituent parts. All fields are strings and may be empty if that component was not found in the input.
Attributes
title: Title or honorific (e.g., "Dr.", "Mr.", "Prof.")first: First name (given name)middle: Middle name(s) or initial(s), space-separated if multiplelast: Last name (family name, may include prefixes like "de la", "van", "Mc")suffix: Suffix (e.g., "JR", "III", "Ph.D."), comma-separated if multiplenickname: Nickname extracted from quotes or parentheses
Methods
as_dict
def as_dict(self) ‑> Dict[str, str]:Return the parsed name as a dictionary.
All fields are normalized to empty strings if None, ensuring consistent output format for serialization and comparison.
Returns Dictionary with keys: title, first, middle, last, suffix, nickname