labapi.util.extract.extract_etree#

labapi.util.extract.extract_etree(
_etree: Element,
schema: EtreeExtractorDict,
) dict[str, Any][source]#

Extract data from an lxml.etree.Element using a format dictionary.

This function navigates the XML tree using paths defined in the schema dictionary and applies callable extractors to the text content of the found elements.

Parameters:
  • _etree – The lxml.etree.Element from which to extract data.

  • schema – A dictionary defining the structure and extraction logic. Keys are XML element tags (or paths), and values are either nested EtreeExtractorDict or callable functions to process the text.

Returns:

A dictionary containing the extracted and processed data.

Raises:

ExtractionError – If an element specified in the format is not found in the etree, or if a callable extractor fails to process a value.