Skip to content

MDMS Input Validation Rules

To enhance the accuracy and reliability of extracted data, the MARS Data Mining Studio (MDMS) allows for the configuration of specific validation rules within the extraction templates or form designs. These rules are applied to the data after it has been initially extracted but before it is finalized in the output, enabling automated quality checks.

Types of Validation Rules Supported:

  • Presence Checks:
  • Mandatory Fields: Rules can designate certain fields as mandatory, triggering an error or flag if no data is successfully extracted for that field in a given document.
  • Format & Type Compliance:
  • Data Type Validation: Implicitly validates if the extracted data conforms to the defined field type (e.g., fails if non-numeric characters are found in a Number field, or if a date string doesn't match the specified format).
  • Pattern Matching: Allows validation against specific regular expression (RegEx) or wildcard patterns. This ensures extracted data adheres to expected structural formats (e.g., validating a social security number pattern, a specific account code structure, or a postal code format).
  • Length Constraints:
  • Minimum Length: Rules can specify the minimum number of characters required for a field.
  • Maximum Length: Rules can specify the maximum number of characters allowed for a field.
  • Value Range Constraints:
  • Numeric Ranges: For Number or Currency fields, validators can check if the extracted value falls within a predefined minimum and maximum value range.
  • Date/Time Ranges: For Date, Time, or DateTime fields, validators can ensure the extracted value falls within an acceptable date/time window.
  • Extraction Confidence / Trigger Validation:
  • Rules can potentially be linked to the confidence score returned by an OCR engine or check if required text triggers (anchor keywords) used for extraction were successfully found.

By applying these validation rules during the automated extraction process, MDMS helps identify potential errors or inconsistencies early, improving overall data quality and reducing the burden of manual verification or correction in downstream systems. Validation failures can typically be configured to flag records for review or route them through specific exception handling workflows.