Python

How to Ensure Data Validity with Constr in Pydantic

In software development, ensuring that the data is accurate and reliable is crucial. A powerful Python library known as Pydantic has a “constr” feature that can be used to do this task in a variety of ways. The constr (short for “constrained string”) is a validator that you can use to ensure the data validity by imposing constraints on string values. These constraints can include regular expressions, string length, and more.

Example:

Let’s implement an example in which Pydantic checks the data against constraints using the customized validation methods that we specify in the model. If the program detects any differences in the data, it generates an error.

The program starts by importing the necessary modules from Pydantic to define the data models and the “ValidationError” to handle the validation errors.

import re

from pydantic import BaseModel, ValidationError, validator

We import the Python regular expression module by writing “Import re” which is used for pattern matching and validation. In the next line, we import the necessary components from the Pydantic library which are “BaseModel”, “ValidationError”, and “validator”.

“BaseModel” is the core class of the Pydantic library that we use to define our data models. The “ValidationError” is an exception class from Pydantic that we can use to catch the validation errors when the data doesn’t match the defined constraints. The “validator” is a decorator that is provided by Pydantic that allows us to define the custom validation functions for fields in our data models.

Now that the required modules are imported into the project, we define a Pydantic model.

class UserData(BaseModel):

  username: str

  email: str

In this code, we create a Pydantic model named “UserData” that is inherited from the “BaseModel”. We define two fields in the “UserData” model: username and email. Both fields are str which implies that they can only hold the string values.

After that, we define the custom validation logic for specific fields in the Pydantic model using the “@validator” decorator which allows us to enforce the constraints and patterns on the data within the specified fields.

@validator("username")

  def validate_username(cls, value):

    if len(value) < 5:

      raise ValueError("Username must be at least 5 characters long.")

   if not re.match("^[a-zA-Z0-9_]+$", value):

      raise ValueError("Username can only contain alphanumeric characters and underscores.")

  return value.strip().lower()

In this code snippet, the “@validator(“username”)” decorator is used to associate the “validate_username” function with the username field. The “validate_username” function takes two parameters: cls (the class) and value (the value of the field that is being validated). The function checks if the length of the username is less than five characters. If it is, a “ValueError” is raised with a message that indicates the requirement for a longer username.

Then, the “re.match()” function is used to match the “username” against the regular expression “^[a-zA-Z0-9_]+$”. This regular expression guarantees that the username only comprises alphanumeric characters including underscores. A “ValueError” is displayed if the match fails with a message to signify the constraint.

If the username successfully satisfies both validations, any leading and trailing whitespace is removed and subsequently transformed to lowercase before being returned.

@validator("email")

  def validate_email(cls, value):

    if not re.match("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$", value):

        raise ValueError("Invalid email format.")

    return value

Similar to the previous validator, the “@validator(“email”)” decorator associates the “validate_email” function with the email field. The “validate_email” function checks if the value (email) matches the provided regular expression. This regular expression ensures a basic email format with a format like “[email protected]”. If the email format does not match, a “ValueError” is raised with a message that indicates the invalid format. If the email format is valid, the function returns the email value.

After that, we define two dictionaries, “valid_user_data” and “invalid_user_data”, representing the valid and invalid user data.

valid_user_data = {
    "username": "adam_li123",
    "email": "[email protected]"
}

invalid_user_data = {
    "username": "user",
    "email": "invalid_email"
}

The “valid_user_data” dictionary contains the example data that meets the constraints that are defined in our “UserData” model. It demonstrates that the validation logic works as expected for the valid data. We have given an example of a valid username and an example of a valid email address in this dictionary.

The purpose of “valid_user_data” is to test the validation process for the data that should pass without raising any validation errors.

The “invalid_user_data” dictionary contains the example data that does not meet the constraints that are defined in our “UserData” model. It is used to determine that the validation logic correctly identifies and raises the errors for invalid data. We provide an example of a (invalid) username that’s too short (less than five characters). Also shown is an example of an incorrect email address that fails to adhere to the specified format.

The purpose of “invalid_user_data” is to showcase how the validation process catches and handles the data that doesn’t meet the required criteria.

Now, we create an instance of the “UserData” model using the provided data and handle the validation errors using the “try” and “except” blocks.

try:
    valid_instance = UserData(**valid_user_data)
    print("Valid user data:", valid_instance)
except ValidationError as e:
    print("Error for valid data:", e.errors())

The previous code attempts to create an instance of the “UserData” class using the “valid_user_data” dictionary. The “**valid_user_data” syntax unpacks the dictionary and provides its contents as keyword arguments to the constructor of the “UserData” class.

Inside the “try” block, since the data in “valid_user_data” meets the constraints that are defined in the “UserData” model, no validation errors are raised. Ultimately, the created instance is assigned to the “valid_instance” variable.

If a “ValidationError” is raised during the creation of the “valid_instance”, the code inside the “except” block is executed. The “ValidationError” exception instance is caught and assigned to the “e” variable. Then, we use the “e.errors()” method to retrieve a list of validation error details. These error details are printed with the “Error for valid data” message.

The code subsequently tries to build an instance of the “UserData” class that employs the “invalid_user_data” dictionary.

try:
    invalid_instance = UserData(**invalid_user_data)
    print("Invalid user data:", invalid_instance)
except ValidationError as e:
    print("Error for invalid data:", e.errors())

Similar to what we previously did, the “**invalid_user_data” syntax unpacks the dictionary and provides its contents as keyword arguments to the constructor of the “UserData” class. Inside the “try” block, the validation errors are raised because the data in “invalid_user_data” doesn’t meet the constraints that are defined in the “UserData” model.

If a “ValidationError” is raised during the creation of the “invalid_instance”, the code inside the second “except” block is executed. The “ValidationError” exception instance is caught and given to the “e” variable.

The “e.errors()” method is invoked to obtain a list of validation error details. The obtained error details are displayed with a message specifying the “Error for invalid data”.

When we compile this code, it looks like this:

import re

from pydantic import BaseModel, ValidationError, validator

class UserData(BaseModel):

username: str

email: str

@validator("username")

def validate_username(cls, value):

  if len(value) < 5:

     raise ValueError("Username must be at least 5 characters long.")

  if not re.match("^[a-zA-Z0-9_]+$", value):

    raise ValueError("Username can only contain alphanumeric characters and underscores.")

  return value.strip().lower()

@validator("email")

def validate_email(cls, value):

  if not re.match("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$", value):

    raise ValueError("Invalid email format.")

  return value

valid_user_data = {

  "username": "adam_li123",

  "email": "[email protected]"

}

invalid_user_data = {

  "username": "user",

  "email": "invalid_email"

}

try:

  valid_instance = UserData(**valid_user_data)

  print("Valid user data:", valid_instance)

except ValidationError as e:

  print("Error for valid data:", e.errors())

try:

  invalid_instance = UserData(**invalid_user_data)

  print("Invalid user data:", invalid_instance)

except ValidationError as e:

  print("Error for invalid data:", e.errors())

The execution of this code obtains the following output:

Conclusion

This discussion explored how Pydantic’s validator and “constr” functionalities contribute to maintaining the data validity in Python applications. We created an instance where Pydantic checks the data against constraints that are specified in the model using the customized validation functions as required. When there are differences, the procedure throws the “ValidationError” exceptions to safeguard the authenticity of the data.

About the author

Omar Farooq

Hello Readers, I am Omar and I have been writing technical articles from last decade. You can check out my writing pieces.