PostgreSQL

Postgres data masking with PostgreSQL Anonymizer

While using the PostgreSQL database management system, we need to hide whole or some part of data from the users to keep sensitive data protected and unused. This can be done through different approaches, but here we will explain one of the commonly used processes of data masking.

Data masking

It is the method used to protect important sensitive data by replacing the values with some alternative values that are realistically equivalent. Data masking is also known as an umbrella for the data that hides the part in it and protects the data from unwanted users.

Postgresql Anonymizer

It is a PostgreSQL extension that is created to hide or replace personal details or apply a mask on the information. This technique is applied with the masking procedure by creating roles.

The functionality of data masking

By using data masking, our legal requirements are confirmed to keep the data private, like PCI-DSS and other regulations are created to ensure that data will stay private. It protects the information secrets as many organizations work with high-time critical records, which should be protected from competitors.

Working of the Process:

  • We configure the PostgreSQL system for configuring the data masking.
  • We create a user for the secure backups.
  • Then also, a masked backup is created.
  • The secure backups are provided to the developers.

Role

A role can be a single user or maybe a group of users. A role is that entity of PostgreSQL database that gives PostgreSQL ownership and provides the database privileges. For instance, a login user is a role that makes able the new user get logged in to the PostgreSQL database. Through the commands and the pgAdmin panel options, a role is created.

Implementation

Go to the Admin panel of PostgreSQL. Provide the password for the connection of the database with the server once it is established. Now open the query tool and use the command to create a role. In PostgreSQL, a role is created for the user by applying any condition or logic separately to all the commands. Whenever we use that role in our commands, the logic or the condition is automatically applied to that command. So here, a role named Role1 is created. As a command, we use the query given below.

>> Create ROLE Role1; COMMENT ON ROLE Role1 IS 'MASKED';

This will cause a comment masked on the command where we will use the role. This is the first comment used that will mask or hide the items. Whenever we want to mask any element or column, we will use the specified role in that column. After creating the role, we will now apply the effects on a specific column in a table. So select a table on which you want to apply a mask. We have selected a table named ‘patient’ having three attributes: id, name, and phone of a patient.

>> select * from patients;

This command will display the data in each row. Now we will apply the comment of masked on the column ‘name’. The ‘anon’ is the name of the extension in the PostgreSQL folder. Here the random_name() function is called, which will return an anonymous random name. And that name will be replaced with the original name. This is how a mask is applied for security methods.

>> COMMENT ON COLUMN patient.name IS 'MASKED WITH FUNCTION anon.random_name()';

This will access the table and then the specified column, as you can see that a ‘dot’ method is used here. This mask will hide the original names of the column; when the role is applied, you will see the results.

The next step is to apply the comment of a mask on the column ‘phones’ of the table ‘patient’. Some portion is visible, while others will be hidden. The dollar sign indicates that here the original number will appear. Whereas the ‘*’ shows that the number is not visible. In the place of the number, an asterisk will be placed just like in the password; the letters are hidden.

>> COMMENT ON COLUMN patient.phones IS 'MASKED WITH FUNCTION anon.partial (phones, 2, $$*-***-**$$, 2)';

The parameter ‘2’ means that only two numbers can be visible at both ends. Now we will use a simple select statement to view the results of a specific row. This view will be what a normal user will see, on which we have not applied any mask.

>> select * from patient where id = '3';

You can see that the whole data is visible. A normal user can update, view, delete and perform almost all the operations on the relation of the PostgreSQL database. But to restrict some data, we use the mask as the comment that is applied to the role. By using this mask, a user is unable to perform any update regarding any command on the table like delete, update, or even cannot view the results properly. As we have seen the normal user’s view, now we will see what the masked user will see on the execution of the same query. For this purpose, we need to apply and set the role we have created mask comments on it. Otherwise, the mask will not be applied, and the view will be the same as above.

>> SET ROLE role1; SELECT * FRPM patient WHERE id = '3';

Now recall the comments we have applied; two masks were created. One on the ‘name’ column to apply any random name, and one on the ‘phones’ column to hide partial data. Now on execution, you can see that the name ‘ROBERT JAMES’ at row 3 is replaced with a random name ‘sheetle’, and similarly, the number in the ‘phones’ column is also hidden. Only the first and last two numbers are visible according to the condition we have applied.

These ‘masked’ comments are important to keep privacy. And to keep your data intact and visible only to the people you want.

By applying role1, the user wants to fetch the record of a specific person by applying another query.

>> SET ROLE role1; SELECT * FROM patient WHERE name ILIKE 'sushi Azaar';

The ‘ILIKE’ keyword acts as same as an equal statement. On execution of the command, you will see that 0 rows are shown because of the role added to the command; if you remove the role from the query, the results will be displayed from the table.

Similarly, now applying a delete statement by applying for the role.

>> set role role1 delete from patient where id = '3';

It will not delete any row from the table as the role is applied, and hence the privileges are restricted. And if the user tries to update the table by using a query, he/she will be unable to do so as the role is mentioned in the command.

Conclusion

This article contains information regarding data integrity from the hackers or the person you want to hide the data. The process involves data masking, which is done by creating a role in PostgreSQL. Data can be replaced or partially hidden. Both the types are explained with examples that are implemented on a column of the table.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.