AWS

How to create a DocumentDB cluster on AWS

Amazon DocumentDB is a fully managed NoSQL Database service with MongoDB compatibility. It automatically scales the storage size with the increment of 10GB up to 64TB. It can be configured to increase the read throughput by creating up to 15 read replicas. These read replicas share the same storage, reducing the storage cost and providing more processing power. Read replicas with a DocumentDB cluster can be set up in minutes to increase the read throughput. DocumentDB cluster provides a different endpoint for read queries that points to the read replicas.

Amazon DocumentDB allows the users to scale the memory and compute resources for each cluster. The compute and memory scaling of the DocumentDB cluster takes place in a few minutes. In order to isolate the DocumentDB cluster, AWS allows us to run the cluster in the Virtual Private Cloud (VPC). You can also configure a virtual firewall to enhance the security of the DocumentDB cluster.

This blog describes the step-by-step procedure to deploy a highly available and scalable DocumentDB cluster on AWS.

Creating DocumentDB cluster

First, log into the AWS management console and go to the AWS DocumentDB service.

It will open the DocumentDB console. Click on the Launch Amazon DocumentDB button to create a DocumentDB cluster from the dashboard.

It will open a page asking for DocumentDB cluster configuration, authentication, and other advanced settings.

The cluster identifier is the unique name of the cluster within the region. Engine version is the documentDB engine version. For this demo, select the latest engine version. The instance class specifies the instance type, memory, and compute power that will be used for the DocumentDB cluster. The number of instances option specifies the total number of instances that the cluster will contain. Among all the instances, one instance will be the primary instance, and the remaining instances will be read replicas and will only be used to increase the read throughput. These instances are equally distributed in all the availability zones, and a maximum of 16 instances can be launched within a DocumentDB cluster.

After configuration, enter the authentication details like DocumentDB cluster username and password after configuration.

The master username is the main user for the DocumentDB cluster. The master password will be a super-secret password that will be used along with the master username to authenticate the cluster.

Now click on the Show advanced settings button to configure the advanced settings of the cluster.

The network settings section will ask for the network details like the VPC (virtual private cloud), subnet group, and security group.

The VPC is the virtual private cloud in which the DocumentDB cluster will be deployed. For this demo, we will deploy our DocumentDB cluster inside the default VPC. The subnet group is the group of subnets in the VPC, and all the instances of the DocumentDB cluster will be deployed in the subnets defined in the subnet group. For this demo, we will use the default subnet group. Security group is the firewall in front of the DocumentDB cluster instances and allows or blocks specific traffic from specific IPs.

The cluster options will ask for the TCP/IP port for the DocumentDB cluster on which the cluster instances will listen for the connection. The cluster parameter defines the configuration settings which will be applied to the cluster instances. For this demo, enter the default port number for mongoDB that is 27017, and leave the cluster parameter group blank.

Amazon DocumentDB also provides encryption at rest for enhanced security of the data stored. In order to encrypt the data stored in the DocumentDB cluster, enable the Encryption-at-rest option. Encryption is performed using some keys, and for this demo, we will use the default AWS KMS key for RDS.

With Amazon DocumentDB, you can also schedule backups for the DocumentDB cluster that are used for point-in-time recovery at any time. These backups are taken on a daily basis in a defined window, and the retention period for the backups can also be specified.

For this demo, we will set the retention period for the backups to be 3 days, and all the backups will be automatically deleted after 3 days. The retention period can be set from 1 day to 35 days. The backup window is the time during which the DocumentDB backup starts. For the backup window, always choose a time window during which the load on the DocumentDB cluster is lower as, during backups, the performance of the database is affected.

For logging, Amazon DocumentDB provides two types of logs to monitor the activities performed on the DocumentDB cluster. To push the logs to the AWS CloudWatch, an IAM role is automatically created and attached to the DocumentDB cluster, RDS Service Linked Role, in this demo. Check both the boxes to enable both types of logs in the DocumentDB cluster.

AWS automatically applies patch updates and modifications on all the instances of the DocumentDB cluster, and we can select a time on which these patches are applied. The Maintenance window option allows the user to select a specific time window to apply these patch updates and modifications. If you do not specify any window, then AWS, on the user’s behalf, itself selects a time window for the patch updates.

You can also add tags and enable termination protection on your DocumentDB cluster. Tags are used to add metadata to the AWS resources and the termination protection, if enabled, protects the DocumentDB cluster from accidental termination. Before terminating the cluster, you must disable the termination protection.

Now everything is set up, click on the create cluster button at the bottom of the page to create the DocumentDB cluster.

Conclusion

DocumentDB is a managed NoSQL service with mongoDB compatibility provided by AWS. It is a scalable and easy-to-use database service that stores data as JSON documents. You can scale storage and resources provisioned at any time without going through any downtime. This blog describes the step-by-step procedure to create a highly available and scalable DocumentDB cluster on AWS.

About the author

Zain Abideen

A DevOps Engineer with expertise in provisioning and managing servers on AWS and Software delivery lifecycle (SDLC) automation. I'm from Gujranwala, Pakistan and currently working as a DevOps engineer.